📄 expr2.c
字号:
/* $EPIC: expr2.c,v 1.19 2004/11/10 03:20:35 jnelson Exp $ *//* * Zsh: math.c,v 3.1.2.1 1997/06/01 06:13:15 hzoli Exp * math.c - mathematical expression evaluation * This file is based on 'math.c', which is part of zsh, the Z shell. * * Copyright (c) 1992-1997 Paul Falstad, All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the copyright notice, * this list of conditions and the two following paragraphs. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimers in the * documentation and/or other materials provided with the distribution * 3. The names of the author(s) may not be used to endorse or promote * products derived from this software without specific prior written * permission. * * In no event shall Paul Falstad or the Zsh Development Group be liable * to any party for direct, indirect, special, incidental, or consequential * damages arising out of the use of this software and its documentation, * even if Paul Falstad and the Zsh Development Group have been advised of * the possibility of such damage. * * Paul Falstad and the Zsh Development Group specifically disclaim any * warranties, including, but not limited to, the implied warranties of * merchantability and fitness for a particular purpose. The software * provided hereunder is on an "as is" basis, and Paul Falstad and the * Zsh Development Group have no obligation to provide maintenance, * support, updates, enhancements, or modifications. * *//* * Substantial modifications by Jeremy Nelson which are * Coypright 1998, 2003 EPIC Software Labs, All rights reserved. * * You may distribute this file under the same terms as the above, by * including the parties "Jeremy Nelson" and "EPIC Software Labs" to the * limitations of liability and the express disclaimer of all warranties. * This software is provided "AS IS". */#include <math.h>#define STACKSZ 80#define TOKENCOUNT 80#define MAGIC_TOKEN -14/* * THIS IS THE "NEW NEW" MATH PARSER -- or shall I say, this is the * new math parser, second edition. *//* * Question: Why is this math parser so much more hideous than the old one? * * Answer: The implementation looks very very complex, and to a certain * extent it is. However, do not be frightened by the malicious abuse of * macros and the over-use of inline functions. The design of this math * parser is not nearly as complex as its implementation. Maybe that is * due to my not being the world's best C programmer, and I probably wrote * this code too quickly. One of the most important design considerations * was that it should be possible to prove correctness, while maintaining * the possibility for general optimization. Those are conflicting goals * and this implementation tries to balance the two. *//* * Question: Why should I use this math parser instead of the old one? * * Answer: Like the old math parser, this new math parser evaluates infix * notation expressions and returns the resulting value of the entire * expression. Unlike the old math parser, this new math parser correctly * obeys both operator precedence as well as associativity. It still can * do short-circuiting of &&, ||, and ?: operators. All operands start * off their life as text strings, but can be converted to and from other * data types (such as floating point, integer, and boolean). Type conversion * is automatic and efficient, and the results of every conversion is cached, * so the conversion is only done once, and only when it is actually needed. * This new math parser has a slew of new operators, as well as correct * implementations of some of the old operators. This new math parser also * handles both integer and floating point operations gracefully. *//* * Question: Why is everything stored in a struct? * * Answer: All the information for each expression is stored in a struct. * This is done so that there are no global variables in use (they're all * collected making them easier to handle), and makes re-entrancy possible * since you don't have to worry about whether or not all of the state * information is accounted for (it's all on the stack). *//* * Question: Why do you keep a 'symbol table' for each expression? * * By keeping a record of every direct symbol or derived symbol during * the entire lifetime of the expression, we can be assured that we have * a clean way to clean up after an expression (no memory leaks), and * permit a reference to any operator to persist for the entire lifetime * of the expression, without having to worry about who might be holding * a reference to an operand (tokens never change over their lifetime). * By refering to each token through an integer, rather than a pointer, * we can also prevent stale pointers, which can cause crashes. * * This also solves several more problems. There is never any concern over * whether or not a certain string is malloced(), or just who is responsible * for free()ing it. If you need a value to stay around as a temporary value * you can always tokenize() it and get a handle which you then use further. * The value will not go away until the entire expression has been parsed. *//* * Question: Why don't you support pre-compiled expressions? * * Because this implementation does not create a compiled spanning tree * out of the expression before executing it, but rather tokenizes the * operands and reduces the operations based on prior operations. Perhaps * this might change in the future, but don't count on it. The lexer uses * the results of prior operations to support such things as short circuits * and changing that would be a big pain. */typedef int TOKEN;typedef int BooL;/* * These flags tell us whether or not a given token has a useful value * of the given type. This is used to tell us when we need to do automatic * conversion, or whether the conversion was done before and we can just * grab the cached value. These flags are used by the "used" field, and * are cumulative. */#define USED_NONE 0#define USED_LVAL 1 << 0#define USED_RAW 1 << 1#define USED_EXPANDED 1 << 2#define USED_INTEGER 1 << 3#define USED_FLOAT 1 << 4#define USED_BOOLEAN 1 << 5/* * Theoretically, all these macros are independant because the return value of * INT*FUNC is typecasted back to INTTYPE. One caveat is that INT2STR * must return a malloc'd string. */#ifdef HAVE_LONG_LONGtypedef long long INTTYPE;#define FORMAT "%lld"#define STR2INT(x) ((INTTYPE)atoll(x))#define INT2STR(x) (malloc_sprintf(NULL, FORMAT , (INTTYPE)(x)))#elsetypedef long INTTYPE;#define FORMAT "%ld"#define STR2INT(x) ((INTTYPE)atol(x))#define INT2STR(x) (malloc_sprintf(NULL, FORMAT , (INTTYPE)(x)))#endif/* * This is a symbol table entry. */typedef struct TOKEN_type { int used; /* Which fields contain useful info? */ char * lval; /* Cached unexpanded variable name */ char * raw_value; /* Cached unexpected string */ char * expanded_value; /* Cached full expanded string */ INTTYPE integer_value; /* Cached integer value */ double float_value; /* Cached floating point value */ short boolean_value; /* Cached boolean value */} SYMBOL;#define __inline /* * This is an expression context */typedef struct{ /* POSITION AND STATE INFORMATION */ /* * This is the current position in the lexing. */ char *ptr; /* * When set, the expression is lexed, but nothing that may have a side * effect (function calls, assignments, etc) are actually performed. * Dummy values are instead substituted. */ int noeval; /* * When set, this means the next token may either be a prefix operator * or an operand. When clear, it means the next operator must be a * non-prefix operator. */ int operand; /* TOKEN TABLE */ /* * Each registered 'token' is given a TOKEN id. The idea is * that we want TOKEN to be an opaque type to be used to refer * to a token in a generic way, but in practice its just an integer * offset into a char ** table. We register all tokens sequentially, * so this just gets incremented when we want to register a new token. */ TOKEN token; /* * This is the list of operand (string) tokens we have extracted * so far from the expression. Offsets into this array are stored * into the parsing stack. */ SYMBOL tokens[TOKENCOUNT + 1]; /* OPERAND STACK */ /* * This is the operand shift stack. These are the operands that * are currently awaiting reduction. Note that rather than keeping * track of the lvals and rvals here, we simply keep track of offsets * to the 'tokens' table that actually stores all the relevant data. * Then we can just call the token-class functions to get that data. * This is more efficient because it allows us to recycle tokens * more reasonably without wasteful malloc-copies. */ TOKEN stack[STACKSZ + 1]; /* Current index to the operand stack */ int sp; /* This is the last token that was lexed. */ TOKEN mtok; /* This is set when an error happens */ int errflag; TOKEN last_token; const char *args; int *args_flag;} expr_info;/* * Useful macro to get at a specific token. * 'c' is the expression context, 'v' is the token handle. */#define TOK(c, v) c->tokens[v]/* Forward function references */__inline static TOKEN tokenize_raw (expr_info *c, char *t); static char * after_expando_special (expr_info *c); static char * alias_special_char (char **buffer, char *ptr, const char *args, char *quote_em, int *args_flag);/******************** EXPRESSION CONSTRUCTOR AND DESTRUCTOR ****************//* * Clean the expression context pointed to by 'c'. * This function must be called before you call mathparse(). */static void setup_expr_info (expr_info *c){ int i; c->ptr = NULL; c->noeval = 0; c->operand = 1; c->token = 0; for (i = 0; i <= TOKENCOUNT; i++) { TOK(c, i).used = USED_NONE; TOK(c, i).lval = NULL; TOK(c, i).raw_value = NULL; TOK(c, i).expanded_value = NULL; TOK(c, i).integer_value = 0; TOK(c, i).float_value = 0; TOK(c, i).boolean_value = 0; } for (i = 0; i <= STACKSZ; i++) c->stack[i] = 0; c->sp = -1; c->mtok = 0; c->errflag = 0; c->last_token = 0; tokenize_raw(c, empty_string); /* Always token 0 */}/* * Clean up the expression context pointed to by 'c'. * This function must be called after you call mathparse(). */static void destroy_expr_info (expr_info *c){ int i; c->ptr = NULL; c->noeval = -1; c->operand = -1; for (i = 0; i < c->token; i++) { TOK(c, i).used = USED_NONE; new_free(&TOK(c, i).lval); new_free(&TOK(c, i).raw_value); new_free(&TOK(c, i).expanded_value); } c->token = -1; for (i = 0; i <= STACKSZ; i++) c->stack[i] = -1; c->sp = -1; c->mtok = -1; c->errflag = -1; c->last_token = -1;}/**************** TOKEN, PRECEDENCE, and ASSOCITIVITY TABLES ****************//* * LR = left-to-right associativity * RL = right-to-left associativity * BOOL = short-circuiting boolean */#define LR 0#define RL 1#define BOOL 2/* * These are all the token-types. Each of the operators is represented, * as is the generic operand type */enum LEX { M_INPAR, NOT, COMP, PREMINUS, PREPLUS, UPLUS, UMINUS, STRLEN, WORDC, DEREF, POWER, MUL, DIV, MOD, PLUS, MINUS, STRCAT, SHLEFT, SHRIGHT, LES, LEQ, GRE, GEQ, MATCH, NOMATCH, DEQ, NEQ, AND, XOR, OR, DAND, DXOR, DOR, QUEST, COLON, EQ, PLUSEQ, MINUSEQ, MULEQ, DIVEQ, MODEQ, ANDEQ, XOREQ, OREQ, SHLEFTEQ, SHRIGHTEQ, DANDEQ, DOREQ, DXOREQ, POWEREQ, STRCATEQ, STRPREEQ, SWAP, COMMA, POSTMINUS, POSTPLUS, ID, M_OUTPAR, ERROR, EOI, TOKCOUNT};/* * Precedence table: Operators with a lower precedence VALUE have a higher * precedence. The theory behind infix notation (algebraic notation) is that * you have a sequence of operands seperated by (typically binary) operators. * The problem of precedence is that each operand is surrounded by two * operators, so it is ambiguous which operator the operand "binds" to. This * is resolved by "precedence rules" which state that given two operators, * which one is allowed to "reduce" (operate on) the operand. For a simple * explanation, take the expression (3+4*5). Now the middle operand is a * '4', but we dont know if it should be reduced via the plus, or via the * multiply. If we look up both operators in the prec table, we see that * multiply has the lower value -- therefore the 4 is reduced via the multiply * and then the result of the multiply is reduced by the addition. */static int prec[TOKCOUNT] = { 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 4, 4, 4, 5, 5, 5, 6, 6, 7, 7, 7, 7, 8, 8, 9, 9, 10, 11, 12, 13, 14, 15, 16, 16, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 18, 2, 2, 0, 137, 156, 200};#define TOPPREC 21/* * Associativity table: But precedence isnt enough. What happens when you * have two identical operations to determine between? Well, the easy way * is to say that the first operation is always done first. But some * operators dont work that way (like the assignment operator) and always * reduce the LAST (or rightmost) operation first. For example: * (3+4+5) ((4+3)+5) (7+5) (12) * (v1=v2=3) (v1=(v2=3)) (v1=3) (3) * So for each operator we need to specify how to determine the precedence * of the same operator against itself. This is called "associativity". * Left-to-right associativity means that the operations occur left-to-right, * or first-operator, first-reduced. Right-to-left associativity means * that the operations occur right-to-left, or last-operator, first-reduced. * * We have a special case of associativity called BOOL, which is just a * special type of left-to-right associtivity whereby the right hand side * of the operand is not automatically parsed. (not really, but its the * theory that counts.) */static int assoc[TOKCOUNT] ={ LR, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, LR, LR, LR, LR, LR, LR, LR, LR, LR, LR, LR, LR, LR, LR, LR, LR, LR, LR, LR, BOOL, BOOL, BOOL, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, RL, LR, RL, RL, LR, LR, LR, LR};/* ********************* CREATE NEW SYMBOLS AND TOKENS ********************//* Lvalues are stored via this routine */__inline static TOKEN tokenize_lval (expr_info *c, const char *t){ if (c->token >= TOKENCOUNT) { error("Too many tokens for this expression"); return -1; } TOK(c, c->token).used = USED_LVAL; malloc_strcpy(&TOK(c, c->token).lval, t); return c->token++;}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -