📄 changelog
字号:
* regex.c (most routines): use unsigned char vs. char consistently. * regex.h (re_compile_pattern): do not declare the length arg as const. * regex.c (re_compile_pattern): likewise. * regex.c (POINTER_TO_REG): rename to `POINTER_TO_OFFSET'. * regex.h (re_registers): declare `start' and `end' as `regoff_t', instead of `int'. * regex.c (regexec): if either of the malloc's for the register information fail, return failure. * regex.h (RE_NREGS): define this again, as 30 (from jla). (RE_ALLOCATE_REGISTERS): remove this. (RE_SYNTAX_*): remove it from definitions. (re_pattern_buffer): remove `return_default_num_regs', add `caller_allocated_regs'. * regex.c (re_compile_pattern): clear no_sub and caller_allocated_regs in the pattern. (regcomp): set caller_allocated_regs. (re_match_2): do all register allocation at the end of the match; implement new semantics. * regex.c (MAX_REGNUM): new macro. (regex_compile): at handle_open and handle_close, if the group number is too large, don't push the start/stop memory.Thu Jan 2 07:56:10 1992 Karl Berry (karl at hayley) * regex.c (re_match_2): if the back reference is to a group that never matched, then goto fail, not really_fail. Also, don't test if the pattern can match the empty string. Why did we ever do that? (really_fail): this label no longer needed. * regexinc.c [STDC_HEADERS]: use only this to test if we should include <stdlib.h>. * regex.c (DO_RANGE, regex_compile): translate in all cases except the single character after a \. * regex.h (RE_AWK_CLASS_HACK): rename to RE_BACKSLASH_ESCAPE_IN_LISTS. * regex.c (regex_compile): change use. * regex.c (re_compile_fastmap): do not translate the characters again; we already translated them at compilation. (From ylo@ngs.fi.) * regex.c (re_match_2): in case for at_dot, invert sense of comparison and find the character number properly. (From worley@compass.com.) (re_match_2) [emacs]: remove the cases for before_dot and after_dot, since there's no way to specify them, and the code is wrong (judging from this change).Wed Jan 1 09:13:38 1992 Karl Berry (karl at hayley) * psx-{interf,basic,extend}.c, other.c: set `t' as the first thing, so that if we run them in sucession, general_test's kludge to see if we're doing POSIX tests works. * test.h (test_type): add `all_test'. * main.c: add case for `all_test'. * regexinc.c (partial_compiled_pattern_printer, double_string_printer): don't print anything if we're passed null. * regex.c (PUSH_FAILURE_POINT): do not scan for the highest and lowest active registers. (re_match_2): compute lowest/highest active regs at start_memory and stop_memory. (NO_{LOW,HIGH}EST_ACTIVE_REG): new sentinel values. (pop_failure_point): return the lowest/highest active reg values popped; change calls. * regex.c [DEBUG]: include <assert.h>. (various routines) [DEBUG]: change conditionals to assertions. * regex.c (DEBUG_STATEMENT): new macro. (PUSH_FAILURE_POINT): use it to increment num_regs_pushed. (re_match_2) [DEBUG]: only declare num_regs_pushed if DEBUG. * regex.c (*can_match_nothing): rename to *unmatchable. * regex.c (re_match_2): at stop_memory, adjust argument reading. * regex.h (re_pattern_buffer): declare `can_be_null' as a 2-bit bit field. * regex.h (re_pattern_buffer): declare `buffer' unsigned char *; no, dumb idea. The pattern can have signed number. * regex.c (re_match_2): in maybe_pop_jump case, skip over the right number of args to the group operators, and don't do anything with endline if posix_newline is not set. * regex.c, regexinc.c (all the things we just changed): go back to putting the inner group count after the start_memory, because we need it in the on_failure_jump case in re_match_2. But leave it after the stop_memory also, since we need it there in re_match_2, and we don't have any way of getting back to the start_memory. * regexinc.c (partial_compiled_pattern_printer): adjust argument reading for start/stop_memory. * regex.c (re_compile_fastmap, group_can_match_nothing): likewise.Tue Dec 31 10:15:08 1991 Karl Berry (karl at hayley) * regex.c (bits list routines): remove these. (re_match_2): get the number of inner groups from the pattern, instead of keeping track of it at start and stop_memory. Put the count after the stop_memory, not after the start_memory. (compile_stack_element): remove `fixup_inner_group' member, since we now put it in when we can compute it. (regex_compile): at handle_open, don't push the inner group offset, and at handle_close, don't pop it. * regex.c (level routines): remove these, and their uses in regex_compile. This was another manifestation of having to find $'s that were endlines. * regex.c (regexec): this does searching, not matching (a well-disguised part of the standard). So rewrite to use `re_search' instead of `re_match'. * psx-interf.c (test_regexec): add tests to, uh, match. * regex.h (RE_TIGHT_ALT): remove this; nobody uses it. * regex.c: remove the code that was supposed to implement it. * other.c (test_others): ^ and $ never match newline characters; RE_CONTEXT_INVALID_OPS doesn't affect anchors. * psx-interf.c (test_regerror): update for new error messages. * psx-extend.c: it's now ok to have an alternative be just a $, so remove all the tests which supposed that was invalid.Wed Dec 25 09:00:05 1991 Karl Berry (karl at hayley) * regex.c (regex_compile): in handle_open, don't skip over ^ and $ when checking for an empty group. POSIX has changed the grammar. * psx-extend.c (test_posix_extended): thus, move (^$) tests to valid section. * regexinc.c (boolean): move from here to test.h and regex.c. * test files: declare verbose, omit_register_tests, and test_should_match as boolean. * psx-interf.c (test_posix_c_interface): remove the `c_'. * main.c: likewise. * psx-basic.c (test_posix_basic): ^ ($) is an anchor after (before) an open (close) group. * regex.c (re_match_2): in endline, correct precedence of posix_newline condition.Tue Dec 24 06:45:11 1991 Karl Berry (karl at hayley) * test.h: incorporate private-tst.h. * test files: include test.h, not private-tst.h. * test.c (general_test): set posix_newline to zero if we are doing POSIX tests (unfortunately, it's difficult to call regcomp in this case, which is what we should really be doing). * regex.h (reg_syntax_t): make this an enumeration type which defines the syntax bits; renames re_syntax_t. * regex.c (at_endline_op_p): don't preincrement p; then if it's not an empty string op, we lose. * regex.h (reg_errcode_t): new enumeration type of the error codes. * regex.c (regex_compile): return that type. * regex.c (regex_compile): in [, initialize just_had_a_char_class to false; somehow I had changed this to true. * regex.h (RE_NO_CONSECUTIVE_REPEATS): remove this, since we don't use it, and POSIX doesn't require this behavior anymore. * regex.c (regex_compile): remove it from here. * regex.c (regex_compile): remove the no_op insertions for verify_and_adjust_endlines, since that doesn't exist anymore. * regex.c (regex_compile) [DEBUG]: use printchar to print the pattern, so unprintable bytes will print properly. * regex.c: move re_error_msg back. * test.c (general_test): print the compile error if the pattern was invalid.Mon Dec 23 08:54:53 1991 Karl Berry (karl at hayley) * regexinc.c: move re_error_msg here. * regex.c (re_error_msg): the ``message'' for success must be NULL, to keep the interface to re_compile_pattern the same. (regerror): if the msg is null, use "Success". * rename most test files for consistency. Change Makefile correspondingly. * test.c (most routines): add casts to (unsigned char *) when we call re_{match,search}{,_2}.Sun Dec 22 09:26:06 1991 Karl Berry (karl at hayley) * regex.c (re_match_2): declare string args as unsigned char * again; don't declare non-pointer args const; declare the pattern buffer const. (re_match): likewise. (re_search_2, re_search): likewise, except don't declare the pattern const, since we make a fastmap. * regex.h [__STDC__]: change prototypes. * regex.c (regex_compile): return an error code, not a string. (re_err_list): new table to map from error codes to string. (re_compile_pattern): return an element of re_err_list. (regcomp): don't test all the strings. (regerror): just use the list. (put_in_buffer): remove this. * regex.c (equivalent_failure_points): remove this. * regex.c (re_match_2): don't copy the string arguments into non-const pointers. We never alter the data. * regex.c (re_match_2): move assignment to `is_a_jump_n' out of the main loop. Just initialize it right before we do something with it. * regex.[ch] (re_match_2): don't declare the int parameters const.Sat Dec 21 08:52:20 1991 Karl Berry (karl at hayley) * regex.h (re_syntax_t): new type; declare to be unsigned (previously we used int, but since we do bit operations on this, unsigned is better, according to H&S). (obscure_syntax, re_pattern_buffer): use that type. * regex.c (re_set_syntax, regex_compile): likewise. * regex.h (re_pattern_buffer): new field `posix_newline'. * regex.c (re_comp, re_compile_pattern): set to zero. (regcomp): set to REG_NEWLINE. * regex.h (RE_HAT_LISTS_NOT_NEWLINE): remove this (we can just check `posix_newline' instead.) * regex.c (op_list_type, op_list, add_op): remove these. (verify_and_adjust_endlines): remove this. (pattern_offset_list_type, *pattern_offset* routines): and these. These things all implemented the nonleading/nontrailing position code, which was very long, had a few remaining problems, and is no longer needed. So... * regexinc.c (STREQ): new macro to abbreviate strcmp(,)==0, for brevity. Change various places in regex.c to use it. * regex{,inc}.c (enum regexpcode): change to a typedef re_opcode_t, for brevity. * regex.h (re_syntax_table) [SYNTAX_TABLE]: remove this; it should only be in regex.c, I think, since we don't define it in this case. Maybe it should be conditional on !SYNTAX_TABLE? * regexinc.c (partial_compiled_pattern_printer): simplify and distinguish the emacs/not-emacs (not)wordchar cases.Fri Dec 20 08:11:38 1991 Karl Berry (karl at hayley) * regexinc.c (regexpcode) [emacs]: only define the Emacs opcodes if we are ifdef emacs. * regex.c (BUF_PUSH*): rename to PAT_PUSH*. * regex.c (regex_compile): in $ case, go back to essentially the original code for deciding endline op vs. normal char. (at_endline_op_p): new routine. * regex.h (RE_ANCHORS_ONLY_AT_ENDS, RE_CONTEXT_INVALID_ANCHORS, RE_REPEATED_ANCHORS_AWAY, RE_NO_ANCHOR_AT_NEWLINE): remove these. POSIX has simplified the rules for anchors in draft 11.2. (RE_NEWLINE_ORDINARY): new syntax bit. (RE_CONTEXT_INDEP_ANCHORS): change description to be compatible with POSIX. * regex.texinfo (Syntax Bits): remove the descriptions.Mon Dec 16 08:12:40 1991 Karl Berry (karl at hayley) * regex.c (re_match_2): in jump_past_next_alt, unconditionally goto no_pop. The only register we were finding was one which enclosed the whole alternative expression, not one around an individual alternative. So we were never doing what we thought we were doing, and this way makes (|a) against the empty string fail. * regex.c (regex_compile): remove `highest_ever_regnum', and don't restore regnum from the stack; just put it into a temporary to put into the stop_memory. Otherwise, groups aren't numbered consecutively. * regex.c (is_in_compile_stack): rename to `group_in_compile_stack'; remove unnecessary test for the stack being empty. * regex.c (re_match_2): in on_failure_jump, skip no_op's before checking for the start_memory, in case we were called from succeed_n.Sun Dec 15 16:20:48 1991 Karl Berry (karl at hayley) * regex.c (regex_compile): in duplicate case, use highest_ever_regnum instead of regnum, since the latter is reverted at stop_memory.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -