📄 changelog

📁 php-4.4.7学习linux时下载的源代码
💻
📖 第 1 页 / 共 5 页
字号:
      #include "pcre_internal.h"    to pcre_chartables.c because without it, gcc 4.x may remove the array    definition from the final binary if PCRE is built into a static library and    dead code stripping is activated.46. For an unanchored pattern, if a match attempt fails at the start of a    newline sequence, and the newline setting is CRLF or ANY, and the next two    characters are CRLF, advance by two characters instead of one.Version 6.7 04-Jul-06--------------------- 1. In order to handle tests when input lines are enormously long, pcretest has    been re-factored so that it automatically extends its buffers when    necessary. The code is crude, but this _is_ just a test program. The    default size has been increased from 32K to 50K. 2. The code in pcre_study() was using the value of the re argument before    testing it for NULL. (Of course, in any sensible call of the function, it    won't be NULL.) 3. The memmove() emulation function in pcre_internal.h, which is used on    systems that lack both memmove() and bcopy() - that is, hardly ever -    was missing a "static" storage class specifier. 4. When UTF-8 mode was not set, PCRE looped when compiling certain patterns    containing an extended class (one that cannot be represented by a bitmap    because it contains high-valued characters or Unicode property items, e.g.    [\pZ]). Almost always one would set UTF-8 mode when processing such a    pattern, but PCRE should not loop if you do not (it no longer does).    [Detail: two cases were found: (a) a repeated subpattern containing an    extended class; (b) a recursive reference to a subpattern that followed a    previous extended class. It wasn't skipping over the extended class    correctly when UTF-8 mode was not set.] 5. A negated single-character class was not being recognized as fixed-length    in lookbehind assertions such as (?<=[^f]), leading to an incorrect    compile error "lookbehind assertion is not fixed length". 6. The RunPerlTest auxiliary script was showing an unexpected difference    between PCRE and Perl for UTF-8 tests. It turns out that it is hard to    write a Perl script that can interpret lines of an input file either as    byte characters or as UTF-8, which is what "perltest" was being required to    do for the non-UTF-8 and UTF-8 tests, respectively. Essentially what you    can't do is switch easily at run time between having the "use utf8;" pragma    or not. In the end, I fudged it by using the RunPerlTest script to insert    "use utf8;" explicitly for the UTF-8 tests. 7. In multiline (/m) mode, PCRE was matching ^ after a terminating newline at    the end of the subject string, contrary to the documentation and to what    Perl does. This was true of both matching functions. Now it matches only at    the start of the subject and immediately after *internal* newlines. 8. A call of pcre_fullinfo() from pcretest to get the option bits was passing    a pointer to an int instead of a pointer to an unsigned long int. This    caused problems on 64-bit systems. 9. Applied a patch from the folks at Google to pcrecpp.cc, to fix "another    instance of the 'standard' template library not being so standard".10. There was no check on the number of named subpatterns nor the maximum    length of a subpattern name. The product of these values is used to compute    the size of the memory block for a compiled pattern. By supplying a very    long subpattern name and a large number of named subpatterns, the size    computation could be caused to overflow. This is now prevented by limiting    the length of names to 32 characters, and the number of named subpatterns    to 10,000.11. Subpatterns that are repeated with specific counts have to be replicated in    the compiled pattern. The size of memory for this was computed from the    length of the subpattern and the repeat count. The latter is limited to    65535, but there was no limit on the former, meaning that integer overflow    could in principle occur. The compiled length of a repeated subpattern is    now limited to 30,000 bytes in order to prevent this.12. Added the optional facility to have named substrings with the same name.13. Added the ability to use a named substring as a condition, using the    Python syntax: (?(name)yes|no). This overloads (?(R)... and names that    are numbers (not recommended). Forward references are permitted.14. Added forward references in named backreferences (if you see what I mean).15. In UTF-8 mode, with the PCRE_DOTALL option set, a quantified dot in the    pattern could run off the end of the subject. For example, the pattern    "(?s)(.{1,5})"8 did this with the subject "ab".16. If PCRE_DOTALL or PCRE_MULTILINE were set, pcre_dfa_exec() behaved as if    PCRE_CASELESS was set when matching characters that were quantified with ?    or *.17. A character class other than a single negated character that had a minimum    but no maximum quantifier - for example [ab]{6,} - was not handled    correctly by pce_dfa_exec(). It would match only one character.18. A valid (though odd) pattern that looked like a POSIX character    class but used an invalid character after [ (for example [[,abc,]]) caused    pcre_compile() to give the error "Failed: internal error: code overflow" or    in some cases to crash with a glibc free() error. This could even happen if    the pattern terminated after [[ but there just happened to be a sequence of    letters, a binary zero, and a closing ] in the memory that followed.19. Perl's treatment of octal escapes in the range \400 to \777 has changed    over the years. Originally (before any Unicode support), just the bottom 8    bits were taken. Thus, for example, \500 really meant \100. Nowadays the    output from "man perlunicode" includes this:      The regular expression compiler produces polymorphic opcodes.  That      is, the pattern adapts to the data and automatically switches to      the Unicode character scheme when presented with Unicode data--or      instead uses a traditional byte scheme when presented with byte      data.    Sadly, a wide octal escape does not cause a switch, and in a string with    no other multibyte characters, these octal escapes are treated as before.    Thus, in Perl, the pattern  /\500/ actually matches \100 but the pattern    /\500|\x{1ff}/ matches \500 or \777 because the whole thing is treated as a    Unicode string.    I have not perpetrated such confusion in PCRE. Up till now, it took just    the bottom 8 bits, as in old Perl. I have now made octal escapes with    values greater than \377 illegal in non-UTF-8 mode. In UTF-8 mode they    translate to the appropriate multibyte character.29. Applied some refactoring to reduce the number of warnings from Microsoft    and Borland compilers. This has included removing the fudge introduced    seven years ago for the OS/2 compiler (see 2.02/2 below) because it caused    a warning about an unused variable.21. PCRE has not included VT (character 0x0b) in the set of whitespace    characters since release 4.0, because Perl (from release 5.004) does not.    [Or at least, is documented not to: some releases seem to be in conflict    with the documentation.] However, when a pattern was studied with    pcre_study() and all its branches started with \s, PCRE still included VT    as a possible starting character. Of course, this did no harm; it just    caused an unnecessary match attempt.22. Removed a now-redundant internal flag bit that recorded the fact that case    dependency changed within the pattern. This was once needed for "required    byte" processing, but is no longer used. This recovers a now-scarce options    bit. Also moved the least significant internal flag bit to the most-    significant bit of the word, which was not previously used (hangover from    the days when it was an int rather than a uint) to free up another bit for    the future.23. Added support for CRLF line endings as well as CR and LF. As well as the    default being selectable at build time, it can now be changed at runtime    via the PCRE_NEWLINE_xxx flags. There are now options for pcregrep to    specify that it is scanning data with non-default line endings.24. Changed the definition of CXXLINK to make it agree with the definition of    LINK in the Makefile, by replacing LDFLAGS to CXXFLAGS.25. Applied Ian Taylor's patches to avoid using another stack frame for tail    recursions. This makes a big different to stack usage for some patterns.26. If a subpattern containing a named recursion or subroutine reference such    as (?P>B) was quantified, for example (xxx(?P>B)){3}, the calculation of    the space required for the compiled pattern went wrong and gave too small a    value. Depending on the environment, this could lead to "Failed: internal    error: code overflow at offset 49" or "glibc detected double free or    corruption" errors.27. Applied patches from Google (a) to support the new newline modes and (b) to    advance over multibyte UTF-8 characters in GlobalReplace.28. Change free() to pcre_free() in pcredemo.c. Apparently this makes a    difference for some implementation of PCRE in some Windows version.29. Added some extra testing facilities to pcretest:    \q<number>   in a data line sets the "match limit" value    \Q<number>   in a data line sets the "match recursion limt" value    -S <number>  sets the stack size, where <number> is in megabytes    The -S option isn't available for Windows.Version 6.6 06-Feb-06--------------------- 1. Change 16(a) for 6.5 broke things, because PCRE_DATA_SCOPE was not defined    in pcreposix.h. I have copied the definition from pcre.h. 2. Change 25 for 6.5 broke compilation in a build directory out-of-tree    because pcre.h is no longer a built file. 3. Added Jeff Friedl's additional debugging patches to pcregrep. These are    not normally included in the compiled code.Version 6.5 01-Feb-06--------------------- 1. When using the partial match feature with pcre_dfa_exec(), it was not    anchoring the second and subsequent partial matches at the new starting    point. This could lead to incorrect results. For example, with the pattern    /1234/, partially matching against "123" and then "a4" gave a match. 2. Changes to pcregrep:    (a) All non-match returns from pcre_exec() were being treated as failures        to match the line. Now, unless the error is PCRE_ERROR_NOMATCH, an        error message is output. Some extra information is given for the        PCRE_ERROR_MATCHLIMIT and PCRE_ERROR_RECURSIONLIMIT errors, which are        probably the only errors that are likely to be caused by users (by        specifying a regex that has nested indefinite repeats, for instance).        If there are more than 20 of these errors, pcregrep is abandoned.    (b) A binary zero was treated as data while matching, but terminated the        output line if it was written out. This has been fixed: binary zeroes        are now no different to any other data bytes.    (c) Whichever of the LC_ALL or LC_CTYPE environment variables is set is        used to set a locale for matching. The --locale=xxxx long option has        been added (no short equivalent) to specify a locale explicitly on the        pcregrep command, overriding the environment variables.    (d) When -B was used with -n, some line numbers in the output were one less        than they should have been.    (e) Added the -o (--only-matching) option.    (f) If -A or -C was used with -c (count only), some lines of context were        accidentally printed for the final match.    (g) Added the -H (--with-filename) option.    (h) The combination of options -rh failed to suppress file names for files        that were found from directory arguments.    (i) Added the -D (--devices) and -d (--directories) options.    (j) Added the -F (--fixed-strings) option.    (k) Allow "-" to be used as a file name for -f as well as for a data file.    (l) Added the --colo(u)r option.    (m) Added Jeffrey Friedl's -S testing option, but within #ifdefs so that it        is not present by default. 3. A nasty bug was discovered in the handling of recursive patterns, that is,    items such as (?R) or (?1), when the recursion could match a number of    alternatives. If it matched one of the alternatives, but subsequently,    outside the recursion, there was a failure, the code tried to back up into    the recursion. However, because of the way PCRE is implemented, this is not    possible, and the result was an incorrect result from the match.    In order to prevent this happening, the specification of recursion has    been changed so that all such subpatterns are automatically treated as    atomic groups. Thus, for example, (?R) is treated as if it were (?>(?R)). 4. I had overlooked the fact that, in some locales, there are characters for    which isalpha() is true but neither isupper() nor islower() are true. In    the fr_FR locale, for instance, the \xAA and \xBA characters (ordmasculine    and ordfeminine) are like this. This affected the treatment of \w and \W    when they appeared in character classes, but not when they appeared outside    a character class. The bit map for "word" characters is now created    separately from the results of isalnum() instead of just taking it from the    upper, lower, and digit maps. (Plus the underscore character, of course.)
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -