📄 tests
字号:
# regular expression test set# Lines are at least three fields, separated by one or more tabs. "" stands# for an empty field. First field is an RE. Second field is flags. If# C flag given, regcomp() is expected to fail, and the third field is the# error name (minus the leading REG_).## Otherwise it is expected to succeed, and the third field is the string to# try matching it against. If there is no fourth field, the match is# expected to fail. If there is a fourth field, it is the substring that# the RE is expected to match. If there is a fifth field, it is a comma-# separated list of what the subexpressions should match, with - indicating# no match for that one. In both the fourth and fifth fields, a (sub)field# starting with @ indicates that the (sub)expression is expected to match# a null string followed by the stuff after the @; this provides a way to# test where null strings match. The character `N' in REs and strings# is newline, `S' is space, `T' is tab, `Z' is NUL.## The full list of flags:# - placeholder, does nothing# b RE is a BRE, not an ERE# & try it as both an ERE and a BRE# C regcomp() error expected, third field is error name# i REG_ICASE# m ("mundane") REG_NOSPEC# s REG_NOSUB (not really testable)# n REG_NEWLINE# ^ REG_NOTBOL# $ REG_NOTEOL# # REG_STARTEND (see below)# p REG_PEND## For REG_STARTEND, the start/end offsets are those of the substring# enclosed in ().# basicsa & a aabc & abc abcabc|de - abc abca|b|c - abc a# parentheses and perversions thereofa(b)c - abc abca\(b\)c b abc abca( C EPARENa( b a( a(a\( - a( a(a\( bC EPARENa\(b bC EPARENa(b C EPARENa(b b a(b a(b# gag me with a right parenthesis -- 1003.2 goofed here (my fault, partly)a) - a) a)) - ) )# end gagging (in a just world, those *should* give EPAREN)a) b a) a)a\) bC EPAREN\) bC EPARENa()b - ab aba\(\)b b ab ab# anchoring and REG_NEWLINE^abc$ & abc abca^b - a^ba^b b a^b a^ba$b - a$ba$b b a$b a$b^ & abc @abc$ & abc @^$ & "" @$^ - "" @\($\)\(^\) b "" @# stop retching, those are legitimate (although disgusting)^^ - "" @$$ - "" @b$ & abNcb$ &n abNc b^b$ & aNbNc^b$ &n aNbNc b^$ &n aNNb @Nb^$ n abc^$ n abcN @$^ n aNNb @Nb\($\)\(^\) bn aNNb @Nb^^ n^ aNNb @Nb$$ n aNNb @NN^a ^ aa$ $ a^a ^n aNb^b ^n aNb ba$ $n bNab$ $n bNa ba*(^b$)c* - b ba*\(^b$\)c* b b b# certain syntax errors and non-errors| C EMPTY| b | |* C BADRPT* b * *+ C BADRPT? C BADRPT"" &C EMPTY() - abc @abc\(\) b abc @abca||b C EMPTY|ab C EMPTYab| C EMPTY(|a)b C EMPTY(a|)b C EMPTY(*a) C BADRPT(+a) C BADRPT(?a) C BADRPT({1}a) C BADRPT\(\{1\}a\) bC BADRPT(a|*b) C BADRPT(a|+b) C BADRPT(a|?b) C BADRPT(a|{1}b) C BADRPT^* C BADRPT^* b * *^+ C BADRPT^? C BADRPT^{1} C BADRPT^\{1\} bC BADRPT# metacharacters, backslashesa.c & abc abca[bc]d & abd abda\*c & a*c a*ca\\b & a\b a\ba\\\*b & a\*b a\*ba\bc & abc abca\ &C EESCAPEa\\bc & a\bc a\bc\{ bC BADRPTa\[b & a[b a[ba[b &C EBRACK# trailing $ is a peculiar special case for the BRE codea$ & a aa$ & a$a\$ & aa\$ & a$ a$a\\$ & aa\\$ & a$a\\$ & a\$a\\$ & a\ a\# back references, ugha\(b\)\2c bC ESUBREGa\(b\1\)c bC ESUBREGa\(b*\)c\1d b abbcbbd abbcbbd bba\(b*\)c\1d b abbcbda\(b*\)c\1d b abbcbbbd^\(.\)\1 b abca\([bc]\)\1d b abcdabbd abbd ba\(\([bc]\)\2\)*d b abbccd abbccda\(\([bc]\)\2\)*d b abbcbd# actually, this next one probably ought to fail, but the spec is uncleara\(\(b\)*\2\)*d b abbbd abbbd# here is a case that no NFA implementation does right\(ab*\)[ab]*\1 b ababaaa ababaaa a# check out normal matching in the presence of back refs\(a\)\1bcd b aabcd aabcd\(a\)\1bc*d b aabcd aabcd\(a\)\1bc*d b aabd aabd\(a\)\1bc*d b aabcccd aabcccd\(a\)\1bc*[ce]d b aabcccd aabcccd^\(a\)\1b\(c\)*cd$ b aabcccd aabcccd# ordinary repetitionsab*c & abc abcab+c - abc abcab?c - abc abca\(*\)b b a*b a*ba\(**\)b b ab aba\(***\)b bC BADRPT*a b *a *a**a b a a***a bC BADRPT# the dreaded bounded repetitions{ & { {{abc & {abc {abc{1 C BADRPT{1} C BADRPTa{b & a{b a{ba{1}b - ab aba\{1\}b b ab aba{1,}b - ab aba\{1,\}b b ab aba{1,2}b - aab aaba\{1,2\}b b aab aaba{1 C EBRACEa\{1 bC EBRACEa{1a C EBRACEa\{1a bC EBRACEa{1a} C BADBRa\{1a\} bC BADBRa{,2} - a{,2} a{,2}a\{,2\} bC BADBRa{,} - a{,} a{,}a\{,\} bC BADBRa{1,x} C BADBRa\{1,x\} bC BADBRa{1,x C EBRACEa\{1,x bC EBRACEa{300} C BADBRa\{300\} bC BADBRa{1,0} C BADBRa\{1,0\} bC BADBRab{0,0}c - abcac acab\{0,0\}c b abcac acab{0,1}c - abcac abcab\{0,1\}c b abcac abcab{0,3}c - abbcac abbcab\{0,3\}c b abbcac abbcab{1,1}c - acabc abcab\{1,1\}c b acabc abcab{1,3}c - acabc abcab\{1,3\}c b acabc abcab{2,2}c - abcabbc abbcab\{2,2\}c b abcabbc abbcab{2,4}c - abcabbc abbcab\{2,4\}c b abcabbc abbc((a{1,10}){1,10}){1,10} - a a a,a# multiple repetitionsa** &C BADRPTa++ C BADRPTa?? C BADRPTa*+ C BADRPTa*? C BADRPTa+* C BADRPTa+? C BADRPTa?* C BADRPTa?+ C BADRPTa{1}{1} C BADRPTa*{1} C BADRPTa+{1} C BADRPTa?{1} C BADRPTa{1}* C BADRPTa{1}+ C BADRPTa{1}? C BADRPTa*{b} - a{b} a{b}a\{1\}\{1\} bC BADRPTa*\{1\} bC BADRPTa\{1\}* bC BADRPT# brackets, and numerous perversions thereofa[b]c & abc abca[ab]c & abc abca[^ab]c & adc adca[]b]c & a]c a]ca[[b]c & a[c a[ca[-b]c & a-c a-ca[^]b]c & adc adca[^-b]c & adc adca[b-]c & a-c a-ca[b &C EBRACKa[] &C EBRACKa[1-3]c & a2c a2ca[3-1]c &C ERANGEa[1-3-5]c &C ERANGEa[[.-.]--]c & a-c a-ca[1- &C ERANGEa[[. &C EBRACKa[[.x &C EBRACKa[[.x. &C EBRACKa[[.x.] &C EBRACKa[[.x.]] & ax axa[[.x,.]] &C ECOLLATEa[[.one.]]b & a1b a1ba[[.notdef.]]b &C ECOLLATEa[[.].]]b & a]b a]ba[[:alpha:]]c & abc abca[[:notdef:]]c &C ECTYPEa[[: &C EBRACKa[[:alpha &C EBRACKa[[:alpha:] &C EBRACKa[[:alpha,:] &C ECTYPEa[[:]:]]b &C ECTYPEa[[:-:]]b &C ECTYPEa[[:alph:]] &C ECTYPEa[[:alphabet:]] &C ECTYPE[[:alnum:]]+ - -%@a0X- a0X[[:alpha:]]+ - -%@aX0- aX[[:blank:]]+ - aSSTb SST[[:cntrl:]]+ - aNTb NT[[:digit:]]+ - a019b 019[[:graph:]]+ - Sa%bS a%b[[:lower:]]+ - AabC ab[[:print:]]+ - NaSbN aSb[[:punct:]]+ - S%-&T %-&[[:space:]]+ - aSNTb SNT[[:upper:]]+ - aBCd BC[[:xdigit:]]+ - p0f3Cq 0f3Ca[[=b=]]c & abc abca[[= &C EBRACKa[[=b &C EBRACKa[[=b= &C EBRACKa[[=b=] &C EBRACKa[[=b,=]] &C ECOLLATEa[[=one=]]b & a1b a1b# complexitiesa(((b)))c - abc abca(b|(c))d - abd abda(b*|c)d - abbd abbd# just gotta have one DFA-buster, of coursea[ab]{20} - aaaaabaaaabaaaabaaaab aaaaabaaaabaaaabaaaab# and an inline expansion in case somebody gets trickya[ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab] - aaaaabaaaabaaaabaaaab aaaaabaaaabaaaabaaaab# and in case somebody just slips in an NFA...a[ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab](wee|week)(knights|night) - aaaaabaaaabaaaabaaaabweeknights aaaaabaaaabaaaabaaaabweeknights# fish for anomalies as the number of states passes 3212345678901234567890123456789 - a12345678901234567890123456789b 12345678901234567890123456789123456789012345678901234567890 - a123456789012345678901234567890b 1234567890123456789012345678901234567890123456789012345678901 - a1234567890123456789012345678901b 123456789012345678901234567890112345678901234567890123456789012 - a12345678901234567890123456789012b 12345678901234567890123456789012123456789012345678901234567890123 - a123456789012345678901234567890123b 123456789012345678901234567890123# and one really big one, beyond any plausible word width1234567890123456789012345678901234567890123456789012345678901234567890 - a1234567890123456789012345678901234567890123456789012345678901234567890b 1234567890123456789012345678901234567890123456789012345678901234567890# fish for problems as brackets go past 8[ab][cd][ef][gh][ij][kl][mn] - xacegikmoq acegikm[ab][cd][ef][gh][ij][kl][mn][op] - xacegikmoq acegikmo[ab][cd][ef][gh][ij][kl][mn][op][qr] - xacegikmoqy acegikmoq[ab][cd][ef][gh][ij][kl][mn][op][q] - xacegikmoqy acegikmoq# subtleties of matchingabc & xabcy abca\(b\)?c\1d b acdaBc i Abc Abca[Bc]*d i abBCcd abBCcd0[[:upper:]]1 &i 0a1 0a10[[:lower:]]1 &i 0A1 0A1a[^b]c &i abca[^b]c &i aBca[^b]c &i adc adc[a]b[c] - abc abc[a]b[a] - aba aba[abc]b[abc] - abc abc[abc]b[abd] - abd abda(b?c)+d - accd accd(wee|week)(knights|night) - weeknights weeknights(we|wee|week|frob)(knights|night|day) - weeknights weeknightsa[bc]d - xyzaaabcaababdacd abda[ab]c - aaabc abcabc s abc abca* & b @b# Let's have some fun -- try to match a C comment.# first the obvious, which looks okay at first glance.../\*.*\*/ - /*x*/ /*x*/# but.../\*.*\*/ - /*x*/y/*z*/ /*x*/y/*z*/# okay, we must not match */ inside; try to do that.../\*([^*]|\*[^/])*\*/ - /*x*/ /*x*//\*([^*]|\*[^/])*\*/ - /*x*/y/*z*/ /*x*/# but.../\*([^*]|\*[^/])*\*/ - /*x**/y/*z*/ /*x**/y/*z*/# and a still fancier version, which does it right (I think).../\*([^*]|\*+[^*/])*\*+/ - /*x*/ /*x*//\*([^*]|\*+[^*/])*\*+/ - /*x*/y/*z*/ /*x*//\*([^*]|\*+[^*/])*\*+/ - /*x**/y/*z*/ /*x**//\*([^*]|\*+[^*/])*\*+/ - /*x****/y/*z*/ /*x****//\*([^*]|\*+[^*/])*\*+/ - /*x**x*/y/*z*/ /*x**x*//\*([^*]|\*+[^*/])*\*+/ - /*x***x/y/*z*/ /*x***x/y/*z*/# subexpressions.* - abc abc -a(b)(c)d - abcd abcd b,ca(((b)))c - abc abc b,b,ba(b|(c))d - abd abd b,-a(b*|c|e)d - abbd abbd bba(b*|c|e)d - acd acd ca(b*|c|e)d - ad ad @da(b?)c - abc abc ba(b?)c - ac ac @ca(b+)c - abc abc ba(b+)c - abbbc abbbc bbba(b*)c - ac ac @c(a|ab)(bc([de]+)f|cde) - abcdef abcdef a,bcdef,de# the regression tester only asks for 9 subexpressionsa(b)(c)(d)(e)(f)(g)(h)(i)(j)k - abcdefghijk abcdefghijk b,c,d,e,f,g,h,i,ja(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)l - abcdefghijkl abcdefghijkl b,c,d,e,f,g,h,i,j,ka([bc]?)c - abc abc ba([bc]?)c - ac ac @ca([bc]+)c - abc abc ba([bc]+)c - abcc abcc bca([bc]+)bc - abcbc abcbc bca(bb+|b)b - abb abb ba(bbb+|bb+|b)b - abb abb ba(bbb+|bb+|b)b - abbb abbb bba(bbb+|bb+|b)bb - abbb abbb b(.*).* - abcdef abcdef abcdef(a*)* - bc @b @b# do we get the right subexpression when it is used more than once?a(b|c)*d - ad ad -a(b|c)*d - abcd abcd ca(b|c)+d - abd abd ba(b|c)+d - abcd abcd ca(b|c?)+d - ad ad @da(b|c?)+d - abcd abcd @da(b|c){0,0}d - ad ad -a(b|c){0,1}d - ad ad -a(b|c){0,1}d - abd abd ba(b|c){0,2}d - ad ad -a(b|c){0,2}d - abcd abcd ca(b|c){0,}d - ad ad -a(b|c){0,}d - abcd abcd ca(b|c){1,1}d - abd abd ba(b|c){1,1}d - acd acd ca(b|c){1,2}d - abd abd ba(b|c){1,2}d - abcd abcd ca(b|c){1,}d - abd abd ba(b|c){1,}d - abcd abcd ca(b|c){2,2}d - acbd acbd ba(b|c){2,2}d - abcd abcd ca(b|c){2,4}d - abcd abcd ca(b|c){2,4}d - abcbd abcbd ba(b|c){2,4}d - abcbcd abcbcd ca(b|c){2,}d - abcd abcd ca(b|c){2,}d - abcbd abcbd ba(b+|((c)*))+d - abd abd @d,@d,-a(b+|((c)*))+d - abcd abcd @d,@d,-# check out the STARTEND option[abc] &# a(b)c b[abc] &# a(d)c[abc] &# a(bc)d b[abc] &# a(dc)d c. &# a()cb.*c &# b(bc)c bcb.* &# b(bc)c bc.*c &# b(bc)c bc# plain strings, with the NOSPEC flagabc m abc abcabc m xabcy abcabc m xyza*b m aba*b a*ba*b m ab"" mC EMPTY# cases involving NULsaZb & a aaZb &p aaZb &p# (aZb) aZbaZ*b &p# (ab) aba.b &# (aZb) aZba.* &# (aZb)c aZb# word boundaries (ick)[[:<:]]a & a a[[:<:]]a & ba[[:<:]]a & -a aa[[:>:]] & a aa[[:>:]] & aba[[:>:]] & a- a[[:<:]]a.c[[:>:]] & axcd-dayc-dazce-abc abc[[:<:]]a.c[[:>:]] & axcd-dayc-dazce-abc-q abc[[:<:]]a.c[[:>:]] & axc-dayc-dazce-abc axc[[:<:]]b.c[[:>:]] & a_bxc-byc_d-bzc-q bzc[[:<:]].x..[[:>:]] & y_xa_-_xb_y-_xc_-axdc _xc_[[:<:]]a_b[[:>:]] & x_a_b# past problems, and suspected problems(A[1])|(A[2])|(A[3])|(A[4])|(A[5])|(A[6])|(A[7])|(A[8])|(A[9])|(A[A]) - A1 A1abcdefghijklmnop i abcdefghijklmnop abcdefghijklmnopabcdefghijklmnopqrstuv i abcdefghijklmnopqrstuv abcdefghijklmnopqrstuv(ALAK)|(ALT[AB])|(CC[123]1)|(CM[123]1)|(GAMC)|(LC[23][EO ])|(SEM[1234])|(SL[ES][12])|(SLWW)|(SLF )|(SLDT)|(VWH[12])|(WH[34][EW])|(WP1[ESN]) - CC11 CC11CC[13]1|a{21}[23][EO][123][Es][12]a{15}aa[34][EW]aaaaaaa[X]a - CC11 CC11Char \([a-z0-9_]*\)\[.* b Char xyz[k Char xyz[k xyza?b - ab ab-\{0,1\}[0-9]*$ b -5 -5a*a*a*a*a*a*a* & aaaaaa aaaaaa
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -