⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 parsewords.pm

📁 Astercon2 开源软交换 2.2.0
💻 PM
字号:
package Text::ParseWords;use vars qw($VERSION @ISA @EXPORT $PERL_SINGLE_QUOTE);$VERSION = "3.24";require 5.000;use Exporter;@ISA = qw(Exporter);@EXPORT = qw(shellwords quotewords nested_quotewords parse_line);@EXPORT_OK = qw(old_shellwords);sub shellwords {    my(@lines) = @_;    $lines[$#lines] =~ s/\s+$//;    return(quotewords('\s+', 0, @lines));}sub quotewords {    my($delim, $keep, @lines) = @_;    my($line, @words, @allwords);    foreach $line (@lines) {	@words = parse_line($delim, $keep, $line);	return() unless (@words || !length($line));	push(@allwords, @words);    }    return(@allwords);}sub nested_quotewords {    my($delim, $keep, @lines) = @_;    my($i, @allwords);    for ($i = 0; $i < @lines; $i++) {	@{$allwords[$i]} = parse_line($delim, $keep, $lines[$i]);	return() unless (@{$allwords[$i]} || !length($lines[$i]));    }    return(@allwords);}sub parse_line {    my($delimiter, $keep, $line) = @_;    my($word, @pieces);    no warnings 'uninitialized';	# we will be testing undef strings    while (length($line)) {	$line =~ s/^(["'])			# a $quote        	    ((?:\\.|(?!\1)[^\\])*)	# and $quoted text		    \1				# followed by the same quote		   |				# --OR--		   ^((?:\\.|[^\\"'])*?)		# an $unquoted text		    (\Z(?!\n)|(?-x:$delimiter)|(?!^)(?=["']))  		    				# plus EOL, delimiter, or quote		  //xs or return;		# extended layout	my($quote, $quoted, $unquoted, $delim) = ($1, $2, $3, $4);	return() unless( defined($quote) || length($unquoted) || length($delim));        if ($keep) {	    $quoted = "$quote$quoted$quote";	}        else {	    $unquoted =~ s/\\(.)/$1/sg;	    if (defined $quote) {		$quoted =~ s/\\(.)/$1/sg if ($quote eq '"');		$quoted =~ s/\\([\\'])/$1/g if ( $PERL_SINGLE_QUOTE && $quote eq "'");            }	}        $word .= substr($line, 0, 0);	# leave results tainted        $word .= defined $quote ? $quoted : $unquoted;         if (length($delim)) {            push(@pieces, $word);            push(@pieces, $delim) if ($keep eq 'delimiters');            undef $word;        }        if (!length($line)) {            push(@pieces, $word);	}    }    return(@pieces);}sub old_shellwords {    # Usage:    #	use ParseWords;    #	@words = old_shellwords($line);    #	or    #	@words = old_shellwords(@lines);    #	or    #	@words = old_shellwords();	# defaults to $_ (and clobbers it)    no warnings 'uninitialized';	# we will be testing undef strings    local *_ = \join('', @_) if @_;    my (@words, $snippet);    s/\A\s+//;    while ($_ ne '') {	my $field = substr($_, 0, 0);	# leave results tainted	for (;;) {	    if (s/\A"(([^"\\]|\\.)*)"//s) {		($snippet = $1) =~ s#\\(.)#$1#sg;	    }	    elsif (/\A"/) {		require Carp;		Carp::carp("Unmatched double quote: $_");		return();	    }	    elsif (s/\A'(([^'\\]|\\.)*)'//s) {		($snippet = $1) =~ s#\\(.)#$1#sg;	    }	    elsif (/\A'/) {		require Carp;		Carp::carp("Unmatched single quote: $_");		return();	    }	    elsif (s/\A\\(.)//s) {		$snippet = $1;	    }	    elsif (s/\A([^\s\\'"]+)//) {		$snippet = $1;	    }	    else {		s/\A\s+//;		last;	    }	    $field .= $snippet;	}	push(@words, $field);    }    return @words;}1;__END__=head1 NAMEText::ParseWords - parse text into an array of tokens or array of arrays=head1 SYNOPSIS  use Text::ParseWords;  @lists = &nested_quotewords($delim, $keep, @lines);  @words = &quotewords($delim, $keep, @lines);  @words = &shellwords(@lines);  @words = &parse_line($delim, $keep, $line);  @words = &old_shellwords(@lines); # DEPRECATED!=head1 DESCRIPTIONThe &nested_quotewords() and &quotewords() functions accept a delimiter (which can be a regular expression)and a list of lines and then breaks those lines up into a list ofwords ignoring delimiters that appear inside quotes.  &quotewords()returns all of the tokens in a single long list, while &nested_quotewords()returns a list of token lists corresponding to the elements of @lines.&parse_line() does tokenizing on a single string.  The &*quotewords()functions simply call &parse_line(), so if you're only splittingone line you can call &parse_line() directly and save a functioncall.The $keep argument is a boolean flag.  If true, then the tokens aresplit on the specified delimiter, but all other characters (quotes,backslashes, etc.) are kept in the tokens.  If $keep is false then the&*quotewords() functions remove all quotes and backslashes that arenot themselves backslash-escaped or inside of single quotes (i.e.,&quotewords() tries to interpret these characters just like the Bourneshell).  NB: these semantics are significantly different from theoriginal version of this module shipped with Perl 5.000 through 5.004.As an additional feature, $keep may be the keyword "delimiters" whichcauses the functions to preserve the delimiters in each string astokens in the token lists, in addition to preserving quote andbackslash characters.&shellwords() is written as a special case of &quotewords(), and itdoes token parsing with whitespace as a delimiter-- similar to mostUnix shells.=head1 EXAMPLESThe sample program:  use Text::ParseWords;  @words = &quotewords('\s+', 0, q{this   is "a test" of\ quotewords \"for you});  $i = 0;  foreach (@words) {      print "$i: <$_>\n";      $i++;  }produces:  0: <this>  1: <is>  2: <a test>  3: <of quotewords>  4: <"for>  5: <you>demonstrating:=over 4=item 0a simple word=item 1multiple spaces are skipped because of our $delim=item 2use of quotes to include a space in a word=item 3use of a backslash to include a space in a word=item 4use of a backslash to remove the special meaning of a double-quote=item 5another simple word (note the lack of effect of thebackslashed double-quote)=backReplacing C<&quotewords('\s+', 0, q{this   is...})>with C<&shellwords(q{this   is...})>is a simpler way to accomplish the same thing.=head1 AUTHORSMaintainer is Hal Pomeranz <pomeranz@netcom.com>, 1994-1997 (Originalauthor unknown).  Much of the code for &parse_line() (including theprimary regexp) from Joerk Behrends <jbehrends@multimediaproduzenten.de>.Examples section another documentation provided by John Heidemann <johnh@ISI.EDU>Bug reports, patches, and nagging provided by lots of folks-- thankseverybody!  Special thanks to Michael Schwern <schwern@envirolink.org>for assuring me that a &nested_quotewords() would be useful, and to Jeff Friedl <jfriedl@yahoo-inc.com> for telling me not to worry abouterror-checking (sort of-- you had to be there).=cut

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -