⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 chartokenizer.pm

📁 Plucene-1.25.tar.gz PERL版本的lucene
💻 PM
字号:
package Plucene::Analysis::CharTokenizer;=head1 NAME Plucene::Analysis::CharTokenizer - base class for character tokenisers=head1 SYNOPSIS	# isa Plucene::Analysis::Tokenizer	my $next = $chartokenizer->next;	=head1 DESCRIPTIONThis is an abstract base class for simple, character-oriented tokenizers.=head1 METHODS=cutuse strict;use warnings;use Carp;use Plucene::Analysis::Token;use base 'Plucene::Analysis::Tokenizer';=head2 token_reThis should be defined in subclasses.=cut# And here we deviate from the scriptsub token_re { die "You should define this" }# Class::Virtually::Abstract doesn't like being called twice.=head2 normalizeThis will normalise the character before it is added to the token.=cutsub normalize { return $_[1] }=head2 next	my $next = $chartokenizer->next;This will return the next token in the string, or undef at the end of the string.	=cutsub next {	my $self = shift;	my $re   = $self->token_re();	my $fh   = $self->{reader};	retry:	if (!defined $self->{buffer} or !length $self->{buffer}) {		return if eof($fh);		$self->{start} = tell($fh);		$self->{buffer} .= <$fh>;	}	return unless length $self->{buffer};	if ($self->{buffer} =~ s/(.*?)($re)//) {		$self->{start} += length $1;		my $word = $self->normalize($2);		my $rv   = Plucene::Analysis::Token->new(			text  => $word,			start => $self->{start},			end   => ($self->{start} + length($word)));		$self->{start} += length($word);		return $rv;	}	# No match, rest of buffer is useless.	$self->{buffer} = "";	# But we should try for some more text	goto retry;}1;

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -