charstream.html

来自「This a JavaCC documentation.」· HTML 代码 · 共 279 行

HTML
279
字号
<!--Copyright 漏 2002 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,California 95054, U.S.A. All rights reserved.  Sun Microsystems, Inc. hasintellectual property rights relating to technology embodied in the productthat is described in this document. In particular, and without limitation,these intellectual property rights may include one or more of the U.S.patents listed at http://www.sun.com/patents and one or more additionalpatents or pending patent applications in the U.S. and in other countries.U.S. Government Rights - Commercial software. Government users are subjectto the Sun Microsystems, Inc. standard license agreement and applicableprovisions of the FAR and its supplements.  Use is subject to license terms.Sun,  Sun Microsystems,  the Sun logo and  Java are trademarks or registeredtrademarks of Sun Microsystems, Inc. in the U.S. and other countries.  Thisproduct is covered and controlled by U.S. Export Control laws and may besubject to the export or import laws in other countries.  Nuclear, missile,chemical biological weapons or nuclear maritime end uses or end users, whetherdirect or indirect, are strictly prohibited.  Export or reexport to countriessubject to U.S. embargo or to entities identified on U.S. export exclusionlists, including, but not limited to, the denied persons and speciallydesignated nationals lists is strictly prohibited.--><HTML><HEAD> <title>JavaCC: CharStream Classes MiniTutorial</title><!-- Changed by: Michael Van De Vanter, 14-Jan-2003 --></HEAD><BODY bgcolor="#FFFFFF" ><H1>JavaCC [tm]: CharStream Classes MiniTutorial</H1>This document describes in some detail the methods of the CharStreamclasses. Note that some of the details may not be relevant for theCharStream interface (to be used with USER_CHAR_STREAM).<P>There are 4 different kinds of char stream classes that are generated based on combinations of various options.<UL>  <LI> <H3>ASCII_CharStream</H3>       Generated when neither of the two options - <TT>UNICODE_INPUT</TT>       or <TT>JAVA_UNICODE_ESCAPE</TT> is set. <P>       This class treats the input as a stream of 1-byte (ISO-LATIN1)       characters. Note that this class can also be used to parse       binary files. It just reads a byte and returns it as a 16 bit       quantity to the lexical analyzer. So any character returned by       this class will be in the range <TT>'\u0000'-'\u00ff'</TT>.<P>  <LI> <H3>ASCII_UCodeESC_CharStream</H3>       Generated when the option <TT>JAVA_UNICODE_ESCAPE</TT> is set       and the <TT>UNICODE_INPUT</TT> option is not set.<P>       This class treats the input as a stream of 1-byte characters.       However, the special escape sequence<UL><BR>             <TT>("\\\\")* "\\" ("u")+</TT> - (<I>odd number of backslahes followed       by one or more 'u's.</I>)</UL><BR>       is treated as a tag indicating that the next 4 bytes following       the tag will be hexadecimal digits forming a 4-digit hex number       whose value will be treated as the value of the character at the       position indicated by the first backslash. Note that this value       can be anything in the range <TT>0x0-0xffff</TT>.<P>  <LI> <H3>UCode_CharStream</H3>       Generated when the option <TT>UNICODE_INPUT</TT> is set and        the option <TT>JAVA_UNICODE_ESCAPE</TT> is not set.<P>       This class treats the input as a stream of 2-byte characters. So       it reads 2 bytes <TT>b1</TT> and <TT>b2</TT> and returns them as        a single character using the expression <TT> b1 << 8 | b2 </TT>       assuming bigendian order. So in particular all the characters in       the range <TT>0x00-0xff</TT> are assumed to be stored as 2 bytes       with the first (higher-order) byte being 0.<P>  <LI> <H3>UCode_UCodeESC_CharStream</H3>       Generated when both the options <TT>UNICODE_INPUT</TT> and        <TT>JAVA_UNICODE_ESCAPE</TT> are set.<P>       This class input is a stream of 2-byte characters (just       like the UCode_CharStream class) and the special escape sequence<UL><BR>             <TT>("\\\\")* "\\" ("u")+</TT> - (<I>odd number of backslahes followed       by one or more 'u's.</I>)</UL><BR>       is treated as a tag indicating that the next 4 2-byte characters       following the tag will be hexadecimal digits forming a 4-digit hex       number whose value will be treated as the value of the character at the       position indicated by the first backslash. Note that this value       can be any value in the range <TT>0x0-0xffff</TT>. Also note that       the backslash(es) and u(s) are all assumed to be given as 2-byte       characters (with the higher order byte value being 0).<P></UL><H4>Note : None of the above classes can be used to read characters in a       mixed mode, i.e., some characters given as 1-byte characters and others       as 2-byte characters. To do this, you need to set USER_CHAR_STREAM       option to true and define your own char stream.</H4><HR>(Throughout the following, we use the notation XXXCharStream that stands for any of the above described 4 classes.)<h3>Constructors</h3><UL><TT>  <LI> public XXXCharStream(java.io.InputStream dstream,                                        int startline, int startcolumn)</TT><P>  Takes an input stream, starting line and column numbers and constructs a  CharStream object. It also creates buffers of initial size 4K for buffering the  characters and also for line and column numbers for each of those characters.<P><TT>  <LI> public XXXCharStream(java.io.InputStream dstream,                                        int startline, int startcolumn, int buffersize)</TT><P>  Takes an input stream, starting line and column numbers and constructs a  CharStream object. It also creates buffers of initial size <TT>buffsize</TT> for buffering the  characters and also for line and column numbers for each of those characters.<P>  So when you have an estimate on the maximum size of any token that can occur,  you can use that size to optimize the buffer sizes. Note, however, that   these sizes are only initial sizes and they will be expanded as and when  needed (in 2K steps).</UL><P><HR><h3>Methods</h3>All the following methods will be static or nonstatic depending onwhether the STATIC option is true or false at the generation time. Also onlythose methods that users can use in their lexical actions (using the<TT>input_stream</TT> variable of the lexical analyzer) are documentedhere. Rest of the (public) methods are very tightly coupled with theimplementation of the lexical analyzer and thus <B> should not </B> beused in lexical actions. In the future when we adopt version 1.1 of the Java [tm] programming language, we willstreamline this by making that part of the interface an innerclass tothe lexical analyzer.<P><UL>  <LI> <TT>public final char readChar() throws java.io.IOException</TT><P>       This method returns the next "character" in the input according       to the rules of the CharStream class as described above. It will       throw <TT>java.io.IOException</TT> if it reaches EOF during the       process of "constructing" the character. It also updates the line       and column number and buffers the character for any possible       backtracking that may be required later. It also stores the line       and column numbers for the same purpose.<P>  <LI> <TT> public final int getBeginLine() </TT><P>       This method returns the line number for the beginning of the       current match.<P>  <LI> <TT> public final int getBeginColumn() </TT><P>       This method returns the column number for the beginning of the       current match.<P>  <LI> <TT> public final int getEndLine() </TT><P>       This method returns the line number for the ending of the       current match.<P>  <LI> <TT> public final int getEndColumn() </TT><P>       This method returns the column number for the ending of the       current match.<P>  <LI> <TT> public final void backup(int amount) </TT><P>       This method puts back <TT>amount</TT> number of characters       into the stream. Note that the amount indicates the number       of characters as constructed by <TT>readChar</TT>. Since the       buffers used are circular buffers, it cannot check for       illegal <TT>amount</TT> values, it just wraps around. So it       is the user's responsibility to make sure that those many       characters are really produced before a call to this method.<P>  <LI><TT> public final String GetImage()</TT><P>       Returns the image of the current match. As far as the XXXCharStream       is concerned, all characters after the last call to the private method       <TT>BeginToken</TT> are considered a part of the current match.<P>  <LI><TT> public void ReInit(java.io.InputStream dstream,                                        int startline, int startcolumn)</TT><P>       This method reinitializes the XXXCharStream classes with a (possibly       new) input stream and starting line and column numbers.<P>  <LI><TT> public void ReInit(java.io.InputStream dstream,                 int startline, int startcolumn, int buffersize)</TT><P>       This method reinitializes the XXXCharStream classes with a (possibly       new) input stream and starting line and column numbers and adjusts       the size of the buffers to <TT>buffersize</TT>, by extending them.       Note that if the value of <TT>buffersize</TT> is less than the current       buffer sizes, they remain unchanged.<P>   <LI><TT> public void adjustBeginLineColumn(int newLine, int newCol)</TT><P>       This method adjusts the line and column number of the beginning of       the current match to <TT>newLine</TT> and </TT>newCol</TT> and        also adjusts the line and column numbers for all the characters       in the lookahead buffer.</UL></BODY></HTML>

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?