|
Chapter 11 The java.io Package |
|
StreamTokenizer
Name
StreamTokenizer
- Class Name:
-
java.io.StreamTokenizer
- Superclass:
-
java.lang.Object
- Immediate Subclasses:
-
None
- Interfaces Implemented:
-
None
- Availability:
-
JDK 1.0 or later
The StreamTokenizer class performs
a lexical analysis on an InputStream
object and breaks the stream into tokens. Although StreamTokenizer
is not a general-purpose parser, it recognizes tokens that
are similar to those used in the Java language. A StreamTokenizer recognizes identifiers, numbers, quoted strings, and various comment
styles.
A StreamTokenizer object can be wrapped around an
InputStream. In this case, when the
StreamTokenizer reads bytes from the stream, the
bytes are converted to Unicode characters by simply zero-extending the
byte values to 16 bits. As of Java 1.1, a
StreamTokenizer can be wrapped around a
Reader to eliminate this problem.
The nextToken() method returns
the next token from the stream. The rest of the methods in StreamTokenizer
control how the object interprets the characters that it reads and tokenizes
them.
The parsing functionality of StreamTokenizer is
controlled by a table and a number of flags. Each character that is
read from the InputStream is in the range
'\u0000' to
'\uFFFF'. The character value
looks up attributes of the character in the table. A character can
have zero or more of the following attributes: whitespace, alphabetic,
numeric, string quote, and comment character.
By default, a StreamTokenizer
recognizes the following:
- Whitespace characters between '\u0000'
and '\u0020'
- Alphabetic characters from 'a'
through 'z', 'A'
through 'Z', and
'\u00A0' and
'\u00FF'.
- Numeric characters '1',
'2', '3',
'4', '5',
'6', '7',
'8', '9',
'0', '.',
and '-'
- String quote characters "'"
and "'"
- Comment character "/"
public class java.io.StreamTokenizer extends java.lang.Object {
// Variables
public double nval;
public String sval;
public int ttype;
public final static int TT_EOF;
public final static int TT_EOL;
public final static int TT_NUMBER;
public final static int TT_WORD;
// Constructors
public StreamTokenizer(InputStream in); // Deprecated in 1.1
public StreamTokenizer(Reader in); // New in 1.1
// Instance Methods
public void commentChar(int ch);
public void eolIsSignificant(boolean flag);
public int lineno();
public void lowerCaseMode(boolean flag);
public int nextToken();
public void ordinaryChar(int ch);
public void ordinaryChars(int low, int hi);
public void parseNumbers();
public void pushBack();
public void quoteChar(int ch);
public void resetSyntax();
public void slashSlashComments(boolean flag);
public void slashStarComments(boolean flag);
public String toString();
public void whitespaceChars(int low, int hi);
public void wordChars(int low, int hi);
}
- Description
-
This variable contains the value of a TT_NUMBER
token.
- Description
-
This variable contains the value of a TT_WORD
token.
- Description
-
This variable indicates the token type. The value is either one of the
TT_ constants defined below
or the character that has just been parsed from the input stream.
- Description
-
This token type indicates that the end of the stream has been reached.
- Description
-
This token type indicates that the end of a line has been reached. The
value is not returned by nextToken()
unless eolIsSignificant(true)
has been called.
- Description
-
This token type indicates that a number has been parsed. The number is
placed in nval.
- Description
-
This token type indicates that a word has been parsed. The word is placed
in sval.
- Availability
-
Deprecated as of JDK 1.1
- Parameters
-
- in
-
The input stream to
tokenize.
- Description
-
This constructor creates a StreamTokenizer
that reads from the given InputStream.
As of JDK 1.1, this method is deprecated and
StreamTokenizer(Reader) should be
used instead.
- Availability
-
New as of JDK 1.1
- Parameters
-
- in
-
The reader to tokenize.
- Description
-
This constructor creates a StreamTokenizer
that reads from the given Reader.
- Parameters
-
- ch
-
The character to use
to indicate comments.
- Description
-
This method tells this StreamTokenizer to
treat the given character as the beginning of a comment that ends at the
end of the line. The StreamTokenizer
ignores all of the characters from the comment character to the end of
the line. By default, a StreamTokenizer
treats the '/'
character as a comment character. This method may be called multiple times
if there are multiple characters that begin comment lines.
To specify that a character is not a comment character,
use ordinaryChar().
- Parameters
-
- flag
-
A boolean
value that specifies whether or not this StreamTokenizer
returns TT_EOL
tokens.
- Description
-
A StreamTokenizer recognizes
"\n", "\r", and
"\r\n" as the end of a line. By default,
end-of-line characters are treated as whitespace and thus, the
StreamTokenizer does not return
TT_EOL tokens from
nextToken(). Call
eolIsSignificant(true) to tell the
StreamTokenizer to return TT_EOL
tokens.
- Returns
-
The current line number.
- Description
-
This method returns the current line number. Line numbers begin at 1.
- Parameters
-
- flag
-
A boolean value that specifies
whether or not this StreamTokenizer
returns TT_WORD
tokens in lowercase.
- Description
-
By default, a StreamTokenizer
does not change the case of the words that it parses. However if you call
lowerCaseMode(true),
whenever nextToken() returns
a TT_WORD token, the word in
sval is converted to lowercase.
- Returns
-
One of the token types (TT_EOF,
TT_EOL, TT_NUMBER,
or TT_WORD) or a character
code.
- Throws
-
- IOException
-
If any kind
of I/O error occurs.
- Description
-
This method reads the next token from the stream. The value returned is
the same as the value of the variable ttype.
The nextToken() method parses
the following tokens:
- TT_EOF
-
The end of the input stream has been reached.
- TT_EOL
-
The end of a line has been reached. The eolIsSignificant()
method controls whether end-of-line characters are treated as whitespace
or returned as TT_EOL tokens.
- TT_NUMBER
-
A number has been parsed. The value can be found in the variable nval.
The parseNumbers() method tells
the StreamTokenizer to recognize
numbers distinct from words.
- TT_WORD
-
A word has been parsed. The word can be found in the variable
sval.
- Quoted string
-
A quoted string has been parsed. The variable ttype
is set to the quote character, and sval contains the
string itself. You can tell the StreamTokenizer
what characters to use as quote characters using
quoteChar().
- Character
-
A single character has been parsed. The variable ttype
is set to the character value.
- Parameters
-
- ch
-
The character to treat
normally.
- Description
-
This method causes this StreamTokenizer to treat
the given character as an ordinary
character. This means that the character has no special significance
as a comment, string quote, alphabetic, numeric, or whitespace
character. For example, to tell the StreamTokenizer
that the slash does not start a single-line comment, use
ordinaryChar('/').
- Parameters
-
- low
-
The beginning of a range of character values.
- hi
-
The end of a range of character values.
- Description
-
This method tells this StreamTokenizer
to treat all of the characters in the given range as ordinary
characters. See the description of ordinaryChar()
above for more information.
- Description
-
This method tells this StreamTokenizer
to recognize numbers. The StreamTokenizer
constructor calls this method, so the default behavior of a StreamTokenizer
is to recognize numbers. This method modifies the syntax table of the StreamTokenizer
so that the following characters have the numeric attribute:
'1',
'2', '3',
'4', '5',
'6', '7',
'8', '9',
'0', '.',
and '-'
When the parser encounters a token that has the format of a double-precision
floating-point number, the token is treated as a number rather than a word.
The ttype variable is set to
TT_NUMBER, and nval
is set to the value of the number.
To use a StreamTokenizer that does
not parse numbers, make the above characters ordinary using
ordinaryChar() or
ordinaryChars():
- Description
-
This method has the effect of pushing the current token back onto the stream.
In other words, after a call to this method, the next call to the nextToken()
method returns the same result as the previous call to the nextToken()method
without reading any input.
- Parameters
-
- ch
-
The character to use
as a delimiter for quoted strings.
- Description
-
This method tells this StreamTokenizer
to treat the given character as the beginning or end of a quoted string.
By default, the single-quote character and the double-quote character are
string-quote characters. When the parser encounters a string-quote character,
the ttype variable is set to
the quote character, and sval
is set to the actual string. The string consists of all the characters
after (but not including) the string-quote character up to (but not including)
the next occurrence of the same string-quote character, a line terminator,
or the end of the stream.
To specify that a character is not a string-quote character, use ordinaryChar().
- Description
-
This method resets this StreamTokenizer,
which causes it to treat all characters as ordinary characters.
See the description of ordinaryChar()
above for more information.
- Parameters
-
- flag
-
A boolean
value that specifies whether or not this StreamTokenizer
recognizes double-slash comments (//).
- Description
-
By default, a StreamTokenizer
does not recognize double-slash comments. However, if you call slashSlashComments(true),
the nextToken() method recognizes
and ignores double-slash comments.
- Parameters
-
- flag
-
A boolean
value that specifies whether or not this StreamTokenizer
recognizes slash-star (/* ... */) comments.
- Description
-
By default, a StreamTokenizer
does not recognize slash-star comments. However, if you call slashStarComments(true),
the nextToken() method recognizes
and ignores slash-star comments.
- Returns
-
A String representation of
the current token.
- Overrides
-
Object.toString()
- Description
-
This method returns a string representation of the current token recognized
by the nextToken() method. This
string representation consists of the value of ttype,
the value of sval if the token
is a word or the value of nval
if the token is a number, and the current line number.
- Parameters
-
- low
-
The beginning of a range of character values.
- hi
-
The end of a range of character values.
- Description
-
This method causes this StreamTokenizer
to treat characters in the specified range as whitespace. The only function
of whitespace characters is to separate tokens in the stream.
- Parameters
-
- low
-
The beginning of a range of character values.
- hi
-
The end of a range of character values.
- Description
-
This method causes this StreamTokenizer
to treat characters in the specified range as characters that are part
of a word token, or, in other words, consider the characters to be alphabetic.
A word token consists of a sequence of characters that begins with
an alphabetic character and is followed by zero or more numeric or alphabetic
characters.
InputStream,
IOException,
Reader,
StringTokenizer
|