The Character class wraps a value of the primitive type char
in an object. An object of type Character contains a
single field whose type is char.
In addition, this class provides several methods for determining
the type of a character and converting characters from uppercase
to lowercase and vice versa.
Many of the methods of class Character are defined
in terms of a "Unicode attribute table" that specifies
a name for every defined Unicode code point. The table also
includes other attributes, such as a decimal value, an uppercase
equivalent, a lowercase equivalent, and/or a titlecase equivalent.
The Unicode attribute table is available on the World Wide Web as
the file:
For a more detailed specification of the Character
class, one that encompasses the exact behavior of methods such as
isDigit, isLetter,
isLowerCase, and isUpperCase over the
full range of Unicode values, see Gosling, Joy, and Steele, The
Java Language Specification.
Determines if the specified character is a
"Java" letter, that is, the character is permissible
as the first character in an identifier in the Java language.
Deprecated.
Determines if the specified character is a
"Java" letter or digit, that is, the character is
permissible as a non-initial character in an identifier in the
Java language.
Deprecated.
The minimum radix available for conversion to and from Strings.
The constant value of this field is the smallest value permitted
for the radix argument in radix-conversion methods such as the
digit method, the forDigit
method, and the toString method of class
Integer.
The maximum radix available for conversion to and from Strings.
The constant value of this field is the largest value permitted
for the radix argument in radix-conversion methods such as the
digit method, the forDigit
method, and the toString method of class
Integer.
The Class object representing the primitive type char.
UNASSIGNED
public static final byte UNASSIGNED
UPPERCASE_LETTER
public static final byte UPPERCASE_LETTER
LOWERCASE_LETTER
public static final byte LOWERCASE_LETTER
TITLECASE_LETTER
public static final byte TITLECASE_LETTER
MODIFIER_LETTER
public static final byte MODIFIER_LETTER
OTHER_LETTER
public static final byte OTHER_LETTER
NON_SPACING_MARK
public static final byte NON_SPACING_MARK
ENCLOSING_MARK
public static final byte ENCLOSING_MARK
COMBINING_SPACING_MARK
public static final byte COMBINING_SPACING_MARK
DECIMAL_DIGIT_NUMBER
public static final byte DECIMAL_DIGIT_NUMBER
LETTER_NUMBER
public static final byte LETTER_NUMBER
OTHER_NUMBER
public static final byte OTHER_NUMBER
SPACE_SEPARATOR
public static final byte SPACE_SEPARATOR
LINE_SEPARATOR
public static final byte LINE_SEPARATOR
PARAGRAPH_SEPARATOR
public static final byte PARAGRAPH_SEPARATOR
CONTROL
public static final byte CONTROL
FORMAT
public static final byte FORMAT
PRIVATE_USE
public static final byte PRIVATE_USE
SURROGATE
public static final byte SURROGATE
DASH_PUNCTUATION
public static final byte DASH_PUNCTUATION
START_PUNCTUATION
public static final byte START_PUNCTUATION
END_PUNCTUATION
public static final byte END_PUNCTUATION
CONNECTOR_PUNCTUATION
public static final byte CONNECTOR_PUNCTUATION
OTHER_PUNCTUATION
public static final byte OTHER_PUNCTUATION
MATH_SYMBOL
public static final byte MATH_SYMBOL
CURRENCY_SYMBOL
public static final byte CURRENCY_SYMBOL
MODIFIER_SYMBOL
public static final byte MODIFIER_SYMBOL
OTHER_SYMBOL
public static final byte OTHER_SYMBOL
Character
public Character(char value)
Constructs a Character object and initializes it so
that it represents the primitive value argument.
Compares this object against the specified object.
The result is true if and only if the argument is not
null and is a Character object that
represents the same char value as this object.
Parameters:
obj - the object to compare with.
Returns:
true if the objects are the same;
false otherwise.
Returns a String object representing this character's value.
Converts this Character object to a string. The
result is a string whose length is 1. The string's
sole component is the primitive char value represented
by this object.
isLowerCase
public static boolean isLowerCase(char ch)
Determines if the specified character is a lowercase character.
A character is lowercase if it is not in the range
'\u2000' through '\u2FFF', the Unicode
attribute table does not specify a mapping to lowercase for the
character, and at least one of the following is true:
The attribute table specifies a mapping to uppercase for the
character.
The name for the character contains the words "SMALL
LETTER".
The name for the character contains the words "SMALL
LIGATURE".
A character is considered to be lowercase if and only if
it is specified to be lowercase by the Unicode 2.0 standard
(category "Ll" in the Unicode specification data file).
Of the ISO-LATIN-1 characters (character codes 0x0000 through 0x00FF),
the following are lowercase:
a b c d e f g h i j k l m n o p q r s t u v w x y z
\u00DF \u00E0 \u00E1 \u00E2 \u00E3 \u00E4 \u00E5 \u00E6 \u00E7
\u00E8 \u00E9 \u00EA \u00EB \u00EC \u00ED \u00EE \u00EF \u00F0
\u00F1 \u00F2 \u00F3 \u00F4 \u00F5 \u00F6 \u00F8 \u00F9 \u00FA
\u00FB \u00FC \u00FD \u00FE \u00FF
Many other Unicode characters are lowercase, too.
Parameters:
ch - the character to be tested.
Returns:
true if the character is lowercase;
false otherwise.
isUpperCase
public static boolean isUpperCase(char ch)
Determines if the specified character is an uppercase character.
A character is uppercase if it is not in the range
'\u2000' through '\u2FFF', the Unicode
attribute table does not specify a mapping to uppercase for the
character, and at least one of the following is true:
The attribute table specifies a mapping to lowercase for the
character.
The name for the character contains the words
"CAPITAL LETTER".
The name for the character contains the words
"CAPITAL LIGATURE".
Of the ISO-LATIN-1 characters (character codes 0x0000 through 0x00FF),
the following are uppercase:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
\u00C0 \u00C1 \u00C2 \u00C3 \u00C4 \u00C5 \u00C6 \u00C7
\u00C8 \u00C9 \u00CA \u00CB \u00CC \u00CD \u00CE \u00CF \u00D0
\u00D1 \u00D2 \u00D3 \u00D4 \u00D5 \u00D6 \u00D8 \u00D9 \u00DA
\u00DB \u00DC \u00DD \u00DE
Many other Unicode characters are uppercase, too.
Parameters:
ch - the character to be tested.
Returns:
true if the character is uppercase;
false otherwise.
isTitleCase
public static boolean isTitleCase(char ch)
Determines if the specified character is a titlecase character.
A character is considered to be titlecase if and only if
it is specified to be titlecase by the Unicode 2.0 standard
(category "Lt" in the Unicode specification data file).
The printed representations of four Unicode characters look like
pairs of Latin letters. For example, there is an uppercase letter
that looks like "LJ" and has a corresponding lowercase letter that
looks like "lj". A third form, which looks like "Lj",
is the appropriate form to use when rendering a word in lowercase
with initial capitals, as for a book title.
These are the Unicode characters for which this method returns
true:
LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON
LATIN CAPITAL LETTER L WITH SMALL LETTER J
LATIN CAPITAL LETTER N WITH SMALL LETTER J
LATIN CAPITAL LETTER D WITH SMALL LETTER Z
Parameters:
ch - the character to be tested.
Returns:
true if the character is titlecase;
false otherwise.
Determines if the specified character is a digit.
A character is considered to be a digit if it is not in the range
'\u2000' <= ch <= '\u2FFF'
and its Unicode name contains the word
"DIGIT". For a more complete
specification that encompasses all Unicode characters that are
defined as digits, see Gosling, Joy, and Steele, The Java
Language Specification.
These are the ranges of Unicode characters that are considered digits:
0x0030 through 0x0039
ISO-LATIN-1 digits ('0' through '9')
0x0660 through 0x0669
Arabic-Indic digits
0x06F0 through 0x06F9
Extended Arabic-Indic digits
0x0966 through 0x096F
Devanagari digits
0x09E6 through 0x09EF
Bengali digits
0x0A66 through 0x0A6F
Gurmukhi digits
0x0AE6 through 0x0AEF
Gujarati digits
0x0B66 through 0x0B6F
Oriya digits
0x0BE7 through 0x0BEF
Tamil digits
0x0C66 through 0x0C6F
Telugu digits
0x0CE6 through 0x0CEF
Kannada digits
0x0D66 through 0x0D6F
Malayalam digits
0x0E50 through 0x0E59
Thai digits
0x0ED0 through 0x0ED9
Lao digits
0x0F20 through 0x0F29
Tibetan digits
0xFF10 through 0xFF19
Fullwidth digits
Parameters:
ch - the character to be tested.
Returns:
true if the character is a digit;
false otherwise.
Determines if the specified character is a letter. For a
more complete specification that encompasses all Unicode
characters, see Gosling, Joy, and Steele, The Java Language
Specification.
A character is considered to be a letter if and only if
it is specified to be a letter by the Unicode 2.0 standard
(category "Lu", "Ll", "Lt", "Lm", or "Lo" in the Unicode
specification data file).
Note that most ideographic characters are considered
to be letters (category "Lo") for this purpose.
Note also that not all letters have case: many Unicode characters are
letters but are neither uppercase nor lowercase nor titlecase.
Parameters:
ch - the character to be tested.
Returns:
true if the character is a letter;
false otherwise.
isLetterOrDigit
public static boolean isLetterOrDigit(char ch)
Determines if the specified character is a letter or digit.
For a more complete specification that encompasses all Unicode
characters, see Gosling, Joy, and Steele, The Java Language
Specification.
A character is considered to be a letter if and only if
it is specified to be a letter or a digit by the Unicode 2.0 standard
(category "Lu", "Ll", "Lt", "Lm", "Lo", or "Nd" in the Unicode
specification data file). In other words, isLetterOrDigit is true
of a character if and only if either isLetter is true of the character
or isDigit is true of the character.
Parameters:
ch - the character to be tested.
Returns:
true if the character is a letter or digit;
false otherwise.
isJavaLetter
public static boolean isJavaLetter(char ch)
Note: isJavaLetter() is deprecated.Replaced by isJavaIdentifierStart(char).
Determines if the specified character is a
"Java" letter, that is, the character is permissible
as the first character in an identifier in the Java language.
A character is considered to be a Java letter if and only if it
is a letter, the ASCII dollar sign character '$', or
the underscore character '_'.
Parameters:
ch - the character to be tested.
Returns:
true if the character is a Java letter;
false otherwise.
Note: isJavaLetterOrDigit() is deprecated.Replaced by isJavaIdentifierPart(char).
Determines if the specified character is a
"Java" letter or digit, that is, the character is
permissible as a non-initial character in an identifier in the
Java language.
A character is considered to be a Java letter or digit if and
only if it is a letter, a digit, the ASCII dollar sign character
'$', or the underscore character '_'.
Parameters:
ch - the character to be tested.
Returns:
true if the character is a Java letter or digit;
false otherwise.
Determines if the specified character is
permissible as the first character in a Java identifier.
A character may start a Java identifier if and only if
it is one of the following:
a letter
a currency symbol (such as "$")
a connecting punctuation character (such as "_").
Parameters:
ch - the character to be tested.
Returns:
true if the character may start a Java identifier;
false otherwise.
Determines if the specified character may be part of a Java
identifier as other than the first character.
A character may be part of a Java identifier if and only if
it is one of the following:
a letter
a currency symbol (such as "$")
a connecting punctuation character (such as "_").
a digit
a numeric letter (such as a Roman numeral character)
a combining mark
a non-spacing mark
an ignorable control character
Parameters:
ch - the character to be tested.
Returns:
true if the character may be part of a Unicode identifier;
false otherwise.
Determines if the specified character is
permissible as the first character in a Unicode identifier.
A character may start a Unicode identifier if and only if
it is a letter.
Parameters:
ch - the character to be tested.
Returns:
true if the character may start a Unicode identifier;
false otherwise.
Determines if the specified character may be part of a Unicode
identifier as other than the first character.
A character may be part of a Unicode identifier if and only if
it is one of the following:
a letter
a connecting punctuation character (such as "_").
a digit
a numeric letter (such as a Roman numeral character)
a combining mark
a non-spacing mark
an ignorable control character
Parameters:
ch - the character to be tested.
Returns:
true if the character may be part of a Unicode identifier;
false otherwise.
Determines if the specified character should be regarded as
an ignorable character in a Java identifier or a Unicode identifier.
The following Unicode characters are ignorable in a Java identifier
or a Unicode identifier:
0x0000 through 0x0008,
ISO control characters that
0x000E through 0x001B,
are not whitespace
and 0x007F through 0x009F
0x200C through 0x200F
join controls
0x200A through 0x200E
bidirectional controls
0x206A through 0x206F
format controls
0xFEFF
zero-width no-break space
Parameters:
ch - the character to be tested.
Returns:
true if the character may be part of a Unicode identifier;
false otherwise.
toLowerCase
public static char toLowerCase(char ch)
The given character is mapped to its lowercase equivalent; if the
character has no lowercase equivalent, the character itself is
returned.
A character has a lowercase equivalent if and only if a lowercase
mapping is specified for the character in the Unicode attribute
table.
Note that some Unicode characters in the range
'\u2000' to '\u2FFF' have lowercase
mappings; this method does map such characters to their lowercase
equivalents even though the method isUpperCase does
not return true for such characters.
Parameters:
ch - the character to be converted.
Returns:
the lowercase equivalent of the character, if any;
otherwise the character itself.
toUpperCase
public static char toUpperCase(char ch)
Converts the character argument to uppercase. A character has an
uppercase equivalent if and only if an uppercase mapping is
specified for the character in the Unicode attribute table.
Note that some Unicode characters in the range
'\u2000' to '\u2000FFF' have uppercase
mappings; this method does map such characters to their titlecase
equivalents even though the method isLowerCase does
not return true for such characters.
Parameters:
ch - the character to be converted.
Returns:
the uppercase equivalent of the character, if any;
otherwise the character itself.
toTitleCase
public static char toTitleCase(char ch)
Converts the character argument to titlecase. A character has a
titlecase equivalent if and only if a titlecase mapping is
specified for the character in the Unicode attribute table.
Note that some Unicode characters in the range
'\u2000' through '\u2FFF' have titlecase
mappings; this method does map such characters to their titlecase
equivalents even though the method isTitleCase does
not return true for such characters.
There are only four Unicode characters that are truly titlecase forms
that are distinct from uppercase forms. As a rule, if a character has no
true titlecase equivalent but does have an uppercase mapping, then the
Unicode 2.0 attribute table specifies a titlecase mapping that is the
same as the uppercase mapping.
Parameters:
ch - the character to be converted.
Returns:
the titlecase equivalent of the character, if any;
otherwise the character itself.
Returns the numeric value of the character ch in the
specified radix.
If the radix is not in the range MIN_RADIX <=
radix <= MAX_RADIX or if the
value of ch is not a valid digit in the specified
radix, -1 is returned. A character is a valid digit
if at least one of the following is true:
The method isDigit is true of the character
and the Unicode decimal digit value of the character (or its
single-character decomposition) is less than the specified radix.
In this case the decimal digit value is returned.
The character is one of the uppercase Latin letters
'A' through 'Z' and its code is less than
radix + 'A' - 10.
In this case, ch - 'A' + 10
is returned.
The character is one of the lowercase Latin letters
'a' through 'z' and its code is less than
radix + 'a' - 10.
In this case, ch - 'a' + 10
is returned.
Parameters:
ch - the character to be converted.
radix - the radix.
Returns:
the numeric value represented by the character in the
specified radix.
Returns the Unicode numeric value of the character as a
nonnegative integer.
If the character does not have a numeric value, then -1 is returned.
If the character has a numeric value that cannot be represented as a
nonnegative integer (for example, a fractional value), then -2
is returned.
Parameters:
ch - the character to be converted.
radix - the radix.
Returns:
the numeric value of the character, as a nonnegative int value;
-2 if the character has a numeric value that is not a
nonnegative integer; -1 if the character has no numeric value.
isSpaceChar
public static boolean isSpaceChar(char ch)
Determines if the specified character is a Unicode space character.
A character is considered to be a space character if and only if
it is specified to be a space character by the Unicode 2.0 standard
(category "Zs", "Zl, or "Zp" in the Unicode specification data file).
Parameters:
ch - the character to be tested.
Returns:
true if the character is a space character; false otherwise.
isWhitespace
public static boolean isWhitespace(char ch)
Determines if the specified character is white space according to Java.
A character is considered to be a Java whitespace character if and only
if it satisfies one of the following criteria:
It is a Unicode space separator (category "Zs"), but is not
a no-break space (\u00A0 or \uFEFF).
It is a Unicode line separator (category "Zl").
It is a Unicode paragraph separator (category "Zp").
It is \u0009, HORIZONTAL TABULATION.
It is \u000A, LINE FEED.
It is \u000B, VERTICAL TABULATION.
It is \u000C, FORM FEED.
It is \u000D, CARRIAGE RETURN.
It is \u001C, FILE SEPARATOR.
It is \u001D, GROUP SEPARATOR.
It is \u001E, RECORD SEPARATOR.
It is \u001F, UNIT SEPARATOR.
Parameters:
ch - the character to be tested.
Returns:
true if the character is a Java whitespace character;
false otherwise.
isISOControl
public static boolean isISOControl(char ch)
Determines if the specified character is an ISO control character.
A character is considered to be an ISO control character if its
code is in the range \u0000 through \u001F or in the range
\u007F through \u009F.
Parameters:
ch - the character to be tested.
Returns:
true if the character is an ISO control character;
false otherwise.
forDigit
public static char forDigit(int digit,
int radix)
Determines the character representation for a specific digit in
the specified radix. If the value of radix is not a
valid radix, or the value of digit is not a valid
digit in the specified radix, the null character
('\u0000') is returned.
The radix argument is valid if it is greater than or
equal to MIN_RADIX and less than or equal to
MAX_RADIX. The digit argument is valid if
0 <= digit <= radix.
If the digit is less than 10, then
'0' + digit is returned. Otherwise, the value
'a' + digit - 10 is returned.
Parameters:
digit - the number to convert to a character.
radix - the radix.
Returns:
the char representation of the specified digit
in the specified radix.