Primitive Data Types (Java in a Nutshell)

2.4. Primitive Data Types

Java supports eight basic data types known as primitive types. In addition, it supports classes and arrays as composite data types, or reference types. Classes and arrays are documented later in this chapter. The primitive types are: a boolean type, a character type, four integer types, and two floating-point types. The four integer types and the two floating-point types differ in the number of bits that represent them, and therefore in the range of numbers they can represent. Table 2-2 summarizes these primitive data types.

Table 2-2. Java Primitive Data Types

Type	Contains	Default	Size	Range
`boolean`	`true` or `false`	`false`	1 bit	NA
`char`	Unicode character	`\u0000`	16 bits	`\u0000` to `\uFFFF`
`byte`	Signed integer	0	8 bits	-128 to 127
`short`	Signed integer	0	16 bits	-32768 to 32767
`int`	Signed integer	0	32 bits	-2147483648 to 2147483647
`long`	Signed integer	0	64 bits	-9223372036854775808 to 9223372036854775807
`float`	IEEE 754 floating point	0.0	32 bits	±1.4E-45 to ±3.4028235E+38
`double`	IEEE 754 floating point	0.0	64 bits	±4.9E-324 to ±1.7976931348623157E+308

2.4.1. The boolean Type

The boolean type represents a truth value. There are only two possible values of this type, representing the two boolean states: on or off, yes or no, true or false. Java reserves the words true and false to represent these two boolean values.

C and C++ programmers should note that Java is quite strict about its boolean type: boolean values can never be converted to or from other data types. In particular, a boolean is not an integral type, and integer values cannot be used in place of a boolean. In other words, you cannot take shortcuts such as the following in Java:

if (o) {
  while(i) {
  }
}

Instead, Java forces you to write cleaner code by explicitly stating the comparisons you want:

if (o != null) {
  while(i != 0) {
  }
}

2.4.2. The char Type

The char type represents Unicode characters. It surprises many experienced programmers to learn that Java char values are 16 bits long, but in practice this fact is totally transparent. To include a character literal in a Java program, simply place it between single quotes (apostrophes):

char c = 'A';

You can, of course, use any Unicode character as a character literal, and you can use the \u Unicode escape sequence. In addition, Java supports a number of other escape sequences that make it easy both to represent commonly used nonprinting ASCII characters such as newline and to escape certain punctuation characters that have special meaning in Java. For example:

char tab = '\t', apostrophe = '\'', nul = '\000', aleph='\u05D0';

Table 2-3 lists the escape characters that can be used in char literals. These characters can also be used in string literals, which are covered later in this chapter.

Table 2-3. Java Escape Characters

Escape Sequence	Character Value
`\b`	Backspace
`\t`	Horizontal tab
`\n`	Newline
`\f`	Form feed
`\r`	Carriage return
`\"`	Double quote
`\'`	Single quote
`\\`	Backslash
`\`xxx	The Latin-1 character with the encoding xxx, where xxx is an octal (base 8) number between 000 and 377. The forms `\`x and `\`xx are also legal, as in `'\0'`, but are not recommended because they can cause difficulties in string constants where the escape sequence is followed by a regular digit.
`\u`xxxx	The Unicode character with encoding xxxx, where xxxx is four hexadecimal digits. Unicode escapes can appear anywhere in a Java program, not only in character and string literals.

char values can be converted to and from the various integral types. Unlike byte, short, int, and long, however, char is an unsigned type. The Character class defines a number of useful static methods for working with characters, including isDigit(), isJavaLetter(), isLowerCase(), and toUpperCase().

2.4.3. Integer Types

The integer types in Java are byte, short, int, and long. As shown in Table 2-2, these four types differ only in the number of bits and, therefore, in the range of numbers each type can represent. All integral types represent signed numbers; there is no unsigned keyword as there is in C and C++.

Literals for each of these types are written exactly as you would expect: as a string of decimal digits. Although it is not technically part of the literal syntax, any integer literal can be preceded by the unary minus operator to indicate a negative number. Here are some legal integer literals:

Integer literals can also be expressed in hexadecimal or octal notation. A literal that begins with 0x or 0X is taken as a hexadecimal number, using the letters A to F (or a to f) as the additional digits required for base-16 numbers. Integer literals beginning with a leading 0 are taken to be octal (base-8) numbers and cannot include the digits 8 or 9. Java does not allow integer literals to be expressed in binary (base-2) notation. Legal hexadecimal and octal literals include:

0xff              // Decimal 255, expressed in hexadecimal
0377              // The same number, expressed in octal (base 8)
0xCAFEBABE        // A magic number used to identify Java class files

Integer literals are 32-bit int values unless they end with the character L or l, in which case they are 64-bit long values:

1234        // An int value
1234L       // A long value
0xffL       // Another long value

Integer arithmetic in Java is modular, which means that it never produces an overflow or an underflow when you exceed the range of a given integer type. Instead, numbers just wrap around. For example:

byte b1 = 127, b2 = 1;   // Largest byte is 127
byte sum = b1 + b2;      // Sum wraps to -128, which is the smallest byte

Neither the Java compiler nor the Java interpreter warns you in any way when this occurs. When doing integer arithmetic, you simply must ensure that the type you are using has a sufficient range for the purposes you intend. Integer division by zero and modulo by zero are illegal and cause an ArithmeticException to be thrown.

Each integer type has a corresponding wrapper class: Byte, Short, Integer, and Long. Each of these classes defines MIN_VALUE and MAX_VALUE constants that describe the range of the type. The classes also define useful static methods, such as Byte.parseByte() and Integer.parseInt(), for converting strings to integer values.

2.4.4. Floating-Point Types

Real numbers in Java are represented with the float and double data types. As shown in Table 2-3, float is a 32-bit, single-precision floating-point value, and double is a 64-bit, double-precision floating-point value. Both types adhere to the IEEE 754-1985 standard, which specifies both the format of the numbers and the behavior of arithmetic for the numbers.

Floating-point values can be included literally in a Java program as an optional string of digits, followed by a decimal point and another string of digits. Here are some examples:

123.45
0.0
.01

Floating-point literals can also use exponential, or scientific, notation, in which a number is followed by the letter e or E (for exponent) and another number. This second number represents the power of ten by which the first number is multiplied. For example:

1.2345E02      // 1.2345 * 10^2, or 123.45
1e-6           // 1 * 10^-6, or 0.000001
6.02e23        // Avagadro's Number: 6.02 * 10^23

Floating-point literals are double values by default. To include a float value literally in a program, follow the number by the character f or F:

double d = 6.02E23;
float f = 6.02e23f;

Floating-point literals cannot be expressed in hexadecimal or octal notation.

Most real numbers, by their very nature, cannot be represented exactly in any finite number of bits. Thus, it is important to remember that float and double values are only approximations of the numbers they are meant to represent. A float is a 32-bit approximation, which results in at least 6 significant decimal digits, and a double is a 64-bit approximation, which results in at least 15 significant digits. In practice, these data types are suitable for most real-number computations.

In addition to representing ordinary numbers, the float and double types can also represent four special values: positive and negative infinity, zero, and NaN. The infinity values result when a floating-point computation produces a value that overflows the representable range of a float or double. When a floating-point computation underflows the representable range of a float or a double, a zero value results. The Java floating-point types make a distinction between positive zero and negative zero, depending on the direction from which the underflow occurred. In practice, positive and negative zero behave pretty much the same. Finally, the last special floating-point value is NaN, which stands for not-a-number. The NaN value results when an illegal floating-point operation, such as 0/0, is performed. Here are examples of statements that result in these special values:

double inf = 1/0;             // Infinity
double neginf = -1/0;         // -Infinity
double negzero = -1/inf;      // Negative zero
double NaN = 0/0;             // NaN

Because the Java floating-point types can handle overflow to infinity and underflow to zero and have a special NaN value, floating-point arithmetic never throws exceptions, even when performing illegal operations, like dividing zero by zero or taking the square root of a negative number.

The float and double primitive types have corresponding classes, named Float and Double. Each of these classes defines the following useful constants: MIN_VALUE, MAX_VALUE, NEGATIVE_INFINITY, POSITIVE_INFINITY, and NaN.

The infinite floating-point values behave as you would expect. Adding or subtracting anything to or from infinity, for example, yields infinity. Negative zero behaves almost identically to positive zero, and, in fact, the = = equality operator reports that negative zero is equal to positive zero. The only way to distinguish negative zero from positive, or regular, zero is to divide by it. 1/0 yields positive infinity, but 1 divided by negative zero yields negative infinity. Finally, since NaN is not-a-number, the = = operator says that it is not equal to any other number, including itself ! To check whether a float or double value is NaN, you must use the Float.isNan() and Double.isNan() methods.

2.4.5. Strings

In addition to the boolean, character, integer, and floating-point data types, Java also has a data type for working with strings of text (usually simply called strings). The String type is a class, however, and is not one of the primitive types of the language. Because strings are so commonly used, though, Java does have a syntax for including string values literally in a program. A String literal consists of arbitrary text within double quotes. For example:

"Hello, world"
"'This' is a string!"

String literals can contain any of the escape sequences that can appear as char literals (see Table 2-3). Use the \" sequence to include a double-quote within a String literal. Strings and string literals are discussed in more detail later in this chapter. Chapter 4, "The Java Platform", demonstrates some of the ways you can work with String objects in Java.

2.4.6. Type Conversions

Java allows conversions between integer values and floating-point values. In addition, because every character corresponds to a number in the Unicode encoding, char types can be converted to and from the integer and floating-point types. In fact, boolean is the only primitive type that cannot be converted to or from another primitive type in Java.

There are two basic types of conversions. A widening conversion occurs when a value of one type is converted to a wider type--one that is represented with more bits and therefore has a wider range of legal values. A narrowing conversion occurs when a value is converted to a type that is represented with fewer bits. Java performs widening conversions automatically when, for example, you assign an int literal to a double variable or a char literal to an int variable.

Narrowing conversions are another matter, however, and are not always safe. It is reasonable to convert the integer value 13 to a byte, for example, but it is not reasonable to convert 13000 to a byte, since byte can only hold numbers between -128 and 127. Because you can lose data in a narrowing conversion, the Java compiler complains when you attempt any narrowing conversion, even if the value being converted would in fact fit in the narrower range of the specified type:

int i = 13;
byte b = i;    // The compiler does not allow this

The one exception to this rule is that you can assign an integer literal (an int value) to a byte or short variable, if the literal falls within the range of the variable.

If you need to perform a narrowing conversion and are confident you can do so without losing data or precision, you can force Java to perform the conversion using a language construct known as a cast. Perform a cast by placing the name of the desired type in parentheses before the value to be converted. For example:

int i = 13;
byte b = (byte) i;   // Force the int to be converted to a byte
i = (int) 13.456;    // Force this double literal to the int 13

Casts of primitive types are most often used to convert floating-point values to integers. When you do this, the fractional part of the floating-point value is simply truncated (i.e., the floating-point value is rounded towards zero, not towards the nearest integer). The methods Math.round(), Math.floor(), and Math.ceil() perform other types of rounding.

The char type acts like an integer type in most ways, so a char value can be used anywhere an int or long value is required. Recall, however, that the char type is unsigned, so it behaves differently than the short type, even though both of them are 16 bits wide:

short s = (short) 0xffff; // These bits represent the number -1
char c = '\uffff';        // The same bits, representing a Unicode character
int i1 = s;               // Converting the short to an int yields -1
int i2 = c;               // Converting the char to an int yields 65535

Table 2-4 is a grid that shows which primitive types can be converted to which other types and how the conversion is performed. The letter N in the table means that the conversion cannot be performed. The letter Y means that the conversion is a widening conversion and is therefore performed automatically and implicitly by Java. The letter C means that the conversion is a narrowing conversion and requires an explicit cast. Finally, the notation Y* means that the conversion is an automatic widening conversion, but that some of the least significant digits of the value may be lost by the conversion. This can happen when converting an int or long to a float or double. The floating-point types have a larger range than the integer types, so any int or long can be represented by a float or double. However, the floating-point types are approximations of numbers and cannot always hold as many significant digits as the integer types.

Table 2-4. Java Primitive Type Conversions

Convert	Convert To:
From:	`boolean`	`byte`	`short`	`char`	`int`	`long`	`float`	`double`
`boolean`	-	N	N	N	N	N	N	N
`byte`	N	-	Y	C	Y	Y	Y	Y
`short`	N	C	-	C	Y	Y	Y	Y
`char`	N	C	C	-	Y	Y	Y	Y
`int`	N	C	C	C	-	Y	Y*	Y
`long`	N	C	C	C	C	-	Y*	Y*
`float`	N	C	C	C	C	C	-	Y
`double`	N	C	C	C	C	C	C	-

2.4.7. Reference Types

In addition to its eight primitive types, Java defines two additional categories of data types: classes and arrays. Java programs consist of class definitions; each class defines a new data type that can be manipulated by Java programs. For example, a program might define a class named Point and use it to store and manipulate X,Y points in a Cartesian coordinate system. This makes Point a new data type in that program. An array type represents a list of values of some other type. char is a data type, and an array of char values is another data type, written char[]. An array of Point objects is a data type, written Point[]. And an array of Point arrays is yet another type, written Point[][].

As you can see, there are an infinite number of possible class and array data types. Collectively, these data types are known as reference types. The reason for this name will become clear later in this chapter. For now, however, what is important to understand is that class and array types differ significantly from primitive types, in that they are compound, or composite, types. A primitive data type holds exactly one value. Classes and arrays are aggregate types that contain multiple values. The Point type, for example, holds two double values representing the X and Y coordinates of the point. And char[] is obviously a compound type because it represents a list of characters. By their very nature, class and array types are more complicated than the primitive data types. We'll discuss classes and arrays in detail later in this chapter and examine classes in even more detail in Chapter 3, "Object-Oriented Programming in Java".