1.0 Introduction
A string is one of
the fundamental building blocks of data that JavaScript works with.
Any script that touches URLs or user entries in form text boxes works
with strings. Most document object model properties are string
values. Data that you read or write to a browser cookie is a string.
Strings are everywhere!
The core JavaScript language has a repertoire of the common string
manipulation properties and methods that you find in most programming
languages. You can tear apart a string character by character if you
like, change the case of all letters in the string, or work with
subsections of a string. Most scriptable browsers now in circulation
also benefit from the power of regular expressions, which greatly
simplify numerous string manipulation tasks—once you surmount a
fairly steep learning curve.
Your scripts will commonly be handed values that are already string
data types. For instance, if you need to inspect the text that a user
has entered into a form's text box, the
value property of that text box object returns a
value already typed as a string. All properties and methods of any
string object are immediately available for your scripts to operate
on that text box value.
1.0.1 Creating a String
If
you need to create a string, you have a couple of ways to accomplish
it. The simplest way is to simply assign a quoted string of
characters to a
variable (or object property):
var myString = "Fluffy is a pretty cat.";
Quotes
around
a JavaScript string can be either single or double quotes, but each pair must be of the
same type. Therefore, both of the following statements are
acceptable:
var myString = "Fluffy is a pretty cat.";
var myString = 'Fluffy is a pretty cat.';
But the following mismatched pair is illegal and throws a script
error:
var myString = "Fluffy is a pretty cat.';
Having the two sets of quote symbols is handy when you need to embed
one string within another. The following document.write(
) statement that would execute while a page loads into the
browser has one outer string (the entire string being written by the
method) and nested sets of quotes that surround a string value for an
HTML element attribute:
document.write("<img src='img/logo.jpg' height='30' width='100' alt='Logo'>");
You are also free to reverse the order of double and single quotes as
your style demands. Thus, the above statement would be interpreted
the same way if it were written as follows:
document.write('<img src="img/logo.jpg" height="30" width="100" alt="Logo">');
Two more levels of nesting are also possible if you use escape
characters with the quote symbols. See Recipe 1.8 for examples of
escaped character usage in JavaScript strings.
Technically speaking, the strings described so far
aren't precisely
string objects in
the purest sense of JavaScript. They are string
values, which, as it turns out, lets the strings
use all of the properties and methods of the global
String object that inhabits every scriptable
browser window. Use string values for all of your JavaScript text
manipulation. In a few rare instances, however, a JavaScript string
value isn't quite good enough. You may encounter
this situation if you are using JavaScript to communicate with a Java
applet, and one of the applet's public methods
requires an argument as a string data type. In this case, you might
need to create a full-fledged instance of a String
object and pass that object as the method argument. To create such an
object, use the constructor function of the String
object:
var myString = new String("Fluffy is a pretty cat.");
The data type of the myString variable after this
statement executes is object rather than
string. But this object inherits all of the same
String object properties and methods that a string
value has, and works fine with a Java applet.
1.0.2 Regular Expressions
For the uninitiated, regular
expressions can be cryptic and confusing. This isn't
the forum to teach you regular expressions from scratch, but perhaps
the recipes in this chapter that demonstrate them will pique your
interest enough to pursue their study.
The purpose of a regular expression is to define a pattern of
characters that you can then use to compare against an existing
string. If the string contains characters that match the pattern, the
regular expression tells you where the match is within the string,
facilitating further manipulation (perhaps a search-and-replace
operation). Regular expression patterns are powerful entities because
they let you go much further than simply defining a pattern of fixed
characters. For example, you can define a pattern to be a sequence of
five numerals bounded on each side by whitespace. Another pattern can
define the format for a typical email address, regardless of the
length of the username or domain, but the full domain must include at
least one period.
The cryptic part of regular expressions is the notation they use to
specify the various conditions within the pattern. JavaScript regular
expressions notation is nearly identical to regular expressions found
in languages such as Perl. The syntax is the same for all except for
some of the more esoteric uses. One definite difference is the way
you create a regular expression object from a pattern. You can use
either the formal constructor function or shortcut syntax. The
following two syntax examples create the same regular expression
object:
var re = /pattern/ [g | i | gi]; // Shortcut syntax
var re = new RegExp(["pattern", ["g "| "i" | "gi"]]); // Formal constructor
The optional trailing characters
(g, i, and
gi) indicate whether the pattern
should be applied globally and whether the pattern is
case-insensitive. Internet Explorer 5.5 or later for Windows and
Netscape 6 or later also recognize the optional m
modifier, which influences string boundary pattern matching within
multiline strings.
If you have been exposed to regular expressions in the past, Table 1-1 lists the regular expression pattern notation
available in browsers since NN 4 and IE 4.
Table 1-1. Regular expression notation
\b
|
Word boundary
|
/\bto/ matches
"tomorrow"
/to\b/ matches
"Soweto"
/\bto\b/ matches
"to"
|
\B
|
Word nonboundary
|
/\Bto/ matches
"stool" and
"Soweto"
/to\B/ matches
"stool" and
"tomorrow"
/\Bto\B/ matches
"stool"
|
\d
|
Numeral 0 through 9
|
/\d\d/ matches
"42"
|
\D
|
Nonnumeral
|
/\D\D/ matches
"to"
|
\s
|
Single whitespace
|
/under\sdog/ matches "under
dog"
|
\S
|
Single nonwhitespace
|
/under\Sdog/ matches
"under-dog"
|
\w
|
Letter, numeral, or underscore
|
/1\w/ matches
"1A"
|
\W
|
Not a letter, numeral, or underscore
|
/1\W/ matches
"1%"
|
.
|
Any character except a newline
|
/../ matches "Z3"
|
[...]
|
Any one of the character set in brackets
|
/J[aeiou]y/ matches
"Joy"
|
[^...]
|
Negated character set
|
/J[^eiou]y/ matches
"Jay"
|
*
|
Zero or more times
|
/\d*/ matches
"",
"5", or
"444"
|
?
|
Zero or one time
|
/\d?/ matches ""
or "5"
|
+
|
One or more times
|
/\d+/ matches
"5" or
"444"
|
{n}
|
Exactly n times
|
/\d{2}/ matches
"55"
|
{n,}
|
n or more times
|
/\d{2,}/ matches
"555"
|
{n,m}
|
At least n, at most m times
|
/\d{2,4}/ matches
"5555"
|
^
|
At beginning of a string or line
|
/^Sally/ matches "Sally
says..."
|
$
|
At end of a string or line
|
/Sally.$/ matches "Hi,
Sally."
|
See Recipe 1.5 through Recipe 1.7, as well as Recipe 8.2, to see how
regular expressions can empower a variety of string examination
operations with less overhead than more traditional string
manipulations. For in-depth coverage of regular expressions, see
Mastering Regular Expressions, by Jeffrey E. F.
Friedl (O'Reilly).
|