home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Previous Section Next Section

1.11 Encoding and Decoding URL Strings

NN 6, IE 5.5(Win)

1.11.1 Problem

You want to convert a string of plain text to a format suitable for use as a URL or URL search string, or vice versa.

1.11.2 Solution

To convert a string consisting of an entire URL to a URL-encoded form, use the encodeURI( ) method, passing the string needing conversion as an argument. For example:

document.myForm.action = encodeURI(myString);

If you are assembling content for values of search string name/value pairs, apply the encodeURIComponent( ) method:

var srchString = "?name=" + encodeURIComponent(myString);

Both methods have complementary partners that perform conversions in the opposite direction:

decodeURI(encodedURIString)
decodeURIComponent(encodedURIComponentString)

In all cases, the original string is not altered when passed as an argument to these methods. Capture the results from the value returned by the methods.

1.11.3 Discussion

Although the escape( ) and unescape( ) methods have been available since the first scriptable browsers, they have been deprecated in the formal language specification (ECMA-262) in favor of a set of new methods. The new methods are available in IE 5.5 or later for Windows and Netscape 6 or later.

These new encoding methods work by slightly different rules than the old escape( ) and unescape( ) methods. As a result, you must encode and decode using the same pairs of methods at all times. In other words, if a URL is encoded with encodeURI( ), the resulting string can be decoded only with decodeURI( ).

The differences between encodeURI( ) and encodeURIComponent( ) are defined by the range of characters that the methods convert to the URI-friendly form of a percent sign (%) followed by the hexadecimal Unicode value of the symbol (e.g., a space becomes %20). Regular alphanumeric characters are not converted, but when it comes to punctuation and special characters, the two methods diverge in their coverage. The encodeURI( ) method converts the following symbols from the characters in the ASCII range of 32 through 126:

space  "  %  <  >  [  \  ]  ^  `  {  |  }

For example, if you are assembling a URL with a simple search string on the end, pass the URL through encodeURI( ) before navigating to the URL to make sure the URL is well-formed:

var newURL = "http://www.megacorp.com?prod=Gizmo Deluxe";
location.href = encodeURI(newURL);
// encoded URL is: http://www.megacorp.com?prod=Gizmo%20Deluxe

In contrast, the encodeURIComponent( ) method encodes far more characters that might find their way into value strings of forms or script-generated search strings. Encodable characters unique to encodeURIComponent( ) are shown in bold:

space   "  #  $   %  &  +  ,  /  :  ;   <  =   >  ?  @   [  \  ]  ^  `  {  |  }

You may recognize some of the encodeURIComponent( ) values as those frequently appearing within complex URLs, especially the ?, &, and = symbols. For this reason, you want to apply the encodeURIComponent( ) only to values of name/value pairs before those values are inserted or appended to a URL. But then it gets dangerous to pass the composite URL through encodeURI( ) again because the % symbols of the encoded characters will, themselves, be encoded, probably causing problems on the server end when parsing the input from the client.

If, for backward-compatibility reasons, you need to use the escape( ) method, be aware that this method uses a heavy hand in choosing characters to encode. Encodable characters for the escape( ) method are as follows:

space !  \  "  #  $  %  &  '  ( )  ,  :  ;  <  =  >  ?  @  [  \  ]  ^  `  {  |  }  ~

The @ symbol, however, is not converted in Internet Explorer browsers via the escape( ) method.

You can see now why it is important to use the matching decoding method if you need to return one of your encoded strings back into plain language. If the encoded string you are trying to decode comes from an external source (e.g., part of a URL search string returned by the server), try to use the decodeURIComponent( ) method on only those parts of the search string that are the value portion of a name/value pair. That's typically where the heart of your passed information is, as well as where you want to obtain the most correct conversion.

1.11.4 See Also

Recipe 10.6 for passing data to another page via URLs, during which value encoding is used.

    Previous Section Next Section