CDATA Sections (XML in a Nutshell, 2nd Edition)

2.6. CDATA Sections

When an XML document includes samples of XML or HTML source code, the < and & characters in those samples must be encoded as < and &. The more sections of literal code a document includes and the longer they are, the more tedious this encoding becomes. Instead you can enclose each sample of literal code in a CDATA section. A CDATA section is set off by a <![CDATA[ and ]]>. Everything between the <![CDATA[ and the ]]> is treated as raw character data. Less-than signs don't begin. Ampersands don't start entity references. Everything is simply character data, not markup.

For example, in a Scalable Vector Graphics (SVG) tutorial written in XHTML, you might see something like this:

<p>You can use a default <code>xmlns</code> attribute to avoid
having to add the svg prefix to all your elements:</p>
     <![CDATA[
       <svg xmlns="http://www.w3.org/2000/svg"
            width="12cm" height="10cm">
         <ellipse rx="110" ry="130" />
         <rect x="4cm" y="1cm" width="3cm" height="6cm" />
       </svg>
     ]]>

The SVG source code has been included directly in the XHTML file without carefully replacing each < with <. The result will be a sample SVG document, not an embedded SVG picture, as might happen if this example were not placed inside a CDATA section.

The only thing that can not appear in a CDATA section is the CDATA section end delimiter ]]>.

CDATA sections exist for the convenience of human authors, not for programs. Parsers are not required to tell you whether a particular block of text came from a CDATA section, from normal character data, or from character data that contained entity references such as < and &. By the time you get access to the data, these differences will have been washed away.