11.10. Converting ASCII to HTML
11.10.1. Problem
You want to turn plaintext into
reasonably formatted HTML.
11.10.2. Solution
First, encode entities with
htmlentities( )
; then, transform the text into
various HTML structures. The pc_ascii2html(
)
function shown in Example 11-3 has basic
transformations for links and paragraph breaks.
Example 11-3. pc_ascii2html( ) function pc_ascii2html($s) {
$s = htmlentities($s);
$grafs = split("\n\n",$s);
for ($i = 0, $j = count($grafs); $i < $j; $i++) {
// Link to what seem to be http or ftp URLs
$grafs[$i] = preg_replace('/((ht|f)tp:\/\/[^\s&]+)/',
'<a href="$1">$1</a>',$grafs[$i]);
// Link to email addresses
$grafs[$i] = preg_replace('/[^@\s]+@([-a-z0-9]+\.)+[a-z]{2,}/i',
'<a href="mailto:$1">$1</a>',$grafs[$i]);
// Begin with a new paragraph
$grafs[$i] = '<p>'.$grafs[$i].'</p>';
}
return join("\n\n",$grafs);
}
11.10.3. Discussion
The more you know about what the ASCII text looks like, the better
your HTML conversion can be. For example, if emphasis is indicated
with *asterisks* or /slashes/ around words, you can add rules that
take care of that, as follows:
$grafs[$i] = preg_replace('/(\A|\s)\*([^*]+)\*(\s|\z)/',
'$1<b>$2</b>$3',$grafs[$i]);
$grafs[$i] = preg_replace('{(\A|\s)/([^/]+)/(\s|\z)}',
'$1<i>$2</i>$3',$grafs[$i]);
 |  |  | | 11.9. Extracting Links from an HTML File |  | 11.11. Converting HTML to ASCII |
Copyright © 2003 O'Reilly & Associates. All rights reserved.
|
|