home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam    

UNIX Power Tools

UNIX Power ToolsSearch this book
Previous: 35.11 Hacking on Characters with tr Chapter 35
You Can't Quite Call This Editing
Next: 35.13 Other Conversions with dd

35.12 Converting Between ASCII and EBCDIC

The first time I was handed an EBCDIC tape, I discovered the wonders of the standard UNIX utility dd . It is great for reading tapes generated on non-UNIX systems. (The GNU version of dd is on the CD-ROM.)

You do need to understand a bit about the blocking factors on the foreign tape , ( 20.6 ) but once you've got that down, you can handle just about anything.

For example, to read an EBCDIC tape on tape device /dev/rmt0 and convert it to ASCII, putting the output in file was_ibm :


dd if=/dev/rmt0 of=was_ibm ibs=800 cbs=80 conv=ascii

dd reads standard input and writes to standard output, but if you want to specify file or device names, you can use the fairly non-standard if= and of= options to specify the input file and output file, respectively.

If you wanted to convert the other way, you could use this command:


dd if=was_unix of=/dev/rmt0 obs=800 cbs=80 conv=ebcdic

There's also a conv=ibm option, which uses a different ASCII to EBCDIC conversion table. According to the dd manual page, "The ASCII/EBCDIC conversion tables are taken from the 256 character standard in the CACM Nov, 1968. The ibm conversion, while less blessed as a standard, corresponds better to certain IBM print train conventions. There is no universal solution."

Some gotchas:

  • You need to be able to read the raw device ( 20.3 ) to do the conversion, since the tape probably doesn't use standard UNIX tape block sizes.

  • You need to know the blocking factor of the foreign tape, so you can tell dd about it.

  • If the foreign tape has multiple files on it, you'll have to use the tape device name that allows "no rewind on close" ( 20.3 ) to read past the first file.

One last thing to mention about dd : all options that refer to sizes expect counts in bytes, unless otherwise mentioned. However, you can use keyletters to indicate various types of multiplication: k means to multiply by 1024; b to multiply by 512 (a block); and w to multiply by 4 (word). You can also show an arbitrary multiplication by separating two numbers with an x .


Previous: 35.11 Hacking on Characters with tr UNIX Power Tools Next: 35.13 Other Conversions with dd
35.11 Hacking on Characters with tr Book Index 35.13 Other Conversions with dd

The UNIX CD Bookshelf Navigation The UNIX CD BookshelfUNIX Power ToolsUNIX in a NutshellLearning the vi Editorsed & awkLearning the Korn ShellLearning the UNIX Operating System