United States-English |
|
|
HP-UX Reference > Iiconv(3C)HP-UX 11i Version 3: February 2007 |
|
NAMEiconv(), iconv_open(), iconv_close() — codeset conversion routines SYNOPSIS#include <iconv.h> iconv_t iconv_open(const char *tocode, const char *fromcode); size_t iconv( iconv_t cd, const char **inbuf, size_t *inbytesleft, char **outbuf, size_t *outbytesleft ); int iconv_close(iconv_t cd); RemarksThese interfaces conform to the XPG4 standard, and should be used instead of the 9.0 iconv interfaces, such as, iconvopen(), iconvclose(), iconvsize(), iconvlock(), ICONV(), ICONV1(), and ICONV2(). Refer to the white paper entitled, HP-UX 11.0 - 11i Internationalization Features White Paper for an understanding of the conversion process. The white paper explains how iconv uses tables and methods to do the conversions. This white paper also shows you how to customize your own conversions. The white paper is in http://docs.hp.com. DESCRIPTIONThe iconv_open() routine uses two configuration files, system.config.iconv and config.iconv residing in the /usr/lib/nls/iconv directory. The entries in the system.config.iconv and config.iconv file are the set of conversions (codeset names) that are supported by iconv(). The first two columns correspond to the fromcode and tocode names. These names may be directly used or their corresponding aliases may be used as parameters to iconv_open().
APPLICATION USAGEPortable applications must assume that conversion descriptors are not valid after calls to any of the exec functions. Special UsageIn state-dependent encodings, the characters are interpreted depending on "state" of the input. State shifts occur when a specific sequence of bytes are seen in the input. These sequences will change the way subsequent characters are interpreted (that is, initially the characters may be single-byte characters, after a state shift, subsequent characters may be interpreted as two-byte characters). For state-dependent encodings, the conversion descriptor after iconv_open() is in a codeset-dependent initial shift state, ready for immediate use with iconv(). For state-dependent encodings, the conversion descriptor cd is placed into its initial shift state by a call to iconv() for which the inbuf is a null pointer, or for which inbuf points to a null pointer. When iconv() is called in this way, and outbuf is not a null pointer or a pointer to a null pointer, and outbytesleft points to a positive value, iconv() places the byte sequence to change the output buffer to its initial shift state. If the output buffer is not large enough to hold the entire reset sequence, iconv() fails and sets errno to E2BIG. Subsequent calls with inbuf set to other than a null pointer or a pointer to a null pointer cause the conversion to take place from the current state of the conversion descriptor. For state-dependent encodings, the conversion descriptor is updated to reflect the shift state in effect at the end of the last successfully converted byte sequence. RETURN VALUE
ERRORSiconv_open() fails if any of the following conditions are encountered:
iconv() fails if any of the following conditions are encountered:
iconv_close() fails if any of the following conditions are encountered:
EXAMPLESThe following example shows how the iconv() interfaces maybe used for conversions. #include <iconv.h> #include <errno.h> main() { ... convert("roman8", "iso88591", fd); ... } int convert(tocode, fromcode, Input) char *tocode; /* tocode name */ char *fromcode /* fromcode name */ int Input; /* input file descriptor */ { extern void error(); /* local error message */ iconv_t cd; /* conversion descriptor */ unsigned char *table; /* ptr to translation table */ int bytesread; /* num bytes read into input buffer */ unsigned char inbuf[BUFSIZ]; /* input buffer */ unsigned char *inchar; /* ptr to input character */ size_t inbytesleft; /* num bytes left in input buffer */ unsigned char outbuf[BUFSIZ]; /* output buffer */ unsigned char *outchar; /* ptr to output character */ size_t outbytesleft; /* num bytes left in output buffer */ size_t ret_val; /* number of conversions */ /* Initiate conversion -- get conversion descriptor */ if ((cd = iconv_open(tocode, fromcode)) == (iconv_t)-1) { error(FATAL, BAD_OPEN); } inbytesleft = 0; /* no. of bytes converted */ /* translate the characters */ for ( ;; ) { /* * if any bytes are leftover, they will be in the * beginning of the buffer on the next read(). */ inchar = inbuf; /* points to input buffer */ outchar = outbuf; /* points to output buffer */ outbytesleft = BUFSIZ; /* no of bytes to be converted */ if ((bytesread = read(Input, inbuf+inbytesleft, (size_t)BUFSIZ-inbytesleft)) < 0) { perror("prog"); return BAD; } if (!(inbytesleft += bytesread)) { break; /* end of conversions */ } ret_val = iconv(cd, &inchar, &inbytesleft, &outchar, &outbytesleft); if (write(1, outbuf, (size_t)BUFSIZ-outbytesleft) < 0) { perror("prog"); return BAD; } /* iconv() returns the number of non-identical conversions * performed. If the entire string in the input buffer is * converted, the value pointed to by inbytesleft will be * zero. If the conversion stopped due to any reason, the * value pointed to by inbytesleft will be non-zero and * errno is set to indicate the condition. */ if ((ret_val == -1) && (errno == EINVAL)) { /* Input conversion stopped due to an incomplete * character or shift sequence at the end of the * input buffer. */ /* Copy data left, to the start of buffer */ memcpy((char *)inbuf, (char *)inchar, (size_t)inbytesleft); } else if ((ret_val == -1) && (errno == EILSEQ)) { /* Input conversion stopped due to an input byte * that does not belong to the input codeset. */ error(FATAL, BAD_CONVERSION); } else if ((ret_val == -1) && (errno == E2BIG)) { /* Input conversion stopped due to lack of space * in the output buffer. inbytesleft has the * number of bytes to be converted. */ memcpy((char *)inbuf, (char *)inchar, (size_t)inbytesleft); } /* Go back and read from the input file. */ } /* end conversion & get rid of the conversion table */ if (iconv_close(cd) == BAD) { error(FATAL, BAD_CLOSE); } return GOOD; } WARNINGSIf you use iconv() and compile/link your application archive on PA-RISC systems, note that iconv() has a dependency on libdld.sl that will require a change to the compile/link command: Compile : cc -Wl,-a,archive -Wl,-E -Wl,+n -l:libdld.sl -o outfile source Or compile with CCOPTS and LDOPTS: export CCOPTS="-Wl,-a,archive options -Wl,-E -l:libdld.sl" export LDOPTS="options -E +n -l:libdld.sl" cc -o outfile source The option -Wl,-a,archive is positionally dependent and should occur at the beginning of the compile line. For optimum compatibility in future releases, you should avoid using archive libc with other shared libraries except for libdld.sl as needed above. There is a corner-case situation for multi-byte characters that is not correctly handled by iconv(). If the last character in the file being converted is an invalid multi-byte character, iconv() returns EINVAL instead of EILSEQ. The application can get around this by checking whether EOF is reached or if this is the last buffer being converted. In this case, EINVAL should be treated as EILSEQ. FILES
SEE ALSOgenxlt(1), iconv(1), thread_safety(5). HP-UX 11.0 - 11i Internationalization Features White Paper at http://docs.hp.com |
Printable version | ||
|