Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP-UX Reference > I

iconv(3C)

HP-UX 11i Version 3: February 2007
» 

Technical documentation

» Feedback
Content starts here

 » Table of Contents

 » Index

NAME

iconv(), iconv_open(), iconv_close() — codeset conversion routines

SYNOPSIS

#include <iconv.h> iconv_t iconv_open(const char *tocode, const char *fromcode); size_t iconv( iconv_t cd, const char **inbuf, size_t *inbytesleft, char **outbuf, size_t *outbytesleft ); int iconv_close(iconv_t cd);

Remarks

These interfaces conform to the XPG4 standard, and should be used instead of the 9.0 iconv interfaces, such as, iconvopen(), iconvclose(), iconvsize(), iconvlock(), ICONV(), ICONV1(), and ICONV2().

Refer to the white paper entitled, HP-UX 11.0 - 11i Internationalization Features White Paper for an understanding of the conversion process. The white paper explains how iconv uses tables and methods to do the conversions. This white paper also shows you how to customize your own conversions. The white paper is in http://docs.hp.com.

DESCRIPTION

The iconv_open() routine uses two configuration files, system.config.iconv and config.iconv residing in the /usr/lib/nls/iconv directory. The entries in the system.config.iconv and config.iconv file are the set of conversions (codeset names) that are supported by iconv(). The first two columns correspond to the fromcode and tocode names. These names may be directly used or their corresponding aliases may be used as parameters to iconv_open().

iconv_open()

Returns a conversion descriptor that describes a conversion from the codeset specified by the string pointed to by the fromcode argument to the codeset specified by the tocode argument.

A conversion descriptor remains valid in a process until that process closes it.

The fromcode and tocode arguments must have a corresponding entry in the configuration files /usr/lib/nls/iconv/system.config.iconv or /usr/lib/nls/iconv/config.iconv.

The system file /usr/lib/nls/iconv/system.config.iconv can not be modified and contains codeset names supported by the operating system.

/usr/lib/nls/iconv/config.iconv can be modified by the user for customization.

iconv_open() searches the codeset names first in /usr/lib/nls/iconv/system.config.iconv and then in /usr/lib/nls/iconv/config.iconv to check if the requested conversion is supported. If so, iconv_open() determines which table and/or method to use for the conversion.

iconv()

Converts a sequence of characters from one codeset that is contained in the array specified by inbuf, into a sequence of corresponding characters in another codeset, contained in the array specified by outbuf. The codesets are those specified in the iconv_open() call that returned the conversion descriptor cd. The inbuf argument points to a variable that points to the first character in the input buffer, and inbytesleft indicates the number of remaining bytes in the buffer being converted. The outbuf argument points to a variable that points to the first available byte in the output buffer, and outbytesleft indicates the number of the available remaining bytes in the buffer.

If a sequence of input bytes does not form a valid character in the specified codeset, conversion stops after the previous successfully converted character. If the input buffer ends with an incomplete character or shift sequence (see Special Usage section), conversion stops after the previous successfully converted character. If the output buffer is not large enough to hold the entire converted output, conversion stops just prior to the character that would cause the output buffer to overflow. The variable pointed to by inbuf is updated to point to the byte following the last byte successfully used in the conversion. The value pointed to by inbyesleft is reduced to reflect the number of bytes still not converted in the input buffer. The variable pointed to by outbuf is updated to point to the byte following the last byte of converted output data. The value pointed to by outbytesleft is reduced to reflect the number of bytes still available in the output buffer.

If iconv() encounters a character in the input buffer that is legal but for which an identical character does not exist in the target codeset, iconv() maps this character to a pre-defined character, called the "galley character" that is defined at the time of table generation. (See genxlt(1)).

iconv_close()

Deallocates the conversion descriptor cd and all other associated resources allocated by iconv_open().

APPLICATION USAGE

Portable applications must assume that conversion descriptors are not valid after calls to any of the exec functions.

Special Usage

In state-dependent encodings, the characters are interpreted depending on "state" of the input. State shifts occur when a specific sequence of bytes are seen in the input. These sequences will change the way subsequent characters are interpreted (that is, initially the characters may be single-byte characters, after a state shift, subsequent characters may be interpreted as two-byte characters). For state-dependent encodings, the conversion descriptor after iconv_open() is in a codeset-dependent initial shift state, ready for immediate use with iconv().

For state-dependent encodings, the conversion descriptor cd is placed into its initial shift state by a call to iconv() for which the inbuf is a null pointer, or for which inbuf points to a null pointer. When iconv() is called in this way, and outbuf is not a null pointer or a pointer to a null pointer, and outbytesleft points to a positive value, iconv() places the byte sequence to change the output buffer to its initial shift state. If the output buffer is not large enough to hold the entire reset sequence, iconv() fails and sets errno to E2BIG. Subsequent calls with inbuf set to other than a null pointer or a pointer to a null pointer cause the conversion to take place from the current state of the conversion descriptor.

For state-dependent encodings, the conversion descriptor is updated to reflect the shift state in effect at the end of the last successfully converted byte sequence.

RETURN VALUE

iconv_open()

Upon successful completion, iconv_open() returns a conversion descriptor for use on subsequent calls to iconv(). Otherwise iconv_open() returns (iconv_t)-1 and sets errno to indicate the error.

iconv()

iconv() updates the variables pointed to by the arguments to reflect the extent of conversion, and returns the number of non-identical conversions performed. If the entire string in the input buffer is converted, the value pointed to by inbytesleft is zero. If an error occurs, iconv() returns (size_t)-1 and sets errno to indicate the error.

iconv_close()

Upon successful completion, iconv_close() returns a value of zero. Otherwise it returns -1 and sets errno to indicate the error.

ERRORS

iconv_open() fails if any of the following conditions are encountered:

ENOMEM

Insufficient storage space is available.

EINVAL

The conversion specified by the fromcode and tocode is not supported, or the table or method specified in the configuration file could not be read or loaded correctly. This error will also occur if the configuration file itself is faulty.

iconv() fails if any of the following conditions are encountered:

EILSEQ

Input conversion stopped due to an input character that does not belong to the input codeset, or if the conversion table does not contain an entry corresponding to this input character and a galley character was not defined for that particular table.

E2BIG

Input conversion stopped due to lack of space in the output buffer.

EINVAL

Input conversion stopped due to an incomplete character or shift sequence at the end of the input buffer.

EBADF

The cd argument is not a valid open conversion descriptor.

iconv_close() fails if any of the following conditions are encountered:

EBADF

The conversion descriptor is invalid.

EXAMPLES

The following example shows how the iconv() interfaces maybe used for conversions.

#include <iconv.h> #include <errno.h> main() { ... convert("roman8", "iso88591", fd); ... } int convert(tocode, fromcode, Input) char *tocode; /* tocode name */ char *fromcode /* fromcode name */ int Input; /* input file descriptor */ { extern void error(); /* local error message */ iconv_t cd; /* conversion descriptor */ unsigned char *table; /* ptr to translation table */ int bytesread; /* num bytes read into input buffer */ unsigned char inbuf[BUFSIZ]; /* input buffer */ unsigned char *inchar; /* ptr to input character */ size_t inbytesleft; /* num bytes left in input buffer */ unsigned char outbuf[BUFSIZ]; /* output buffer */ unsigned char *outchar; /* ptr to output character */ size_t outbytesleft; /* num bytes left in output buffer */ size_t ret_val; /* number of conversions */ /* Initiate conversion -- get conversion descriptor */ if ((cd = iconv_open(tocode, fromcode)) == (iconv_t)-1) { error(FATAL, BAD_OPEN); } inbytesleft = 0; /* no. of bytes converted */ /* translate the characters */ for ( ;; ) { /* * if any bytes are leftover, they will be in the * beginning of the buffer on the next read(). */ inchar = inbuf; /* points to input buffer */ outchar = outbuf; /* points to output buffer */ outbytesleft = BUFSIZ; /* no of bytes to be converted */ if ((bytesread = read(Input, inbuf+inbytesleft, (size_t)BUFSIZ-inbytesleft)) < 0) { perror("prog"); return BAD; } if (!(inbytesleft += bytesread)) { break; /* end of conversions */ } ret_val = iconv(cd, &inchar, &inbytesleft, &outchar, &outbytesleft); if (write(1, outbuf, (size_t)BUFSIZ-outbytesleft) < 0) { perror("prog"); return BAD; } /* iconv() returns the number of non-identical conversions * performed. If the entire string in the input buffer is * converted, the value pointed to by inbytesleft will be * zero. If the conversion stopped due to any reason, the * value pointed to by inbytesleft will be non-zero and * errno is set to indicate the condition. */ if ((ret_val == -1) && (errno == EINVAL)) { /* Input conversion stopped due to an incomplete * character or shift sequence at the end of the * input buffer. */ /* Copy data left, to the start of buffer */ memcpy((char *)inbuf, (char *)inchar, (size_t)inbytesleft); } else if ((ret_val == -1) && (errno == EILSEQ)) { /* Input conversion stopped due to an input byte * that does not belong to the input codeset. */ error(FATAL, BAD_CONVERSION); } else if ((ret_val == -1) && (errno == E2BIG)) { /* Input conversion stopped due to lack of space * in the output buffer. inbytesleft has the * number of bytes to be converted. */ memcpy((char *)inbuf, (char *)inchar, (size_t)inbytesleft); } /* Go back and read from the input file. */ } /* end conversion & get rid of the conversion table */ if (iconv_close(cd) == BAD) { error(FATAL, BAD_CLOSE); } return GOOD; }

WARNINGS

If you use iconv() and compile/link your application archive on PA-RISC systems, note that iconv() has a dependency on libdld.sl that will require a change to the compile/link command:

Compile :

cc -Wl,-a,archive -Wl,-E -Wl,+n -l:libdld.sl -o outfile source

Or compile with CCOPTS and LDOPTS:

export CCOPTS="-Wl,-a,archive options -Wl,-E -l:libdld.sl" export LDOPTS="options -E +n -l:libdld.sl" cc -o outfile source

The option -Wl,-a,archive is positionally dependent and should occur at the beginning of the compile line. For optimum compatibility in future releases, you should avoid using archive libc with other shared libraries except for libdld.sl as needed above.

There is a corner-case situation for multi-byte characters that is not correctly handled by iconv(). If the last character in the file being converted is an invalid multi-byte character, iconv() returns EINVAL instead of EILSEQ. The application can get around this by checking whether EOF is reached or if this is the last buffer being converted. In this case, EINVAL should be treated as EILSEQ.

AUTHOR

iconv() was developed by HP.

FILES

/usr/lib/nls/iconv/system.config.iconv

System iconv configuration file containing codeset names supported by the operating system.

/usr/lib/nls/iconv/config.iconv

User customizable iconv configuration file containing additional codeset names.

iconv_open() searches the codeset names first in /usr/lib/nls/iconv/system.config.iconv and then in /usr/lib/nls/iconv/config.iconv to check if the requested conversion is supported. If so, iconv_open() determines which table and/or method to use for the conversion.

/usr/lib/nls/iconv/tables

Directory containing tables used for conversion.

/usr/lib/nls/iconv/methods

Directory containing methods used for conversion.

SEE ALSO

genxlt(1), iconv(1), thread_safety(5).

HP-UX 11.0 - 11i Internationalization Features White Paper at http://docs.hp.com

STANDARDS CONFORMANCE

iconv_open(): XPG4

iconv(): XPG4

iconv_close(): XPG4

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 1983-2007 Hewlett-Packard Development Company, L.P.