CHAR and VARCHAR data types

Informix®

Informix supports the following character data types:

Table 1. Informix character data types
Informix data type	Description
`CHAR(n)`	SBCS and MBCS character data (max is 32767 bytes)
`VARCHAR(n[,m])`	SBCS and MBCS character data (max is 255 bytes)
`NCHAR(n)`	Same as `CHAR`, with specific collation order
`NVARCHAR(n[,m])`	Same as `VARCHAR`, with specific collation order
`LVARCHAR(n)`	max size varies depending on the IDS version

With Informix, both CHAR/VARCHAR and NCHAR/NVARCHAR data types can be used to store single-byte or multibyte encoded character strings. The only difference between CHAR/VARCHAR and NCHAR/NVARCHAR is in how they use sorting: N[VAR]CHAR types use the collation order, while [VAR]CHAR types use the byte order.

The character set used to store strings in CHAR/VARCHAR/NCHAR/NVARCHAR columns is defined by the DB_LOCALE environment variable.

The character set used by applications is defined by the CLIENT_LOCALE environment variable.

Informix uses Byte Length Semantics (the size N that you specify in [VAR]CHAR(N) is expressed in bytes, not characters as in some other databases)

Netezza®

Netezza supports following data types to store character data:

Table 2. Netezza character data types
Netezza data type	Description
`CHAR(n)`	SBCS character data using the database character set, where `n` is specified in bytes (max is 64000 bytes)
`VARCHAR(n)`	SBCS character data using the database character set, where `n` is specified in bytes (max is 64000 bytes)
`NCHAR(n)`	Unicode/UTF-8 character data, where `n` is specified in characters (max is 16000 characters)
`NVARCHAR(n)`	Unicode/UTF-8 character data, where `n` is specified in characters (max is 16000 characters)

Netezza stores single-byte character data in CHAR/VARCHAR columns, and stores UNICODE (UTF-8 encoded) character strings in NCHAR/NVARCHAR columns. You cannot store UTF-8 strings in CHAR/VARCHAR columns.

NCHAR/NVARCHAR data is always stored in UTF-8. The database character defines the encoding for CHAR and VARCHAR columns and is defined when creating the database with the CREATE DATABASE command; the default is latin9. Note that, at the time of writing these lines, Netezza V6 does not yet support a different database character set than latin9.

No automatic character set conversion is done by the Netezza software. When using CHAR/VARCHAR columns, the client application character set (LC_ALL, LANG) and the database character set defined by CREATE DATABASE must match. When using NCHAR/NVARCHAR columns, the client application character set must be UTF-8.

Solution

If your application uses a single-byte character set (i.e. latin9), you can create tables with the CHAR and VARCHAR types. However, if you want to store UNICODE (UTF-8) strings, you must use the NCHAR/NVARCHAR types instead when creating tables. In program sources you can use CHAR/VARCHAR; these types can hold single and multibyte character sets, based on the C POSIX locale.

Important: Netezza (V6 while writing these lines) supports only the latin9 database character set for CHAR/VARCHAR types. Since character set conversion is not supported, you can only implement either latin9 or UTF-8 based applications.

When using a multibyte character set (such as UTF-8), define database columns as NCHAR and NVARCHAR, with the size in character units, and use character length semantics in BDL programs with FGL_LENGTH_SEMANTICS=CHAR.

When extracting a database schema from a Netezza database, the fgldbsch schema extractor uses the size of the column in characters, not the octet length. If you have created a CHAR(10 (characters) ) column a in Netezza database using the UTF-8 character set, the .sch file will get a size of 10, that will be interpreted by FGL_LENGTH_SEMANTICS as a number of bytes or characters.

Do not forget to properly define the database client character set, which must correspond to the runtime system character set.

See also the section about Localization.

The CHAR/VARCHAR type translation can be controlled with the following FGLPROFILE entries:

dbi.database.dsname.ifxemul.datatype.char = { true | false }
dbi.database.dsname.ifxemul.datatype.varchar = { true | false }

For more details see IBM Informix emulation parameters in FGLPROFILE.