Description

iconv_unicode

- code set conversion tables for Unicode

The following code set conversions are supported:

                    CODE SET CONVERSIONS SUPPORTED
                    ------------------------------
  FROM Code Set                               TO Code Set
      Code              FROM          Target Code            TO
                        Filename                             Filename
                        Element                              Element
     
ISO 8859-1 (Latin 1)    8859-1            UTF-8               UTF-8
ISO 8859-2 (Latin 2)    8859-2            UTF-8               UTF-8
ISO 8859-3 (Latin 3)    8859-3            UTF-8               UTF-8
ISO 8859-4 (Latin 4)    8859-4            UTF-8               UTF-8
ISO 8859-5 (Cyrillic)   8859-5            UTF-8               UTF-8
ISO 8859-6 (Arabic)     8859-6            UTF-8               UTF-8
ISO 8859-7 (Greek)      8859-7            UTF-8               UTF-8
ISO 8859-8 (Hebrew)     8859-8            UTF-8               UTF-8
ISO 8859-9 (Latin 5)    8859-9            UTF-8               UTF-8
ISO 8859-10 (Latin 6)   8859-10           UTF-8               UTF-8
Japanese EUC            eucJP             UTF-8               UTF-8
Chinese/PRC EUC
(GB 2312-1980)          gb2312            UTF-8               UTF-8
ISO-2022                iso2022           UTF-8               UTF-8
Korean EUC              ko_KR-euc         Korean UTF-8        ko_KR-UTF-8
ISO-2022-KR             ko_KR-iso2022-7   Korean UTF-8        ko_KR_UTF-8
Korean Johap
(KS C 5601-1987)        ko_KR-johap       Korean UTF-8        ko_KR-UTF-8
Korean Johap
(KS C 5601-1992)        ko_KR-johap92     Korean UTF-8        ko_KR-UTF-8
Korean UTF-8            ko_KR-UTF-8       Korean EUC          ko_KR-euc
Korean UTF-8            ko_KR-UTF-8       Korean Johap        ko_KR-johap
                                          (KS C 5601-1987)        
Korean UTF-8            ko_KR-UTF-8       Korean Johap        ko_KR-johap92
                                          (KS C 5601-1992)
KOI8-R (Cyrillic)       KOI8-R            UCS-2               UCS-2
KOI8-R (Cyrillic)       KOI8-R            UTF-8               UTF-8
PC Kanji (SJIS)         PCK               UTF-8               UTF-8
PC Kanji (SJIS)         SJIS              UTF-8               UTF-8
UCS-2                   UCS-2             KOI8-R (Cyrillic)   KOI8-R
UCS-2                   UCS-2             UCS-4               UCS-4

                    CODE SET CONVERSIONS SUPPORTED
                    ------------------------------
  FROM Code Set                               TO Code Set
      Code              FROM          Target Code            TO
                        Filename                             Filename
                        Element                              Element
    
UCS-2              UCS-2           UTF-7                   UTF-7
UCS-2              UCS-2           UTF-8                   UTF-8
UCS-4              UCS-4           UCS-2                   UCS-2
UCS-4              UCS-4           UTF-16                  UTF-16
UCS-4              UCS-4           UTF-7                   UTF-7
UCS-4              UCS-4           UTF-8                   UTF-8
UTF-16             UTF-16          UCS-4                   UCS-4
UTF-16             UTF-16          UTF-8                   UTF-8
UTF-7              UTF-7           UCS-2                   UCS-2
UTF-7              UTF-7           UCS-4                   UCS-4
UTF-7              UTF-7           UTF-8                   UTF-8
UTF-8              UTF-8           ISO 8859-1 (Latin 1)    8859-1
UTF-8              UTF-8           ISO 8859-2 (Latin 2)    8859-2
UTF-8              UTF-8           ISO 8859-3 (Latin 3)    8859-3
UTF-8              UTF-8           ISO 8859-4 (Latin 4)    8859-4
UTF-8              UTF-8           ISO 8859-5 (Cyrillic)   8859-5
UTF-8              UTF-8           ISO 8859-6 (Arabic)     8859-6
UTF-8              UTF-8           ISO 8859-7 (Greek)      8859-7
UTF-8              UTF-8           ISO 8859-8 (Hebrew)     8859-8
UTF-8              UTF-8           ISO 8859-9 (Latin 5)    8859-9
UTF-8              UTF-8           ISO 8859-10 (Latin 6)   8859-10
UTF-8              UTF-8           Japanese EUC            eucJP
UTF-8              UTF-8           Chinese/PRC EUC         gb2312
                                   (GB 2312-1980)
UTF-8              UTF-8           ISO-2022                iso2022
UTF-8              UTF-8           KOI8-R (Cyrillic)       KOI8-R
UTF-8              UTF-8           PC Kanji (SJIS)         PCK
UTF-8              UTF-8           PC Kanji (SJIS)         SJIS
UTF-8              UTF-8           UCS-2                   UCS-2
UTF-8              UTF-8           UCS-4                   UCS-4
UTF-8              UTF-8           UTF-16                  UTF-16
UTF-8              UTF-8           UTF-7                   UTF-7
UTF-8              UTF-8           Chinese/PRC EUC         zh_CN.euc
                                   (GB 2312-1980)

                    CODE SET CONVERSIONS SUPPORTED
                    ------------------------------
  FROM Code Set                               TO Code Set
      Code              FROM          Target Code            TO
                        Filename                             Filename
                        Element                              Element
    
UTF-8                 UTF-8             ISO 2022-CN           zh_CN.iso2022-7
UTF-8                 UTF-8             Chinese/Taiwan Big5   zh_TW-big5
UTF-8                 UTF-8             Chinese/Taiwan  EUC   zh_TW-euc
                                        (CNS 11643-1992)
UTF-8                 UTF-8             ISO 2022-TW           zh_TW-iso2022-7
Chinese/PRC EUC       zh_CN.euc         UTF-8                 UTF-8
(GB 2312-1980)
ISO 2022-CN           zh_CN.iso2022-7   UTF-8                 UTF-8
Chinese/Taiwan Big5   zh_TW-big5        UTF-8                 UTF-8
Chinese/Taiwan  EUC   zh_TW-euc         UTF-8                 UTF-8
(CNS 11643-1992)
ISO 2022-TW           zh_TW-iso2022-7   UTF-8                 UTF-8

Examples

Example 1 The library module filename

In the conversion library, /usr/lib/iconv (see iconv(3C)), the library module filename is composed of two symbolic elements separated by the percent sign (%). The first symbol specifies the code set that is being converted; the second symbol specifies the target code, that is, the code set to which the first one is being converted.

In the conversion table above, the first symbol is termed the “FROM Filename Element”. The second symbol, representing the target code set, is the “TO Filename Element”.

For example, the library module filename to convert from the Korean EUC code set to the Korean UTF-8 code set is

ko_KR-euc%ko_KR-UTF-8

Files

/usr/lib/iconv/*.so: conversion modules

Notes

ISO 8859 character sets using Latin alphabetic characters are distinguished as follows:

ISO 8859-1 (Latin 1)

For most West European languages, including:

Albanian	Finnish	Italian
Catalan	French	Norwegian
Danish	German	Portuguese
Dutch	Galician	Spanish
English	Irish	Swedish
Faeroese	Icelandic

ISO 8859-2 (Latin 2)

For most Latin-written Slavic and Central European languages:

Czech	Polish	Slovak
German	Rumanian	Slovene
Hungarian	Croatian

ISO 8859-3 (Latin 3)

Popularly used for Esperanto, Galician, Maltese, and Turkish.

ISO 8859-4 (Latin 4)

Introduces letters for Estonian, Latvian, and Lithuanian. It is an incomplete predecessor of ISO 8859-10 (Latin 6).

ISO 8859-9 (Latin 5)

Replaces the rarely needed Icelandic letters in ISO 8859-1 (Latin 1) with the Turkish ones.

ISO 8859-10 (Latin 6)

Adds the last Inuit (Greenlandic) and Sami (Lappish) letters that were not included in ISO 8859-4 (Latin 4) to complete coverage of the Nordic area.

Skip Navigation Links
Exit Print View
	man pages section 5: Standards, Environments, and Macros Oracle Solaris 11.1 Information Library

iconv_unicode

Description

Examples

Files

See Also

Notes