|
ISO 10646 Coding Standard
To provide a common technical basis for the processing
of electronic information in various languages,
the International Organization for Standardization
(ISO) has developed an international coding standard
called ISO 10646. The ISO 10646 provides a unified
standard for the coding of characters in all major
languages in the world including traditional and
simplified Chinese characters.
The ISO 10646 standard provides a unified character-coding
standard for the communication and exchange of
electronic information. By adopting the ISO 10646
standard, various computer systems in different
parts of the world will be able to more accurately
store, process, transmit and display electronic
information in different languages, thus facilitating
the flow of electronic information and the conduct
of electronic transactions across geographical
areas.
The ISO released the first version of the ISO
10646 standard in 1993. It was called ISO/IEC
10646-1:1993. In 2000, the ISO released ISO/IEC
10646-1:2000, which is an updated version of ISO/IEC
10646-1:1993. ISO/IEC 10646-1:2000 contains 27,484
ideographic characters consisting of the 20,902
ideographic characters of ISO/IEC 10646-1:1993
plus 6,582 newly defined ideographic characters
in the Extension A. In November 2001, the ISO
released ISO/IEC 10646-2:2001 as a supplement
to ISO/IEC 10646-1:2000. ISO/IEC 10646-2:2001
contains 42,711 newly defined ideographic characters
in the Extension B, bringing the total number
of ideographic characters contained in the ISO
10646 standard to exceed 70,000.
In April 2004, ISO published the ISO/IEC 10646:2003, which is a single publication as the
result of the merger of the previous two releases of ISO 10646 standard: ISO/IEC
10646-1:2000 and its supplement ISO/IEC 10646-2:2001. Therefore, the ideographic characters in the
ISO/IEC 10646:2003 standard are the same as those in ISO/IEC 10646-1:2000 cum ISO/IEC 10646-2:2001.
In November 2005, ISO released the ISO/IEC 10646:2003/Amd 1:2005.
This release publishes a subset of ideographic characters, named as International Ideographs Core (IICORE).
The formulation of this subset of ideographs can provide conveniences for day-to-day electronic communication in
Chinese on resource-limited devices. In addition, it also facilitates global electronic communications
in Chinese by providing a shared subset of ideographs.
The IICORE contains 9,810 characters and it can be implemented in devices with limited memory,
input/output capability, and/or applications where the use of complete ISO 10646 ideographs repertoire is not feasible.
The information related to the development of the IICORE is available for reference at
http://www.cse.cuhk.edu.hk/~irg/irg/IICore/IICore.htm
For standardisation purposes, some ISO documents (including the ISO 10646 international coding
standard and its amendments ) are made freely available by the ISO at the following website:
http://standards.iso.org/ittf/PubliclyAvailableStandards/
Unicode and its relationship with ISO 10646
standard
Unicode is a character coding system designed
by the Unicode Consortium to support the interchange,
processing and display of the written texts of
all major languages in the world. Members of the
Unicode Consortium are mainly hardware and software
vendors.
In 1991, the ISO and the Unicode Consortium decided
to cooperate in defining a universal coding standard
for multilingual texts. Since then, the two organizations
have been working very closely to extend the ISO
10646 standard and Unicode, and to keep them synchronized.
The ISO releases information of characters and
code points in the ISO 10646 standard, while the
Unicode Consortium supplements the characters
and code points with implementation algorithms
and semantics information. The ISO 10646 standard
and Unicode are code-to-code identical. Unicode
can be regarded as the implementation version
of the ISO 10646 standard. Therefore, products
supporting Unicode also support the ISO 10646
standard.
Unicode 3.0 was officially released by the Unicode
Consortium in February 2000. It contains 49,194
characters of different languages, in which 27,484
are East Asian (Han) ideographic characters. Unicode
3.0 is synchronized with ISO/IEC 10646-1:2000.
Unicode 3.1 was released in March 2001. The main feature of Unicode 3.1 is the addition of
44,946 new characters, in which 42,711 are ideographic characters. Together with the existing
characters in Unicode 3.0, Unicode 3.1 has 94,140 characters, in which more than 70,000 are ideographic characters.
Unicode 4.0 was released in April 2003. It covers 1,226 new characters but the ideographic characters
included are still the same as Unicode 3.1. Unicode 4.0 is synchronized with the ISO/IEC 10646:2003.
Unicode 4.1 was released by the Unicode Consortium in March 2005,
which corresponds with the ISO/IEC 10646:2003 and its Amendments.
The latest version of the Unicode Standard is version 5.0 released in July 2006,
in which the ideographic characters included are the same as Unicode 4.1.
Hong Kong Supplementary Character Set (HKSCS)
The
HKSCS-2008
is an updated version of the Hong Kong Supplementary Character Set-2001 (HKSCS-2001)
published in December 2001. The HKSCS-2008 is aligned technically with the ISO/IEC 10646:2003 and its Amendment 1
published in April 2004 by the International Organization for Standardization (ISO).
The Government has submitted the Hong Kong Supplementary Character Set (HKSCS-2008)
to the International Organization for Standardization (ISO) for inclusion in the ISO 10646
international coding standard. The current version of the ISO 10646 standard includes all the
5,009 characters of the HKSCS.
(Notice : Revised Principle for the Inclusion of Characters in the HKSCS)
|
| (c) 2002-2010 Hong Kong Productivity Council |
|
Disclaimer Last Updated: 1-04-2010 |
|