ISO/IEC 10646:2020 信息技术 通用编码字符集

标准编号:ISO/IEC 10646:2020

中文名称:信息技术 通用编码字符集

英文名称:Information technology — Universal coded character set (UCS)

发布日期:2020-12

标准范围

本文件-指定UCS的架构;-定义用于UCS的术语;-描述UCS码空间的一般结构;-指定UCS的指定平面:UCS的基本多语言平面(BMP)、补充多语言平面(SMP)、补充表意平面(SIP)、第三表意平面(TIP)和补充专用平面(SSP);-定义一组在世界范围内用于脚本和语言书面形式的图形字符;-指定BMP、SMP、SIP、TIP、SSP的图形字符和格式字符的名称及其在UCS代码空间内的编码表示;-指定控制字符和私有使用字符的编码表示;-指定UCS的三种编码形式:UTF-8、UTF-16和UTF-32;-指定UCS的七种编码方案:UTF-8、UTF-16、UTF-16BE、UTF-16LE、UTF-32、UTF-32BE和UTF-32LE;-指定将来添加到此编码字符集的管理。请注意,本文档没有指定这些字符是否适合用作编程语言中的标识符,但可以在外部参考文献中找到。见附件U。

This document
- specifies the architecture of the UCS;
- defines terms used for the UCS;
- describes the general structure of the UCS codespace;
- specifies the assigned planes of the UCS: the Basic Multilingual Plane (BMP) of the UCS, the Supplementary Multilingual Plane (SMP), the Supplementary Ideographic Plane (SIP), the Tertiary Ideographic Plane (TIP), and the Supplementary Special-purpose Plane (SSP);
- defines a set of graphic characters used in scripts and the written form of languages on a world-wide scale;
- specifies the names for the graphic characters and format characters of the BMP, SMP, SIP, TIP, SSP and their coded representations within the UCS codespace;
- specifies the coded representations for control characters and private use characters;
- specifies three encoding forms of the UCS: UTF-8, UTF-16, and UTF-32;
- specifies seven encoding schemes of the UCS: UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, and UTF-32LE;
- specifies the management of future additions to this coded character set.
NOTE The determination of suitability of these characters for use as identifiers in programming languages is not specified by this document but can be found in an external reference. See Annex U.

标准预览图


立即下载标准文件