ISO/IEC 14496-10:2008 信息技术 音频-可视对象的编码 第10部分:高级视频编码
标准编号:ISO/IEC 14496-10:2008
中文名称:信息技术 音频-可视对象的编码 第10部分:高级视频编码
英文名称:Information technology — Coding of audio-visual objects — Part 10: Advanced Video Coding
发布日期:2008-09
标准范围
ISO/IEC 14496-10:20 08是与ITU-T联合开发的,以响应各种应用(如数字存储介质、电视广播、互联网流和实时视听通信)对更高压缩的运动图像日益增长的需求。它还被设计成能够以灵活的方式将编码视频表示用于各种各样的网络环境。它被设计成通用的,因为它服务于广泛的应用、比特率、分辨率、质量和服务。ISO/IEC 14496-10:20 08的使用允许将运动视频作为计算机数据的形式进行操作并存储在各种存储介质上,在现有和未来的网络上发送和接收,并在现有和未来的广播信道上分发。在创建ISO/IEC 14496-10:20 08的过程中,已经考虑了来自各种应用的需求,已经开发了必要的算法元素,并且已经将这些元素集成到单个语法中。因此,ISO/IEC 14496-10:20 08将促进不同应用之间的视频数据交换。在语法中指定的编码表示被设计成在图像质量的最小退化的情况下实现高压缩能力。该算法通常不是无损的,因为确切的源样本值通常不会通过编码和解码过程被保留。定义了具有相关联的解码过程的多个语法特征,这些语法特征可以用于实现高效压缩,并且可以无损地发送各个选定区域。预期的编码算法(在ISO/IEC 14496-10:20 08中未指定)可以在用于每个图片的块形区域的帧间和帧内编码之间进行选择。帧间译码使用运动矢量进行基于块的图片间预测,以利用不同图片之间的时间统计相关性。帧内编码使用空间预测模式来利用单个图片内的源信号中的空间统计相关性。运动向量和帧内预测模式可以与图片中的各种块大小相关联。然后使用空间变换来处理在帧内或帧间预测之后剩余的残余信号以去除每个变换块内的空间相关性。然后对变换后的块进行量化。量化是不可逆的过程,其形成可以使用减少数量的比特来表示的近似值,同时招致一些保真度损失。最后,将运动矢量或帧内预测模式与量化变换系数信息组合,并使用上下文自适应可变长度码或上下文自适应二进制算术编码进行编码。附录A至E和G包含规范性要求,是ISO/IEC 14496-10:20 08的组成部分。附录A定义了11个配置文件(基线、主要、扩展、高、高10、高4:2:2、高4:4:4预测、高10内、高4:2:2内、高4:4:4内和CAVLC 4:4:4内),每个配置文件针对一组应用域定制,并且还定义了每个配置文件内的能力水平。附录B指定用于将编码视频作为字节或比特的有序流递送的字节流格式的语法和语义。附录C规定了假设的参考解码器及其用于检查比特流和解码器一致性的用途。附录D规定了补充增强信息消息有效载荷的语法和语义。附录E规定了编码视频序列的视频可用性信息参数的语法和语义。附录G规定了三个附加简档(可缩放基线、可缩放高和可缩放高帧内)中的可缩放视频译码,其使得编码视频比特流能够被构造成层,使得比特流的分层子集可以被独立地解码以提供与每个较小比特流中保留的数据量相称的视频质量。ISO/IEC 14496-10:20 08是ISO/IEC 14496-10规范的第四版。它包括第一版(ISO/IEC 14496-10:20 03)中规定的相同技术材料,加上以下内容:-2004年、2005年和2006年勘误表中规定的综合勘误表更正。-主要针对被称为“保真度范围扩展”的2004年修正案和被称为“专业应用扩展”的2007年修正案中规定的高分辨率、高质量视频应用的增强功能。-增强功能,支持2007年修正案中规定的额外色彩空间和纵横比定义。支持2007年修正案中规定的可扩展视频编码的增强功能。
ISO/IEC 14496-10:2008 was developed jointly with the ITU-T in response to the growing need for higher compression of moving pictures for various applications such as digital storage media, television broadcasting, Internet streaming, and real-time audiovisual communication. It is also designed to enable the use of the coded video representation in a flexible manner for a wide variety of network environments. It is designed to be generic in the sense that it serves a wide range of applications, bit rates, resolutions, qualities and services. The use of ISO/IEC 14496-10:2008 allows motion video to be manipulated as a form of computer data and to be stored on various storage media, transmitted and received over existing and future networks and distributed on existing and future broadcasting channels. In the course of creating ISO/IEC 14496-10:2008, requirements from a wide variety of applications have been considered, necessary algorithmic elements have been developed, and these have been integrated into a single syntax. Hence, ISO/IEC 14496-10:2008 will facilitate video data interchange among different applications.The coded representation specified in the syntax is designed to enable a high compression capability with minimal degradation of image quality. The algorithm is not ordinarily lossless, as the exact source sample values are typically not preserved through the encoding and decoding processes. A number of syntactical features with associated decoding processes are defined that can be used to achieve highly efficient compression, and individual selected regions can be sent without loss. The expected encoding algorithm (not specified in ISO/IEC 14496-10:2008) can select between inter and intra coding for block-shaped regions of each picture. Inter coding uses motion vectors for block-based inter-picture prediction to exploit temporal statistical dependencies between different pictures. Intra coding uses spatial prediction modes to exploit spatial statistical dependencies in the source signal within a single picture. Motion vectors and intra prediction modes may be associated with a variety of block sizes in a picture. The residual signal remaining after intra or inter prediction is then processed using a spatial transform to remove spatial correlation within each transform block. The transformed blocks are then quantised. Quantisation is an irreversible process that forms an approximation that can be represented using a reduced number of bits while incurring some loss of fidelity. Finally, the motion vectors or intra prediction modes are combined with the quantised transform coefficient information and encoded using either context-adaptive variable length codes or context-adaptive binary arithmetic coding.Annexes A through E and G contain normative requirements and are an integral part of ISO/IEC 14496-10:2008. Annex A defines eleven profiles (Baseline, Main, Extended, High, High 10, High 4:2:2, High 4:4:4 Predictive, High 10 Intra, High 4:2:2 Intra, High 4:4:4 Intra, and CAVLC 4:4:4 Intra), each being tailored to a group of application domains, and also defines levels of capability within each of these profiles. Annex B specifies the syntax and semantics of a byte stream format for delivery of the coded video as an ordered stream of bytes or bits. Annex C specifies the Hypothetical Reference Decoder and its use to check bitstream and decoder conformance. Annex D specifies syntax and semantics for Supplemental Enhancement Information message payloads. Annex E specifies syntax and semantics of the Video Usability Information parameters of coded video sequences. Annex G specifies scalable video coding in three additional profiles (Scalable Baseline, Scalable High, and Scalable High Intra) which enable a coded video bitstream to be structured into layers, such that layered subsets of the bitstream can be independently decodable to provide video quality commensurate with the quantity of data that remains in each smaller bitstream.ISO/IEC 14496-10:2008 is the fourth edition of the ISO/IEC 14496-10 specification. It includes the same technical material specified in the first edition (ISO/IEC 14496-10:2003), plus the following:– integrated errata corrections that were specified in corrigenda of 2004, 2005, and 2006.– enhancements primarily for high-resolution, high-quality video applications that were specified in a 2004 amendment referred to as "Fidelity Range Extensions" and a 2007 amendment referred to as "Professional Application Extensions".– enhancements supporting additional color spaces and aspect ratio definitions specified in a 2007 amendment.enhancements supporting scalable video coding as specified in a 2007 amendment.
标准预览图


