什么是编解码器?

这是我们正在进行的“是什么...?”文章, 旨在提供定义, 历史, 以及围绕在线视频行业的重要术语和问题的背景.

执行概要

编解码器是互联网的氧气流媒体 market; no 编解码器s, no 流媒体. From shooting video to editing to encoding our 流媒体 files for delivery, 编解码器涉及到整个过程的每一步. 许多视频制作商也涉足DVD-ROM和蓝光市场, 以及广播, 编解码器也在其中发挥了作用.

虽然你可能知道编解码器是什么，但你真的了解编解码器吗? 当然不会像你读了这篇文章之后那样好. 首先，我们将介绍有关编解码器如何工作的基础知识, 然后，我们将研究各种编解码器执行的不同角色. 接下来我们将研究H.264成为当今使用最广泛的视频编解码器, 最后以音频编解码器的快速讨论结束.

编解码器的基础

编解码器是一种压缩技术，有两个组件, 压缩文件的编码器, 还有一个解码器来解压缩. 有数据的编解码器(PKZIP), 静止图像(JPEG), GIF, PNG), (MP3音频, AAC)和视频(Cinepak, mpeg - 2, H.264, VP8).

的re are two kinds of 编解码器s; lossless, 和 lossy. Lossless 编解码器s, like PKZIP or PNG, reproduce the same exact file as the original upon decompression. 有一些无损视频编解码器，包括苹果的动画编解码器和 Lagarith编解码器但这些设备无法将视频压缩到足够低的数据速率，无法进行流媒体传输.

与无损编解码器相比, 有损编解码器在解压缩时产生原始文件的副本, 但不是原始文件. 有损编解码器有一个不可改变的缺点——较低的数据速率, 解压后的文件看起来(或听起来)越不像原始文件. 换句话说，你压缩得越多，你失去的质量就越多.

Lossy compression technologies use two types of compression, intra-frame 和 inter-frame compression. Intra-frame compression is essentially still image compression applied to video, 每一帧被压缩而不参考任何其他帧. 例如, Motion-JPEG只使用帧内压缩, 将每一帧编码为单独的JPEG图像. 的 DV编解码器也只使用帧内压缩 DVCPRO-HD, 它将每个高清帧划分为四个SD DV块, 所有编码完全通过帧内压缩.

In contrast, inter-frame compression uses redundancies between frames to compress video. 例如, in a talking head scenario, much of the background remains static. 帧间技术只存储一次静态背景信息, 然后在随后的帧中只存储改变的信息. 帧间压缩比帧间压缩效率高得多, so most 编解码器s are optimized to search for 和 leverage redundant 信息 between frames.

Early CD-ROM based 编解码器s like Cinepak 和 Indeo used two types of frames for this operation: key frames 和 delta frames. Key frames stored the complete frame 和 were compressed only with intra-frame compression. 在编码, 将增量帧中的像素与前一帧中的像素进行比较, 多余的信息被删除了. 的 remaining data in each delta frame is also compressed using intra-frame techniques as necessary to meet the target data rate of the file.

什么是三角架?

图1. 关键帧和增量帧由基于CD-ROM的编解码器部署.

这显示在图1，这是画家的一个会说话的头部视频，显示在左上角. During the video, the only regions in the frame that change are the mouth, cigar, 和 eyes. 的 four delta frames store only the blocks of pixels that have changed 和 refer back to the key frame during decompression for the redundant 信息.

在这个场景中使用一个动画文件, 帧间压缩是无损的, because you could recreate the original animation bit for bit with 信息 stored in the key 和 delta frames. 然而，对于真实世界的视频，操作并不是无损的, 效率很高, which explains why talking head videos encode at much higher quality than soccer matches or NASCAR races.

长GOP格式

Since the CD-ROM days, inter-frame techniques have advanced, 和 most 编解码器s, including mpeg - 2, H.264和VC-1, 现在使用三种帧类型进行压缩:i帧, B-frames, 和P-frames, 如图所示图2. i帧与关键帧相同, 并且仅使用帧内技术进行压缩, 使它们成为最大的, 效率最低的框架类型.

图2. I-， B-和p -帧用于最先进的压缩技术.

b坐标系和p坐标系都是delta坐标系. P-frames are the simplest, 和 can utilize redundant 信息 in any previous I- or P-frame. b框架更为复杂, 并且可以利用任何之前或之后的I-中的冗余信息, B或p坐标系. 这使得b帧是三种帧类型中效率最高的.

这些多种帧类型存储在一组图片中, 或共和党, that starts with each I-frame 和 includes all frames up to but not including the subsequent I-frame. 使用所有三种帧类型的编解码器通常称为“长GOP格式”,主要是在非线性编辑系统中使用编解码器时. This highlights the second fundamental trade-off of lossy compression technologies: quality for decode complexity. 这是, 编解码器提供的质量越高, 就越难解码, 特别是在视频编辑等交互式应用程序中.

非线性编辑系统中使用的第一个长gop格式是丁肝病毒这是一种基于mpeg - 2的格式，想象一下它带来的复杂性. 例如, 使用DV和Motion-JPEG, 每一帧都是完全自我参照的, 所以你可以把编辑播放头拖到视频中的任何一帧, 它可以实时解压.

然而, 使用基于mpeg - 2的丁肝病毒, 如果你把游戏头拖到b帧, the non-linear editor would have to decompress all frames referenced by that B-frame, 和 those frames could be located before or after that B-frame in the timeline. 在当时动力不足的计算机系统上, most working with 32-bit operating systems that could address only 2GB of memory, 长gop格式导致明显的延迟，这使得编辑无响应.

As camcorders increasingly came to rely on long GOP formats like mpeg - 2 和 H.264存储他们的数据, 一种新型编解码器, 通常称为中间编解码器, 到达现场. 这些公司包括Cineform公司.苹果ProRes和Avid DNxHD. 的se 编解码器s use solely intra-frame compression techniques for maximum editing responsiveness, 非常高的数据留存率.

函数专用编解码器

的se intermediate 编解码器s highlight the fact that while there is some crossover, 通过它们的功能来识别编解码器是很有用的, 其中包括以下类别:

用于摄像机的采集编解码器

的se includes Motion-JPEG used in DV 和 DVCPROHD, mpeg - 2 as used in Sony’s XDCAM HD 和丁肝病毒, 和 H.AVCHD和许多数码单反相机中使用的264. 的 role of the 编解码器 here is to capture at as high a quality as possible while meeting the data rate requirements of the on-board storage mechanism.

中间编解码器，如上所述，主要在编辑期间使用

As mentioned, in this role, these 编解码器s are designed to optimize editing responsiveness 和 quality.

交付编解码器

这些包括用于DVD、广播和卫星的mpeg - 2、mpeg - 2、VC-1和H.264是蓝光格式，H.264, VP6, WMV, WebM和多种其他格式的流传输. 在这个角色中, 编解码器必须匹配传输平台规定的数据速率, 在流媒体的情况下呢, 是否远远低于用于收购的比率.

编解码器和容器格式

将编解码器与容器格式区分开来是很重要的, 虽然有时它们的名字相同. 短暂的, 容器格式, 或包装, 文件格式是否可以包含特定类型的数据, 包括音频, video, 隐藏式字幕文本, 以及相关的元数据. 尽管有一些通用的容器格式, 像QuickTime, most 容器格式 target one aspect of the production 和 distribution pipeline, 比如MXF，用于在摄像机上进行基于文件的捕获, FLV和WebM用于流媒体Flash和WebM内容.

在某些情况下, 容器格式有一个或主要的编解码器, 比如Windows Media Video和WMV容器格式. 然而，大多数容器格式可以输入多个编解码器. QuickTime has perhaps the broadest use, with some camcorders capturing mpeg - 2/H.264 video in the QuickTime container format, 以及在iTunes上分发的大量带有MOV扩展的视频.

一个可能引起混淆的领域与MPEG-4有关, 它既是一种容器格式(MPEG-4第1部分)又是一种编解码器(MPEG-4第2部分)。. 从技术上讲，至少从ISO的角度来看，H.264 is also an MPEG-4 编解码器 (MPEG-4 part 10), which has largely supplanted use of the MPEG-4 编解码器. As a container format, an MP4 file can contain video encoded using the mpeg - 2, MPEG-4, VC-1, H.263和H.264编解码器.

用VC-1编码你的MP4文件, 然而, 和 neither the QuickTime Player or any iOS device will be able to play the file. 在这方面, 当生成要分发的文件时, it’s critical to choose a 编解码器和 container format that’s compatible with the playback capabilities of your target viewers.

多数道路通向H.264

从历史上看, 视频编解码器在多个不同的路径上发展, 有趣的是，大多数路径都指向H.264，这就是为什么编解码器在今天如此流行的原因. 一条途径是通过国际标准组织, 谁的标准影响了摄影, 计算机和消费电子产品市场. ISO于1993年发布了第一个视频标准, 什么是MPEG-1, 随后在1994年推出了mpeg - 2, 1999年的MPEG-4, 和AVC / H.264 in 2002.

下一个途径是通过国际电信联盟, which is the leading United Nations agency for 信息和 communication technology issues, 并为电话标准做出了贡献, 广播电视市场. 国际电联首次发布了他们的第一个视频会议相关标准H.1984年，与H.1990年，H.1994年，H.1995年，H.264，它是在2002年与ISO共同开发的.

正如我们所见, 数码摄像机最初使用DV编解码器, 然后过渡到mpeg - 2, 哪个继续占有重要的份额. AVCHD是H.基于264年的格式, 使用这种格式的个人摄像机越来越受欢迎, 使用H键的摄像机也是如此.基于264的avc intra格式. H.264完全占据了佳能7D等数码单反相机的市场, 几乎所有的摄像机都使用H.264.

On the delivery side, while most cable TV broadcasts are still mpeg - 2-based, H.264在有线电视中发展迅速，在卫星广播中得到广泛应用. 流媒体市场最初由专有编解码器主导, 最初是RealNetwork的RealVideo, 然后是微软的Windows Media Video, 然后On2的VP6, 与索伦森视频3主要的QuickTime编解码器. 2002年，苹果的QuickTime 6首次支持MPEG-4.264添加到QuickTime 7在2005年. 同年，第一款带有视频功能的iPod上市，同样搭载了H.264(和MPEG-4)支持. 2007年，Adobe加入了H.264对Flash的支持，微软Silverlight的支持紧随其后.

唯一一个H.264 doesn’t dominate is the intermediate 编解码器 market, which is not appropriate for long GOP formats. Otherwise, almost every other market, from iPods to satellite TV, is primarily driven by the H.264编解码器.

音频编解码器

Finally, since most video is also captured with audio, the audio component must also be addressed. 用于采集和编辑的最广泛使用的音频格式是PCM, 它代表脉冲编码调制, 在Windows上通常以WAV或AVI格式存储, 或Mac上的AIFF或MOV. PCM被认为是未压缩的, 因此，将其描述为一种文件格式可能更为恰当, 而不是编解码器. 为了保证质量, most intermediate 编解码器s simply pass through the uncompressed audio as delivered by the camcorder.

大多数传输格式都有相关的有损音频编解码器, 比如dvd上的MPEG音频和AC-3杜比数字压缩. 大多数早期的流媒体技术, 比如RealVideo和Windows Media, 拥有专有的音频组件, 所以RealAudio伴随着RealVideo文件, 就像Windows Media Audio和Windows Media Video一样.

This dynamic changed most prominently when Adobe paired the VP6 编解码器 with the MP3 audio 编解码器 for Flash distribution. 基于标准的音频编解码器.264 video is the Advanced Audio Coding (AAC) 编解码器, 而WebM则将VP8编解码器与开源的Vorbis编解码器配对.