1、单位代码 01 学 号 070101121 分 类 号 TP391 密 级 文 献 翻 译基于相邻帧的差别预测的多分辨率小波域运动估计和补偿院 ( 系 ) 名 称 信 息 工 程 学 院专 业 名 称 计 算 机 科 学 与 技 术学 生 姓 名 周 幸 幸指 导 教 师 王 学 春2011 年 3 月 26 日黄河科技学院毕业设计(文献翻译) 第 1 页 英语译文基于相邻帧的差别预测的多分辨率小波域运动估计和补偿唐国伟摘要:针对高比特率的运动矢量编码和占用更多时间负载的全部搜索策略,提出了基于相邻帧的差别预测的多分辨率小波域运动估计和运动补偿算法。图像序列使用微分运动检测方法,并选择适当的阈
2、值来确定相连区域。接下来提取出运动的区域来进行运动估计和运动补偿。实验结果表明,该算法提高了运动矢量编码的效率,降低了运动估计的复杂度,在相同的比特率的情况下,重构图像具有更好的质量,多分辨率运动估计算法(MRME )得以改善。关键词:运动估计;运动补偿;多分辨率分析;视频编码黄河科技学院毕业设计(文献翻译) 第 2 页 1、引 言由于小波分析的时频局部化性能,被广泛应用于图像/视频编码领域。正如图像序列,运动估计和运动补偿可以有效地减少相对的时间并且有效地改进编码。然而传统的运动补偿小波编码,以其运动估测的结构再加上帧的内部静止图像编码,不能充分利用固有的多分辨率特性的优势。在 1992 年
3、,Yaqin Zhang 和 S.Zafar 提出可变方块多分辨率运动估测的 fMRME1 视频压缩算法,即 1aid 的运动估测和运动补偿基础是在在小域中。通过使用相对较小的搜索窗口和匹配块,这种方法可以有效的减少运算量,摆脱方块效应,并有易于使视频可扩展编码适合人类视觉系统并实现进步的过渡。但是在MRME 算法中,存在着不连续的运动矢量的默认值和实物边界和方块边界的不一致这导致了高频率在系数变换中的成分增加,并影响位移帧的差DFD的编码。Zan Jinwen 提出通过多分辨率运动估测中值滤波,可产生更平稳的运动领域,产生一个更好估计性能。 。但中值滤波会在不平稳高分辨率的子带运动中带来相当
4、负面的影响 Su141 作了理论推断,深入研究小波系数的插补算法并提出半像素多分辨率运动补偿,有效提高运动估计的准确性。为了克服了离散的小波变换不断转移的属性。Zhang 提出基于 2 声道高通讯滤波和子带的小波域运动估测适应中央搜索点预测.此方法有一个相当低的计算复杂度,但编码器的性能也从 PSNR 值的数据中减少了。Cagnazzo 所研究的基于视频编码的小波理论性优化准则,提出一个最佳运动估计和补偿的方法,但是此方法是以扩展一个复杂性的成本为代价的法,但在一个扩展的 COM成本复杂性。本文提出了一种帧差的邻接预测 MRME 算法(即 FDMRME)它采用差动检测图像序列,并提取出运动区域
5、进行运动估计和补偿。这种方法减少运动估测的复杂性,提高了运动矢量的编码的高效性,并提出在同一比特率情况下,作为 MRME 的图像重建的质量。2、基于帧差的运动检测(1) 三帧差算法运动检测包括光流的算法,背景消除算法,相邻三帧差法。利用连续三幅图像,使差异化操作并进行 AND 运算导致出不同的结果,三帧差算法可以快速从图像序列中检测出运动区域。检测过程如图 1 所示。黄河科技学院毕业设计(文献翻译) 第 3 页 图像差 g1(x,y) 二值化图像差 e g2(x,y) 二值化提取运动对象 去噪和连通区域标识AND 操作kk+1LL3 帧 k-1e k-1图 1 用三帧差算法对运动对象的检测过程
6、g1(x, y)是前两个帧的运动图像的变化,g2(x,y) 是后两个帧。g1(x,y)和 g2(x,y)都包含有运动的信息。使两个运动变化图像二值化,并为他们进行 AND 运算以获得运动对象。(2) 不同图像的阈值选择为了提取运动目标,有必要选择合适的阈值 T 作为框架。不同的图像 g1(x,y)和 g2(x, y) 通过使用基于灰度特征的方法,然后通过阈值 T 使帧差图像二值化。阈值选择的过程包含 4 个步骤。第一、在三帧图像中,将第一和第二帧分成 2X2 的块。在第二帧上每块增加 4 个像素以获得总和 a,然后每个相应的块加上的第一帧的 4 个像素以获得 bi。这里 m,n 分别代表图像的
7、长和宽,k 是 22 个块的数量。第二、通过使用阈 T=1.1S,使第一第二帧的帧差图像二值化,得到二值化图像。在这一步骤中,获得一个粗略的阈值,进而可以得到一个被罚阈值。第三、在帧差图像中,计算像素的平均值小于阈值 T。q 是指在帧差图像中像素数量小于阈值 T。以 M 为第二和第一帧的帧差图像的阈值,然后使帧差图像二值化。第四、通过阈值 M 使第三和第二帧的帧差图像二值化然后在帧差图像中计算出像素的平均值小于阈值 M。以平均值作为第三和第二帧的帧差图像的阈值,然后二值化帧差图像。黄河科技学院毕业设计(文献翻译) 第 4 页 (3) 识别所连接的区域和运动区域的坐标的提取标注出 AND 运算图
8、像以通过使对象聚类的方法得出运动区域的坐标,包括以下 2个步骤。第一、标注出每个对象的像素从左到右,从上到下扫描 AND 运算的图像。每当遇到一个属于目标区域的像素(灰度级为 1) ,就检测该像素的相邻的 8 个像素。如果这 8 个像素没有被标注出来,就用和一个新的标签号码标注他们(从 1 开始) ,每一次的数量增加 1。否则就把当前像素的标签号码标注成这 8 个相邻像素中最小的编号。第二、群集每个对象区域从头到尾一行一行的扫描运动物体图像(从左到右,然后由右至左) 。每当遇到一个对象的像素,检测该像素的 8 个相邻的像素。如果这 8 个中最小的标签号比该像素的标签号大,它就会被这 8 个中小
9、的标签号所取代。当整个图象已经扫描完之后,再从后到前以同样的方式再扫描一下的图像,直到所有对象像素的标签不再变化为止。当 AND 差异图像已被对象聚类过程处理过之后,头肩图像可能有一些异常区域。降噪这些异常区域并把它们作为图像的运动区域保存下来。3、差异相邻块运动估计和补偿第 1 步,把 3 级小波分解到图像序列,并进行运动检测,通过使用三帧差法,以得出最低频率子带 LL3。提取运动区域,并把他们分成 2 2 个方块。假设每块的运动矢量是 V3(x,y) ,其他 3 子带的运动矢量为也( x,y) 。定义一个可靠的标记 R。如果不是所有的像素在方块所共享的同一状态,那么就可以断定这个块所在的运
10、动物体边界是不可靠的运动估计。第 2 步,检查当前方块低频率子带相应的像素标志,如果他们属于静止区域则 R= 1且运动矢量是 0,这并不需要估计。而在其他子带对应位置的运动矢量也为 0。第 3 步,如果所有的 LL3 当前方块属于运动区域。则 R= 1,据相邻方块运动的预测,块(x,y)与相邻方块之间的关系如图 2 所示。 a1 的值是 0 或 1。如果相应的方块位于同一地区的当前块,那么这个值是 1。否则则是 0。确定是否超过预测值,绑定与否。如果不是,那么就把预测值当成中心开始搜索。为了保持一致性,把平均绝对差(MAD)作为匹配准则。黄河科技学院毕业设计(文献翻译) 第 5 页 1 2 3
11、4当前块图 2 当前块 R=1 以及相邻方块的略图第 4 步,如果有一个对象,它的像素即是它的运动像素,同时也是当前方块的像素。那么可以推断当前方块位于运动区域边界处。该运动是不可靠,它不能被当成是可靠的方块来进行处理。为了促进了预测的可靠性,需要知道更多的信息。所以这些随后会来处理。当第一个扫描完成后,所有可靠的方块运动矢量已实现。然后,毗邻不可靠方块的可靠方块的运动矢量可以被用来于作出预测。当前方块 R = 0 时与相邻方块之间的位置关系如图 3 所示。1 2 34 56 7 8当前块图 3 当前块 R=0 以及相邻方块的略图第 5 步,通过参照帧 LL3 子带使用方块运动矢量,在其他子带
12、中估计相应的方块。对于 m(m 3)子图像,方块(x,y)的运动矢量的初始值是 。重复第 3 步和第 4 步,可以得出运动估计预测值是 。第 6 步,因为每个方块(x,y)在每个子图像中,所以最后的运动矢量 的可以得出来。因此,在方块中,任何像素(x,y)的运动补偿预测不仅取决于这个方块的运动矢量,同时也取决于相邻方块的运动矢量。在固定的比特率的情况下,更多的比特可以分配到剩余的信息,以提高重建图像的质量。与此同时。由于估计只针对运动区域, 用于估计的时间消耗要少得多。对于运动区域这是成正比的,而不是对于整个图像。4、实验结果分析该实验条件如表 1 所示。黄河科技学院毕业设计(文献翻译) 第
13、6 页 克莱尔 100 帧 和美国小姐分别进行了测试。表.2 显示运动矢量的字节(由运动矢量,字节/帧表示)和在 MRME/ MRMC 补偿法和 FDMRME 方法中所消耗的运动估计的总时间(表示总时间,秒)通过利用的子带取向小波分解的选择性,使用 FDMRME算法,改进低频率子带的基运动矢量的准确性。每个子带的运动矢量的错误率下降,这些证明了运动矢量的编码的有效性是有目共睹的。在 CIF 序列中美国小姐所消耗的时间比在 QCIF 序列中克莱尔所消耗的时间少。通过使用运动检测,视频帧分为运动区域,静止区域。编码器不会使运动估计和补偿成为静止区域,所以消耗的时间是由运动区域而不是由整个图像决定的
14、,这导致系统效率的改进。表 1 实验条件视频序列 格式化帧速度(F/s)低频率子块搜索最小分辨率的范围Claire Miss AmericaQCIF CIF10102*22*28*810*10表 2 运动估计值对比MRME FDMRME视频序列运动矢量 总时间 运动矢量 总时间Claire345.28 9.86 296.53 8.76Miss America298.08 10.97 243.64 6.92表 3 显示了 FDMRME 方法的编码结果,用来测试序列。这里 PSNR 表示重建图像的质量,运动矢量代表的意思和上面的一样。ER (字节/ 帧)表示经过运动补偿之后,用于传输错误图像的字节
15、数。TOTAL (字节/帧)表示 ER 字节和运动矢量的总字节数。从表 3 得出,在一定的比特率情况下,在 FDMRME 算法中,用于编码的运动矢量的比特要少得多。所以更多的比特可以分配到运动补偿后的残差,以提高重建图像的质量。黄河科技学院毕业设计(文献翻译) 第 7 页 表 3 编码结果对比MARE FDMARE视频序列 PSNR ER 总和 PSNR ER 总和Clair(BR/CR:118/65) 39.12 3698 4024 38-94 3902 4125MissAmerica(R/CR:125/50)34.46 3903 4216 35.93 3996 44635、结论变量方块多分
16、辨率运动估计和补偿是基于视频编码领域实现小波视频编码高效率的重要途径。通过分析 MRME 方法出现的问题,本文提出了通过运动检测的运动矢量部分以及运动估计过程的指导,促进了运动矢量估计的一致性和准确性,提高了编码效率,同时,降低了运动估计的复杂度。摘自:唐国伟基于帧差的相邻预测的多分辨率小波域运动估计和补偿J电子学报(英文版) 2009.5黄河科技学院毕业设计(文献翻译) 第 8 页 附:英文原文Multi-resolution Motion Estimation and Compensation based on Adjacent Prediction of Frame Difference
17、 in Wavelet DomainTang GuoweiAbstract:Aiming at the higher bitrate occupation of motion vector encoding and more time load of fullsearching strategies, a multi-resolution motion estimation and compensation algorithm based on adjacent prediction of frame difference was proposed. Differential motion d
18、etection was employed to image sequences and proper threshold was ad opted to identify the connected region. Then the motion region was extracted to carry out motion estimation and motion compensation on itThe experiment results show that the encoding efficiency of motion vector is promoted, the com
19、plexity of motion estimation is reduced and the quality of the reconstruction image at the same bit-rate as Multi-Resolution Motion Estimation(MRME)is improved.Key words: Motion estimation; Motion compensation; Multi-resolution analysis; Video coding黄河科技学院毕业设计(文献翻译) 第 9 页 I. IntroductionFor the exce
20、llent properties of time-frequency localization, wavelet analysis is widely used in the field of image/video coding. As to image sequences, motion estimation and motion compensation can effectively reduce the time-relativity and improve the encoding efficiency. But the traditional motion compensatio
21、n wavelet coding, taking the structure of motion estimation plus intra-frame still image encoding, can not make full use of the advantages of the inherent multiresolution characteristics of wavelet decompositionIn 1992Zhang and S.Zafar proposed variable block Multi-Resolution Motion Estimation MRME
22、algorithm for video compression which 1aid the foundation for motion estimation and compensation in wavelet domain.By using comparatively small searching window and matching block, this method can reduce the amount of operation effectively, get rid of the blocking artifacts and be easy to achieve vi
23、deo scalable encoding suitable for human visual system and progressive transmission. But in the MRME algorithm there exist the defaults of discontinuous motion vector and the inconformity of real object border with block border, which leads to the increase of high frequency components in transform c
24、oefficients and affects the encoding efficiency of Displacement Frame Difference DFD.Zan Jinwen proposed multi resolution motion estimation through median filtering which can produce more smooth motion fields and result in a better estimation performance. But median filtering brings about a quite ne
25、gative effect on the unsmooth motion of high frequency sub-band in high resolution. Y.C.Su made a theoretical deduction and a deep research to interpolation algorithm of wavelet coefficients and proposed half-pixel multi-resolution motion compensation which improves the accuracy of motion estimation
26、 effectively. To overcome the shift-variant property of discrete wavelet transform, Lei Zhang proposed wavelet-domain motion estimation based on 2-channel high pass filtering and subband-adapted central searching point prediction This method has a rather low calculated complexity, but the performanc
27、e of the coder is also reduced from the PSNR data. M.Cagnazzo studied the theoretical optimal criterion of wavelet-based video coding and proposed an optimal motion estimation and compensation method, but at the cost of an 黄河科技学院毕业设计(文献翻译) 第 10页 augmented complexity.This paper presents a Frame Diffe
28、rence adjacency prediction MRME algorithm (FDMRME), which adopts differential motion detection for image sequence and extracts the motion region to carry out motion estimation and compensation. This method reduces the complexity of motion estimation, improves the encoding efficiency of motion vector
29、 and raises the quality of the reconstruction image at the same bit-rate as MRME.II. Motion Detection Based On Frame Difference1. Three frame difference method Motion detection includes optical flow algorithm, background elimination algorithm, adjacent frame algorithm and three frame difference algo
30、rithm. By using continuous three images to make difference operation and carry out AND operation to the difference results, the three frame difference algorithm can quickly detect the motion region from image sequences. The detection procedure is shown in Fig.1.Difference image g1(x,y)BinatizationDi
31、fference image g2(x,y) BinatizationExtraction of motion objectsDenoising andIdentification ofconnected regionAND operationkk+1Fig.1 The procedure of motion object detection using three frame difference methodg1(x ,y)is the motion variation image of the first two frames, and g2(x ,y )is that of the l
32、ater two frames. The motion information is included in both of g1(x ,y)and g2(x ,y). Binarize the two motion variation images and make AND operation for them to obtain the motion objects. 2. Threshold selection of difference imagesLL3 frame k-1黄河科技学院毕业设计(文献翻译) 第 11页 In order to extract motion object
33、s. It is necessary to select proper threshold T for the frame difference images gl(x ,y)and g2(x ,y)by using a gray feature based approach and then to binarize the frame difference images by T . The process of thres- hold selection consists of 4 steps:(1) Among the three frame images. Divide the fir
34、st and the second frames to 2x2 blocks Add the 4 pixels of each blocks in second frame to get a sum a , and add the 4 pixels of each corresponding blocks in first frame to get bi.Here mn are respectively the length and the width of the image, and k is the number of the 22 blocks.(2) To binarize the
35、frame difference image of the second and the first frames by using threshold T = 1.1S and get the binarization image. In this step a rough threshold is obtained on which a fined threshold can be obtained.(3) Compute the mean value of the pixels less than the threshold T in the frame difference image
36、.Here q is the number of the pixels less than the threshold T in the frame difference image. Take M as the threshold of the frame difference image of the second and the first frames and then binarize the frame difference image.(4) To binarize the frame difference image of the third and second frames
37、 by threshold M. And then compute the mean value of the pixels less than M in the frame difference image. Take the mean value as the threshold of the frame difference image of the third and second frames and then binarize the frame difference image.3. Identification of the connected region and extra
38、ction of the coordinates of motion region. To label the AND operation images to get the coordinates of the motion region by using object clustering approach, which includes 2 steps:(1) Label each object pixels Scan the AND operation images from left to right and from top to bottom. When a pixel (gra
39、y-level is 1) belonging to object region is met, detect the 8 neighbors of this pixe1. If they are not labeled, label them with a new label number (start from 1), and each time the number increases by 1. Otherwise label the current pixel with the smallest label number of the 8 黄河科技学院毕业设计(文献翻译) 第 12页
40、 neighboring pixels.(2) Cluster each object regionScan the motion object image from top to bottom line by line (from left to right and then from right to left).when an object pixel is met, detect the 8 neighbors of this pixe1. If the smallest label number of them is greater than that of the object p
41、ixe1. It is substituted for the smallest labe1 number of the 8 neighboring pixels. When the whole image has been scanned., scan the image once more from bottom to top in the same way until all the labels of the object pixels stop changing.For head-shoulder images there might be some irregular region
42、s after the AND difference image has been processed by the object clustering procedure. Denoise these irregular regions and save them as the motion regions of the image.III Difference Adjacent Block Prediction Motion Estimation and CompensationStep1 Make 3 level wavelet decomposition to the image se
43、quences and conduct motion detection to the lowest frequency sub-band LL3 by using three frame difference method. Extract the motion regions and divide them into 22 blocks. Suppose the motion vector of each block as V3( x ,y), and the other 3 sub-band motion vectors are also ( x ,y). Define a reliab
44、ility flag R. If not all the pixels in theblock share the same state, then it can be determined that this block lies in the border of the movement object which is not regarded as reliable to motion estimation.Step 2 Check the corresponding flags of the pixels of the current block in 1ow frequency su
45、b-band. If they belong to still region then R = 1. And the motion vectors are 0, which do not need to be estimated. And the motion vectors of the corresponding position in other sub-bands are also 0. Step 3 If all the current blocks of LL3 belong to motion region. Then R = 1. Predict according to th
46、e motion of the adjacent bocks. The relation between the block at( x ,y)with the adjacent bocks are shown in Fig.2.The motion estimation values are:Here the value of a1 is 0 or 1. If the corresponding block is located at the same region with the current block, then the value is 1. Otherwise it is 0.
47、 Determine whether the prediction value exceeds the bound or not. If not, then start searching by黄河科技学院毕业设计(文献翻译) 第 13页 taking the prediction value as the center. In order to promote the consistency, the Mean Absolute Difference (MAD) is taken as the matching criterion.1 2 34Current blockFig.2 The s
48、ketch of current block R=1 and the adjacenciesStep 4 If there axe both motion pixels and still pixels in the current block. Then it can be inferred that the current block is located at the border of the motion region. The motion is less reliable and it can not be processed as the reliable blockIn or
49、der to promote the reliability of the prediction, more information needs to be obtained. So it will be processed later. W hen the first scanning is finished, all the motion vectors of reliable blocks have been achieved. Then the reliable motion vectors of the blocks adjacent to the unreliable block can be used to make prediction. The position relation between the current R=0 block with the adjacent blocks is shown in Fig.3.Here a1 has the same meaning as above.1 2