Inter frame prediction

Inter-frame prediction is a technique which takes advantage of the temporal correlation between consecutive frames in order to encode at lower bitrates. The prediction of the current frame is made from past and/or future frames allowing a certain displacement ruled by the motion vectors. The image is divided into macroblocks and for each one the encoder tries to find the macroblock in the reference frames that best matches it.

Context

H.264 is the future video coding standard which is going to replace the current MPEG-2 video compression algorithm. H.264/MPEG-4 AVC was a joint job of ITU and ISO. H.264 is the MPEG-4(part 10) and the main scope is to code with the same quality at lower bitrates than prior standards.

Inter-frame coding enhancements with h.264/MPEG-4 AVC

This technique was already used in prior standards like MPEG-2. The key features in H.264 that allow higher compression are the following:

Flexible block divisions

1) Luma block partitions of 16x16(MPEG2), 16x8, 8x16, 8x8 and if this last mode is used, the 8x8 partitions can be split up again in sizes as small as 4x4.

The frame to be encoded is divided into macroblocks or sub-macroblocks of the sizes shown in the figure. The prediction of each block would be the block of the same size found in the reference frames allowing a certain displacement and an offset. The prediction could also be a weighted prediction of several reference blocks.

Quarter pixel accuracy

2) Quarter-pixel accuracy for motion compensation: (MPEG-2 enabled half-pixel accuracy at most). We can interpolate integer pixels in reference frames in order to find a better matching. When the motion vector is an integer it means that we will find the prediction directly in pixels of the reference frame. If a non-integer motion vector is chosen, we will find the prediction by interpolating in both vertical and horizontal directions and applying a filter.

Pixels at half-pixel positions are obtained by applying a 6-tap filter:

H=[1 -5 20 20 -5 1]

For instance:

b=A - 5B + 20C+20D+E

Pixels at quarter-pixel positions are obtained by a simple bilinear prediction.

Multiple reference picture motion compensation

3) Multiple reference picture motion compensation. Enables searching the reference in two different buffers (usually: List 0 stands for past frames, List 1 stands for future frames). Each buffer can store up to 16 pictures.

The block prediction would be either a single block of a reference picture or a weighted average and offset from several reference pictures. It improves the performance in cases such as panes, cross-fade transitions or object discovering.

Enhanced Direct and Skip macroblocks

4) The Skip and Direct mode are frequently used and allow a lot of compression mainly in B-Pictures. When neither the residual error nor the motion vector is sent, we will call it Skip or Direct macroblock. Only a flag indicating that it is a Direct macroblock needs to be sent. The decoder will infer the motion vector of such macroblock from the already decoded frames. There are two techniques to do this:

TEMPORAL: (For bi-predictively coded areas) This is also known as Direct Mode.

The co-located macroblock’s motion vector in frame at t+1 is used to infer the motion of the direct macroblock at t. The co-located macroblock’s motion vector in frame t+1 points at the t reference frame.

So the decoder infers the motion vectors MV1,MV2 from MV applying certain temporal weightings.

SPATIAL: (Skip Mode) The decoder estimates the motion vector from neighboring blocks in the same frame. One simple criterion would be just to copy the neighboring block motion vector. These modes are commonly used in smooth areas where there is not a lot of motion. Prior standards didn't enable motion for this kind of macroblocks. This enhanced direct macroblock allows to encode constant motion sequences like slow panning very efficiently. As a drawback, the decoder will have to store the motion vectors also in the buffers.

The pink blocks are direct macroblocks Skip Macroblocks appear very often in B-pictures

References

Software H.264: http://iphome.hhi.de/suehring/tml/download/
T.Wiegand, G.J. Sullivan, G. Bjøntegaard, A.Luthra: Overview of the H.264/AVC Video Coding Standard. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13, No. 7, July 2003

External links

Tutorial de compresión de video MPEG-4 AVC

video compression tutorial MPEG-4 de OKI
Noticia sobre MPEG-4 AVC en francès

en:H.264/MPEG-4 AVC