Tuesday, October 3, 2017

VP9 ELI5 Part 2

Here's the photo split into its 3 YUV components:
The U and V are the color add-on data. Notice how they are 1/4 the size of the B/W image.

Most video formats, VP9 included, depend on both intra and inter frames. An intra frame is a normal picture while an inter frame is just the difference between the current frame and a previous one. Think of it as only saving what moved in the current frame instead of saving the whole picture again.

Obviously we need to start a video with an intra frame, also known as a key frame, so that we can build future inter frames based on it.

Unlike BMP's, we don't save VP9 images pixel-by-pixel in order. Instead, we divide the image on several levels. The first, highest level of division is into 64x64 superblocks.

64x64 is a very large subdivision size. In MP4 (H.264), the largest size available is 16x16. Notice the difference below:

16x16

64x64

The superblocks are processed in raster order. That means left to right, top to bottom. In other words, each horizontal line of superblocks is processed from left to right, in order from top to bottom.

Each superblock is then optionally subdivided further. Here are some possible subdivisions:



Notice what subdivisions are allowed. Shown from left to right, we can split a superblock:
  • Vertically to make 16x32,
  • Horizontally to make 32x16, or
  • Four ways to make 4 32x32 blocks

After that, we can split the 32x32 blocks the same way or we could leave them alone. However, we may not further split one of the non-square sizes. Any non-square split is final in VP9. Also note that 4x4 is technically as small as we can get but things get messy at that size so for this tutorial we will not go smaller than 8x8.

This is just the first level of subdivision. The next level is the transform level and is more interesting.

No comments:

Post a Comment