How does MPEG-1 VIDEO work ?
First off, it starts with a relatively low resolution video sequence (possibly decimated from the original) of about 352 by 240 frames by 30 frames/s (US–different numbers for Europe), but original high (CD) quality audio. The images are in color, but converted to YUV space, and the two chrominance channels (U and V) are decimated further to 176 by 120 pixels. It turns out that you can get away with a lot less resolution in those channels and not notice it, at least in “natural” (not computer generated) images. The basic scheme is to predict motion from frame to frame in the temporal direction, and then to use DCT’s (discrete cosine transforms) to organize the redundancy in the spatial directions. The DCT’s are done on 8×8 blocks, and the motion prediction is done in the luminance (Y) channel on 16×16 blocks. In other words, given the 16×16 block in the current frame that you are trying to code, you look for a close match to that block in a previous or future frame (there are backward