In video coding, understanding the real bitrate requirements that correspond to the desired quality is a key feature. It’s possible to assess quality by using appropriate objective metrics such as PSNR and SSIM Index, or simply through subjective methods as the Mean Opinion Score (MOS). Objective metrics are good if the aim is evaluating video coding performances without the pollution of human perceptions; but this is the bad side too: without human perceptions, objective metrics fail to give meaning to their results (numbers are merely numbers). So, subjective evaluation procedures – that simply consist in averaging the scores given by human observers – are fundamental in putting together the perceived video quality and objective evaluations.
In evaluating the H.264 encoder performances in Microsoft Expression Encoder 3, I decided to set up the following pipeline:
- Choice of the test clips: I decided to use more than a clip, in order to understand the effects of motion and shot changes on video coding performance; so, I choose touchdown_pass and elephants_dream video sequences freely available on Xiph.org;
- Choice of encoding parameters and video formats: for each file, three image formats were chosen (480p, 720p, 1080p, all set to 24 frames per second), producing 16 outputs at different average bitrates for each of them; H.264@Main Profile was used, with a 5 seconds key frame interval (the time interval between two I-frames), 4 reference frames and 1 B-frame for each GOP, CAVLC entropic coding, 8×8 motion estimation blocks and a in loop deblocking filter.
- Choice of evaluation metrics: PSNR and SSIM Index for objective assessment, MOS for subjective score (15 human observers);
- Conclusions: (rough) bitrate requirements related to output quality.
And finally, here are the results:
Hope this could be useful to someone besides me! ;D