The 4k codec wars: vp9 vs x265

The contenders [vp9 vs x265]

I’ve been wanting for a while to get an idea of how does actually vp9 perform versus x265. While x265 is still a work in progress, I thought i would be worthwhile to run some tests with CODECbench to get a feel of where we stand.

Setup

At the time of this writing x265 and particularly vp9 are quite slow. It took me more than a week of encoding time to produce the results of this test. There were 432 runs to produce the data. Each sequence was run at 32 different bitrates to produce accurate rate distortion plots. The goal of the test was to get a feeling of the performance of the two tests subjects vp9 and x265. Getting very detailed performance metrics for every setting of these two encoders was beyond the scope of the test. There are a few other caveats you can read in the observations section below.

Tools

The following tools were used for this test:

  • Our venerable CODECbench
  • The standard libvpx and x265 codecpacks
  • late 2013 macbook pro 13”

CODECbench produces a signature of the codecpacks used with version information that is shown below.

libvpx version:

"version": "vp8: v1.3.0-2993-gcf83983, vp9: v1.3.0-2993-gcf83983",

x265 version:

"version_long": "x265 [info]: HEVC encoder version 1.1+200-09450ac6dc7d
x265 [info]: build info [Mac OS X][clang 5.1.0][64 bit] 8bpp
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX AVX2 FMA3 LZCNT BMI2
"

The sequence set

I used the 4k versions of the sequence set we use here at codecbench.

You can learn more about the sequences and their quirks in here.

12 days later, the results

It took a total of 290 hours, about 12 days to produce the following results, about 40 minutes per run to produce the following results:

SSIM, PSNR results

When looking metrics I like to look at SSIM and PSNR simultaneously. SSIM is great for sequence independent evaluation. Usually >0.97 SSIM results yields in pretty good results with >0.99 getting to almost hard to tell content was compressed. With PSNR you have to know what values are good and bad for that particular scene, but it’s a good exercise to compare the two when they diverge.

Full results

All the RD plots are available here:

 

My take

While you can take a look at all the data above the globalCABScore results using SSIM were the following:

cabs4k_ssim_cabsscore

Globally, for the selected configurations of x265 medium and vp9 at cpu setting of ‘1’ vp9 averaged about 7% of bitrate savings vs x265.

Coastguard4k

Interestingly, in ‘coastguard4k’  vp9 performed worse. Let’s look at that particular sequence RD curve for SSIM

coastguard4k_cabs4k_ssim

In this particular instance x265 had a 17% of bitrate savings in the CABS coverage area between 2000-8000 kbps. This is pretty good if you consider that ‘coastguard4k’ is a pretty difficult scene to encode. What does the PSNR curve tells us?:

png-1

Similar story, with here only a 6% lead to x265.

Parkjoy4k25fps

Parkjoy is a very difficult scene to encode. Just look at the SSIM RD curve:

parkjoy_ssim

Even at 10mbit you are still only getting SSIM values of 0.96 and the curve the quality gain grows quite slowly as you increase the bitrate. Both x265 and vp9 struggle with this scene although there is a slight lead towards vp9 with these settings. PSNR is not much different:

parkjoy_psnr

Overall impression

While one could argue that vp9 did better in this test, my overall impression is actually that both codecs offer similar compression ratios. x265 was set at the ‘medium’ preset while vp9 was set with ‘cpu_used=1’ which is quite slow too. Maybe a different combination would have changed things a bit.

 

Observations

Encoding time

The goal of the test was to get an idea of the compression performance of vp9/x265. Encoding time vs compression performance was not considered, mainly because these are software codecs that were run without taking into consideration software optimizations to make them run faster.

Quality settings used

‘cpu_used=1’ was used in libvpx and ‘medium’ preset on x265. These settings seem to skew the encoders to use all the compression toolset disregarding somewhat encoding time. While I tried to use ‘cpu_used=0’ and ‘placebo’ on x265 the encoding times were so brutally slow that it became unrealistic to realize the test with the available hardware. I hope to be able to produce a follow up to see how much difference are there between different x265 presets and same thing for vp9 to see if I was really missing much compression gain.

Comments?

Feel free to leave comments below or send me suggestions/ideas privately