HEVC, AV1, VVC and Past – Chips and Cheese
This text was written in March 2023. On the time of writing, the encoders we used have been nonetheless in energetic growth. As codec requirements and encoders evolve, a few of the info could change into outdated.
A Turning Level for Codecs
For the longest time, MPEG LA’s Superior Video Coding (AVC), or H.264, has been everybody’s go-to video codec. Because of Cisco’s OpenH264, software program tasks not have to fret about patents as Cisco foots the invoice for royalties. Nonetheless, AVC is displaying its age after 20 years, because it struggles to deal with resolutions past 1080p, whereas sustaining low bitrates.
AVC’s official successor – Excessive Effectivity Video Coding (HEVC), or H.265 – shouldn’t be having fun with the identical stage of recognition. Video platforms like YouTube refuse to undertake HEVC as a consequence of licensing charges. Alternatively, open-source software program tasks, akin to Mozilla Firefox, intentionally select to not implement HEVC to keep away from potential authorized bother. Though HEVC was revealed virtually a decade in the past, AVC nonetheless dominates the codec market to this present day.
In 2015, an business consortium led by Google – the Alliance for Open Media (AOMedia) – determined to develop their very own open, royalty-free different to HEVC – AOMedia Video 1 (AV1). AV1 was initially deliberate to launch in 2017, however the world of codecs strikes slowly, and mainstream cellular platforms nonetheless lack {hardware} decoding capabilities as we converse.
With its adoption inside grasp, the hype for AV1 has by no means been increased. Some anticipate AV1 to comply with AVC’s path, however its destiny will not be as clear-cut. Again in 2015, HEVC was AV1’s competitor, however HEVC’s successor – Versatile Video Coding (VVC), or H.266 – was launched in 2020. In the meantime, MPEG LA is growing two different codecs in parallel: Important Video Coding (EVC) and Low Complexity Enhancement Video Coding (LCEVC). A subset of EVC – Sort 1 – solely makes use of patents throughout the public area and thus is royalty-free. LCEVC, then again, shouldn’t be a standalone codec however an modern enhancement layer. LCEVC enhances a base codec by compressing misplaced particulars and compensating for compression artifacts. LCEVC has the potential to re-vitalize older codecs, akin to AVC, and produce their relevance again into the 4K enviornment.
We’re at a turning level for codecs, and it begs the query: born in between MPEG LA’s two generations of codecs, how does AV1 carry out? It is a large matter, so let’s slim it down a bit. There are already many good writings on how codecs work, and I’m not likely certified to touch upon the authorized facet with out an LL.B. Due to this fact, we’re solely looking on the technical efficiency of HEVC, AV1 and VVC, specifically compression effectivity and computational complexity. Each EVC and LCEVC are presently in heavy growth, and real-world purposes are tough to search out. So we’ll exclude them in immediately’s dialogue. I hope you aren’t already misplaced from too many acronyms (TMA), and I promise you it’s not getting significantly better from right here on.
An Opinionated Methodology
Codec testing is a minefield. Every codec has many encoders, and every encoder has dozens of parameters with dialectical relationships with each pace and high quality. It’s definitely perfect to cowl the widespread configurations, however testing combos of too many variables shortly turns into convoluted.
Alternatively, visible high quality is perceptual and even as much as particular person style. For instance, some could think about motion pictures with out movie grain clearer and thus increased high quality, whereas others argue staying true to movie – one hallmark of visible high quality – means retaining the grain. Due to this fact, the easiest way to evaluate visible high quality is to conduct in-person subjective assessments the place actual human judges by their notion. Nonetheless, we are able to’t afford such a large challenge, and algorithmic metrics include its personal set of pitfalls.
Due to this fact, the testing methodology is (extremely) opinionated, as now we have made numerous selections based mostly on what we predict is finest. Each time I learn an article on codec testing, I virtually at all times get annoyed by questionable selections left and proper, and depart with dozens of questions left unanswered. Let me prevent from the identical frustration by strolling by way of the selections we’ve made. You might not agree with our methodology (particularly if you happen to hang around on Doom9), and you might have good causes for doing so, however not less than you should have a proof of why we’ve carried out what we’ve carried out.
Software program Choice
For HEVC, we use the preferred open-source implantation x265 3.5+95
. For AV1, now we have picked SVT-AV1 1.4.1
as our encoder because it strikes a stability between high quality and encoding pace, however VideoLAN’s dav1d 1.1.0
is the decoder of our selection because it’s a quicker implementation. For VVC, we use Fraunhofer’s Versatile Video Encoder VVenC 1.7.0
, an intensive optimized encoder based mostly on the reference design VTM
, and its sibling decoder VVdeC 1.6.1
.
We additionally useFFmpeg
to help with our testing. VVC shouldn’t be carried out by FFmpeg
for the time being, so we use the VVC v6 patch to supply such help. Each encoding and decoding are ran on an 8-core, 16-thread KVM on the second CCD of an AMD Ryzen 7950X.
Supply Materials
We use Blender’s third “open movie” Sintel as our supply materials. The precise copy we used is an AVC-encoded 4K version available for download here. As talked about, visible high quality is in the end as much as particular person notion. Sintel is licensed beneath the Creative Commons Attribution 3.0 licence, which permits us to share the encoded end result so that you can decide your self.
Software program encoding could be painstakingly gradual, and now we have solely examined this one clip because it already took weeks to course of. To make the issue worse, no comparability can characterize all use circumstances. Please observe the limitation that in case your supply materials is drastically totally different from a typical movie akin to Sintel, you would possibly produce totally different outcomes from ours.
Codec Utilization
Now we have determined to check the aforementioned codecs for archival functions. As no mainstream system can software program encode any of the three codecs on the fly, it’s tough to check them for streaming functions. Maybe we are able to give you a follow-up article to dive into widespread GPU {hardware} encoder implementations. Tell us if you happen to’re .
Video archiving results in two main assumptions: one, archival typically goals for the “finest” subjective high quality on the smallest file measurement; two, encoding time doesn’t matter. These two assumptions will later assist us determine our encoding parameters.
Encoder Tuning
Fee Management
For x265
and SVT-AV1
, we use Fixed Fee Issue (CRF) for price management. Briefly, CRF is perfect for archival because it ensures constant high quality for a given file. We begin our testing on the default CRF whereas controlling the typical VMAF rating between 91 and 96.
VVenC
continues to be in heavy growth and doesn’t presently help CRF. Due to this fact, we use the subsequent best choice – two-pass variable bitrate (VBR) – for VVC. On this case, we begin encoding at 1000 kbps, and increas bitrates by 500 kbps till we attain a median VMAF of 96. In the course of the first cross, the encoder dynamically allocates bitrates based mostly on scene complexity. Then the encoder makes use of the collected info to encode the video for the second cross.
Presets
Every encoder has dozens of parameters, and totally different supply supplies can profit from totally different tunings to realize optimum compression effectivity and high quality. Nonetheless, since we’re evaluating codecs somewhat than encoder settings, we assume the encoder presets are sane and have determined to make use of the presets. Since encoding time doesn’t matter, we use the slowest preset for every encoder to realize finest compression effectivity.
For SVT-AV1
, the slowest preset is 0; for VVenC
, the slowest is “slower.” The slowest preset for x265
is “placebo,” however we use “veryslow” as an alternative. Because the title suggests, the enhancements from “veryslow” to “placebo” is negligible, however the encoding time is so much longer as “placebo” has reached previous the purpose of diminishing returns.
Past presets, we select 10-bit color outputs because it typically improves compression effectivity and reduces artifacts on the expense of encoding time. x265
disables AVX-512 help by default as a consequence of clock pace regressions on older Intel shopper methods. Since we’re utilizing an AMD system, we manually allow AVX-512 to hurry up encoding.
You’ll find the FFmpeg
instructions as follows:
# HEVC
ffmpeg -i reference.mkv -map 0:v -c:v libx265 -crf $CRF -preset veryslow -pix_fmt yuv420p10le -x265-params asm=avx512 -map 0:a -c:a duplicate hevc-$CRF.mkv
# AV1
ffmpeg -i reference.mkv -map 0:v -c:v libsvtav1 -crf $CRF -preset 0 -pix_fmt yuv420p10le -map 0:a -c:a duplicate av1-$CRF.mkv
# VVC
ffmpeg -i reference.mkv -map 0:v -c:v libvvenc -b:v $bitrate -preset 4 -vvenc-params passes=2:cross=1:rcstatsfile=vvc-$bitrate-stats.json -f null /dev/null
ffmpeg -i reference.mkv -map 0:v -c:v libvvenc -b:v $bitrate -preset 4 -vvenc-params passes=2:cross=2:rcstatsfile=vvc-$bitrate-stats.json -map 0:a -c:a duplicate vvc-$bitrate.mkv
Visible High quality
We use Netflix’s Emmy-winning Video Multimethod Evaluation Fusion (VMAF) to evaluate visible high quality, because it’s one of the best obtainable device for the time being. VMAF makes use of machine studying to foretell subjective video high quality, and we use the vmaf_4k_v0.6.1
mannequin particularly skilled for 4K. As mentioned, visible high quality is subjective. However for the sake of quantitive evaluation, we goal at a median VMAF rating of 95 for our last encode because the “finest” high quality.
Nonetheless, VMAF shouldn’t be excellent, and solely counting on algorithms is problematic, as machines in the end understand visuals in another way from people. Due to this fact, we will even take a more in-depth take a look at the ultimate encodes, and examine our findings with VMAF scores.
Outcome
Compression Effectivity
The VMAF scores present a transparent benefit for VVC, delivering the very best high quality at any given bitrate. HEVC and AV1 traded blows, with AV1 performing higher at decrease high quality between VMAF 91 to 94, and HEVC performing higher at increased high quality between VMAF 94 to 96. However on the goal VMAF 95, HEVC’s delivered an additional 5.6% bitrate financial savings than AV1.
Computational Complexity
Encoding Time | Decoding Time | |
HEVC | 2 Hours 34 Minutes | 1 Minute 14 Seconds |
AV1 | 18 Hours 54 Minutes | 1 Minute 42 Seconds |
VVC | 2 Days 10 Hours 35 Minutes* | 2 Minutes 42 Seconds |
Right here, we expectedly see a geometrical improve in computational complexity in newer codecs. Though we are saying encoding time doesn’t matter, AV1 is 7.5x slower than HEVC, and VVC took literal days to encode a 14-minute clip utilizing the slowest preset. The massive discrepancy will in the end have an effect on selections on codec choice and encoder tunings in real-life purposes.
We put an asterisk behind VVC’s encoding time as a result of now we have discovered VVenC
struggles to scale past 8 threads regardless of working on an 16-thread machine. I’m assured that future optimizations can drastically enhance encoding time on excessive core-count machines.
The decoding pace is affordable for all 3 codecs, however the checks have been ran on an 8-core desktop system. Within the age of cellular gadget proliferation, hardware-accelerated decoding is critically vital for a codec’s adoption. {Hardware} acceleration won’t solely assist cellular units preserve battery life, but additionally guarantee good playback efficiency on TVs and set-up containers utilizing related SOCs.
Visible High quality at VMAF 95
In response to VMAF, newer codecs produce much less constant high quality throughout frames. Essentially the most constant high quality got here from x265
with the bottom 1% rating of 84.67. In the meantime, the 1% low for AV1 is 81.28, and 79.26 for VVC.
Wait, what?
A Nearer Look: Adaptive Quantization
VMAF is suggesting newer codecs have higher compression effectivity, however visible high quality is much less constant throughout frames. Nonetheless, we have been utilizing CRF and 2-pass VBR for price management, which ought to give us constant high quality all through a video. What’s taking place right here?
If you happen to dig by way of the manuals, you’ll see all three encoders have “benchmark modes,” which disable psycho-visual optimizations in trade of upper algorithmic metrics scores, together with VMAF. We intentionally select to not use benchmark modes as a result of we’ve determined to purpose for the “finest” subjective high quality, somewhat than replicating precise copies of each body. This additionally provides all three encoders an equal start line as we’re not during which encoder is finest optimized for benchmarking.
However what are the psycho-visual optimizations? Why does encoders and VMAF interpret visible high quality so in another way? Allow us to take a more in-depth take a look at a selected body then. In response to VMAF, body 11506 is VVC’s worst body, which additionally occurs to be HEVC’s ninth worst body. How does it look?
We see drastic variations between three codecs in body 11506. HEVC retained most particulars, and VVC is outright blurry in some areas. The screenshots do corroborate VMAF’s frame-by-frame evaluation, nevertheless it raises one other query. Movies are movement photos. This body comes from a quick, panning transition scene, and it’s solely proven for 1/24 of a second. How does the scene appear like when it’s taking part in?
With the quick digital camera motion, the distinction in visible high quality is arguably negligible throughout all codecs regardless of the degradations in particular person frames. I wish to level out that we – together with you, my pricey readers – are observing and scrutinizing the tree, however the common viewer probably isn’t.
Earlier, we talked about that it’s problematic to solely depend on VMAF, which penalizes psycho-visual optimizations. One in every of these optimizations is Adaptive Quantization (AQ). When watching movement photos, the human eye is extra delicate to sure areas of a body. For instance, we’re extra drawn to flat and low-texture areas – often foreground – than the background high-texture areas. We’re additionally extra drawn to high-motion areas than low-motion areas. In Sintel‘s body 11506, the tree shouldn’t be solely the high-texture background, but additionally strikes shortly to offer technique to the protagonist within the foreground. AQ redistributes extra bits to the place the human eye is drawn to, and due to this fact permits encoders to differ compression inside a body to enhance topic visible high quality on the expense of background particulars.
This is the reason VMAF exhibits much less constant high quality in newer codecs. Though VMAF is a state-of-the-art machine studying algorithm, it in the end seems to be at movement photos in another way from human. We watch movies, however VMAF watches frames.
Outcome 2: Electrical Boogaloo
Earlier, we in contrast VMAF outcomes between HEVC, AV1 and VVC. Though VMAF is an helpful metric, we now comprehend it doesn’t precisely characterize perceptual visible high quality. If we have been to right for VMAF’s biases, on the identical common rating, newer codecs would produce increased perceptual high quality general as a consequence of enhancements in Adaptive Quantization. Due to this fact, AV1 could be typically on par with, and generally surpasses HEVC when it comes to compression effectivity. VVC would nonetheless be one of the best codec among the many three.
There are additionally some real-life implications from our outcomes. If you wish to retain as a lot particulars as doable, then you definately would possibly wish to use an older codec, akin to AVC, and switch off psycho-visual optimizations. For residence labs, HEVC might be one of the best codec of the time since royalty isn’t related for private use. Though the compression effectivity isn’t one of the best, HEVC enjoys considerably quicker encoding pace and widespread {hardware} decoding help. VVC theoretically has one of the best compression effectivity, however on the time of the writing, we couldn’t discover any video participant that helps it out of the field, not to mention any {hardware} acceleration. VVC continues to be in heavy growth, nevertheless it seems to be good for the long run.
Born in between MPEG LA’s two generations of codecs, AV1’s greatest benefit is that it’s royalty-free, which makes it enticing for industrial utilization, however every thing is nearly at all times extra sophisticated than it appears. Sisvel, a patent licensing outfit, has already began a patent pool for AV1. Though AOMedia revealed an announcement to determine a authorized defence program in opposition to patent claims, nobody can predict what the long run holds.
As beforehand talked about, I don’t really feel certified to make authorized commentary as I don’t have a legislation diploma, however I’m allowed to complain. Though HEVC is on par with AV1 when it comes to compression effectivity, AV1 is a lot extra computationally advanced, and thus vitality inefficient, as a result of AOMedia actively avoids patents that they don’t possess. VVC faces the identical patent downside as HEVC, and even AV1’s royalty-free standing wants an asterisks on the finish.
It’s really a disgrace. Within the area of multimedia, virtually all essential strategies and codecs are lined by broad and trivial patents. If HEVC isn’t encumbered by patents and royalties, we’d have been within the HEVC-era by now with vital bandwidth and storage financial savings. Beneficial engineering assets could be put into enhancing present codecs for everybody, as an alternative of painstakingly creating new ones that may in the end even be embroiled in patent disputes. As a fan of free, libre software program, I really feel strongly that patents shouldn’t be exploited to impede progress and innovation within the pursuit of revenue. The longer term seems to be brilliant after we take a look at the know-how itself, however the future can also be dishearteningly sophisticated.
Oh, earlier than I neglect, listed below are the promised downloads: HEVC | AV1 | VVC | VMAF results.
If you happen to like our articles and journalism, and also you wish to help us in our endeavors, then think about heading over to our Patreon or our PayPal if you wish to toss a couple of dollars our manner. If you want to speak with the Chips and Cheese workers and the individuals behind the scenes, then think about becoming a member of our Discord.