Now Reading
Who Has the Greatest {Hardware} AV1 Encoder?

Who Has the Greatest {Hardware} AV1 Encoder?

2023-04-16 00:26:56

Who Has the Greatest {Hardware} AV1 Encoder?

back

Apr 15, 2023

projects

av1

Please bear in mind that some photos could not load on this web page until your browser helps JPEG-XL.

Why?

No one actually is aware of which {hardware} accelerated AV1 encoder is best proper now.

There are different assets accessible, however they do not use good metrics. Everybody’s favourite dangerous, out of date, simply tricked metric is VMAF, which was good for a time however merely is not good anymore. You may take a look at this Twitter thread to see what I am speaking about; these aren’t edge instances both. You may produce a “higher high quality” video in response to VMAF utilizing Distinction Adaptive Sharpening filters which inherently don’t enhance video constancy in any manner.

Different metrics are PSNR, SSIM, MS-SSIM, SSIMULACRA, Butteraugli, & SSIMULACRA2. PSNR would not precisely signify our visible system as people, SSIM was good for its time however is now previous and in addition offers inconsistent/deceptive outcomes, & MS-SSIM does higher but additionally would not correlate as properly with our visible programs as newer metrics. The gap-based Butteraugli metric is not dangerous, however SSIMULACRA2 is presently thought of the gold normal for a visible high quality metric.

You may learn extra about why this metric is nice here, however SSIMULACRA2 (ssimu2) is, in my view, the one solution to actually measure subjective visible high quality proper now. The scores it produces correlate finest with my eyes, and lots of others agree.

Here is how SSIMULACRA2 assesses high quality:

Returns a rating in vary -inf..100, which correlates to subjective visible high quality scores as follows:

  • 30 = low high quality. This corresponds to the p10 worst output of mozjpeg -quality 30.
  • 50 = medium high quality. This corresponds to the typical output of cjxl -q 40 or mozjpeg -quality 40, or the p10 output of cjxl -q 50 or mozjpeg -quality 60.
  • 70 = top quality. This corresponds to the typical output of cjxl -q 65 or mozjpeg -quality 70, p10 output of cjxl -q 75 or mozjpeg -quality 80.
  • 90 = very top quality. Probably not possible to tell apart from the unique when seen at 1:1 from a traditional viewing distance. This corresponds to the typical output of mozjpeg -quality 95 or the p10 output of cjxl -q 95.

In my Image Codec Comparison benchmark, I mentioned “Due to the irrelevance of damaging SSIMULACRA2 scores, I ended the vertical axis on the graph(s) at 0.” I nonetheless keep this logic was right for the picture benchmark, however with livestreams, it is not uncommon to see a low bitrate stream because of a poor community connection on both a streamer’s finish or your personal. This check can also be extra about how the encoders carry out at sure goal bitrates, as talked about above, and because the aim is not essentially at all times to focus on a sure high quality like it’s with photos, I believe a number of the much less related information (eg goal 500kb/s) remains to be necessary to incorporate due to the way it frames the extra helpful information (eg the 2000-6000kb/s vary standard with streamers).

Anyway, I’ve seen testing from many utilizing VMAF, and their conclusions are various relative to which {hardware} encoder is the most effective. These tests are extra thorough than mine, however aren’t accompanied by a lot clarification or any VAAPI outcomes for Linux. In addition they use VMAF, so every part have to be taken with a grain of salt. I nonetheless assume they’re helpful in judging the standard of many {hardware} encoders that I have not examined right here (like AMD’s stuff) so I am going to nonetheless advocate you go have a look if you happen to’re . So, what did we discover in our personal testing?

Methodology (Gaming Corpus)

With the intention to perceive who makes use of {hardware} encoders and the place, look no additional than livestreaming. Livestreaming might be achieved through OBS, and you’ve got a option to resolve between CBR, VBR, & CQP encoding. Many select to go for CBR or VBR, since your connection is most probably the bottleneck & you possibly can goal a sure bitrate. I might personally go for VBR in order that the encoder can intelligently allocate roughly bits when needed whereas nonetheless concentrating on a particular bitrate on common. This additionally permits us to get higher general high quality per bit, so the viewer can see the next high quality stream whereas utilizing much less information on common. It’s extra dependable for information utilization & bandwidth than CQP which targets high quality, & extra environment friendly than CBR which is able to give each scene the identical quantity of bits irrespective of its content material.

With assist from a pair pals on-line (specifically BlacKnight & Maik), I examined {hardware} & software program encoders from goal 500kb/s to focus on 8500kb/s in increments of 500. This gave us 16 information factors per encoder per clip to work with. I used two clips from DERF’s Take a look at Media – specifically the CSGO clip & the Minecraft clip underneath “Gaming” – and transcoded them to h264 with crf 1 to supply the encoders with close to mathematically lossless video that they might successfully decode. The loss launched by reencoding at such top quality is miniscule & ought to have a virtually undetectable impact on the benchmark outcomes.

Here is a screenshot from the CSGO clip:

csgo

& here is one from the Minecraft clip:

minecraft

All {hardware} encoders had been on their highest effort preset (-compression_level 7) for optimum high quality. It did not have a large impact on pace in my testing.

The encoders examined for the CSGO clip had been the next:

  • h264 (UHD770 VAAPI)
  • h264 (Arc A770 VAAPI)
  • HEVC (UHD770 VAAPI)
  • HEVC (Arc A770 VAAPI)
  • HEVC (Arc A770 VAAPI, 10 bit)
  • VP9 (UHD770 VAAPI)
  • x264 (slower preset)
  • SVT-AV1 (preset 6, 10 bit)
  • SVT-AV1 (preset 8)
  • SVT-AV1 (preset 8, 10 bit)
  • Nvenc AV1 (RTX 4090)
  • Nvenc AV1 (RTX 4090, 10 bit)
  • QSV AV1 (Arc A770)
  • QSV AV1 (Arc A770, 10 bit)

The encoders examined for the Minecraft clip had been the next:

  • x265 (medium preset, 10 bit)
  • SVT-AV1 (preset 4, 10 bit)
  • SVT-AV1 (preset 6, 10 bit)
  • SVT-AV1 (preset 8, 10 bit)
  • Nvenc AV1 (RTX 4090, 10 bit)
  • QSV AV1 (Arc A770, 10 bit)

You could be questioning why there may be such an emphasis on 10 bit encoding regardless of the 8 bit supply. In my testing, even simply with my eyes, encoding to 10 bit video with most lossy video codecs (HEVC & AV1 particularly) yields higher coding effectivity than encoding to eight bit. I can not let you know precisely why, however I am positive others on the AV1 Community Discord can.

You may additionally be questioning in regards to the lack of AMD numbers. I’ve a Radeon Rx 6600xt & myself and others discovered it very tough to correctly check AMD’s {hardware} encoders. You may have a look at the outcomes I linked above that report VMAF numbers – the huge discrepancy in AMD’s AV1 efficiency in comparison with different AV1 encoders must be sufficient to let you understand how good AMD’s outcomes are.

Outcomes

Lastly, we come to the graphs. Additionally it is price noting that the default GOP dimension for the {hardware} encoders is round 300 frames; leaving it on “auto” in FFmpeg typically yielded barely higher outcomes by a number of fractions of a degree in ssimu2, so that’s what we did.

CSGO

Here’s a graph for goal bitrate (VBR) with the CSGO clip:

csgo_results

And a log scale model that we will use to see discrepancies higher:

csgo_results_log

That is sort of loads to unpack, as there are a substantial variety of overlapping traces right here. I’ve connected a CSV file here to be able to visualize this information by yourself in any manner you please, however I believe we will come to some attention-grabbing conclusions given this present visualization. For this clip:

  • There are clear, marked variations in coding effectivity relying on the encoder even when the codec is similar. So, simply because your video is h264 does not imply that’s was encoded as effectively as each different h264 video. You may see right here that the Arc A770’s h264 encoder yields higher high quality per bit than the h264 encoder that is a part of the iGPU on my 13700k. The fastened operate {hardware} Intel is deploying for these two implementations is distinct, and Arc’s is clearly higher. The software program x264 encoder that runs immediately on the CPU is actually slower, however you possibly can see how that pays off in spades.
  • SVT-AV1 is in a category by itself. Preset 8 dominates the {hardware} encoders throughout the board by a not insignificant margin, & Preset 6 pulls forward even additional. Whereas {hardware} encoders are very quick and a gem for streaming, software program encoders nonetheless take house the effectivity crown.
  • NVENC AV1 is extra environment friendly than QSV AV1 on Arc, even when the benefit is slight. Even when evaluating 8 bit NVENC to 10 bit QSV, NVENC nonetheless manages to drag forward ever so barely.

If you happen to got here right here to rub salt in Arc customers’ wounds, I would not say the benefit is critical sufficient to warrant buying a 40-series card over an Arc card solely for encoding particularly contemplating the value discrepancy. Nonetheless, the benefit remains to be there, and should not be ignored. My takeaway is in case you have extra money than you understand what to do with, do not buy an Arc GPU for encoding in case you have a 40-series card from Nvidia already. Spend it on a pleasant bowl of ramen as a substitute (or a pair good bowls of ramen, realistically).

See Also

Minecraft

Let’s examine what else we will discover trying on the Minecraft outcomes:

minecraft_results

Here is one other log scale model:

minecraft_results_log

This one is extra stripped down, & every part is simpler to see. Listed below are my takeaways:

  • Now that we now have x265 within the combine, we will see NVENC beat it throughout the board & QSV got here shut. That is fairly spectacular, as x265 medium is not a quick setting by any stretch of the creativeness.
  • SVT-AV1 preset 4’s efficiency leaves us with some clues about our ultimate conclusion, seeing as software program AV1 encoding (even with SVT-AV1, which does not produce nearly as good high quality per bit as AOM encoder forks like aom-av1-lavish) continues to drag forward considerably the extra time & effort we permit it to make use of. It might be attainable to stream utilizing SVT-AV1 preset 8 on greater finish programs proper now, however extra probably you would be utilizing preset 9 or 10, which is the place I believe SVT-AV1 might be on par with {hardware} encoders. As CPUs grow to be extra highly effective, this may grow to be simpler, & it’s clear that devoted streaming {hardware} could proceed to supply worth for streamers who can afford it.

Conclusion

Who Gained?

Nvidia’s {hardware} acceleration for AV1 video encoding is superior in these two clips. It’s a tiny margin, and you will need to decide subjective visible high quality for your self when evaluating tiny discrepancies like this, however SSIMULACRA2 says that Nvidia has the higher resolution. I do not assume that is a very powerful takeaway right here, although, contemplating our peek on the potential software program encoders have.

Software program vs {hardware}?

Software program encoders are run on the CPU with none fastened operate {hardware} designed round any particular codec. {Hardware} accelerated video encoding takes place on specialised ASICs which are connected to most shopper GPUs, & encoding with {hardware} acceleration means the encoder may make use of a GPU’s extremely parallel compute to additional speed up sure video encoding features. Creating these ASICs is dear, and dedicating beneficial silicon actual property on GPUs to such options should show to be properly price an organization’s money and time. Nvidia & Intel each appear to agree that AV1 has sufficient of a future to make it financially sane to place quick devoted {hardware} within the arms of each shopper who has certainly one of their newest GPUs, which is fairly cool.

The problem concerning effectivity is that this fastened operate {hardware} is designed to be quick above all else, & implementing much less necessary codec options into this {hardware} can inflate prices with out serving to effectivity meaningfully. In software program, whereas slower, encoders can decide and select to allow/disable no matter options they like based mostly on person specified parameters or inside defaults for every preset. This makes the use instances for each options very completely different.

When ought to I exploit {hardware} encoders?

As I discussed earlier, livestreaming is the place {hardware} accelerated video encoding is hottest. Nvidia has turned their NVENC {hardware} right into a differentiating characteristic due to the recognition of livestreaming, & as websites like Twitch & YouTube roll out AV1 for the plenty, livestreamers will recognize having a robust {hardware} implementation for the extra environment friendly codec to stream with given their {hardware} readily helps it.

{Hardware} encoding can be nice for display screen recording, capturing clips in video games, and trancoding media on the fly for a media server or one thing comparable. When pace is the precedence over coding effectivity, {hardware} encoders are a improbable selection.

Listed below are some locations the place you most likely should not use {hardware} encoders:

  • Transcoding media you legally personal for later consumption (until you might be burdened by a time restrict or it’s too painful to encode at lower than actual time)
  • Re-encoding house video that should take up much less area someplace or must be shared simply
  • Producing a ultimate copy of a video undertaking of some type to be shared with others
  • Illegally releasing pirated media that has been transcoded to attain a smaller filesize for viewer comfort (I can not advocate this regardless)

You must by no means transcode one thing that has already been transcoded until it’s important to – it’s best to seek for the supply. If in case you have a BluRay assortment that you simply transcoded to h264 ten years in the past, do not go re-encoding it to AV1 now – discover the supply & do it over, else you danger preserving artifacts from the unique encode. And do it with a software program encoder. If you happen to’re exporting a video for a undertaking or one thing comparable, it could be price encoding to a lossless or close to lossless file & reencoding as effectively as attainable later when you possibly can afford the time so you possibly can optimize for high quality per bit.

SVT-AV1 is nice for that candy spot stability of pace & effectivity for AV1 video. Rav1e, my private favourite software program AV1 encoder, is not the most effective, however it’s nice for prime constancy & ought to by no means segfault. aom-av1-lavish is the reigning champ for coding effectivity in a software program encoder. Use at pace 4 or decrease for glorious outcomes; they are going to be outcomes you may have to attend very lengthy for, although. Instruments like av1an may help.

You may attempt my SVT-AV1 GUI, Aviator, if you would like to fiddle with SVT-AV1 with out utilizing the command line. rAV1ator it its cousin, powered by av1an & rav1e as a substitute of SVT-AV1 & FFmpeg. Please give them a glance if you happen to’re enthusiastic about getting began with AV1. When you’d prefer to dive down the rabbit gap, be at liberty to look into this article for an in depth information written by a proficient encoder. Thanks for studying!

Mastodon | Matrix | The Radix Project

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top