gg's subs, 5.1 128kbps opus
ffmpeg -i file -vcodec libaom-av1 -pix_fmt yuv420p12le -crf 29 -cpu-used 2 -row-mt 0 -b:v 0 -acodec copy -strict -2 outfile.mkv
Why 12-bit? It had the same size and higher quantitative quality (ssim/psnr) than 10-bit, and nothing hardware decodes av1 anyway, so why not.
Run time was ~120hrs on a 9880H(NAS) averaging 3.5-3.7Ghz. File was split up into 6 parts and 6 ffmpeg instances run.
Let me explain to you a bit about bits and how they work:
8bit is per channel so in total you get 24bits per color or 16'777'216 colors
10 bit will be 1'073'741'824 colors
12 bit will be 68'719'476'736 colors
The problem is that current hardware will never get hardware decoding, only what will come out in 2021 (maybe). And when we do get hardware decoding, it will definitely not going to support 12bit.
It is inherently harder to decode higher bit content, so even if it does get optimized, the data is still going to be more complex to decode, and since no one is using 12bit except pro cinema equipment, no one is going to bother to optimize 12bit decodability.
12bit is not more efficient than 10bit. I know because I tried. It only results in unnecessary data which does not improve the quality (we tried with the whole HAV1T team to see differences), neither does it improve efficiency (an increase in BD rate 1-2%).
Also 10bit is already way above the human eye color perception level, and finding hardware that is able to display 12bit color is near impossible except for pro equipment. Most of the monitors are able to get better quality video even when they are 8bit because 10 bit is able to give more accurate colors (especially in luma range). To do that for 12bit current consumer monitors physically lack colors to make any difference.
Another point is this: encoding times are going to suffer too, a lot. While it will improve, more bit data = more complexity = more processing power required.
5.1 audio channels at 128kbps is hmm apparently https://people.xiph.org/~xiphmont/demo/opus/demo3.shtml feasable with opus, did not expect that.
Another point is that it took you 120 hours to encode on 9880H while I can output 4 episodes per 24 hours at cpu used 3 1080p with an i7 6700.
Another footnote, it is recommended to use VMAF rather than PSNR and SSIM.
VMAF is tuned to human perception and gives more consistent data compared to the other two.
It is not perfect, but it is definitely better for measuring quality in quantifiable ways.
> current hardware will never get hardware decoding
I dont think thats true, I do recall NVIDIA having Hardware video decoders. unless you mean AV1 having a HW decoder then yes.
Current is the important word here. To have hardware decoding capability you have to physically have it in the computer. That is why it's called Hardware acceleration. There will be adoption and maybe new gpu's will have inbuilt decoders in them, but no current hardware (with very few exceptions) have AV1 decoder. You have to do it in software.
Edit: Yes, I meant AV1, there are hardware decoders for older codecs. newer GPU's have HEVC hardware decoding, some even NVENC hardware encoders, but not AV1.
> 12bit is not more efficient than 10bit. I know because I tried. It only results in unnecessary data which does not improve the quality (we tried with the whole HAV1T team to see differences), neither does it improve efficiency (an increase in BD rate 1-2%).
That's possible. I got a small psnr/ssim improvement by using 12bit, at no bitrate increase. It's not perceivable colors so much as how the gradients work, especially on anime. It was also dramatic on Macross Frontier.
10-bit is all around a better option I agree. 12-bit also slightly slowed down encoding.
> Another footnote, it is recommended to use VMAF rather than PSNR and SSIM.
I'm aware of VMAF, but VMAF also behaves kinda wierdly if you are running filters, as it prefers oversharpening from what I understand.
> Another point is that it took you 120 hours to encode on 9880H while I can output 4 episodes per 24 hours at cpu used 3 1080p with an i7 6700.
From my test cpu-used 3 was subjectively worse (as well as quantitatively), and was about 2.5x as fast. I'm doing this more to amuse myself. If I was gung-ho about encoding I'd just get a threadripper or something. I'm not doing live releases and competing with groups.
Also AV1 will get faster over time, remember how horrendous x265 was at the beginning?
@2dnfire Oh yeah, AV1 is getting faster every day, every week I see improvements of 2%-3% in speed.
And you are correct about VMAF favoring sharpening, it is one of it's kinks, as far as I understand Netflix is trying to fix it, but this issue is still present.
Correct me if I'm wrong but I expected 12bit encoding to be a lot slower than 10bit, how much of a speed difference do you get when using same settings when comparing 10bit vs 12bit?
Maybe it would be more beneficial to use a slower setting on 10bit compared to faster setting on 12bit? (Btw there is a dramatic jump in subjective quality when using speed 0 but it is VERY slow)
> Correct me if I’m wrong but I expected 12bit encoding to be a lot slower than 10bit, how much of a speed difference do you get when using same settings when comparing 10bit vs 12bit?
from memory it was approximately 10% slower for a small (~.1% SSIM/PSNR improvement). I did that test on Akira at cpu-used 4 though, so it could be invalid on a no-grain(digital) cpu-used 2 encode.
> Maybe it would be more beneficial to use a slower setting on 10bit compared to faster setting on 12bit? (Btw there is a dramatic jump in subjective quality when using speed 0 but it is VERY slow)
The cpu-used toggle is WAAAAY coarser than changing between 10 and 12 bit. I remember trying it with CRF, and raising CRF harmed the PSNR/SSIM significantly compared to increasing bit depth.
cpu1:[Parsed_psnr_0 @ 0x55753138a580] PSNR y:28.352120 u:41.138806 v:38.510453 average:29.954134 min:11.308108 max:55.222888 17.7hrs 20secs 9.2MB
cpu4:[Parsed_psnr_0 @ 0x55c6402efc80] PSNR y:28.351391 u:41.123410 v:38.504420 average:29.953095 min:11.307653 max:54.992161 10.0MB
cpu3:[Parsed_psnr_0 @ 0x557e76ce8340] PSNR y:28.351231 u:41.130022 v:38.505762 average:29.953055 min:11.307634 max:55.013905 9.6MB
cpu2:[Parsed_psnr_0 @ 0x556fb1e62b40] PSNR y:28.352588 u:41.141290 v:38.512864 average:29.954672 min:11.308799 max:55.235064 5.6hrs 20secs 9.5MB
cpu4crf2912b:[Parsed_psnr_0 @ 0x55cb1ba80380] PSNR y:28.377576 u:41.108454 v:38.514176 average:29.978371 min:11.343550 max:54.604746 7.5MB
cpu4crf2612b:[Parsed_psnr_0 @ 0x55a85017dd00] PSNR y:28.383055 u:41.161495 v:38.543877 average:29.985022 min:11.343070 max:55.150660 8.2MB
cpu4crf2910b:[Parsed_psnr_0 @ 0x562b3e504c40] PSNR y:28.370361 u:41.090817 v:38.501484 average:29.970894 min:11.336218 max:54.522439 7.5MB
cpu4crf2910b:[Parsed_ssim_0 @ 0x559b15a4a9c0] SSIM Y:0.977146 (16.410326) U:0.988582 (19.424073) V:0.988302 (19.319060) All:0.980911 (17.192225)
Using 12-bit just because you can is a terrible idea. Keep it with 10-bit where chances are high that there will be hardware support, instead of going full autistic here.
Otherwise, good job.
You are aware that h264 10-bit, the most common fan encode anime format now, doesn't have hardware support?
I understand your point, but I don't really care about hardware support. Having to constrain the bitstream that much isn't worth it.
I just checked out the new nightly ffmpeg builds, encode speed is massively increased for libaom. On the order of maybe 20-50x speedup. depending on cpu-used level.
I remember Quicktime->xvid/divx -> h264 (the difference was immense) ->h264 10b. Everything now is just shrinking sizes or making streaming quality less terrible.
They also fixed Film Grain synthesis (AV1 models the film grain and reinserts it as a mathematical model, massively reducing file size).
I'm thinking maybe Akira, Princess Mononoke, and Spirited away next. No one is using AV1 yet(at all?) as far as I can see from anidb.
I'm working on some Takeshi's Castle at the moment.
Can't. Both my machines have 16GB of RAM. For 4k, SVT-AV1 takes over 16GB of actual reserved memory. I can make a 1080p version from the 4k source though.
The scan is better, though the colors are... off. The colors are fixable.
And my 8700k couldn't play back the file fast enough. That's AV1 for you.
@Chilled Yeah, with current optimizations, ARM64 devices with power A72+ cores actually playback AV1 10-bit content smoother than x86 CPUs, so if you really want to get the most out of AV1 for now, get the latest VLC Beta on your Android phone. :D
I'm not sure that's true.
From what I read nothing short of 12+ core CPUs can decode 4k 10bit av1.
hell even my 8700@5Ghz is having a hard time with 1080p yuv444p10le AKIRA.
Mind you, it will get better as the decoders get better, but I don't think current ARM can decode faster than current X86-64.
IT is true actually.
I have no problem decoding every 1080p 10-bit AV1 stream I've ever encountered on my Snapdragon 730 phone, better than even my 3700X machine.
There have recently been a lot of assembly optimizations on ARM64 CPUs, so that's about it.
1080p 10 bit 420 is fine.
1080p 10 bit 444 and 4K 10 bit 420 isn't.
> better than even my 3700X machine.
A 3700X should have zero problem decoding 1080 10b 420
Comments - 20
Montec
Montec
Shinon71
Montec
2ndfire (uploader)
Montec
2ndfire (uploader)
2ndfire (uploader)
kamineko
2ndfire (uploader)
kanone
2ndfire (uploader)
Chilled
2ndfire (uploader)
Chilled
2ndfire (uploader)
OpusMaxE
2ndfire (uploader)
OpusMaxE
2ndfire (uploader)