I haven't used this seriously, but a simple test I did six years ago is http://canonical.org/~kragen/sw/dev3/vecalpha.c, which I compiled for AMD64 with SSE and for ARM with NEON. I imagine you can do better using intrinsics or assembly, but those are architecture-specific.
AV1's poor performance in ffmpeg has been a major reason I haven't been using AV1. It does seem to provide slightly better bandwidth/quality tradeoffs than H.264 or H.265, but if it's 30× slower to encode, it's usually not worth it. Add to that the possibility of patents.
AV1's poor performance in ffmpeg has been a major reason I haven't been using AV1. It does seem to provide slightly better bandwidth/quality tradeoffs than H.264 or H.265, but if it's 30× slower to encode, it's usually not worth it. Add to that the possibility of patents.