You've already forked FFmpeg
mirror of
https://github.com/FFmpeg/FFmpeg.git
synced 2025-12-25 22:17:24 +02:00
The shader needs ~3 loads per DCT coeff. This data was not observed to get efficiently stored in the upper cached levels, loading it explicitely in shared memory fixes that. Also reduce code size by moving the bitstream initialization outside of the switch/case.