The AV_INPUT_BUFFER_PADDING_SIZE has been increased to 64, but the value is still 32
in function ff_hevc_sao_edge_filter_8_msa. So, use AV_INPUT_BUFFER_PADDING_SIZE directly.
Also, use MAX_PB_SIZE directly instead of 64. Fate tests passed.
Reviewed-by: Derek Buitenhuis <derek.buitenhuis@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
(cherry picked from commit f652c7a45c60427db0a89fae665e63b546af6ebb)
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Load the specific bytes instead of MSA load.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Load the specific destination bytes instead of MSA load and pack.
Pack the data to half word before clipping.
Use immediate unsigned saturation for clip to max saving one vector register.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Remove loops and unroll as block sizes are known.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Replace generic with block size specific function.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Replace generic with block size specific function.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Remove loops and unroll as block sizes are known.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Replace generic with block size specific function.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Replace generic with block size specific function.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Replace generic with block size specific function.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Load the specific destination bytes instead of MSA load and pack.
Pack the data to half word before clipping.
Use immediate unsigned saturation for clip to max saving one vector register.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Preload data in band filter 0-8 for better pipeline parallelization.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Remove loops and unroll as block sizes are known.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Optimize luma intra case by reducing conditional cases.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Seperate the filter processing in all strong, all weak and strong + weak cases.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Load the specific destination bytes instead of MSA load and pack.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Updated VP9_LPF_FILTER4_4W macro to process on 8 bit data.
Replaced VP9_LPF_FILTER4_8W with VP9_LPF_FILTER4_4W.
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Reviewed-by: Manojkumar Bhosale <Manojkumar.Bhosale@imgtec.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* commit '2caa93b813adc5dbb7771dfe615da826a2947d18':
mpegaudiodsp: Change type of array stride parameters to ptrdiff_t
Merged-by: James Almer <jamrial@gmail.com>
* commit 'e4a94d8b36c48d95a7d412c40d7b558422ff659c':
h264chroma: Change type of stride parameters to ptrdiff_t
Merged-by: James Almer <jamrial@gmail.com>
* commit '2ec9fa5ec60dcd10e1cb10d8b4e4437e634ea428':
idct: Change type of array stride parameters to ptrdiff_t
Merged-by: James Almer <jamrial@gmail.com>
* commit '52730e0f867fe77b7d2353d8b44e92edb7079ca5':
iir_filter: Change type of array stride parameters to ptrdiff_t
The merge also updates the MIPS code and drop the extra log.h include.
Merged-by: Clément Bœsch <u@pkh.me>
* commit '1bd890ad173d79e7906c5e1d06bf0a06cca4519d':
hevc: Separate adding residual to prediction from IDCT
This commit should be a noop but isn't because of the following renames:
- transform_add → add_residual
- transform_skip → dequant
- idct_4x4_luma → transform_4x4_luma
Merged-by: Clément Bœsch <cboesch@gopro.com>