clSetKernelArg can return an error due to lack of memory (for instance):
https://www.khronos.org/registry/cl/sdk/1.1/docs/man/xhtml/clSetKernelArg.html.
Thus this error must be propagated.
Currently should not trigger warnings, but adds robustness.
Untested.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
This simplifies and cleans up the code.
Furthermore, it is much faster due to absence of the slow log computation.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Current code is fine, this just adds robustness.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
av_gcd is now always defined regardless of input. This documents this
change in the "documented API". Two benefits (closely related):
1. The function is robust, and there is no need to worry about INT64_MIN, etc.
2. Clients of av_gcd, like av_reduce, can now be made fully correct. Currently,
av_reduce can trigger undefined behavior if e.g num is INT64_MIN due to
integer overflow in the FFABS. Furthermore, this undefined behavior is
completely undocumented, and could be a fuzzer's paradise. The FFABS was needed in the past as
av_gcd was undefined for negative inputs. In order to make av_reduce
robust, it is essential to guarantee that av_gcd works for all int64_t.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
This ensures that no undefined behavior is invoked, while retaining
identical return values in all cases and at no loss of performance
(identical asm on clang and gcc).
Essentially, this patch exchanges undefined behavior with implementation
defined behavior, a strict improvement.
Rationale:
1. The ideal solution is to have the return type a uint64_t. This
unfortunately requires an API change.
2. The only pathological behavior happens if both arguments are
INT64_MIN, to the best of my knowledge. In such a case, the
implementation defined behavior is invoked in the sense that UINT64_MAX
is interpreted as INT64_MIN, which any reasonable implementation will
do. In any case, any usage where both arguments are INT64_MIN is a
fuzzer anyway.
3. Alternatives of checking, etc require branching and lose performance
for no concrete gain - no client cares about av_gcd's actual value when
both args are INT64_MIN. Even if it did, on sane platforms (e.g all the
ones FFmpeg cares about), it produces a correct gcd, namely INT64_MIN.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
This one should not trigger any warnings, but will be useful for future
robustness.
Strictly speaking, one could check the size after the call by examining
the structure instead of the return value. Such a use case is highly
unusual, and this commit may be easily reverted if there is a legitimate
need of such use.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
This ensures that the macro remains correct in the sense of allowing
expressions for value and bits, by placing the value and bits expressions within
parentheses.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Commit 7c8fcbbde3 introduced some warnings
that get triggered on the test build. This should fix them.
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
libc's qsort comparator has a const qualifier on both arguments. This
adds a missing const qualifier to exactly match the comparator API.
Existing usages of av_tree_find, av_tree_insert are appropriately
modified: type signature changes of the comparators, and removal of
unnecessary void * casts of function pointers.
Reviewed-by: Henrik Gramner <henrik@gramner.com>
Reviewed-by: wm4 <nfxjfg@googlemail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
This documents the additional constness, and provides a useful libc
reference for the API specification of the comparator.
Reviewed-by: Henrik Gramner <henrik@gramner.com>
Reviewed-by: wm4 <nfxjfg@googlemail.com>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Also replace the last two usages of avpriv_float_dsp_init with
avpriv_float_dsp_alloc.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
These functions return an error typically when the key size is an
incorrect number. AVERROR(EINVAL) is more specific than -1.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Silences warnings regarding `clCreateCommandQueue` being deprecated.
Only a very limited number of products support 2.0. Since the
replacement API (`clCreateCommandQueueWithProperties`) is only available
in 2.0, we should not update it just yet.
This adds av_warn_unused_result to functions whose return codes need to
be checked.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
It's been argued that the benefits of the current implementation far outweight
those of making the structs opaque.
This deprecation is not present in any release, so it can be safely removed.
Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The return code here should be checked.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
The open syscall can obviously fail, and its return code needs to be
checked.
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
It has already been demonstrated that the de Bruijn method has benefits
over the current implementation: commit 971d12b7f9.
That commit implemented it for long long, this extends it to the int version.
Tested with FATE.
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
changing the context state and restoring it is not safe if another
thread writes data into the fifo
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This reduces the memory & cache need from 256 to 64 bytes
the code also seems faster with this change
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This uses Stein's binary GCD algorithm:
https://en.wikipedia.org/wiki/Binary_GCD_algorithm
to get a roughly 4x speedup over Euclidean GCD on standard architectures
with a compiler intrinsic for ctzll, and a roughly 2x speedup otherwise.
At the moment, the compiler intrinsic is used on GCC and Clang due to
its easy availability.
Quick note regarding overflow: yes, subtractions on int64_t can, but the
llabs takes care of that. The llabs is also guaranteed to be safe, with
no annoying INT64_MIN business since INT64_MIN being a power of 2, is
shifted down before being sent to llabs.
The binary GCD needs ff_ctzll, an extension of ff_ctz for long long (int64_t). On
GCC, this is provided by a built-in. On Microsoft, there is a
BitScanForward64 analog of BitScanForward that should work; but I can't confirm.
Apparently it is not available on 32 bit builds; so this may or may not
work correctly. On Intel, per the documentation there is only an
intrinsic for _bit_scan_forward and people have posted on forums
regarding _bit_scan_forward64, but often their documentation is
woeful. Again, I don't have it, so I can't test.
As such, to be safe, for now only the GCC/Clang intrinsic is added, the rest
use a compiled version based on the De-Bruijn method of Leiserson et al:
http://supertech.csail.mit.edu/papers/debruijn.pdf.
Tested with FATE, sample benchmark (x86-64, GCC 5.2.0, Haswell)
with a START_TIMER and STOP_TIMER in libavutil/rationsl.c, followed by a
make fate.
aac-am00_88.err:
builtin:
714 decicycles in av_gcd, 4095 runs, 1 skips
de-bruijn:
1440 decicycles in av_gcd, 4096 runs, 0 skips
previous:
2889 decicycles in av_gcd, 4096 runs, 0 skips
Signed-off-by: Ganesh Ajjanagadde <gajjanagadde@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>