1
0
mirror of https://github.com/open-telemetry/opentelemetry-go.git synced 2026-06-03 18:35:08 +02:00
Files
opentelemetry-go/exporters
Tyler Yahn d03b03395d Check context prior to delaying retry in OTLP exporters (#7678)
Fix #7673

[Issue being
addressed](https://github.com/open-telemetry/opentelemetry-go/issues/7673#issuecomment-3618325229):

> 1.
[`fn`](https://github.com/open-telemetry/opentelemetry-go/blob/1bc9713ac6dc8cbe2fd04fd6dc716d316059eb90/exporters/otlp/otlplog/otlploggrpc/internal/retry/retry_test.go#L163-L165)
is
[called](https://github.com/open-telemetry/opentelemetry-go/blob/1bc9713ac6dc8cbe2fd04fd6dc716d316059eb90/exporters/otlp/otlplog/otlploggrpc/internal/retry/retry.go#L87)
> 2. It [returns an
error](https://github.com/open-telemetry/opentelemetry-go/blob/1bc9713ac6dc8cbe2fd04fd6dc716d316059eb90/exporters/otlp/otlplog/otlploggrpc/internal/retry/retry_test.go#L165)
> 3. The code [checks if the error is
retryable](https://github.com/open-telemetry/opentelemetry-go/blob/1bc9713ac6dc8cbe2fd04fd6dc716d316059eb90/exporters/otlp/otlplog/otlploggrpc/internal/retry/retry.go#L92),
it [always
is](https://github.com/open-telemetry/opentelemetry-go/blob/1bc9713ac6dc8cbe2fd04fd6dc716d316059eb90/exporters/otlp/otlplog/otlploggrpc/internal/retry/retry_test.go#L149)
> 4. [Time delay is
checked](https://github.com/open-telemetry/opentelemetry-go/blob/1bc9713ac6dc8cbe2fd04fd6dc716d316059eb90/exporters/otlp/otlplog/otlploggrpc/internal/retry/retry.go#L97-L108)
> - [Max elsapsed
time](https://github.com/open-telemetry/opentelemetry-go/blob/1bc9713ac6dc8cbe2fd04fd6dc716d316059eb90/exporters/otlp/otlplog/otlploggrpc/internal/retry/retry_test.go#L156-L157)
is 10 ms
> - Initial [delay is
1ms](https://github.com/open-telemetry/opentelemetry-go/blob/1bc9713ac6dc8cbe2fd04fd6dc716d316059eb90/exporters/otlp/otlplog/otlploggrpc/internal/retry/retry_test.go#L151)
>    - Delay is determined to be 1ms
>    - The program proceeds to waiting
> 5. [Wait is
called](https://github.com/open-telemetry/opentelemetry-go/blob/1bc9713ac6dc8cbe2fd04fd6dc716d316059eb90/exporters/otlp/otlplog/otlploggrpc/internal/retry/retry.go#L110-L112)
> 6. The [wait select statement is
evaluated](https://github.com/open-telemetry/opentelemetry-go/blob/1bc9713ac6dc8cbe2fd04fd6dc716d316059eb90/exporters/otlp/otlplog/otlploggrpc/internal/retry/retry.go#L127-L138)
>    - On slow systems both `case`s are true
> -
[Non-deterministically](https://go.dev/ref/spec#:~:text=If%20one%20or,communications%20can%20proceed.)
the [timer channel
`case`](https://github.com/open-telemetry/opentelemetry-go/blob/1bc9713ac6dc8cbe2fd04fd6dc716d316059eb90/exporters/otlp/otlplog/otlploggrpc/internal/retry/retry.go#L137)
is selected
> - The retry function is re-run and and second iteration is recorded
causing the failure
>    - On fast systems only the context cancel is true
>       - The retry stops here with only `1` execution

Do not rely on non-deterministic `select` statement to catch ended
context prior to waiting for a retry delay. Explicitly check the context
prior to entering the wait.

This resolves the flaky test and ensure in normal operation that
requests with canceled context are ended without having to wait for any
additional delays.
2025-12-07 10:11:36 -08:00
..
2025-11-19 11:06:20 +01:00
2025-11-19 11:06:20 +01:00