1
0
mirror of https://github.com/woodpecker-ci/woodpecker.git synced 2026-06-03 16:35:37 +02:00
Files
woodpecker/pipeline/backend
Simon C. Kemper a765cb885a fix(kubernetes): retry WaitStep when container terminated state not yet finalized (#6672)
## Problem

Kubelet sets `pod.Status.Phase = Succeeded` before finalizing `containerStatuses[0].state.terminated`. When the informer sees the phase change and `WaitStep` calls `Get()`, the container status may still show `Terminated == nil`, causing a hard error:

```
no terminated state found for container wp-XXX/wp-XXX
```

This is a known race in the Kubernetes API server/kubelet eventually-consistent model. The window is normally milliseconds but widens to seconds under load (apiserver latency spikes, ResourceQuota admission storms, node pressure).

## Fix

Wrap the post-informer `Get()` + `Terminated == nil` check in `backoff.Retry` with exponential backoff (200ms initial, 5s max interval, 15s total budget). This mirrors the retry pattern already used for `TailStep` log stream recovery (#5550).
2026-05-30 12:35:53 +02:00
..
2024-12-22 10:44:34 +01:00