## Problem
Kubelet sets `pod.Status.Phase = Succeeded` before finalizing `containerStatuses[0].state.terminated`. When the informer sees the phase change and `WaitStep` calls `Get()`, the container status may still show `Terminated == nil`, causing a hard error:
```
no terminated state found for container wp-XXX/wp-XXX
```
This is a known race in the Kubernetes API server/kubelet eventually-consistent model. The window is normally milliseconds but widens to seconds under load (apiserver latency spikes, ResourceQuota admission storms, node pressure).
## Fix
Wrap the post-informer `Get()` + `Terminated == nil` check in `backoff.Retry` with exponential backoff (200ms initial, 5s max interval, 15s total budget). This mirrors the retry pattern already used for `TailStep` log stream recovery (#5550).
Extends `depends_on` to accept objects with `name` and `optional` fields, at both workflow and step level. When `optional: true`, the dependency is silently dropped if the referenced workflow/step is not part of the pipeline (e.g. filtered out by `when` conditions). If present, it is enforced as usual.
Co-authored-by: 6543 <6543@obermui.de>
As described in #6616 occassional services are not terminated when pipeline completes. It appears that if the service pods are terminated before WaitStep setups hooks for these, the delete events are lost and services end up running until the pipeline timeout exceeds.
The PR adds delete event handler and a check for the special case where the pod is already terminated when the event handlers are set.
Extract the `step_builder` from the server to the pipeline package.
This cleans the interfaces / structure and will allow us to re-use it in the cli to correctly support pipeline execution (things like `depends_on` support).
Co-authored-by: Anton Bracke <anton.bracke@fastleansmart.com>
Co-authored-by: 6543 <6543@obermui.de>
### Problem
When the working directory is set to a directory that doesn't exists (for example, as `plugin-git` does), kubelet will pre-create it with ownership set to `root:root` and permissions `0755` . This makes pods running as non-root unable to write to it, causing permission errors.
### Solution
Added a `podInitContainer` function that conditionally creates an init container to pre-create the working directory with the correct permissions before the main step container starts.
### Behavior
- If the pod runs as root (`RunAsUser == 0` or unset), no init container is created. Kubelet handles directory creation automatically
- If the working directory matches a volume mount path exactly, no init container is needed. `FSGroupChangePolicy` handles permissions
- An init container is only created when the working directory is nested within a volume mount path
- The init container uses `busybox:stable-musl` with minimal resource limits (5m CPU, 5Mi memory) and drops all capabilities.
### Related issues and PRs
- Solves the error mentioned in https://github.com/woodpecker-ci/woodpecker/issues/5346#issuecomment-3211408746 without requiring a previous step.
- In addition to #6307 and #6310, this will make it easier to run woodpecker ci workloads in a namespace that enforces [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/)
because we now wait for all steps to trace status back before we return, the defere did not tear down services anymore ...
... we now explicit tear down services and steps after all stages have executed.
Also adds tests to check for that and update the dummy backend to fullfill the interface contract of killing all "running" steps with DestroyWorkflow.