You've already forked go-profiler-notes
mirror of
https://github.com/DataDog/go-profiler-notes.git
synced 2025-07-15 23:54:16 +02:00
typos
This commit is contained in:
@ -143,7 +143,7 @@ Each frame lookup begins with the current `pc` which is passed to [`findfunc()`]
|
||||
|
||||
The [_func](https://github.com/golang/go/blob/9baddd3f21230c55f0ad2a10f5f20579dcf0a0bb/src/runtime/runtime2.go#L825) meta data that we just retrieved contains a `pcsp` offset into the `pctab` table that maps program counters to stack pointer deltas. To decode this information, we call [`funcspdelta()`](https://github.com/golang/go/blob/go1.16.3/src/runtime/symtab.go#L903) which does a `O(N)` linear search over all program counters of the function until it finds the (`pc`, `sp delta`) pair were are looking for. For stacks with recursive call cycles, a tiny program counter cache is used to avoids doing lots of duplicated work.
|
||||
|
||||
Now that that we have the stack pointer delta, we we are almost ready to locate the next `return address (pc)` value of the caller and do the same lookup for it until we reach the "bottom" of the stack. However, we before that we need to check if the current `pc` is part of one or more inlined function calls. This is done by checking the `_FUNCDATA_InlTree` data for the current `_func` and doing another linear search over the (`pc`, `inline index`) pairs that exist. Any inlined call found this way gets virtual stack frame `pc` added to the list. Then we continue with `return address (pc)` as mentioned in the beginning of the paragraph.
|
||||
Now that that we have the stack pointer delta, we we are almost ready to locate the next `return address (pc)` value of the caller and do the same lookup for it until we reach the "bottom" of the stack. But before that, we need to check if the current `pc` is part of one or more inlined function calls. This is done by checking the `_FUNCDATA_InlTree` data for the current `_func` and doing another linear search over the (`pc`, `inline index`) pairs that exist. Any inlined call found this way gets its own virtual stack frame `pc` added to the list. Then we continue with `return address (pc)` as mentioned in the beginning of the paragraph.
|
||||
|
||||
Putting it all together, for non-recursive call stacks without inlining, the complexity for `gopclntab` unwinding is `O(N*M)` where `N` is the number of frames on the stack, and `M` is the average size of the generated machine code per function. This can be validated [experimentally](https://github.com/DataDog/go-profiler-notes/tree/main/examples/stack-unwind-overhead), but in the real world I'd expect the average `N` and `M` to be fairly similar for most non-trivial Go applications, so unwinding a stack (without symbolization) will generally cost `1-10µs`. That being said, naive frame pointer unwinding appears to be [50x faster](https://github.com/felixge/gounwind), and does less cache thrashing, so high-resolution profiling and tracing use cases would likely benefit from seeing [support for it in the core](https://github.com/golang/go/issues/16638).
|
||||
|
||||
|
Reference in New Issue
Block a user