The Go runtime keeps track of all goroutines in a simple slice called [allgs](https://github.com/golang/go/blob/3a778ff50f7091b8a64875c8ed95bfaacf3d334c/src/runtime/proc.go#L500). It contains both active and dead goroutines. The latter are kept around for reuse when new goroutines are spawned.
Go has various APIs to inspect the active goroutines in `allgs` along with their current stack trace, as well as various other properties. Some APIs expose this information as statistical summaries, while other APIs provide information for each individual goroutine.
Despite the differences between the APIs, the [common](https://github.com/golang/go/blob/9b955d2d3fcff6a5bc8bce7bafdc4c634a28e95b/src/runtime/mprof.go#L729) [definition](https://github.com/golang/go/blob/9b955d2d3fcff6a5bc8bce7bafdc4c634a28e95b/src/runtime/traceback.go#L931) of an "active" goroutine seems to be:
- It's not [`dead`](https://github.com/golang/go/blob/go1.15.6/src/runtime/runtime2.go#L65-L71)
- It's not a [system goroutine](https://github.com/golang/go/blob/9b955d2d3fcff6a5bc8bce7bafdc4c634a28e95b/src/runtime/traceback.go#L1013-L1021) nor finalizer goroutine.
In other words, goroutines that are running as well as those waiting on i/o, locks, channels, scheduling, etc. are all considered to be "active", even so one might naively not think of the latter ones as such.
All Goroutine profiling available in Go requires an `O(N)`**stop-the-world** phase where `N` is the number of allocated goroutines. A [naive benchmark](https://github.com/felixge/fgprof/blob/fe01e87ceec08ea5024e8168f88468af8f818b62/fgprof_test.go#L35-L78) [indicates](https://github.com/felixge/fgprof/blob/master/BenchmarkProfilerGoroutines.txt) that the world is stopped for ~1µs per goroutine when using the [runtime.GoroutineProfile()](https://golang.org/pkg/runtime/#GoroutineProfile) API. But this number is likely to fluctuate in response to factors such as the average stack depth of the program, the number of dead goroutines, etc..
As a rule of thumb, applications that are extremely latency sensitive and make use of thousands of active goroutines might want to be a little careful with goroutine profiling in production. That being said, large number of goroutines, and perhaps even Go itself, might not be good idea for such applications to begin with.
Most applications that don't spawn crazy amounts of goroutines and can tolerate a few ms of ocassional extra latency should have no issues with continous goroutine profiling in production.
Goroutines have a lot of [properties](https://github.com/golang/go/blob/go1.15.6/src/runtime/runtime2.go#L406-L486) that can help to debug Go applications. The ones below are particulary interesting and exposed via the APIs described later on in this document to varying degrees.
- [`goid`](https://github.com/golang/go/blob/go1.15.6/src/runtime/runtime2.go#L428): The unique id of the goroutine, the main goroutine has id `1`.
- [`atomicstatus`](https://github.com/golang/go/blob/go1.15.6/src/runtime/runtime2.go#L14-L105): The status of the goroutine, one of the following:
-`idle`: just got allocated
-`runnable`: on a run queue, waiting to be scheduled
-`running`: executing on an OS thread
-`syscall`: blocked on a syscall
-`waiting`: parked by the scheduler, see `g.waitreason`
-`dead`: just exited or being reinitialized
-`copystack`: stack is currently being moved
-`preempted`: just preempted itself
- [`waitreason`](https://github.com/golang/go/blob/go1.15.6/src/runtime/runtime2.go#L996-L1024): The reason a goroutine is in `waiting` status, e.g. sleep, channel operations, i/o, gc, etc.
- [`waitsince`](https://github.com/golang/go/blob/go1.15.6/src/runtime/runtime2.go#L430): The approximate timestamp a goroutine has entered `waiting` or `syscall` status as determined by the first gc after the wait started.
- [`labels`](https://github.com/golang/go/blob/go1.15.6/src/runtime/runtime2.go#L472): A set of key/value [profiler labels](https://rakyll.org/profiler-labels/) that can be attached to goroutines.
-`stack trace`: The function that is currently being executed as well as its callers. This is exposed as either a plain text output of filenames, function names and line numbers or a slice of program counter addresses (pcs). 🚧 *Research more details on this, e.g. can func/file/line text be converted to pcs?*
- [`gopc`](https://github.com/golang/go/blob/go1.15.6/src/runtime/runtime2.go#L466): The program counter address (pc) of the `go ...` call that caused this goroutine to be created. Can be converted to the file, function name and line number.
- [`lockedm`](https://github.com/golang/go/blob/go1.15.6/src/runtime/runtime2.go#L460): The thread this goroutine is locked to, if any.
The feature matrix below give you a quick idea on the current availability of these properties through the various APIs. Also available as a [google sheet](https://docs.google.com/spreadsheets/d/1txMRjhDA0NC9eSNRRUMMFI5uWJ3FBnACGVjXYT1gKig/edit?usp=sharing).
- The output format is also text based, but looks very different than `debug=2`.
Below is a truncated example of the returned output, see [2.pprof.lookup.goroutine.debug1.txt](./examples/goroutine/2.pprof.lookup.goroutine.debug1.txt) for a full example.
This profiling method is invoked the same way as `pprof.Lookup(debug=1)`, and produces the same data. The only difference is that the data format is the [pprof](./pprof.md) protocol buffer format.
Below is a truncated example of the returned output as reported by `go tool pprof -raw`, see [2.pprof.lookup.goroutine.debug0.pb.gz](./examples/goroutine/2.pprof.lookup.goroutine.debug0.pb.gz) for a full example.
This function essentially returns a slice of all active goroutines and their current stack trace. The stack traces are given in the form of program addresses which can be resolved to function names using [`runtime.CallersFrames()`](https://golang.org/pkg/runtime/#CallersFrames).
This method is used by [fgprof](https://github.com/felixge/fgprof) to implement wall clock profiling.
The following features are not available, but might be interesting to propose to the Go project in the future:
- Include goroutine properties outlined above that are not available yet, especially labels.
- Filter by pprof labels, this could reduce stop-the-world, but would require additional book keeping by the runtime.
- Limit the number of returned goroutines to a random subset, could also reduce stop-the-world and might be easier to implement than filtering by label.
Below is a truncated example of the returned output, see [2.runtime.goroutineprofile.json](./examples/goroutine/2.runtime.goroutineprofile.json) for a full example.
This package exposes the [`pprof.Lookup("goroutine")`](https://golang.org/pkg/runtime/pprof/#Lookup) profiles described above via HTTP endpoints. The output is identical.
Goroutine profiling was [implemented](https://codereview.appspot.com/5687076/) by [Russ Cox](https://github.com/rsc) and first appeared in the [weekly.2012-02-22](https://golang.org/doc/devel/weekly.html#2012-02-22) release prior to go1.
I'm [felixge](https://github.com/felixge) and work at [Datadog](https://www.datadoghq.com/) on [Continuous Profiling](https://www.datadoghq.com/product/code-profiling/) for Go. You should check it out. We're also [hiring](https://www.datadoghq.com/jobs-engineering/#all&all_locations) : ).