Add observer metric (#474)

* wip: observers * wip: float observers * fix copy pasta * wip: rework observers in sdk * small fix in global meter * wip: aggregators and selectors * wip: monotonicity option for observers * some refactor * wip: docs needs more package docs (especially for api/metric and sdk/metric) * fix ci * Fix copy-pasta in docs Co-Authored-By: Mauricio Vásquez <mauricio@kinvolk.io> * recycle unused recorders in observers if a recorder for a labelset is unused for a second collection cycle in a row, drop it * unregister * thread-safe set callback * Fix docs * Revert "wip: aggregators and selectors" This reverts commit 37b7d05aed5dc90f6d5593325b6eb77494e21736. * update selector * tests * Rework number equality Compare concrete numbers, so we can get actual numbers in the error message when they are not equal, not some uint64 representation. This also uses InDelta for comparing floats. * Ensure that Observers are registered in the same order * Run observers in fixed order So the tests can be reproducible - iterating a map made the order of measurements random. * Ensure the proper alignment of the delegates This wasn't checked at all. After adding the checks, the test-386 failed. * Small tweaks to the global meter test * Ensure proper alignment of the callback pointer test-386 was complaining about it * update docs * update a TODO * address review issues * drop SetCallback Co-authored-by: Mauricio Vásquez <mauricio@kinvolk.io> Co-authored-by: Rahul Patel <rghetia@yahoo.com>
2025-11-27 22:49:15 +02:00 · 2020-03-05 12:15:30 -08:00
parent 547d584da8
commit a202f16100
16 changed files with 1072 additions and 124 deletions
--- a/sdk/metric/doc.go
+++ b/sdk/metric/doc.go
@@ -18,10 +18,10 @@ supports configurable metrics export behavior through a collection of
 export interfaces that support various export strategies, described below.

 The metric.Meter API consists of methods for constructing each of the
-basic kinds of metric instrument.  There are six types of instrument
-available to the end user, comprised of three basic kinds of metric
-instrument (Counter, Gauge, Measure) crossed with two kinds of number
-(int64, float64).
+basic kinds of metric instrument.  There are eight types of instrument
+available to the end user, comprised of four basic kinds of metric
+instrument (Counter, Gauge, Measure, Observer) crossed with two kinds
+of number (int64, float64).

 The API assists the SDK by consolidating the variety of metric instruments
 into a narrower interface, allowing the SDK to avoid repetition of
@@ -31,17 +31,25 @@ numerical value.

 To this end, the API uses a core.Number type to represent either an int64
 or a float64, depending on the instrument's definition.  A single
-implementation interface is used for instruments, metric.InstrumentImpl,
-and a single implementation interface is used for handles,
-metric.HandleImpl.
+implementation interface is used for counter, gauge and measure
+instruments, metric.InstrumentImpl, and a single implementation interface
+is used for their handles, metric.HandleImpl. For observers, the API
+defines interfaces, for which the SDK provides an implementation.

-There are three entry points for events in the Metrics API: via instrument
-handles, via direct instrument calls, and via BatchRecord.  The SDK is
-designed with handles as the primary entry point, the other two entry
-points are implemented in terms of short-lived handles.  For example, the
-implementation of a direct call allocates a handle, operates on the
-handle, and releases the handle. Similarly, the implementation of
-RecordBatch uses a short-lived handle for each measurement in the batch.
+There are four entry points for events in the Metrics API - three for
+synchronous instruments (counters, gauges and measures) and one for
+asynchronous instruments (observers). The entry points for synchronous
+instruments are: via instrument handles, via direct instrument calls, and
+via BatchRecord.  The SDK is designed with handles as the primary entry
+point, the other two entry points are implemented in terms of short-lived
+handles.  For example, the implementation of a direct call allocates a
+handle, operates on the handle, and releases the handle. Similarly, the
+implementation of RecordBatch uses a short-lived handle for each
+measurement in the batch.  The entry point for asynchronous instruments is
+via observer callbacks.  Observer callbacks behave like a set of instrument
+handles - one for each observation for a distinct label set.  The observer
+handles are alive as long as they are used.  If the callback stops
+reporting values for a certain label set, the associated handle is dropped.

 Internal Structure

@@ -51,6 +59,10 @@ user-level code or a short-lived device, there exists an internal record
 managed by the SDK.  Each internal record corresponds to a specific
 instrument and label set combination.

+Each observer also has its own kind of record stored in the SDK. This
+record contains a set of recorders for every specific label set used in the
+callback.
+
 A sync.Map maintains the mapping of current instruments and label sets to
 internal records.  To create a new handle, the SDK consults the Map to
 locate an existing record, otherwise it constructs a new record.  The SDK
@@ -61,31 +73,18 @@ from the user's perspective.
 Metric collection is performed via a single-threaded call to Collect that
 sweeps through all records in the SDK, checkpointing their state.  When a
 record is discovered that has no references and has not been updated since
-the prior collection pass, it is marked for reclamation and removed from
-the Map.  There exists, at this moment, a race condition since another
-goroutine could, in the same instant, obtain a reference to the handle.
-
-The SDK is designed to tolerate this sort of race condition, in the name
-of reducing lock contention.  It is possible for more than one record with
-identical instrument and label set to exist simultaneously, though only
-one can be linked from the Map at a time.  To avoid lost updates, the SDK
-maintains two additional linked lists of records, one managed by the
-collection code path and one managed by the instrumentation code path.
+the prior collection pass, it is removed from the Map.

 The SDK maintains a current epoch number, corresponding to the number of
-completed collections.  Each record contains the last epoch during which
-it was collected and updated.  These variables allow the collection code
-path to detect stale records while allowing the instrumentation code path
-to detect potential reclamations.  When the instrumentation code path
-detects a potential reclamation, it adds itself to the second linked list,
-where records are saved from reclamation.
+completed collections.  Each recorder of an observer record contains the
+last epoch during which it was updated.  This variable allows the collection
+code path to detect stale recorders and remove them.

-Each record has an associated aggregator, which maintains the current
-state resulting from all metric events since its last checkpoint.
-Aggregators may be lock-free or they may use locking, but they should
-expect to be called concurrently.  Because of the tolerated race condition
-described above, aggregators must be capable of merging with another
-aggregator of the same type.
+Each record of a handle and recorder of an observer has an associated
+aggregator, which maintains the current state resulting from all metric
+events since its last checkpoint.  Aggregators may be lock-free or they may
+use locking, but they should expect to be called concurrently.  Aggregators
+must be capable of merging with another aggregator of the same type.

 Export Pipeline