Fixes https://github.com/open-telemetry/opentelemetry-go/issues/5769 Related spec PR: https://github.com/open-telemetry/opentelemetry-specification/pull/4203 Remark: A follow-up in contrib is required and afterwards here to fix the `example/dice`. Benchstat results for `sdk/log` (`log` has no benchmarks related to Enabled): ``` goos: linux goarch: amd64 pkg: go.opentelemetry.io/otel/sdk/log cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz │ old.txt │ new.txt │ │ sec/op │ sec/op vs base │ BatchProcessorOnEmit-16 398.8n ± 10% 395.6n ± 12% ~ (p=0.971 n=10) Processor/Simple-16 882.2n ± 8% 869.8n ± 9% ~ (p=0.811 n=10) Processor/Batch-16 1.478µ ± 3% 1.485µ ± 5% ~ (p=0.646 n=10) Processor/SetTimestampSimple-16 847.8n ± 1% 844.6n ± 3% ~ (p=0.247 n=10) Processor/SetTimestampBatch-16 1.480µ ± 3% 1.473µ ± 4% ~ (p=0.700 n=10) Processor/AddAttributesSimple-16 930.0n ± 1% 933.8n ± 1% ~ (p=0.172 n=10) Processor/AddAttributesBatch-16 1.624µ ± 2% 1.639µ ± 2% ~ (p=0.839 n=10) Processor/SetAttributesSimple-16 903.4n ± 1% 895.1n ± 1% ~ (p=0.190 n=10) Processor/SetAttributesBatch-16 1.554µ ± 4% 1.529µ ± 3% ~ (p=0.159 n=10) LoggerNewRecord/5_attributes-16 346.0n ± 2% 343.3n ± 2% ~ (p=0.448 n=10) LoggerNewRecord/10_attributes-16 1.608µ ± 6% 1.503µ ± 2% -6.53% (p=0.007 n=10) LoggerEnabled-16 34.305n ± 8% 6.706n ± 1% -80.45% (p=0.000 n=10) LoggerProviderLogger-16 636.9n ± 10% 605.8n ± 3% ~ (p=0.105 n=10) WalkAttributes/1_attributes-16 5.363n ± 3% 4.540n ± 14% -15.34% (p=0.002 n=10) WalkAttributes/10_attributes-16 5.436n ± 7% 4.461n ± 2% -17.95% (p=0.000 n=10) WalkAttributes/100_attributes-16 5.126n ± 9% 4.465n ± 1% -12.90% (p=0.000 n=10) WalkAttributes/1000_attributes-16 5.316n ± 9% 4.502n ± 5% -15.32% (p=0.002 n=10) SetAddAttributes/SetAttributes-16 220.5n ± 18% 192.6n ± 11% -12.67% (p=0.007 n=10) SetAddAttributes/AddAttributes-16 165.3n ± 21% 127.3n ± 22% -22.96% (p=0.011 n=10) SimpleProcessorOnEmit-16 2.159n ± 9% 2.167n ± 9% ~ (p=0.739 n=10) geomean 178.3n 154.5n -13.31% │ old.txt │ new.txt │ │ B/s │ B/s vs base │ BatchProcessorOnEmit-16 76.52Mi ± 11% 77.14Mi ± 14% ~ (p=0.971 n=10) │ old.txt │ new.txt │ │ B/op │ B/op vs base │ BatchProcessorOnEmit-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/Simple-16 417.0 ± 0% 417.0 ± 0% ~ (p=1.000 n=10) ¹ Processor/Batch-16 1.093Ki ± 1% 1.088Ki ± 1% ~ (p=0.254 n=10) Processor/SetTimestampSimple-16 417.0 ± 0% 417.0 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetTimestampBatch-16 1.095Ki ± 1% 1.084Ki ± 2% ~ (p=0.361 n=10) Processor/AddAttributesSimple-16 417.0 ± 0% 417.0 ± 0% ~ (p=1.000 n=10) ¹ Processor/AddAttributesBatch-16 1.085Ki ± 1% 1.086Ki ± 1% ~ (p=1.000 n=10) Processor/SetAttributesSimple-16 465.0 ± 0% 465.0 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetAttributesBatch-16 1.129Ki ± 1% 1.125Ki ± 1% ~ (p=0.084 n=10) LoggerNewRecord/5_attributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ LoggerNewRecord/10_attributes-16 610.0 ± 0% 610.0 ± 0% ~ (p=1.000 n=10) ¹ LoggerEnabled-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ LoggerProviderLogger-16 359.0 ± 6% 346.0 ± 3% ~ (p=0.117 n=10) WalkAttributes/1_attributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ WalkAttributes/10_attributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ WalkAttributes/100_attributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ WalkAttributes/1000_attributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ SetAddAttributes/SetAttributes-16 48.00 ± 0% 48.00 ± 0% ~ (p=1.000 n=10) ¹ SetAddAttributes/AddAttributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ SimpleProcessorOnEmit-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ geomean ² -0.27% ² ¹ all samples are equal ² summaries must be >0 to compute geomean │ old.txt │ new.txt │ │ allocs/op │ allocs/op vs base │ BatchProcessorOnEmit-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/Simple-16 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/Batch-16 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetTimestampSimple-16 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetTimestampBatch-16 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/AddAttributesSimple-16 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/AddAttributesBatch-16 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetAttributesSimple-16 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=10) ¹ Processor/SetAttributesBatch-16 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=10) ¹ LoggerNewRecord/5_attributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ LoggerNewRecord/10_attributes-16 4.000 ± 0% 4.000 ± 0% ~ (p=1.000 n=10) ¹ LoggerEnabled-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ LoggerProviderLogger-16 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ WalkAttributes/1_attributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ WalkAttributes/10_attributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ WalkAttributes/100_attributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ WalkAttributes/1000_attributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ SetAddAttributes/SetAttributes-16 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ SetAddAttributes/AddAttributes-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ SimpleProcessorOnEmit-16 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ geomean ² +0.00% ² ¹ all samples are equal ² summaries must be >0 to compute geomean ```
26 KiB
Logs Bridge API
Abstract
go.opentelemetry.io/otel/log
provides
Logs Bridge API.
The prototype was created in #4725.
Background
The key challenge is to create a performant API compliant with the specification with an intuitive and user friendly design. Performance is seen as one of the most important characteristics of logging libraries in Go.
Design
This proposed design aims to:
- be specification compliant,
- be similar to Trace and Metrics API,
- take advantage of both OpenTelemetry and
slog
experience to achieve acceptable performance.
Module structure
The API is published as a single go.opentelemetry.io/otel/log
Go module.
The module name is compliant with Artifact Naming and the package structure is the same as for Trace API and Metrics API.
The Go module consists of the following packages:
go.opentelemetry.io/otel/log
go.opentelemetry.io/otel/log/embedded
go.opentelemetry.io/otel/log/noop
Rejected alternative:
LoggerProvider
The LoggerProvider
abstraction
is defined as LoggerProvider
interface in provider.go.
The specification may add new operations to LoggerProvider
.
The interface may have methods added without a package major version bump.
This embeds embedded.LoggerProvider
to help inform an API implementation
author about this non-standard API evolution.
This approach is already used in Trace API and Metrics API.
LoggerProvider.Logger
The Logger
method implements the Get a Logger
operation.
The required name
parameter is accepted as a string
method argument.
The LoggerOption
options are defined to support optional parameters.
Implementation requirements:
-
The specification requires the method to be safe to be called concurrently.
-
The method should use some default name if the passed name is empty in order to meet the specification's SDK requirement to return a working logger when an invalid name is passed as well as to resemble the behavior of getting tracers and meters.
Logger
can be extended by adding new LoggerOption
options
and adding new exported fields to the LoggerConfig
struct.
This design is already used in Trace API for getting tracers
and in Metrics API for getting meters.
Rejected alternative:
Logger
The Logger
abstraction
is defined as Logger
interface in logger.go.
The specification may add new operations to Logger
.
The interface may have methods added without a package major version bump.
This embeds embedded.Logger
to help inform an API implementation
author about this non-standard API evolution.
This approach is already used in Trace API and Metrics API.
Logger.Emit
The Emit
method implements the Emit a LogRecord
operation.
Context
associated with the LogRecord
is accepted as a context.Context
method argument.
Calls to Emit
are supposed to be on the hot path.
Therefore, in order to reduce the number of heap allocations,
the LogRecord
abstraction,
is defined as Record
struct in record.go.
Timestamp
is accessed using following methods:
func (r *Record) Timestamp() time.Time
func (r *Record) SetTimestamp(t time.Time)
ObservedTimestamp
is accessed using following methods:
func (r *Record) ObservedTimestamp() time.Time
func (r *Record) SetObservedTimestamp(t time.Time)
SeverityNumber
is accessed using following methods:
func (r *Record) Severity() Severity
func (r *Record) SetSeverity(s Severity)
Severity
type is defined in severity.go.
The constants are are based on
Displaying Severity recommendation.
Additionally, Severity[Level]
constants are defined to make the API more readable and user friendly.
SeverityText
is accessed using following methods:
func (r *Record) SeverityText() string
func (r *Record) SetSeverityText(s string)
Body
is accessed using following methods:
func (r *Record) Body() Value
func (r *Record) SetBody(v Value)
Log record attributes are accessed using following methods:
func (r *Record) WalkAttributes(f func(KeyValue) bool)
func (r *Record) AddAttributes(attrs ...KeyValue)
Record
has a AttributesLen
method that returns
the number of attributes to allow slice preallocation
when converting records to a different representation:
func (r *Record) AttributesLen() int
The records attributes design and implementation is based on
slog.Record
.
It allows achieving high-performance access and manipulation of the attributes
while keeping the API user friendly.
It relieves the user from making his own improvements
for reducing the number of allocations when passing attributes.
The abstractions described in the specification are defined in keyvalue.go.
Value
is representing any
.
KeyValue
is representing a key(string)-value(any
) pair.
Kind
is an enumeration used for specifying the underlying value type.
KindEmpty
is used for an empty (zero) value.
KindBool
is used for boolean value.
KindFloat64
is used for a double precision floating point (IEEE 754-1985) value.
KindInt64
is used for a signed integer value.
KindString
is used for a string value.
KindBytes
is used for a slice of bytes (in spec: A byte array).
KindSlice
is used for a slice of values (in spec: an array (a list) of any values).
KindMap
is used for a slice of key-value pairs (in spec: map<string, any>
).
These types are defined in go.opentelemetry.io/otel/log
package
as they are tightly coupled with the API and different from common attributes.
The internal implementation of Value
is based on
slog.Value
and the API is mostly inspired by
attribute.Value
.
The benchmarks1 show that the implementation is more performant than
attribute.Value
.
The value accessors (func (v Value) As[Kind]
methods) must not panic,
as it would violate the specification:
API methods MUST NOT throw unhandled exceptions when used incorrectly by end users. The API and SDK SHOULD provide safe defaults for missing or invalid arguments. [...] Whenever the library suppresses an error that would otherwise have been exposed to the user, the library SHOULD log the error using language-specific conventions.
Therefore, the value accessors should return a zero value and log an error when a bad accessor is called.
The Severity
, Kind
, Value
, KeyValue
may implement
the fmt.Stringer
interface.
However, it is not needed for the first stable release
and the String
methods can be added later.
The caller must not subsequently mutate the record passed to Emit
.
This would allow the implementation to not clone the record,
but simply retain, modify or discard it.
The implementation may still choose to clone the record or copy its attributes
if it needs to retain or modify it,
e.g. in case of asynchronous processing to eliminate the possibility of data races,
because the user can technically reuse the record and add new attributes
after the call (even when the documentation says that the caller must not do it).
Implementation requirements:
-
The specification requires the method to be safe to be called concurrently.
-
The method must not interrupt the record processing if the context is canceled per "ignoring context cancellation" guideline.
-
The specification requires use the current time as observed timestamp if the passed is empty.
-
The method should handle the trace context passed via
ctx
argument in order to meet the specification's SDK requirement to populate the trace context fields from the resolved context.
Emit
can be extended by adding new exported fields to the Record
struct.
Rejected alternatives:
- Record as interface
- Options as parameter to Logger.Emit
- Passing record as pointer to Logger.Emit
- Logger.WithAttributes
- Record attributes as slice
- Use any instead of defining Value
- Severity type encapsulating number and text
- Reuse attribute package
- Mix receiver types for Record
- Add XYZ method to Logger
- Rename KeyValue to Attr
Logger.Enabled
The Enabled
method implements the Enabled
operation.
Context
associated with the LogRecord
is accepted as a context.Context
method argument.
Calls to Enabled
are supposed to be on the hot path and the list of arguments
can be extendend in future. Therefore, in order to reduce the number of heap
allocations and make it possible to handle new arguments, Enabled
accepts
a EnabledParameters
struct, defined in logger.go, as the second
method argument.
The EnabledParameters
getters are returning values using the (value, ok)
idiom in order to indicate if the values were actually set by the caller or if
there are unspecified.
noop package
The go.opentelemetry.io/otel/log/noop
package provides
Logs Bridge API No-Op Implementation.
Trace context correlation
The bridge implementation should do its best to pass
the ctx
containing the trace context from the caller
so it can later be passed via Logger.Emit
.
It is not expected that users (caller or bridge implementation) reconstruct
a context.Context
. Reconstructing a context.Context
with
trace.ContextWithSpanContext
and trace.NewSpanContext
would usually involve more memory allocations.
The logging libraries which have recording methods that accepts context.Context
,
such us slog
,
logrus
,
zerolog
,
makes passing the trace context trivial.
However, some libraries do not accept a context.Context
in their recording methods.
Structured logging libraries,
such as logr
and zap
,
offer passing any
type as a log attribute/field.
Therefore, their bridge implementations can define a "special" log attributes/field
that will be used to capture the trace context.
The prototype has bridge implementations that handle trace context correlation efficiently.
Benchmarking
The benchmarks take inspiration from slog
,
because for the Go team it was also critical to create API that would be fast
and interoperable with existing logging packages.23
The benchmark results can be found in the prototype.
Rejected alternatives
Reuse slog
The API must not be coupled to slog
,
nor any other logging library.
The API needs to evolve orthogonally to slog
.
slog
is not compliant with the Logs Bridge API.
and we cannot expect the Go team to make slog
compliant with it.
The interoperability can be achieved using a log bridge.
You can read more about OpenTelemetry Logs design on opentelemetry.io.
Record as interface
Record
is defined as a struct
because of the following reasons.
Log record is a value object without any behavior. It is used as data input for Logger methods.
The log record resembles the instrument config structs like metric.Float64CounterConfig.
Using struct
instead of interface
improves the performance as e.g.
indirect calls are less optimized,
usage of interfaces tend to increase heap allocations.3
Options as parameter to Logger.Emit
One of the initial ideas was to have:
type Logger interface{
embedded.Logger
Emit(ctx context.Context, options ...RecordOption)
}
The main reason was that design would be similar to the Meter API for creating instruments.
However, passing Record
directly, instead of using options,
is more performant as it reduces heap allocations.4
Another advantage of passing Record
is that API would not have functions like NewRecord(options...)
,
which would be used by the SDK and not by the users.
Finally, the definition would be similar to slog.Handler.Handle
that was designed to provide optimization opportunities.2
Passing record as pointer to Logger.Emit
So far the benchmarks do not show differences that would favor passing the record via pointer (and vice versa).
Passing via value feels safer because of the following reasons.
The user would not be able to pass nil
.
Therefore, it reduces the possibility to have a nil pointer dereference.
It should reduce the possibility of a heap allocation.
It follows the design of slog.Handler
.
If follows one of Google's Go Style Decisions to prefer passing values.
Passing struct as parameter to LoggerProvider.Logger
Similarly to Logger.Emit
, we could have something like:
type LoggerProvider interface{
embedded.LoggerProvider
Logger(name string, config LoggerConfig)
}
The drawback of this idea would be that this would be a different design from Trace and Metrics API.
The performance of acquiring a logger is not as critical as the performance of emitting a log record. While a single HTTP/RPC handler could write hundreds of logs, it should not create a new logger for each log entry. The bridge implementation should reuse loggers whenever possible.
Logger.WithAttributes
We could add WithAttributes
to the Logger
interface.
Then Record
could be a simple struct with only exported fields.
The idea was that the SDK would implement the performance improvements
instead of doing it in the API.
This would allow having different optimization strategies.
During the analysis5, it occurred that the main problem of this proposal is that the variadic slice passed to an interface method is always heap allocated.
Moreover, the logger returned by WithAttribute
was allocated on the heap.
Lastly, the proposal was not specification compliant.
Record attributes as slice
One of the proposals6 was to have Record
as a simple struct:
type Record struct {
Timestamp time.Time
ObservedTimestamp time.Time
Severity Severity
SeverityText string
Body Value
Attributes []KeyValue
The bridge implementations could use sync.Pool
for reducing the number of allocations when passing attributes.
The benchmarks results were better.
In such a design, most bridges would have a sync.Pool
to reduce the number of heap allocations.
However, the sync.Pool
will not work correctly with API implementations
that would take ownership of the record
(e.g. implementations that do not copy records for asynchronous processing).
The current design, even in case of improper API implementation,
has lower chances of encountering a bug as most bridges would
create a record, pass it, and forget about it.
For reference, here is the reason why slog
does not use sync.Pool
3
as well:
We can use a sync pool for records though we decided not to. You can but it's a bad idea for us. Why? Because users have control of Records. Handler writers can get their hands on a record and we'd have to ask them to free it or try to free it magically at some some point. But either way, they could get themselves in trouble by freeing it twice or holding on to one after they free it. That's a use after free bug and that's why
zerolog
was problematic for us.zerolog
as as part of its speed exposes a pool allocated value to users if you usezerolog
the normal way, that you'll see in all the examples, you will never encounter a problem. But if you do something a little out of the ordinary you can get use after free bugs and we just didn't want to put that in the standard library.
Therefore, we decided to not follow the proposal as it is
less user friendly (users and bridges would use e.g. a sync.Pool
to reduce
the number of heap allocation), less safe (more prone to use after free bugs
and race conditions), and the benchmark differences were not significant.
Use any instead of defining Value
Logs Data Model
defines Body to be any
.
One could propose to define Body
(and attribute values) as any
instead of a defining a new type (Value
).
First of all, any
type defined in the specification
is not the same as any
(interface{}
) in Go.
Moreover, using any
as a field would decrease the performance.7
Notice it will be still possible to add following kind and factories in a backwards compatible way:
const KindMap Kind
func AnyValue(value any) KeyValue
func Any(key string, value any) KeyValue
However, currently, it would not be specification compliant.
Severity type encapsulating number and text
We could combine severity into a single field defining a type:
type Severity struct {
Number SeverityNumber
Text string
}
However, the Logs Data Model define it as independent fields. It should be more user friendly to have them separated. Especially when having getter and setter methods, setting one value when the other is already set would be unpleasant.
Reuse attribute package
It was tempting to reuse the existing [https://pkg.go.dev/go.opentelemetry.io/otel/attribute] package for defining log attributes and body.
However, this would be wrong because the log attribute definition is different from the common attribute definition.
Moreover, it there is nothing telling that the body definition has anything in common with a common attribute value.
Therefore, we define new types representing the abstract types defined in the Logs Data Model.
Mix receiver types for Record
Methods of slog.Record
have different receiver types.
In log/slog
GitHub issue we can only find that the reason is:8
some receiver of Record struct is by value Passing Records by value means they incur no heap allocation. That improves performance overall, even though they are copied.
However, the benchmarks do not show any noticeable differences.9
The compiler is smart-enough to not make a heap allocation for any of these methods. The use of a pointer receiver does not cause any heap allocation. From Go FAQ:10
In the current compilers, if a variable has its address taken, that variable is a candidate for allocation on the heap. However, a basic escape analysis recognizes some cases when such variables will not live past the return from the function and can reside on the stack.
The Understanding Allocations: the Stack and the Heap presentation by Jacob Walker describes the escape analysis with details.
Moreover, also from Go FAQ:10
Also, if a local variable is very large, it might make more sense to store it on the heap rather than the stack.
Therefore, even if we use a value receiver and the value is very large it may be heap allocated.
Both Go Code Review Comments and Google's Go Style Decisions highly recommend making the methods for a type either all pointer methods or all value methods. Google's Go Style Decisions even goes further and says:
There is a lot of misinformation about whether passing a value or a pointer to a function can affect performance. The compiler can choose to pass pointers to values on the stack as well as copying values on the stack, but these considerations should not outweigh the readability and correctness of the code in most circumstances. When the performance does matter, it is important to profile both approaches with a realistic benchmark before deciding that one approach outperforms the other.
Because, the benchmarks9 do not proof any performance difference
and the general recommendation is to not mix receiver types,
we decided to use pointer receivers for all Record
methods.
Add XYZ method to Logger
The Logger
does not have methods like Enabled
, SetSeverity
, etc.
as the Bridge API needs to follow (be compliant with)
the specification
Moreover, the Bridge API is intended to be used to implement bridges.
Applications should not use it directly. The applications should use logging packages
such as slog
,
logrus
,
zap
,
zerolog
,
logr
.
Rename KeyValue to Attr
There was a proposal to rename KeyValue
to Attr
(or Attribute
).11
New developers may not intuitively know that log.KeyValue
is an attribute in
the OpenTelemetry parlance.
During the discussion we agreed to keep the KeyValue
name.
The type is used in two semantics:
- as a log attribute
- as a map item
As for map item semantics, this type is a key-value pair, not an attribute.
Naming the type as Attr
would convey semantical meaning
that would not be correct for a map.
We expect that most of the Bridge API users will be OpenTelemetry contributors. We plan to implement bridges for the most popular logging libraries ourselves. Given we will all have the context needed to disambiguate these overlapping names, developers' confusion should not be an issue.
For bridges not developed by us, developers will likely look at our existing bridges for inspiration. Our correct use of these types will be a reference to them.
At last, we could consider a design defining both types: KeyValue
and Attr
.
However, in this approach we would need have factory functions for both types.
It would make the API surface unnecessarily big,
and we may even have problems naming the functions.