1
0
mirror of https://github.com/open-telemetry/opentelemetry-go.git synced 2024-12-28 21:09:17 +02:00
opentelemetry-go/sdk/log/DESIGN.md
Robert Pająk 8e8ad092cc
sdk/log: Processor.OnEmit accetps a Record pointer (#5636)
## What

Change `Processor.OnEmit` methods to accept a record pointer.

## Why

Fixes https://github.com/open-telemetry/opentelemetry-go/issues/5219

This would be specification compliant according to discussions around
https://github.com/open-telemetry/opentelemetry-specification/pull/4067

This is inline of how processors Go span processors works and how log
processors work in other languages.

If the performance (an additional heap allocation during log processing)
would occur to be a significant problem for some users, we have at few
possible solutions:

1. Utilize PGO which may also lead to decreasing heap allocations
(sources:
https://landontclipp.github.io/blog/2023/08/25/profile-guided-optimizations-in-go/#devirtualization,
https://andrewwphillips.github.io/blog/pgo.html). Currently it does not
but I expect it may change in future.
2. Improve the Go compilers escape analysis (related to previous point)
3. introduce new "chaining processor" which can be also desirable in
other languages

## Benchstat

`old` is from `main`.
`new` is from current branch.

`new-pgo` is from current branch with PGO optimization. I first run
benchmarks to generate a CPU profile using `go test -run=^$ -bench=.
-count=10 -cpuprofile default.pgo` and then I rerun the tests with PGO.
Currently, the number of heap allocations is the same.

```
goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/log
cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
                                  │    old.txt    │                new.txt                │              new-pgo.txt              │
                                  │    sec/op     │    sec/op     vs base                 │    sec/op     vs base                 │
BatchProcessorOnEmit-16              402.7n ± 18%   382.0n ±  7%         ~ (p=0.247 n=10)   376.7n ± 14%         ~ (p=0.210 n=10)
Processor/Simple-16                  350.9n ±  9%   782.5n ±  6%  +123.00% (p=0.000 n=10)   755.1n ±  5%  +115.19% (p=0.000 n=10)
Processor/Batch-16                   1.333µ ± 15%   1.497µ ± 11%   +12.27% (p=0.000 n=10)   1.528µ ±  8%   +14.63% (p=0.000 n=10)
Processor/SetTimestampSimple-16      329.5n ± 15%   711.6n ±  4%  +115.93% (p=0.000 n=10)   721.9n ±  5%  +119.04% (p=0.000 n=10)
Processor/SetTimestampBatch-16       1.163µ ±  2%   1.524µ ±  3%   +31.03% (p=0.000 n=10)   1.461µ ±  5%   +25.57% (p=0.000 n=10)
Processor/AddAttributesSimple-16     408.7n ±  3%   810.1n ±  4%   +98.21% (p=0.000 n=10)   830.1n ±  4%  +103.11% (p=0.000 n=10)
Processor/AddAttributesBatch-16      1.270µ ±  2%   1.623µ ±  4%   +27.71% (p=0.000 n=10)   1.597µ ±  7%   +25.66% (p=0.000 n=10)
Processor/SetAttributesSimple-16     436.2n ± 10%   796.1n ±  3%   +82.50% (p=0.000 n=10)   817.6n ±  4%   +87.43% (p=0.000 n=10)
Processor/SetAttributesBatch-16      1.202µ ±  2%   1.552µ ±  2%   +29.06% (p=0.000 n=10)   1.659µ ± 11%   +37.96% (p=0.000 n=10)
LoggerNewRecord/5_attributes-16      366.6n ±  3%   363.7n ±  7%         ~ (p=0.952 n=10)   426.2n ±  7%   +16.27% (p=0.000 n=10)
LoggerNewRecord/10_attributes-16     1.711µ ±  2%   1.909µ ± 18%   +11.57% (p=0.000 n=10)   2.077µ ± 10%   +21.39% (p=0.000 n=10)
LoggerProviderLogger-16              650.1n ±  4%   690.1n ±  8%    +6.15% (p=0.019 n=10)   737.6n ± 13%   +13.47% (p=0.004 n=10)
WalkAttributes/1_attributes-16       5.264n ± 12%   5.510n ±  8%         ~ (p=0.812 n=10)   5.865n ±  5%   +11.41% (p=0.011 n=10)
WalkAttributes/10_attributes-16      5.440n ±  8%   5.881n ±  7%    +8.12% (p=0.004 n=10)   6.104n ±  7%   +12.21% (p=0.005 n=10)
WalkAttributes/100_attributes-16     5.403n ±  9%   5.894n ±  9%    +9.10% (p=0.029 n=10)   5.783n ±  6%         ~ (p=0.052 n=10)
WalkAttributes/1000_attributes-16    5.196n ±  4%   5.860n ±  8%   +12.79% (p=0.000 n=10)   5.981n ± 13%   +15.13% (p=0.002 n=10)
SetAddAttributes/SetAttributes-16    181.2n ± 14%   208.1n ± 12%   +14.85% (p=0.005 n=10)   209.9n ± 11%   +15.87% (p=0.007 n=10)
SetAddAttributes/AddAttributes-16    156.7n ± 14%   161.1n ± 16%         ~ (p=0.190 n=10)   165.5n ± 15%         ~ (p=0.315 n=10)
SimpleProcessorOnEmit-16            11.775n ± 10%   9.027n ± 17%   -23.33% (p=0.000 n=10)   9.389n ± 18%   -20.26% (p=0.002 n=10)
geomean                              169.1n         209.6n         +23.98%                  215.5n         +27.48%

                        │     old.txt     │               new.txt                │              new-pgo.txt              │
                        │       B/s       │     B/s       vs base                │      B/s       vs base                │
BatchProcessorOnEmit-16   1004.39Mi ± 15%   79.88Mi ± 7%  -92.05% (p=0.000 n=10)   81.06Mi ± 12%  -91.93% (p=0.000 n=10)

                                  │   old.txt    │                new.txt                 │               new-pgo.txt               │
                                  │     B/op     │    B/op      vs base                   │     B/op      vs base                   │
BatchProcessorOnEmit-16             0.000 ± 0%      0.000 ± 0%         ~ (p=1.000 n=10) ¹    0.000 ±  0%         ~ (p=1.000 n=10) ¹
Processor/Simple-16                   0.0 ± 0%      417.0 ± 0%         ? (p=0.000 n=10)      417.0 ±  0%         ? (p=0.000 n=10)
Processor/Batch-16                  621.5 ± 2%     1057.5 ± 1%   +70.15% (p=0.000 n=10)     1064.5 ±  1%   +71.28% (p=0.000 n=10)
Processor/SetTimestampSimple-16       0.0 ± 0%      417.0 ± 0%         ? (p=0.000 n=10)      418.0 ±  0%         ? (p=0.000 n=10)
Processor/SetTimestampBatch-16      626.5 ± 3%     1049.5 ± 1%   +67.52% (p=0.000 n=10)     1057.5 ±  2%   +68.79% (p=0.000 n=10)
Processor/AddAttributesSimple-16      0.0 ± 0%      417.0 ± 0%         ? (p=0.000 n=10)      417.0 ±  0%         ? (p=0.000 n=10)
Processor/AddAttributesBatch-16     616.5 ± 3%     1053.0 ± 2%   +70.80% (p=0.000 n=10)     1048.5 ±  2%   +70.07% (p=0.000 n=10)
Processor/SetAttributesSimple-16    48.00 ± 0%     466.00 ± 0%  +870.83% (p=0.000 n=10)     466.00 ±  0%  +870.83% (p=0.000 n=10)
Processor/SetAttributesBatch-16     648.0 ± 3%     1089.5 ± 1%   +68.13% (p=0.000 n=10)     1087.5 ±  4%   +67.82% (p=0.000 n=10)
LoggerNewRecord/5_attributes-16     0.000 ± 0%      0.000 ± 0%         ~ (p=1.000 n=10) ¹    0.000 ±  0%         ~ (p=1.000 n=10) ¹
LoggerNewRecord/10_attributes-16    610.0 ± 0%      610.0 ± 0%         ~ (p=1.000 n=10) ¹    610.0 ±  0%         ~ (p=1.000 n=10) ¹
LoggerProviderLogger-16             354.5 ± 6%      368.0 ± 7%         ~ (p=0.288 n=10)      391.0 ± 29%         ~ (p=0.239 n=10)
WalkAttributes/1_attributes-16      0.000 ± 0%      0.000 ± 0%         ~ (p=1.000 n=10) ¹    0.000 ±  0%         ~ (p=1.000 n=10) ¹
WalkAttributes/10_attributes-16     0.000 ± 0%      0.000 ± 0%         ~ (p=1.000 n=10) ¹    0.000 ±  0%         ~ (p=1.000 n=10) ¹
WalkAttributes/100_attributes-16    0.000 ± 0%      0.000 ± 0%         ~ (p=1.000 n=10) ¹    0.000 ±  0%         ~ (p=1.000 n=10) ¹
WalkAttributes/1000_attributes-16   0.000 ± 0%      0.000 ± 0%         ~ (p=1.000 n=10) ¹    0.000 ±  0%         ~ (p=1.000 n=10) ¹
SetAddAttributes/SetAttributes-16   48.00 ± 0%      48.00 ± 0%         ~ (p=1.000 n=10) ¹    48.00 ±  0%         ~ (p=1.000 n=10) ¹
SetAddAttributes/AddAttributes-16   0.000 ± 0%      0.000 ± 0%         ~ (p=1.000 n=10) ¹    0.000 ±  0%         ~ (p=1.000 n=10) ¹
SimpleProcessorOnEmit-16            0.000 ± 0%      0.000 ± 0%         ~ (p=1.000 n=10) ¹    0.000 ±  0%         ~ (p=1.000 n=10) ¹
geomean                                        ²                ?                       ²                 ?                       ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                  │   old.txt    │                new.txt                │              new-pgo.txt              │
                                  │  allocs/op   │ allocs/op   vs base                   │ allocs/op   vs base                   │
BatchProcessorOnEmit-16             0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹   0.000 ± 0%         ~ (p=1.000 n=10) ¹
Processor/Simple-16                 0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)     1.000 ± 0%         ? (p=0.000 n=10)
Processor/Batch-16                  0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)     1.000 ± 0%         ? (p=0.000 n=10)
Processor/SetTimestampSimple-16     0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)     1.000 ± 0%         ? (p=0.000 n=10)
Processor/SetTimestampBatch-16      0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)     1.000 ± 0%         ? (p=0.000 n=10)
Processor/AddAttributesSimple-16    0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)     1.000 ± 0%         ? (p=0.000 n=10)
Processor/AddAttributesBatch-16     0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)     1.000 ± 0%         ? (p=0.000 n=10)
Processor/SetAttributesSimple-16    1.000 ± 0%     2.000 ± 0%  +100.00% (p=0.000 n=10)     2.000 ± 0%  +100.00% (p=0.000 n=10)
Processor/SetAttributesBatch-16     1.000 ± 0%     2.000 ± 0%  +100.00% (p=0.000 n=10)     2.000 ± 0%  +100.00% (p=0.000 n=10)
LoggerNewRecord/5_attributes-16     0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹   0.000 ± 0%         ~ (p=1.000 n=10) ¹
LoggerNewRecord/10_attributes-16    4.000 ± 0%     4.000 ± 0%         ~ (p=1.000 n=10) ¹   4.000 ± 0%         ~ (p=1.000 n=10) ¹
LoggerProviderLogger-16             1.000 ± 0%     1.000 ± 0%         ~ (p=1.000 n=10) ¹   1.000 ± 0%         ~ (p=1.000 n=10) ¹
WalkAttributes/1_attributes-16      0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹   0.000 ± 0%         ~ (p=1.000 n=10) ¹
WalkAttributes/10_attributes-16     0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹   0.000 ± 0%         ~ (p=1.000 n=10) ¹
WalkAttributes/100_attributes-16    0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹   0.000 ± 0%         ~ (p=1.000 n=10) ¹
WalkAttributes/1000_attributes-16   0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹   0.000 ± 0%         ~ (p=1.000 n=10) ¹
SetAddAttributes/SetAttributes-16   1.000 ± 0%     1.000 ± 0%         ~ (p=1.000 n=10) ¹   1.000 ± 0%         ~ (p=1.000 n=10) ¹
SetAddAttributes/AddAttributes-16   0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹   0.000 ± 0%         ~ (p=1.000 n=10) ¹
SimpleProcessorOnEmit-16            0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹   0.000 ± 0%         ~ (p=1.000 n=10) ¹
geomean                                        ²               ?                       ²               ?                       ²
¹ all samples are equal
² summaries must be >0 to compute geomean
```
2024-08-01 10:13:43 +02:00

8.2 KiB

Logs SDK

Abstract

go.opentelemetry.io/otel/sdk/log provides Logs SDK compliant with the specification.

The prototype was created in #4955.

Background

The goal is to design the exported API of the SDK would have low performance overhead. Most importantly, have a design that reduces the amount of heap allocations and even make it possible to have a zero-allocation implementation. Eliminating the amount of heap allocations reduces the GC pressure which can produce some of the largest improvements in performance.1

The main and recommended use case is to configure the SDK to use an OTLP exporter with a batch processor.2 Therefore, the implementation aims to be high-performant in this scenario. Some users that require high throughput may also want to use e.g. an user_events, LLTng or ETW exporter with a simple processor. Users may also want to use OTLP File or Standard Output exporter in order to emit logs to standard output/error or files.

Modules structure

The SDK is published as a single go.opentelemetry.io/otel/sdk/log Go module.

The exporters are going to be published as following Go modules:

  • go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploggrpc
  • go.opentelemetry.io/otel/exporters/otlp/otlplog/otlploghttp
  • go.opentelemetry.io/otel/exporters/stdout/stdoutlog

LoggerProvider

The LoggerProvider is implemented as LoggerProvider struct in provider.go.

LogRecord limits

The LogRecord limits can be configured using following options:

func WithAttributeCountLimit(limit int) LoggerProviderOption
func WithAttributeValueLengthLimit(limit int) LoggerProviderOption

The limits can be also configured using the OTEL_LOGRECORD_* environment variables as defined by the specification.

Processor

The LogRecordProcessor is defined as Processor interface in processor.go.

The user set processors for the LoggerProvider using func WithProcessor(processor Processor) LoggerProviderOption.

The user can configure custom processors and decorate built-in processors.

The specification may add new operations to the LogRecordProcessor. If it happens, CONTRIBUTING.md describes how the SDK can be extended in a backwards-compatible way.

SimpleProcessor

The Simple processor is implemented as SimpleProcessor struct in simple.go.

BatchProcessor

The Batching processor is implemented as BatchProcessor struct in batch.go.

The Batcher can be also configured using the OTEL_BLRP_* environment variables as defined by the specification.

Exporter

The LogRecordExporter is defined as Exporter interface in exporter.go.

The slice passed to Export must not be retained by the implementation (like e.g. io.Writer) so that the caller can reuse the passed slice (e.g. using sync.Pool) to avoid heap allocations on each call.

The specification may add new operations to the LogRecordExporter. If it happens, CONTRIBUTING.md describes how the SDK can be extended in a backwards-compatible way.

Record

The ReadWriteLogRecord is defined as Record struct in record.go.

The Record is designed similarly to log.Record in order to reduce the number of heap allocations when processing attributes.

The SDK does not have have an additional definition of ReadableLogRecord as the specification does not say that the exporters must not be able to modify the log records. It simply requires them to be able to read the log records. Having less abstractions reduces the API surface and makes the design simpler.

Benchmarking

The benchmarks are supposed to test end-to-end scenarios and avoid I/O that could affect the stability of the results.

The benchmark results can be found in the prototype.

Rejected alternatives

Represent both LogRecordProcessor and LogRecordExporter as Expoter

Because the LogRecordProcessor and the LogRecordProcessor abstractions are so similar, there was a proposal to unify them under single Expoter interface.3

However, introducing a Processor interface makes it easier to create custom processor decorators4 and makes the design more aligned with the specification.

Embed log.Record

Because Record and log.Record are very similar, there was a proposal to embed log.Record in Record definition.

log.Record supports only adding attributes. In the SDK, we also need to be able to modify the attributes (e.g. removal) provided via API.

Moreover it is safer to have these abstraction decoupled. E.g. there can be a need for some fields that can be set via API and cannot be modified by the processors.

Processor.OnEmit to accept Record values

There was a proposal to make the Processor's OnEmit to accept a Record value instead of a pointer to reduce allocations as well as to have design similar to slog.Handler.

There have been long discussions within the OpenTelemetry Specification SIG5 about whether such a design would comply with the specification. The summary was that the current processor design flaws are present in other languages as well. Therefore, it would be favorable to introduce new processing concepts (e.g. chaining processors) in the specification that would coexist with the current "mutable" processor design.

The performance disadvantages caused by using a pointer (which at the time of writing causes an additional heap allocation) may be mitigated by future versions of the Go compiler, thanks to improved escape analysis and profile-guided optimization (PGO)6.

On the other hand, Processor's Enabled is fine to accept a Record value as the processors should not mutate the passed parameters.