You've already forked opentelemetry-go
mirror of
https://github.com/open-telemetry/opentelemetry-go.git
synced 2025-11-25 22:41:46 +02:00
Alternative to https://github.com/open-telemetry/opentelemetry-go/pull/7380 This uses a sync.Map and atomics for the sum's counter value. This intentionally introduces a new race condition that didn't previously exist: * It is possible for the exemplar to be recorded in the batch of metrics after the add() for cumulative sum aggregations. For cumulative, this isn't a huge issue since exemplars are expected to persist across collection cycles. This is difficult to fix because we can't manage the internal storage of an exemplar.Reservoir (to atomically swap between hot and cold storage). If we are able to make assumptions about how exemplar reservoirs are managed (i.e. that the number of and order of exemplars returned is always the same), then we could possibly fix this by merging at export time. ### Alternatives Considered #### RWLock for the map instead of sync.Map This is significantly less performant. #### Single sync.Map without hotColdWaitGroup Deleting keys from the sync.Map concurrently with measurements (during Clear() of the sync.Map) can cause measurements to be made to a counter that has already been read, exported and deleted. This can produce incorrect sums when delta is used. Instead, atomically switching writes to a completely empty sync.Map and waiting for writes to the previous sync.Map complete eliminates this issue. #### Use two sync.Map for cumulative sums One idea I explored was doing a hot-cold swap for cumulative sums just like we do for delta sums. We would swap the hot and cold sync.Maps, wait for writes to the cold sync.Map to complete while new writes go to the hot map. Then, once we are done reading the cold map, we could merge the contents of the cold map back into the new hot map. This approach has two issues: * It isn't possible to "merge" one exemplar reservoir into another. This is an issue for persistent exemplars that aren't overwritten in a collection interval. * We can't keep a consistent set of keys in overflow scenarios. Measurements that are made to the hot map before the merge of the cold into hot that should have been overflows will be added as new attribute sets. That, in turn, means we will need to change previously-exported attribute sets to the overflow set, which will cause issues for users. ### Benchmarks Parallel: ``` goos: linux goarch: amd64 pkg: go.opentelemetry.io/otel/sdk/metric cpu: AMD EPYC 7B12 │ main24.txt │ new24_new.txt │ │ sec/op │ sec/op vs base │ SyncMeasure/NoView/ExemplarsDisabled/Int64Counter/Attributes/0-24 255.65n ± 13% 68.06n ± 3% -73.38% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Counter/Attributes/1-24 286.70n ± 8% 67.66n ± 4% -76.40% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Counter/Attributes/10-24 287.15n ± 14% 69.90n ± 3% -75.66% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Counter/Attributes/0-24 244.75n ± 9% 68.83n ± 4% -71.88% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Counter/Attributes/1-24 267.20n ± 14% 65.86n ± 3% -75.35% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Counter/Attributes/10-24 291.50n ± 13% 66.59n ± 11% -77.15% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64UpDownCounter/Attributes/0-24 247.85n ± 7% 66.06n ± 3% -73.34% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64UpDownCounter/Attributes/1-24 286.75n ± 10% 68.52n ± 2% -76.10% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64UpDownCounter/Attributes/10-24 289.50n ± 20% 67.45n ± 4% -76.70% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64UpDownCounter/Attributes/0-24 246.25n ± 14% 66.69n ± 2% -72.92% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64UpDownCounter/Attributes/1-24 289.55n ± 9% 65.54n ± 5% -77.36% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64UpDownCounter/Attributes/10-24 286.05n ± 14% 67.55n ± 2% -76.39% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Gauge/Attributes/0-24 254.8n ± 23% 225.9n ± 17% -11.32% (p=0.026 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Gauge/Attributes/1-24 304.4n ± 13% 234.4n ± 19% -23.01% (p=0.004 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Gauge/Attributes/10-24 308.9n ± 20% 217.6n ± 10% -29.56% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Gauge/Attributes/0-24 267.8n ± 14% 220.1n ± 19% -17.80% (p=0.004 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Gauge/Attributes/1-24 274.1n ± 21% 226.5n ± 5% -17.38% (p=0.024 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Gauge/Attributes/10-24 239.0n ± 14% 236.1n ± 18% ~ (p=0.589 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Histogram/Attributes/0-24 223.7n ± 11% 234.8n ± 7% ~ (p=0.240 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Histogram/Attributes/1-24 253.9n ± 10% 244.8n ± 11% ~ (p=0.240 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Histogram/Attributes/10-24 272.6n ± 7% 250.0n ± 12% -8.33% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Histogram/Attributes/0-24 232.6n ± 4% 232.2n ± 8% ~ (p=0.937 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Histogram/Attributes/1-24 276.7n ± 20% 249.2n ± 11% ~ (p=0.485 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Histogram/Attributes/10-24 265.9n ± 18% 246.4n ± 9% ~ (p=0.240 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialInt64Histogram/Attributes/0-24 294.0n ± 11% 269.0n ± 5% -8.47% (p=0.015 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialInt64Histogram/Attributes/1-24 314.6n ± 10% 268.8n ± 6% -14.54% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialInt64Histogram/Attributes/10-24 303.9n ± 11% 285.4n ± 4% ~ (p=0.180 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialFloat64Histogram/Attributes/0-24 274.7n ± 13% 262.9n ± 7% ~ (p=0.145 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialFloat64Histogram/Attributes/1-24 296.1n ± 6% 288.9n ± 9% ~ (p=0.180 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialFloat64Histogram/Attributes/10-24 276.0n ± 14% 299.4n ± 12% ~ (p=0.240 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Counter/Attributes/0-24 191.4n ± 4% 176.0n ± 3% -8.05% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Counter/Attributes/1-24 223.2n ± 8% 172.8n ± 3% -22.54% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Counter/Attributes/10-24 265.7n ± 19% 172.2n ± 2% -35.21% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Counter/Attributes/0-24 179.4n ± 18% 171.0n ± 3% -4.74% (p=0.009 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Counter/Attributes/1-24 209.1n ± 16% 175.4n ± 5% -16.07% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Counter/Attributes/10-24 222.5n ± 17% 175.6n ± 4% -21.08% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64UpDownCounter/Attributes/0-24 194.4n ± 11% 176.9n ± 5% -9.03% (p=0.004 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64UpDownCounter/Attributes/1-24 207.5n ± 13% 175.1n ± 2% -15.66% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64UpDownCounter/Attributes/10-24 243.7n ± 13% 172.6n ± 3% -29.15% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64UpDownCounter/Attributes/0-24 218.3n ± 10% 177.6n ± 2% -18.67% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64UpDownCounter/Attributes/1-24 193.5n ± 10% 176.1n ± 2% -8.99% (p=0.004 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64UpDownCounter/Attributes/10-24 192.8n ± 11% 173.7n ± 2% -9.91% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Gauge/Attributes/0-24 185.1n ± 9% 204.8n ± 9% +10.61% (p=0.004 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Gauge/Attributes/1-24 218.8n ± 14% 229.7n ± 16% ~ (p=0.310 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Gauge/Attributes/10-24 242.7n ± 8% 209.1n ± 18% -13.84% (p=0.041 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Gauge/Attributes/0-24 182.8n ± 42% 255.2n ± 8% +39.67% (p=0.015 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Gauge/Attributes/1-24 198.0n ± 7% 280.6n ± 22% +41.72% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Gauge/Attributes/10-24 236.3n ± 18% 261.7n ± 8% ~ (p=0.065 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Histogram/Attributes/0-24 223.2n ± 9% 226.9n ± 4% ~ (p=0.965 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Histogram/Attributes/1-24 270.1n ± 10% 280.2n ± 6% ~ (p=0.143 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Histogram/Attributes/10-24 257.2n ± 7% 252.0n ± 7% ~ (p=0.485 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Histogram/Attributes/0-24 277.0n ± 5% 310.4n ± 12% ~ (p=0.065 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Histogram/Attributes/1-24 287.3n ± 9% 271.2n ± 12% ~ (p=0.699 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Histogram/Attributes/10-24 281.8n ± 9% 316.5n ± 22% +12.29% (p=0.041 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialInt64Histogram/Attributes/0-24 289.1n ± 9% 297.1n ± 12% ~ (p=0.310 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialInt64Histogram/Attributes/1-24 277.8n ± 6% 353.1n ± 11% +27.11% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialInt64Histogram/Attributes/10-24 281.8n ± 11% 352.2n ± 16% +24.94% (p=0.009 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialFloat64Histogram/Attributes/0-24 294.1n ± 7% 317.5n ± 9% ~ (p=0.065 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialFloat64Histogram/Attributes/1-24 281.7n ± 10% 332.1n ± 8% +17.89% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialFloat64Histogram/Attributes/10-24 238.9n ± 12% 318.1n ± 9% +33.13% (p=0.002 n=6) geomean 251.9n 184.4n -26.77% ``` Single-threaded: ``` goos: linux goarch: amd64 pkg: go.opentelemetry.io/otel/sdk/metric cpu: Intel(R) Xeon(R) CPU @ 2.20GHz │ main1.txt │ sync1.txt │ │ sec/op │ sec/op vs base │ SyncMeasure/NoView/ExemplarsDisabled/Int64Counter/Attributes/0 109.8n ± 7% 113.4n ± 23% ~ (p=1.000 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Counter/Attributes/1 115.0n ± 4% 113.3n ± 20% ~ (p=0.729 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Counter/Attributes/10 177.1n ± 34% 110.2n ± 16% -37.78% (p=0.009 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Counter/Attributes/0 110.5n ± 42% 109.2n ± 19% ~ (p=0.457 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Counter/Attributes/1 118.8n ± 2% 118.4n ± 5% ~ (p=0.619 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Counter/Attributes/10 119.0n ± 2% 116.8n ± 42% ~ (p=0.699 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64UpDownCounter/Attributes/0 106.9n ± 1% 102.5n ± 5% -4.16% (p=0.030 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64UpDownCounter/Attributes/1 117.2n ± 2% 116.9n ± 7% ~ (p=1.000 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64UpDownCounter/Attributes/10 115.4n ± 1% 115.1n ± 5% ~ (p=0.937 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64UpDownCounter/Attributes/0 109.5n ± 5% 104.2n ± 8% -4.84% (p=0.041 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64UpDownCounter/Attributes/1 118.7n ± 14% 113.8n ± 35% ~ (p=0.240 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64UpDownCounter/Attributes/10 116.6n ± 1% 116.8n ± 8% ~ (p=0.968 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Gauge/Attributes/0 106.6n ± 4% 109.4n ± 5% ~ (p=0.093 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Gauge/Attributes/1 114.7n ± 4% 117.9n ± 4% ~ (p=0.240 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Gauge/Attributes/10 115.2n ± 4% 114.5n ± 1% ~ (p=0.162 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Gauge/Attributes/0 109.4n ± 5% 107.5n ± 3% ~ (p=0.132 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Gauge/Attributes/1 118.3n ± 2% 117.9n ± 3% ~ (p=0.589 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Gauge/Attributes/10 117.7n ± 2% 120.8n ± 14% ~ (p=0.093 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Histogram/Attributes/0 96.78n ± 1% 99.37n ± 3% ~ (p=0.065 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Histogram/Attributes/1 103.0n ± 3% 116.5n ± 26% +13.16% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Int64Histogram/Attributes/10 102.8n ± 1% 107.6n ± 22% +4.67% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Histogram/Attributes/0 93.95n ± 22% 99.88n ± 18% +6.32% (p=0.041 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Histogram/Attributes/1 102.7n ± 5% 106.2n ± 6% ~ (p=0.089 n=6) SyncMeasure/NoView/ExemplarsDisabled/Float64Histogram/Attributes/10 104.1n ± 4% 108.3n ± 27% +4.03% (p=0.026 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialInt64Histogram/Attributes/0 146.3n ± 1% 154.0n ± 24% +5.23% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialInt64Histogram/Attributes/1 154.8n ± 3% 161.2n ± 2% +4.20% (p=0.004 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialInt64Histogram/Attributes/10 155.5n ± 1% 164.0n ± 4% +5.43% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialFloat64Histogram/Attributes/0 145.9n ± 2% 159.7n ± 12% +9.42% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialFloat64Histogram/Attributes/1 155.2n ± 0% 164.0n ± 6% +5.70% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsDisabled/ExponentialFloat64Histogram/Attributes/10 219.3n ± 29% 159.5n ± 3% ~ (p=0.065 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Counter/Attributes/0 263.6n ± 36% 177.2n ± 1% ~ (p=0.065 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Counter/Attributes/1 189.1n ± 8% 190.4n ± 12% ~ (p=0.589 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Counter/Attributes/10 184.3n ± 3% 189.4n ± 6% ~ (p=0.065 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Counter/Attributes/0 180.7n ± 1% 182.7n ± 2% ~ (p=0.457 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Counter/Attributes/1 192.8n ± 9% 192.0n ± 1% ~ (p=1.000 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Counter/Attributes/10 192.3n ± 4% 190.2n ± 4% ~ (p=0.093 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64UpDownCounter/Attributes/0 176.5n ± 2% 181.7n ± 4% +2.95% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64UpDownCounter/Attributes/1 184.0n ± 4% 192.0n ± 1% +4.32% (p=0.015 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64UpDownCounter/Attributes/10 184.4n ± 1% 195.2n ± 3% +5.83% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64UpDownCounter/Attributes/0 183.0n ± 3% 177.4n ± 5% -3.06% (p=0.048 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64UpDownCounter/Attributes/1 194.4n ± 4% 188.1n ± 5% ~ (p=0.084 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64UpDownCounter/Attributes/10 193.0n ± 5% 194.1n ± 5% ~ (p=0.699 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Gauge/Attributes/0 178.4n ± 14% 185.6n ± 29% ~ (p=0.240 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Gauge/Attributes/1 189.0n ± 8% 193.2n ± 2% ~ (p=0.132 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Gauge/Attributes/10 197.7n ± 5% 198.8n ± 2% ~ (p=0.619 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Gauge/Attributes/0 185.5n ± 3% 188.8n ± 4% ~ (p=0.310 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Gauge/Attributes/1 191.2n ± 3% 190.2n ± 7% ~ (p=0.732 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Gauge/Attributes/10 186.8n ± 2% 197.1n ± 6% +5.54% (p=0.004 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Histogram/Attributes/0 224.2n ± 4% 227.3n ± 2% ~ (p=0.394 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Histogram/Attributes/1 232.5n ± 3% 242.5n ± 5% ~ (p=0.132 n=6) SyncMeasure/NoView/ExemplarsEnabled/Int64Histogram/Attributes/10 232.5n ± 3% 237.1n ± 5% +2.00% (p=0.045 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Histogram/Attributes/0 227.5n ± 2% 238.5n ± 5% +4.81% (p=0.017 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Histogram/Attributes/1 239.4n ± 8% 250.1n ± 6% ~ (p=0.240 n=6) SyncMeasure/NoView/ExemplarsEnabled/Float64Histogram/Attributes/10 241.5n ± 4% 254.0n ± 2% +5.18% (p=0.004 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialInt64Histogram/Attributes/0 231.1n ± 5% 239.2n ± 3% ~ (p=0.084 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialInt64Histogram/Attributes/1 260.2n ± 16% 253.8n ± 4% ~ (p=0.190 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialInt64Histogram/Attributes/10 234.3n ± 1% 246.8n ± 2% +5.29% (p=0.002 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialFloat64Histogram/Attributes/0 221.8n ± 6% 232.0n ± 4% +4.58% (p=0.037 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialFloat64Histogram/Attributes/1 228.2n ± 7% 240.6n ± 1% +5.41% (p=0.041 n=6) SyncMeasure/NoView/ExemplarsEnabled/ExponentialFloat64Histogram/Attributes/10 228.6n ± 7% 244.7n ± 1% +7.04% (p=0.015 n=6) geomean 158.1n 158.1n +0.00% ``` --------- Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>