Add system.cpu.load_average.1m and system.processes.count#2014
Add system.cpu.load_average.1m and system.processes.count#2014strawgate wants to merge 1 commit into
system.cpu.load_average.1m and system.processes.count#2014Conversation
Upstream `SystemMetricsInstrumentor` doesn't include either metric.
Both are one-line psutil calls so the Logfire wrapper can emit them
via the same callback pattern as `system.cpu.simple_utilization`.
Both go into `FULL_CONFIG` so the Hosts page populates Load 1m + Procs
columns when the customer uses `base='full'`. Available individually
via `config={…}` for callers picking metrics by hand.
On Windows, `psutil.getloadavg()` is an emulated polling shim — first
call returns 0 and values update every ~5s.
| yield Observation(len(psutil.pids())) | ||
|
|
||
| logfire_instance.metric_gauge_callback( | ||
| 'system.processes.count', |
There was a problem hiding this comment.
I think this should probably be system.process.count as in https://opentelemetry.io/docs/specs/semconv/system/system-metrics/#metric-systemprocesscount, which is an updowncounter. If including the process.state attribute isn't easy it's probably fine to leave out.
There was a problem hiding this comment.
So the point is — rename the attribute to system.process.count, and make it an updowncounter not a gauge. If convenient, we can add the process.state attribute, but note that could increase the number of series significantly, which we probably wouldn't want to do in default config if it costs a decent amount. @strawgate do you have thoughts about whether we should add that attribute?
Alternatively we could change the name more significantly, I don't think it should be close to the semantic convention one though if we do that (like it is now). But I'm okay using this name and not having the attribute if we think this is good.
| yield Observation(psutil.getloadavg()[0]) | ||
|
|
||
| logfire_instance.metric_gauge_callback( | ||
| 'system.cpu.load_average.1m', |
There was a problem hiding this comment.
https://opentelemetry.io/docs/specs/semconv/system/system-metrics/ mentions a hypothetical system.linux.cpu.load_1m. Not a real semconv, but worth mimicking. If we don't want to include the OS prefix, then system.cpu.load_1m.
There was a problem hiding this comment.
@strawgate Can we just change the name to this? Is there a reason you used the existing name?
| 'system.cpu.utilization': CPU_FIELDS, | ||
| # The Hosts page reads these — upstream `_DEFAULT_CONFIG` doesn't include | ||
| # them and they aren't expensive (one number per scrape each), so add them | ||
| # so `base='full'` covers the page without a follow-up `config={…}` knob. |
There was a problem hiding this comment.
If these are so cheap, and they're used in a default UI view, then they should be in the basic config.
| 'cpython.gc.uncollectable_objects': None, | ||
| 'system.cpu.load_average.1m': None, | ||
| 'system.processes.count': None, | ||
| }, 'Docs and the MetricName type need to be updated if this test fails' |
There was a problem hiding this comment.
(the test explains that if it fails the docs need to be updated, we're guessing the AI fixed the test pre-emptively but didn't realize it also needed to update the docs.)
Upstream
SystemMetricsInstrumentordoesn't emit either metric. Both are one-line psutil calls, so the Logfire wrapper adds them via the same callback pattern assystem.cpu.simple_utilizationand includes them inFULL_CONFIG.Closes the gap where
instrument_system_metrics(base='full')left the Hosts page's Load 1m and Procs columns permanently empty.On Windows,
psutil.getloadavg()is an emulated polling shim — first call returns 0 and values update every ~5s. Documented in the helper's docstring.