feat: capture Gemini cache and thinking tokens in instrument_google_genai by JonathanTsen · Pull Request #1961 · pydantic/logfire

JonathanTsen · 2026-05-23T11:40:57Z

Summary

Patches upstream _GenerateContentInstrumentationHelper to extract cached_content_token_count, thoughts_token_count and tool_use_prompt_token_count from response.usage_metadata and emit them as gen_ai.usage.cache_read.input_tokens, gen_ai.usage.details.thoughts_tokens and gen_ai.usage.details.tool_use_prompt_tokens.
Computes operation.cost via genai-prices when available (silent failure if the package is missing or the model is unknown).
Closes the gap where direct google.genai users (not going through pydantic-ai) were missing cache and thinking metrics in their spans.

Implementation notes

Defensive monkey-patch of two upstream methods (_maybe_update_token_counts, create_final_attributes), following the same pattern already used elsewhere in logfire/_internal/integrations/google_genai.py. Wrapped in try/except at module load so a future upstream rename keeps the base instrumentor working.
Methods confirmed present in opentelemetry-instrumentation-google-genai 0.7b0; current pinning in pyproject.toml is >= 0.4b0.
Streaming uses the upstream "keep last non-zero" semantics — walrus operator naturally ignores None/0, so partial chunks don't overwrite a final value.
Unlike Anthropic, the Gemini API's prompt_token_count already includes cached tokens, so we expose cache_read separately rather than summing.
genai-prices is invoked with response.model_dump(by_alias=True) because the extractor expects camelCase JSON keys (usageMetadata, modelVersion).

Test plan

uv run pytest tests/otel_integrations/test_google_genai.py — 8 passing (3 new + 5 existing; existing VCR snapshots updated to include the new attributes that real Gemini responses already carry).
make lint
make typecheck

…enai Patches the upstream OTel GoogleGenAiSdkInstrumentor to also extract cached_content_token_count, thoughts_token_count and tool_use_prompt_token_count from response.usage_metadata, and computes operation.cost via genai-prices when available. Closes the gap where direct google.genai users (not going through pydantic-ai) were missing cache and thinking metrics in their spans.

cubic-dev-ai

No issues found across 4 files

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

_{Re-trigger cubic}

codecov · 2026-05-23T11:50:14Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Adds a test that exercises the early-return branch when the Gemini response has no usage_metadata, and marks the defensive `usage_data.model is not None` check (unreachable in practice: the google extractor raises LookupError for unknown models rather than returning model=None) with `# pragma: no branch`. Restores 100% coverage broken by the previous commit.

References the official Gemini API pages for context caching, thinking tokens, function calling, and pricing, plus the python-genai field description that documents prompt_token_count already including cached tokens.

cubic-dev-ai

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs/integrations/llms/google-genai.md">

<violation number="1" location="docs/integrations/llms/google-genai.md:59">
P2: Documentation for `operation.cost` should clarify that the attribute is only present when `genai-prices` is installed and the model pricing is known, since the implementation silently skips cost calculation otherwise.</violation>
</file>

_{Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic}

cubic-dev-ai · 2026-05-23T12:16:00Z

+- `gen_ai.usage.cache_read.input_tokens` — tokens served from [context cache](https://ai.google.dev/gemini-api/docs/caching) (cache hit)
+- `gen_ai.usage.details.thoughts_tokens` — [reasoning tokens](https://ai.google.dev/gemini-api/docs/thinking) (Gemini 2.5 / 3.x)
+- `gen_ai.usage.details.tool_use_prompt_tokens` — tokens used for [tool definitions](https://ai.google.dev/gemini-api/docs/function-calling)
+- `operation.cost` — calculated price in USD using the [official Gemini pricing tables](https://ai.google.dev/gemini-api/docs/pricing) via [`genai-prices`](https://pypi.org/project/genai-prices/)


P2: Documentation for operation.cost should clarify that the attribute is only present when genai-prices is installed and the model pricing is known, since the implementation silently skips cost calculation otherwise.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At docs/integrations/llms/google-genai.md, line 59: <comment>Documentation for `operation.cost` should clarify that the attribute is only present when `genai-prices` is installed and the model pricing is known, since the implementation silently skips cost calculation otherwise.</comment> <file context> @@ -53,10 +53,13 @@ following attributes may appear depending on the response: +- `gen_ai.usage.cache_read.input_tokens` — tokens served from [context cache](https://ai.google.dev/gemini-api/docs/caching) (cache hit) +- `gen_ai.usage.details.thoughts_tokens` — [reasoning tokens](https://ai.google.dev/gemini-api/docs/thinking) (Gemini 2.5 / 3.x) +- `gen_ai.usage.details.tool_use_prompt_tokens` — tokens used for [tool definitions](https://ai.google.dev/gemini-api/docs/function-calling) +- `operation.cost` — calculated price in USD using the [official Gemini pricing tables](https://ai.google.dev/gemini-api/docs/pricing) via [`genai-prices`](https://pypi.org/project/genai-prices/) Note that, unlike Anthropic, the Gemini API's `prompt_token_count` already includes </file context>

Suggested change

- `operation.cost` — calculated price in USD using the [official Gemini pricing tables](https://ai.google.dev/gemini-api/docs/pricing) via [`genai-prices`](https://pypi.org/project/genai-prices/)

- `operation.cost` — calculated price in USD using the [official Gemini pricing tables](https://ai.google.dev/gemini-api/docs/pricing) via [`genai-prices`](https://pypi.org/project/genai-prices/) (only present when the package is installed and model pricing is known)

alexmojaki · 2026-06-19T14:25:22Z

+Note that, unlike Anthropic, the Gemini API's `prompt_token_count` already includes
+the cached tokens; Logfire does not sum them again. This is documented in the
+[`GenerateContentResponseUsageMetadata.prompt_token_count`](https://googleapis.github.io/python-genai/genai.html#genai.types.GenerateContentResponseUsageMetadata.prompt_token_count)
+field description: *"When `cached_content` is set, this also includes the number
+of tokens in the cached content."*


I don't think this is needed

alexmojaki · 2026-06-19T14:26:05Z

+                self._lf_thoughts = thoughts
+            if tool_use := getattr(metadata, 'tool_use_prompt_token_count', None):
+                self._lf_tool_use_prompt = tool_use
+            self._lf_response = response


it looks like _lf_response is the only thing that needs to be stored, and the rest can be retrieved in _wrapped_create_final_attributes

alexmojaki · 2026-06-19T14:27:45Z

+    return _generate
+
+
+def _build_fake_genai_response(


is extra mocking needed? can we stick to vcr for the new tests?

alexmojaki · 2026-06-19T14:28:08Z

please don't edit this file

cubic-dev-ai Bot reviewed May 23, 2026

View reviewed changes

JonathanTsen added 2 commits May 23, 2026 08:58

cubic-dev-ai Bot reviewed May 23, 2026

View reviewed changes

alexmojaki reviewed Jun 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: capture Gemini cache and thinking tokens in instrument_google_genai#1961

feat: capture Gemini cache and thinking tokens in instrument_google_genai#1961
JonathanTsen wants to merge 3 commits into
pydantic:mainfrom
JonathanTsen:feat/google-genai-cache-tokens

JonathanTsen commented May 23, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

codecov Bot commented May 23, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

cubic-dev-ai Bot May 23, 2026

Uh oh!

alexmojaki Jun 19, 2026

Uh oh!

alexmojaki Jun 19, 2026

Uh oh!

alexmojaki Jun 19, 2026

Uh oh!

alexmojaki Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	- `operation.cost` — calculated price in USD using the [official Gemini pricing tables](https://ai.google.dev/gemini-api/docs/pricing) via [`genai-prices`](https://pypi.org/project/genai-prices/)
	- `operation.cost` — calculated price in USD using the [official Gemini pricing tables](https://ai.google.dev/gemini-api/docs/pricing) via [`genai-prices`](https://pypi.org/project/genai-prices/) (only present when the package is installed and model pricing is known)

Conversation

JonathanTsen commented May 23, 2026

Summary

Implementation notes

Test plan

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

alexmojaki Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

alexmojaki Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

alexmojaki Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

alexmojaki Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented May 23, 2026 •

edited

Loading