Set up generator class and auth for Data Engineering Agent #404
Set up generator class and auth for Data Engineering Agent #404minzznguyen wants to merge 8 commits into
Conversation
… with GCP ADC Task 1.1: Define DataEngineeringAgentGenerator inheriting from QueryGenerator in data_engineering_agent.py. Task 1.2: Integrate A2A SDK and configure GcpAdcCredentialService to use GCP Application Default Credentials (ADC) for authentication. TAG=agy CONV=aa927cc7-418a-41e3-b658-9b82915e18eb
IsmailMehdi
left a comment
There was a problem hiding this comment.
Code review
Scaffolds a new DataEngineeringAgentGenerator plus a GcpAdcCredentialService for the A2A SDK's AuthInterceptor. Per the WIP label, auth is the focus; messaging in generate_internal is stubbed.
Correctness & functional issues
- Generator is not registered.
evalbench/generators/models/__init__.py'sget_generator()dispatch has nodata_engineering_agentbranch and the class isn't imported. Even with a config pointing at it, instantiation falls through toValueError("Unknown Generator ..."). Either add the branch now or call this out as a follow-up in the PR description. - Missing dependency declaration.
a2a(the A2A SDK) is not inpyproject.toml. The repo recently migrated touv(d4501e4) — pleaseuv add a2aso the lockfile is updated. Without this, the import fails anywhere outside the author's local env. - Blocking sync call inside
async def.credentials.refresh(Request())does a synchronous HTTPS round-trip and is called fromasync def get_credentials. Inside an asyncio event loop this blocks every other task. Useawait asyncio.to_thread(credentials.refresh, Request()), or an async-compatible auth flow. - Token fetched on every call, no caching. ADC tokens are valid ~1 hour.
google.auth.default()+refresh()runs on everyget_credentialsinvocation — wasteful and slow on the hot path. Cachecredentialsonselfand only refresh whennot credentials.valid(the library exposescredentials.expired/valid). - Unused import.
from a2a.types import a2a_pb2 as pb—pbis never referenced. Drop untilgenerate_internalactually needs it. self.loggeris set but never used. All output goes throughprint(...). Pick one — and given the rest of the codebase, that should belogging(seegemini.py,claude.py,generator.py).
Security
- Logging token bytes, even truncated.
print(f\"... GCP ADC token: {token[:10]}...\")writes the first 10 chars of an access token to stdout. Even prefixes are sensitive and end up in CI logs / shared terminals. Replace withself.logger.debug(\"Retrieved GCP ADC token\")— no bytes of the token at all. security_scheme_nameis ignored. The service hands the same bearer token back regardless of scheme. If multiple schemes are ever registered, this is silently wrong. At minimum add a comment that this provider intentionally only services one scheme; ideally branch (or raise) on unknown names.- No error handling around
google.auth.default()/refresh(). Both can raise (DefaultCredentialsError,RefreshError). Wrap with try/except and log clearly — saves real debugging time later.
Style / conventions
- Emoji-laden
printstatements (🔑,✅) don't match the rest of the codebase, which uses standardlogging. Remove emoji and route through the logger. - Trailing whitespace on the line after
self.logger = logging.getLogger(__name__). - The closest existing pattern to mirror is
claude.py(concise constructor, singleclientattribute, plainlogging).
Tests
None added. Understandable for a WIP that doesn't do anything yet, but at least one unit test that constructs the generator with a fake CredentialService (no network call) would catch the registration gap and the unused-import problem in CI. Plan to add tests alongside Task 1.3.
Recommended order
- Replace the token-prefix
printwithlogger.debug— no token bytes. - Move
credentials.refresh(Request())off the event loop (asyncio.to_thread) and cache the credentials object. - Add
a2aviauv addso lockfile/pyproject.tomlare updated. - Register
DataEngineeringAgentGeneratoringenerators/models/__init__.pyonce the class is real — or leave a TODO in the PR body. - Remove the unused
a2a_pb2 as pbimport; pickloggingoverprint. - Wrap
google.auth.default()/refresh()with try/except.
Risk summary
Low blast radius today — the generator can't be selected via config and generate_internal raises. Main risks are landing dead/insecure scaffolding into main: (a) token-prefix logging, (b) missing dep declaration breaking imports if anything reaches it, (c) blocking sync auth call that will silently hurt throughput once Task 1.3 lands. Address (a) and (c) before this merges even as WIP scaffolding.
- Add `a2a-sdk>=1.0.3` to dependencies in `pyproject.toml` and update `uv.lock`. - Update `DataEngineeringAgentGenerator` to configure `endpoint`, `target_workspace` and clean up GCP ADC Credential Service logic with event loop safety (async refresh). - Register `DataEngineeringAgentGenerator` in the generators factory. - Add `evalbench/test/data_engineering_agent_test.py` to verify correct generator setup and configurations. TAG=agy CONV=aa927cc7-418a-41e3-b658-9b82915e18eb
| self.name = "data_engineering_agent" | ||
| self.endpoint = querygenerator_config.get( | ||
| "endpoint", | ||
| "https://geminidataanalytics.googleapis.com/v1/a2a/projects/bq-dataworkeragent-test/locations/us-west4/agents/dataengineeringagent" |
There was a problem hiding this comment.
no hardcoded endpoints, please have a config with defaults.
| ) | ||
| self.target_workspace = querygenerator_config.get( | ||
| "target_workspace", | ||
| "projects/bq-dataworkeragent-test/locations/us-west4/repositories/agent_demo_dataform/workspaces/james_test_123" |
There was a problem hiding this comment.
- Remove hardcoded test URLs for `endpoint` and `target_workspace` from `DataEngineeringAgentGenerator`. - Raise `ValueError` if either configuration key is missing or empty. TAG=agy CONV=aa927cc7-418a-41e3-b658-9b82915e18eb
TAG=agy CONV=aa927cc7-418a-41e3-b658-9b82915e18eb
- Add a new unit test `test_get_credentials_invalid_scheme` verifying that `GcpAdcCredentialService` raises `ValueError` for unsupported auth schemes. - Replace absolute generator import in `evalbench/generators/models/__init__.py` with a relative one. - Clean up unused imports in `data_engineering_agent.py`. TAG=agy CONV=aa927cc7-418a-41e3-b658-9b82915e18eb
- Revert relative import of `QueryGenerator` back to absolute import `generators.models.generator.QueryGenerator` in `evalbench/generators/models/__init__.py` to prevent runtime package resolution issues. TAG=agy CONV=aa927cc7-418a-41e3-b658-9b82915e18eb
| "Credentials: %s", | ||
| e, | ||
| ) | ||
| return None |
There was a problem hiding this comment.
are we able to talk to the endpoint if we fail to get the credentials? if not, let's allow the error propagate to the caller instead of suppressing it here (same for the general exception below). i.e. remove the try-except
| ) | ||
|
|
||
|
|
||
| def test_data_engineering_agent_generator_setup(): |
There was a problem hiding this comment.
let's also add tests for the bad paths, i.e. when endpoint and/or target_workspace are missing and ensure the value error is being raised
similarly, let's also try to cover the bad paths in GcpAdcCredentialService.get_credentials
- Update `GcpAdcCredentialService` to propagate `DefaultCredentialsError` and `RefreshError` up instead of catching them silently and returning `None`. - Add `test_generator_setup_missing_endpoint` and `test_generator_setup_missing_workspace` in `data_engineering_agent_test.py`. - Add `test_get_credentials_error_resiliency_default` and `test_get_credentials_error_resiliency_refresh` to verify error propagation. TAG=agy CONV=aa927cc7-418a-41e3-b658-9b82915e18eb
No description provided.