Skip to content

docs: search agent dev note#350

Open
dhruvnathawani wants to merge 3 commits intomainfrom
dhruv/devnotes/search-agent
Open

docs: search agent dev note#350
dhruvnathawani wants to merge 3 commits intomainfrom
dhruv/devnotes/search-agent

Conversation

@dhruvnathawani
Copy link
Contributor

@dhruvnathawani dhruvnathawani commented Feb 23, 2026

Summary

Add a dev note documenting the search agent SFT data pipeline used to generate tool-use trajectories for training Nemotron Super's web browsing capabilities.

What's in the post

  1. Motivation: Why training search agents requires trajectory data capturing the full thought-action-observation loop, not just QA pairs. BrowseComp benchmark context.
  2. Pipeline walkthrough: Wikidata KG random walks (50k seeds) -> two-stage riddle generation (draft + BrowseComp-style obfuscation) via Data Designer LLM columns -> search trajectory rollouts with live Tavily web search via MCP -> post-processing to SFT-ready JSONL
  3. ASCII pipeline diagram showing the 4-stage flow (seed data -> riddle generation -> trajectory rollouts -> SFT dataset)
  4. Example Wikidata paths (NVIDIA -> Jensen Huang -> Oregon State -> Benton County -> Thomas Hart Benton) with draft-to-obfuscated question transformation
  5. Seed filtering heuristics: anti-meta filters, hop range constraints (4-8), generic entity removal
  6. Full trajectory example in OpenAI-messages format showing system prompt, tool calls, tool responses, and final answer
  7. Production yield analysis: 50k seeds -> 37k valid seeds (74%) -> 24k valid questions (65%) -> 7k valid trajectories (29%), 14% end-to-end yield
  8. Correctness challenge: multi-answer validity, stale Wikidata ground truth (U.S. Steel/Nippon Steel example), 27.5% raw accuracy before filtering
  9. Data Designer MCP integration walkthrough: LocalStdioMCPProvider, ToolConfig (allowlists, turn budgets, timeouts), tool_alias + with_trace=TraceType.ALL_MESSAGES for full conversation capture
  10. Collapsible full source script with Tavily MCP server, 3-column DAG (draft -> obfuscated -> agent trajectory), and trace capture
  11. Next steps: scale to 25k questions, push difficulty higher, explore fresher knowledge bases, search RL environment

Files changed

  1. docs/devnotes/posts/search-agent.md (updated with code dropdown)
  2. docs/devnotes/.authors.yml (pulled latest from main with dnathawani)
  3. mkdocs.yml (added Search Agent to Dev Notes nav)

@dhruvnathawani dhruvnathawani changed the title search agent dev notes docs: search agent dev note Feb 26, 2026
@dhruvnathawani dhruvnathawani marked this pull request as ready for review February 26, 2026 20:30
@dhruvnathawani dhruvnathawani requested a review from a team as a code owner February 26, 2026 20:30
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 26, 2026

Greptile Summary

This PR adds a comprehensive dev note documenting the search agent SFT data pipeline used to generate tool-use trajectories for training Nemotron Super's web browsing capabilities. The document covers the complete four-stage pipeline from Wikidata knowledge graph walks to SFT-ready training data, including detailed explanations, code examples, production statistics, and implementation guidance.

  • Well-structured documentation with clear sections covering motivation, architecture, implementation details, and key takeaways
  • Includes working code examples using Data Designer's API with proper import patterns and MCP integration
  • Provides ASCII diagrams for pipeline visualization and concrete examples throughout
  • Documents production metrics and yield analysis (50k seeds → 7k trajectories at 14% yield)
  • Navigation properly updated in mkdocs.yml to include the new dev note

Confidence Score: 5/5

  • This PR is safe to merge with no risk
  • This is a documentation-only PR that adds a new dev note and updates the navigation. No code changes, no runtime behavior modifications, and the documentation is well-written with accurate code examples following project patterns.
  • No files require special attention

Important Files Changed

Filename Overview
docs/devnotes/posts/search-agent.md New comprehensive dev note documenting the search agent SFT data pipeline with detailed explanations, code examples, and ASCII diagrams
mkdocs.yml Added navigation entry for the new Search Agent dev note

Last reviewed commit: 0620a60

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant