-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Problem
Many question concepts either have no Khan Academy course or link to spurious matches (e.g., similar keywords but wrong theme/topic). When a user clicks the Khan Academy button after answering a question, they may land on irrelevant or empty search results.
Based on exploration, it doesn't appear possible to automatically count Khan Academy search results client-side (no public API for search result counts, and CORS prevents scraping).
Proposed Solution
1. Local validation script
Write a script (scripts/validate_khan_links.py) that:
- Reads all domain JSON files and extracts unique
concepts_testedvalues - For each concept, runs a search on Khan Academy (e.g.,
https://www.khanacademy.org/search?page_search_query=<concept>) - Counts the number of search results (using headless browser or Khan Academy's internal API if discoverable)
- Outputs a report: concept → result count → recommended action
2. Add khan_academy_mode flag to question JSON
For each question, add a field:
{
"khan_academy_mode": "search" | "generic"
}"search": The Khan Academy button initiates a search for the question's specific concept(s) (current behavior)"generic": The Khan Academy button links to a generic course page for that sub-domain or domain (e.g.,https://www.khanacademy.org/science/physicsfor physics questions)
3. Update quiz.js to respect the flag
In src/ui/quiz.js, when building the Khan Academy link:
- If
khan_academy_mode === "search"→ use current search URL - If
khan_academy_mode === "generic"→ link to pre-configured domain course URL
4. Define generic fallback URLs per domain
In domain JSON or a config file, define fallback Khan Academy URLs:
{
"quantum-physics": "https://www.khanacademy.org/science/physics/quantum-physics",
"astrophysics": "https://www.khanacademy.org/science/cosmology-and-astronomy",
"biology": "https://www.khanacademy.org/science/biology"
}Tasks
- Write
scripts/validate_khan_links.py— local script to check all concepts against Khan Academy search - Generate report of concepts with 0 results vs. valid results
- Add
khan_academy_modefield to question generation pipeline - Update
quiz.jsto use the flag when building Khan Academy URLs - Define generic fallback URLs for each domain
- Re-run validation after question generation is complete
Notes
This should be done after all 50 questions per domain are generated, since the concept list needs to be finalized first.