-
Notifications
You must be signed in to change notification settings - Fork 45
Haystack integration #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| MOSS_PROJECT_ID=your-project-id | ||
| MOSS_PROJECT_KEY=your-project-key | ||
| GEMINI_API_KEY=your-gemini-key |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,140 @@ | ||
| # Haystack + Moss Cookbook Example | ||
|
|
||
| Use [Moss](https://moss.dev) as realtime semantic search in [Haystack](https://haystack.deepset.ai/) RAG pipelines. Moss provides sub-10ms semantic search, Haystack orchestrates the retrieval-to-generation pipeline. | ||
|
|
||
| > **Note:** This is a cookbook example, not a packaged integration. `moss_haystack.py` is a self-contained module you can adapt into your own project. | ||
|
|
||
| ## Installation | ||
|
|
||
| ```bash | ||
| pip install haystack-ai moss python-dotenv | ||
| ``` | ||
|
|
||
| ## Setup | ||
|
|
||
| Set your credentials in a `.env` file (see `.env.example`): | ||
|
|
||
| ```bash | ||
| MOSS_PROJECT_ID=your-project-id | ||
| MOSS_PROJECT_KEY=your-project-key | ||
| GEMINI_API_KEY=your-gemini-key | ||
| ``` | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ```python | ||
| from haystack import Document | ||
| from moss_haystack import MossDocumentStore, MossRetriever | ||
|
|
||
| store = MossDocumentStore(index_name="knowledge-base") | ||
| store.write_documents([ | ||
| Document(id="1", content="I wake up at 6:30 AM on weekdays."), | ||
| Document(id="2", content="Cold showers improve circulation and alertness."), | ||
| ]) | ||
|
|
||
| retriever = MossRetriever(document_store=store, top_k=3) | ||
| retriever.load_index() | ||
| result = retriever.run(query="when do I wake up?") | ||
|
|
||
| for doc in result["documents"]: | ||
| print(f"[{doc.score:.2f}] {doc.content}") | ||
| ``` | ||
|
|
||
| ## Demo: Multi-Index Life Assistant | ||
|
|
||
| The included `example_usage.py` runs an interactive CLI life assistant with **keyword-based routing** across two Moss indexes: | ||
|
|
||
| ``` | ||
| User Question | ||
| | | ||
| v | ||
| Keyword Router | ||
| | | ||
| +-- personal ("my", "I", "me") --> MossRetriever (life-personal) | ||
| | | | ||
| +-- general ("how to", "tips") --> MossRetriever (life-general) | ||
| | | | ||
| +-- combined (both or neither) --> Both retrievers → DocumentJoiner | ||
| | | ||
| v | ||
| PromptBuilder → Gemini LLM | ||
| | | ||
| v | ||
| Final Answer | ||
| ``` | ||
|
|
||
| ### How it works | ||
|
|
||
| 1. **Two Moss indexes** with synthetic data: | ||
| - `life-personal` (15 docs) — daily routines, fitness schedule, diet, sleep habits | ||
| - `life-general` (15 docs) — tips, research, and advice on health, fitness, productivity | ||
|
|
||
| 2. **Keyword router** classifies queries: | ||
| - Personal pronouns ("my", "I", "me") → search personal index | ||
| - General keywords ("how to", "benefits", "tips") → search general index | ||
| - Both or neither → search both indexes and join results | ||
|
|
||
| 3. **Haystack RAG pipeline** retrieves docs → builds prompt → generates answer via Gemini | ||
|
|
||
| ### Run the demo | ||
|
|
||
| ```bash | ||
| cd examples/cookbook/haystack | ||
| python example_usage.py | ||
| ``` | ||
|
|
||
| ``` | ||
| === Life Assistant (Haystack + Moss) === | ||
| Ask about your habits or get general advice. | ||
| Type 'quit' to exit. | ||
|
|
||
| You: What is my gym routine? | ||
| [Routed to: personal] | ||
| Assistant: You go to the gym Monday, Wednesday, and Friday... | ||
|
|
||
| You: What are the benefits of cold showers? | ||
| [Routed to: general] | ||
| Assistant: Cold exposure therapy benefits include improved circulation... | ||
|
|
||
| You: Should I change my morning routine? | ||
| [Routed to: combined] | ||
| Assistant: Your current morning routine includes yoga and lemon water... | ||
| ``` | ||
|
|
||
| ## Components | ||
|
|
||
| ### MossDocumentStore | ||
|
|
||
| Implements Haystack's `DocumentStore` protocol. Creates its own `MossClient` from credentials. | ||
|
|
||
| | Method | Description | | ||
| |--------|-------------| | ||
| | `write_documents(docs, policy)` | Write documents. First call creates the index, subsequent calls upsert. | | ||
| | `count_documents()` | Return document count | | ||
| | `delete_documents(ids)` | Delete documents by ID | | ||
| | `load_index()` | Download index for fast local queries | | ||
|
|
||
| ### MossRetriever | ||
|
|
||
| Haystack `@component` for semantic search. | ||
|
|
||
| | Parameter | Default | Description | | ||
| |-----------|---------|-------------| | ||
| | `document_store` | required | MossDocumentStore instance | | ||
| | `top_k` | 5 | Number of results | | ||
| | `alpha` | 0.8 | Hybrid search balance (0=keyword, 1=semantic) | | ||
|
|
||
| | Method | Description | | ||
| |--------|-------------| | ||
| | `load_index()` | Load Moss index for fast local queries | | ||
| | `run(query, top_k)` | Search and return `{"documents": list[Document]}` | | ||
|
|
||
| ## Files | ||
|
|
||
| | File | Description | | ||
| |------|-------------| | ||
| | `moss_haystack.py` | MossDocumentStore + MossRetriever implementation | | ||
| | `example_usage.py` | Multi-index life assistant with keyword routing | | ||
| | `data/` | Synthetic data: `personal_habits.json`, `general_knowledge.json` | | ||
| | `test_live.py` | Live platform tests | | ||
|
devin-ai-integration[bot] marked this conversation as resolved.
|
||
| | `.env.example` | Template for required environment variables | | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| [ | ||
| { | ||
| "id": "gen-001", | ||
| "text": "The Pomodoro Technique was invented by Francesco Cirillo in the late 1980s. It uses a timer to break work into 25-minute intervals separated by short breaks. Named after a tomato-shaped kitchen timer.", | ||
| "metadata": {"category": "productivity", "topic": "techniques"} | ||
| }, | ||
| { | ||
| "id": "gen-002", | ||
| "text": "Intermittent fasting 16:8 means fasting for 16 hours and eating within an 8-hour window. Studies show it can improve insulin sensitivity, reduce inflammation, and support weight management.", | ||
| "metadata": {"category": "health", "topic": "nutrition"} | ||
| }, | ||
| { | ||
| "id": "gen-003", | ||
| "text": "A good 5K time for a beginner runner is 30-35 minutes. Intermediate runners aim for 22-25 minutes. Elite runners complete it under 17 minutes. Consistent training can improve times by 1-2 minutes per month.", | ||
| "metadata": {"category": "fitness", "topic": "running"} | ||
| }, | ||
| { | ||
| "id": "gen-004", | ||
| "text": "Cold exposure therapy benefits include improved circulation, reduced inflammation, enhanced immune function, and increased mental resilience. Start with 30 seconds and gradually increase to 2-5 minutes.", | ||
| "metadata": {"category": "wellness", "topic": "recovery"} | ||
| }, | ||
| { | ||
| "id": "gen-005", | ||
| "text": "To build a consistent meditation habit, start with just 5 minutes daily. Guided apps like Headspace, Calm, or Insight Timer can help. Morning meditation pairs well with a morning routine for consistency.", | ||
| "metadata": {"category": "wellness", "topic": "meditation"} | ||
| }, | ||
| { | ||
| "id": "gen-006", | ||
| "text": "The ideal daily water intake is approximately 3.7 liters for men and 2.7 liters for women, including water from food. Active people and those in hot climates need more. Signs of dehydration include dark urine and fatigue.", | ||
| "metadata": {"category": "health", "topic": "hydration"} | ||
| }, | ||
| { | ||
| "id": "gen-007", | ||
| "text": "Meal prepping saves an average of 5-7 hours per week on cooking. Best containers are glass with snap-lock lids. Most prepped meals stay fresh for 4-5 days in the fridge. Chicken, rice, and vegetables are the most popular combo.", | ||
| "metadata": {"category": "nutrition", "topic": "meal-prep"} | ||
| }, | ||
| { | ||
| "id": "gen-008", | ||
| "text": "The best time to exercise for muscle growth is between 2-6 PM when body temperature peaks and testosterone levels are higher. Morning exercise is better for fat burning and establishing consistency.", | ||
| "metadata": {"category": "fitness", "topic": "timing"} | ||
| }, | ||
| { | ||
| "id": "gen-009", | ||
| "text": "Sleep hygiene tips: keep your room at 65-68F (18-20C), avoid screens 1 hour before bed, maintain a consistent schedule, and limit caffeine after noon. 7-9 hours of sleep is optimal for adults.", | ||
| "metadata": {"category": "health", "topic": "sleep"} | ||
| }, | ||
| { | ||
| "id": "gen-010", | ||
| "text": "Walking after meals for 15-30 minutes can lower blood sugar by up to 30%. It also improves digestion and reduces bloating. Even a slow-paced walk is effective.", | ||
| "metadata": {"category": "health", "topic": "walking"} | ||
| }, | ||
| { | ||
| "id": "gen-011", | ||
| "text": "The weekly review is a core practice from Getting Things Done (GTD) by David Allen. It involves reviewing all projects, clearing inboxes, and planning the next week. Best done on Sunday evenings.", | ||
| "metadata": {"category": "productivity", "topic": "planning"} | ||
| }, | ||
| { | ||
| "id": "gen-012", | ||
| "text": "Yoga for beginners: start with 15-20 minutes of basic poses — downward dog, warrior I and II, child's pose, and cat-cow. Consistency matters more than duration. Morning yoga improves flexibility and reduces stress.", | ||
| "metadata": {"category": "fitness", "topic": "yoga"} | ||
| }, | ||
| { | ||
| "id": "gen-013", | ||
| "text": "Oat milk has become the most popular plant-based milk alternative. It's naturally sweet, froths well for coffee, and has more fiber than almond milk. However, it's higher in calories and carbs than almond milk.", | ||
| "metadata": {"category": "nutrition", "topic": "alternatives"} | ||
| }, | ||
| { | ||
| "id": "gen-014", | ||
| "text": "Reading before bed improves sleep quality and reduces stress by 68% according to a University of Sussex study. Physical books are better than screens. Even 6 minutes of reading can lower heart rate and ease tension.", | ||
| "metadata": {"category": "wellness", "topic": "reading"} | ||
| }, | ||
| { | ||
| "id": "gen-015", | ||
| "text": "The two-alarm technique: set alarms 5 minutes apart. The first alarm starts your wake-up process, the second confirms it. Place the second alarm across the room to force getting out of bed.", | ||
| "metadata": {"category": "productivity", "topic": "wake-up"} | ||
| } | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| [ | ||
| { | ||
| "id": "habit-001", | ||
| "text": "I wake up at 6:30 AM on weekdays and 8:00 AM on weekends. I set two alarms 5 minutes apart.", | ||
| "metadata": {"category": "routine", "time": "morning"} | ||
| }, | ||
| { | ||
| "id": "habit-002", | ||
| "text": "My morning routine: wake up, drink a glass of warm lemon water, 20 minutes of yoga, then shower. Takes about 45 minutes total.", | ||
| "metadata": {"category": "routine", "time": "morning"} | ||
| }, | ||
| { | ||
| "id": "habit-003", | ||
| "text": "I do intermittent fasting 16:8. My eating window is 12 PM to 8 PM. I skip breakfast and have black coffee instead.", | ||
| "metadata": {"category": "diet", "time": "daily"} | ||
| }, | ||
| { | ||
| "id": "habit-004", | ||
| "text": "I go to the gym Monday, Wednesday, and Friday. Upper body on Monday, legs on Wednesday, full body on Friday. Each session is 45-60 minutes.", | ||
| "metadata": {"category": "fitness", "time": "weekly"} | ||
| }, | ||
| { | ||
| "id": "habit-005", | ||
| "text": "I run 5K every Tuesday and Thursday morning at 7 AM in the park near my house. My current best time is 24 minutes.", | ||
| "metadata": {"category": "fitness", "time": "weekly"} | ||
| }, | ||
| { | ||
| "id": "habit-006", | ||
| "text": "I read for 30 minutes before bed every night. Currently reading non-fiction books about psychology and habits.", | ||
| "metadata": {"category": "routine", "time": "evening"} | ||
| }, | ||
| { | ||
| "id": "habit-007", | ||
| "text": "I meditate for 10 minutes every morning using the Headspace app. I prefer guided meditation focused on focus and clarity.", | ||
| "metadata": {"category": "wellness", "time": "morning"} | ||
| }, | ||
| { | ||
| "id": "habit-008", | ||
| "text": "I meal prep every Sunday for the week. Usually cook chicken, rice, and roasted vegetables. Takes about 2 hours.", | ||
| "metadata": {"category": "diet", "time": "weekly"} | ||
| }, | ||
| { | ||
| "id": "habit-009", | ||
| "text": "I drink at least 3 liters of water daily. I use a marked water bottle to track intake throughout the day.", | ||
| "metadata": {"category": "health", "time": "daily"} | ||
| }, | ||
| { | ||
| "id": "habit-010", | ||
| "text": "My sleep schedule is 10:30 PM to 6:30 AM. I use night mode on all devices after 9 PM and avoid screens 30 minutes before bed.", | ||
| "metadata": {"category": "routine", "time": "evening"} | ||
| }, | ||
| { | ||
| "id": "habit-011", | ||
| "text": "I take a 15-minute walk after lunch every day. It helps with digestion and gives me a mental break from work.", | ||
| "metadata": {"category": "wellness", "time": "afternoon"} | ||
| }, | ||
| { | ||
| "id": "habit-012", | ||
| "text": "I do a weekly review every Sunday evening. I plan the upcoming week, review goals, and journal about what went well and what to improve.", | ||
| "metadata": {"category": "productivity", "time": "weekly"} | ||
| }, | ||
| { | ||
| "id": "habit-013", | ||
| "text": "I limit coffee to 2 cups per day, both before noon. I switched from regular milk to oat milk six months ago.", | ||
| "metadata": {"category": "diet", "time": "daily"} | ||
| }, | ||
| { | ||
| "id": "habit-014", | ||
| "text": "I practice cold showers every morning for the last 3 months. Started with 30 seconds, now up to 2 minutes. Great for alertness.", | ||
| "metadata": {"category": "wellness", "time": "morning"} | ||
| }, | ||
| { | ||
| "id": "habit-015", | ||
| "text": "I use the Pomodoro technique for work: 25 minutes focused work, 5 minute break. I do 8 pomodoros on a productive day.", | ||
| "metadata": {"category": "productivity", "time": "daily"} | ||
| } | ||
| ] |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please add pyproject.toml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done