ARIA — Open Source Voice Assistant

Built for Linux · CPU-Optimized · Fully Offline · Privacy-First

🎯 Overview

ARIA is a lightweight, fully offline voice assistant designed to run on CPU-constrained systems. It combines state-of-the-art open-source models for speech recognition, natural language understanding, and text-to-speech synthesis—all without cloud dependencies.

Key Features:

🔇 Fully Offline – No cloud API calls or internet required
⚡ CPU-Optimized – Runs efficiently on modest hardware
🔒 Privacy-First – Your voice and data stay local
🛠️ Extensible – Easy-to-add custom skills
🚀 Quick Setup – Automated installation script
🐧 Linux Native – Optimized for Linux environments

📦 Technology Stack

Component	Technology	Purpose
Wake Word Detection	OpenWakeWord	Always-listening wake word detection
Speech-to-Text	OpenAI Whisper (base model)	CPU-friendly speech recognition
Language Model	Ollama + Mistral 7B	Reasoning and intent understanding
Text-to-Speech	Piper TTS	Natural voice synthesis
Voice Activity Detection	Silero VAD	Efficient speech endpoint detection
Skills Framework	Python modules	Extensible task execution

🚀 Quick Start

Prerequisites

Linux (Ubuntu 20.04+ recommended)
Python 3.8+
~2GB free disk space

Installation

# Clone the repository
git clone https://github.com/HaroonKhalidA/AI-Assistant.git
cd AI-Assistant

Run automated setup

chmod +x setup.sh ./setup.sh

Start the assistant

python main.py

That's it! The setup script handles all dependencies, model downloads, and configuration. 📁 Project Structure Code

AI-Assistant/
├── main.py                      # Application entry point
├── setup.sh                     # Automated installation & setup
├── requirements.txt             # Python dependencies
│
├── config/
│   └── settings.yaml            # Global configuration (models, API keys, features)
│
├── core/                        # Core voice processing pipeline
│   ├── wake_word.py            # Wake word detection engine
│   ├── stt.py                  # Speech-to-text processor
│   ├── tts.py                  # Text-to-speech synthesizer
│   ├── brain.py                # LLM reasoning engine
│   ├── vad.py                  # Voice activity detection
│   └── router.py               # Intent routing & skill dispatcher
│
├── skills/                      # Extensible skill modules
│   ├── __init__.py
│   ├── timer.py                # Timer & alarm functionality
│   ├── weather.py              # Weather information (offline fallback)
│   ├── system_control.py       # System commands (volume, brightness, etc.)
│   ├── web_search.py           # Local search capabilities
│   └── general.py              # General conversation
│
├── models/                      # Auto-downloaded ML models (gitignored)
├── logs/                        # Application logs
└── README.md                    # This file

⚙️ Configuration

Edit config/settings.yaml to customize: YAML

Wake word

wake_word: "hey aria" sensitivity: 0.5

Model selection

stt_model: "base" # whisper model size tts_voice: "en_US" # TTS voice brain_model: "mistral" # Local LLM

Features

enable_vad: true enable_web_search: false log_level: "INFO"

🎤 Usage

Start ARIA bash

python main.py

Voice Commands

Wake word: "Hey ARIA"
Examples:
    "What's the weather?"
    "Set a timer for 10 minutes"
    "Control my system brightness"
    "Tell me about Python"

Command Line Options bash

python main.py --config custom_config.yaml # Use custom config python main.py --debug # Enable debug logging python main.py --offline # Force offline mode

🛠️ Extending with Custom Skills

Create a new skill in skills/ directory: Python

skills/my_skill.py

class MySkill:
    def __init__(self):
        self.name = "my_skill"
        self.keywords = ["keyword1", "keyword2"]
    
    def execute(self, text: str) -> str:
        """Process the user request and return response"""
        return "Response to user"
    
    def is_applicable(self, text: str) -> bool:
        """Determine if this skill should handle the request"""
        return any(kw in text.lower() for kw in self.keywords)

Then register in router.py: Python

from skills.my_skill import MySkill self.skills.append(MySkill())

📊 Performance Metrics

Metric Value Wake word detection latency < 100ms Speech-to-text processing ~5-10s for 10s audio LLM response generation 2-5s (Mistral 7B, CPU) Total pipeline latency ~10-20s end-to-end Memory footprint ~1.5GB (base model)

Metrics vary based on hardware and model configuration 🐛 Troubleshooting Audio input not detected bash

List available audio devices

arecord -l

Update config to use correct device

Slow responses

Reduce model size in settings (e.g., tiny for Whisper)
Lower sensitivity for faster wake word detection
Disable unnecessary skills

Model download fails bash

Manually download models

python -m pip install ollama ollama pull mistral

📚 Dependencies

Key Python packages:

openai-whisper – Speech recognition
piper-tts – Text-to-speech
openWakeWord – Wake word detection
silero-vad – Voice activity detection
pyyaml – Configuration management

See requirements.txt for complete list.

🤝 Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

🙏 Acknowledgments

Built with love using:

OpenWakeWord by Tend
Whisper by OpenAI
Ollama for local LLM serving
Piper by Rhasspy
Silero VAD by Snakers4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARIA — Open Source Voice Assistant

🎯 Overview

📦 Technology Stack

🚀 Quick Start

Prerequisites

Installation

Run automated setup

Start the assistant

Wake word

Model selection

Features

🎤 Usage

🛠️ Extending with Custom Skills

skills/my_skill.py

📊 Performance Metrics

List available audio devices

Update config to use correct device

Manually download models

📚 Dependencies

🤝 Contributing

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
config		config
core		core
data		data
gui		gui
skills		skills
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
api_server.py		api_server.py
main.py		main.py
requirements.txt		requirements.txt
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

ARIA — Open Source Voice Assistant

🎯 Overview

📦 Technology Stack

🚀 Quick Start

Prerequisites

Installation

Run automated setup

Start the assistant

Wake word

Model selection

Features

🎤 Usage

🛠️ Extending with Custom Skills

skills/my_skill.py

📊 Performance Metrics

List available audio devices

Update config to use correct device

Manually download models

📚 Dependencies

🤝 Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages