Vocalize is an enterprise-grade speech analysis platform that provides real-time fluency scoring, words-per-minute (WPM) tracking, and high-fidelity transcription.
# Clone the repository
git clone <your-repo-url>
cd TTS
# Install Python backend dependencies
pip install -r requirements.txt
# Install Frontend dependencies
cd frontend
npm install
cd ..Create a .env file in the root directory:
GOOGLE_API_KEY=your_google_cloud_stt_api_keypython -m uvicorn backend.main:app --reloadcd frontend
npm run devVisit http://localhost:3000 to start analyzing.
- Real-time Recording: Capture high-quality audio directly from the browser.
- Fluency Scoring: Automated algorithm to evaluate speech flow and pace.
- WPM Analytics: Instant calculation of communication velocity.
- Minimalist UI: Clean, Notion-inspired monochrome interface for a professional demo experience.
- Auto-Conversion: Built-in audio processing to ensure compatibility with Google Cloud STT.
frontend/: Next.js 14 web application.backend/: FastAPI server for audio processing.evaluation_engine/: Core logic for speech-to-text and fluency metrics.recordings/: Temporary storage for processed audio.CODE_FLOW.md: Detailed guide on code flow and architecture.DEPLOYMENT_HISTORY.md: Troubleshooting log for deployment errors.
record_live.py: Record and analyze directly from the terminal.simple_test.py: Test the pipeline with pre-recorded audio files.
The frontend is optimized for Vercel. Connect your Github repository and ensure the ROOT is set to the frontend folder or use the default root if deploying the monorepo logic.
Deploy the FastAPI backend to services like Heroku, Render, or Railway. Ensure the GOOGLE_API_KEY environment variable is set in your production environment.