The Clio Voice Notes API provides a comprehensive RESTful interface for an AI-powered voice transcription and note-taking platform. The API enables users to record audio, automatically transcribe it using OpenAI's Whisper with intelligent text formatting, manage notes with tagging, perform advanced search operations, and retry failed transcriptions.
- Development:
http://localhost:8011/api/ - Production:
https://your-domain.com/api/
The API uses JWT (JSON Web Token) authentication with access and refresh tokens.
- Register/Login to get tokens
- Include access token in Authorization header:
Bearer <access_token> - Refresh tokens when access token expires
POST /auth/register/
Request Body:
{
"username": "john_doe",
"email": "john@example.com",
"first_name": "John",
"last_name": "Doe",
"password": "SecurePass123!",
"password_confirm": "SecurePass123!"
}Response (201 Created):
{
"success": true,
"message": "User registered successfully",
"data": {
"user": {
"id": 1,
"username": "john_doe",
"email": "john@example.com",
"first_name": "John",
"last_name": "Doe",
"date_joined": "2024-01-15T10:30:00Z",
"profile": {
"username": "john_doe",
"email": "john@example.com",
"preferred_language": "en-US",
"audio_quality": "high",
"storage_quota_mb": 1000,
"storage_used_mb": 0.0,
"storage_percentage": 0.0,
"created_at": "2024-01-15T10:30:00Z"
}
},
"tokens": {
"refresh": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9...",
"access": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9..."
}
}
}POST /auth/login/
Request Body:
{
"username": "john_doe",
"password": "SecurePass123!"
}Response (200 OK):
{
"refresh": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9...",
"access": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9..."
}POST /auth/refresh/
Request Body:
{
"refresh": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9..."
}POST /auth/logout/
Request Body:
{
"refresh_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9..."
}GET /auth/profile/
PUT /auth/profile/
PATCH /auth/profile/
GET Response:
{
"success": true,
"data": {
"username": "john_doe",
"email": "john@example.com",
"first_name": "John",
"last_name": "Doe",
"preferred_language": "en-US",
"audio_quality": "high",
"storage_quota_mb": 1000,
"storage_used_mb": 45.7,
"storage_percentage": 4.6,
"created_at": "2024-01-15T10:30:00Z"
}
}GET /notes/
POST /notes/
GET Parameters:
page- Page number (default: 1)page_size- Items per page (default: 20)search- Search in title and transcriptionstatus- Filter by status (processing, completed, failed)language_detected- Filter by detected languageis_favorite- Filter favorites (true/false)tags- Filter by tag IDs (comma-separated)ordering- Sort by field (created_at, updated_at, title, duration)
GET Response (200 OK):
{
"count": 25,
"next": "http://localhost:8011/api/notes/?page=2",
"previous": null,
"results": [
{
"id": 1,
"title": "Meeting notes from client call",
"username": "john_doe",
"status": "completed",
"duration": "00:05:30",
"file_size_mb": 8.5,
"language_detected": "en",
"confidence_score": 0.95,
"is_favorite": true,
"tags": [
{
"id": 1,
"name": "work",
"color": "#3B82F6",
"created_at": "2024-01-15T10:30:00Z"
}
],
"created_at": "2024-01-15T14:22:00Z",
"updated_at": "2024-01-15T14:25:00Z"
}
]
}POST Request (multipart/form-data):
{
"audio_file": "<binary_audio_file>",
"title": "Meeting notes",
"tag_ids": [1, 2]
}POST Response (201 Created):
{
"success": true,
"message": "Voice note created successfully. Transcription in progress.",
"data": {
"id": 1,
"title": "Meeting notes",
"transcription": "",
"username": "john_doe",
"audio_file": "/media/audio/1/meeting_notes.wav",
"audio_url": "http://localhost:8011/media/audio/1/meeting_notes.wav",
"duration": null,
"file_size_mb": 8.5,
"language_detected": "auto",
"confidence_score": null,
"status": "processing",
"error_message": "",
"is_favorite": false,
"tags": [],
"segments": [],
"created_at": "2024-01-15T14:22:00Z",
"updated_at": "2024-01-15T14:22:00Z"
}
}GET /notes/{id}/
PUT /notes/{id}/
PATCH /notes/{id}/
DELETE /notes/{id}/
GET Response (200 OK):
{
"id": 1,
"title": "Meeting notes from client call",
"transcription": "Hello everyone, this is our weekly client meeting.\n\nWe discussed the project timeline and the key deliverables for next month. The client seemed satisfied with our progress and approved the next phase of development.",
"username": "john_doe",
"audio_file": "/media/audio/1/meeting_notes.wav",
"audio_url": "http://localhost:8011/media/audio/1/meeting_notes.wav",
"duration": "00:05:30",
"file_size_mb": 8.5,
"language_detected": "en",
"confidence_score": 0.95,
"status": "completed",
"error_message": "",
"is_favorite": true,
"tags": [
{
"id": 1,
"name": "work",
"color": "#3B82F6",
"created_at": "2024-01-15T10:30:00Z"
}
],
"segments": [
{
"id": 1,
"start_time": 0.0,
"end_time": 3.2,
"duration": 3.2,
"text": "Hello everyone, this is our weekly client meeting.",
"confidence": 0.98,
"speaker_id": ""
},
{
"id": 2,
"start_time": 3.2,
"end_time": 8.5,
"duration": 5.3,
"text": "We discussed the project timeline and the key deliverables for next month.",
"confidence": 0.94,
"speaker_id": ""
}
],
"created_at": "2024-01-15T14:22:00Z",
"updated_at": "2024-01-15T14:25:00Z"
}PUT/PATCH Request:
{
"title": "Updated meeting notes",
"transcription": "Updated transcription text",
"is_favorite": true,
"tag_ids": [1, 3]
}DELETE Response (200 OK):
{
"success": true,
"message": "Voice note deleted successfully"
}POST /notes/{id}/retranscribe/
This endpoint allows users to retry transcription for failed voice notes or re-transcribe with different language settings.
Request Body:
{
"language": "en"
}Response (200 OK):
{
"success": true,
"message": "Re-transcription started successfully",
"data": {
"id": 1,
"title": "Meeting notes from client call",
"transcription": "",
"status": "processing",
"language_detected": "en",
"error_message": "",
"updated_at": "2024-01-15T15:30:00Z"
}
}Error Response (400 Bad Request):
{
"success": false,
"message": "Cannot retranscribe note that is currently processing",
"errors": {
"status": ["Note must be in 'failed' or 'completed' status to retranscribe"]
}
}GET /tags/
POST /tags/
GET Response (200 OK):
[
{
"id": 1,
"name": "work",
"color": "#3B82F6",
"created_at": "2024-01-15T10:30:00Z"
},
{
"id": 2,
"name": "personal",
"color": "#EF4444",
"created_at": "2024-01-15T10:31:00Z"
}
]POST Request:
{
"name": "meetings",
"color": "#10B981"
}GET /tags/{id}/
PUT /tags/{id}/
PATCH /tags/{id}/
DELETE /tags/{id}/
POST /transcribe/
Request (multipart/form-data):
{
"audio_file": "<binary_audio_file>",
"language": "en"
}Response (200 OK):
{
"success": true,
"data": {
"transcription": "This is the transcribed text from the audio file with intelligent formatting.\n\nParagraphs are automatically created based on natural speech patterns and timing.",
"language": "en",
"duration": 30.5,
"confidence": 0.92,
"segments": [
{
"start_time": 0.0,
"end_time": 15.2,
"text": "This is the transcribed text from the audio file with intelligent formatting.",
"confidence": 0.95
},
{
"start_time": 16.1,
"end_time": 30.5,
"text": "Paragraphs are automatically created based on natural speech patterns and timing.",
"confidence": 0.89
}
]
}
}GET /stats/
Response (200 OK):
{
"success": true,
"data": {
"total_notes": 15,
"completed_notes": 12,
"processing_notes": 2,
"failed_notes": 1,
"favorite_notes": 5,
"total_duration_seconds": 1850.5,
"languages_used": ["en", "es", "fr"],
"storage_used_mb": 245.7,
"storage_quota_mb": 1000,
"storage_percentage": 24.6,
"success_rate": 85.7,
"average_confidence": 0.91
}
}{
"id": "integer",
"user": "foreign_key",
"title": "string(255)",
"transcription": "text",
"audio_file": "file_field",
"duration": "duration",
"file_size_bytes": "positive_big_integer",
"language_detected": "string(10)",
"confidence_score": "float(0.0-1.0)",
"status": "string(processing|completed|failed)",
"error_message": "text",
"tags": "many_to_many",
"is_favorite": "boolean",
"created_at": "datetime",
"updated_at": "datetime"
}{
"id": "integer",
"name": "string(50)",
"color": "string(7)",
"created_at": "datetime"
}{
"id": "integer",
"voice_note": "foreign_key",
"start_time": "float",
"end_time": "float",
"text": "text",
"confidence": "float",
"speaker_id": "string(50)"
}{
"id": "integer",
"user": "one_to_one",
"preferred_language": "string(10)",
"audio_quality": "string(10)",
"storage_quota_mb": "positive_integer",
"storage_used_mb": "float",
"created_at": "datetime",
"updated_at": "datetime"
}Clio uses OpenAI's Whisper API with configurable settings:
Environment Variables:
WHISPER_MODEL=whisper-1
WHISPER_TEMPERATURE=0
WHISPER_FORMAT_TEXT=true
WHISPER_PARAGRAPH_BREAK_SECONDS=2.0
WHISPER_MAX_SENTENCE_LENGTH=150Features:
- Model Selection: Choose between Whisper models
- Temperature Control: Adjust randomness in transcription
- Text Formatting: Automatic paragraph creation from wall-of-text
- Timing Analysis: Use segment timing for intelligent formatting
- Confidence Scoring: Track transcription accuracy
Clio automatically formats transcription output:
Before (Raw Whisper Output):
Hello everyone this is our weekly meeting we need to discuss the project timeline and deliverables the client wants to see progress on the frontend development and we should also talk about the database optimization issues that came up last week
After (Clio Formatting):
Hello everyone, this is our weekly meeting. We need to discuss the project timeline and deliverables.
The client wants to see progress on the frontend development and we should also talk about the database optimization issues that came up last week.
Formatting Rules:
- Paragraph breaks based on speech pauses (configurable threshold)
- Proper sentence capitalization and punctuation
- Maximum sentence length limits
- Natural language flow preservation
{
"success": false,
"message": "Error description",
"errors": {
"field_name": ["Error message"],
"non_field_errors": ["General error message"]
}
}200 OK- Successful GET, PUT, PATCH, DELETE201 Created- Successful POST400 Bad Request- Invalid request data401 Unauthorized- Authentication required403 Forbidden- Permission denied404 Not Found- Resource not found413 Payload Too Large- File size exceeds limit422 Unprocessable Entity- Validation errors500 Internal Server Error- Server error
Common Error Scenarios:
{
"success": false,
"message": "Transcription failed",
"errors": {
"transcription": ["Audio file format not supported"],
"audio_file": ["File size exceeds 50MB limit"]
}
}Retry Mechanism:
- Failed transcriptions can be retried via
/notes/{id}/retranscribe/ - Different language settings can be applied
- Automatic error logging for debugging
- Max Size: 50MB
- Supported Formats: WAV, MP3, M4A, OGG, WebM, FLAC
- Recommended Quality: 16kHz or 44.1kHz sample rate
- Content-Type: Proper MIME type required
- Audio files stored in
/media/audio/{user_id}/directory - Secure URLs with authentication required for access
- HTTP Range request support for streaming playback
- Custom MIME type handling for cross-browser compatibility
DRF throttle classes enforce the following default limits:
- Anonymous requests: 10 requests/minute
- Authenticated requests: 60 requests/minute
Nginx additionally rate-limits auth endpoints to 5 requests/second with burst=10.
GET /api/health/ — No authentication required. Returns {"status": "ok"}.
- HTTPS required in production (automatic redirect via nginx)
- JWT tokens with 60-minute access lifetime, 7-day refresh lifetime
- Refresh tokens are blacklisted after rotation
- File upload validation and size limits (1KB min, 50MB max)
- Transport security headers (HSTS, CSP, X-Frame-Options) when DEBUG=False
- Error responses never expose internal exception details
- CORS headers configured for specific origins
- Rate limiting and request size limits
- Input validation and sanitization
- Audio file streaming with authentication
Use the interactive API documentation available at:
- Swagger UI:
http://localhost:8011/api/docs/ - ReDoc:
http://localhost:8011/api/redoc/ - OpenAPI Schema:
http://localhost:8011/api/schema/
For API support and bug reports:
- GitHub Issues: Create an issue
- Documentation: This API specification
- Community: GitHub Discussions
Clio - AI-Powered Voice Transcription Platform