Command-line interface for otari, the OpenAI-compatible LLM gateway you own and run yourself.
otari-cli is a thin command-line wrapper over the
otariPython client SDK. It talks to a self-hosted otari gateway or the hosted platform at otari.ai.
- Python 3.11 or newer
pip install otari-cliThis installs the otari console command.
otari-cli reads the same environment variables as the otari SDK, so it works in two modes. Flags always override the environment.
| Variable | Mode | Purpose |
|---|---|---|
OTARI_AI_TOKEN |
Platform | Bearer token; base URL defaults to https://api.otari.ai. |
GATEWAY_API_BASE |
Self-hosted | Gateway base URL (required for self-hosted). |
GATEWAY_API_KEY |
Self-hosted | Virtual API key (sent via the Otari-Key header). |
GATEWAY_ADMIN_KEY |
Either | Admin key for control-plane commands (keys, usage). |
Equivalent flags: --token, --api-base, --api-key, --admin-key.
# Show help and the available commands
otari --help
# Check that the configured gateway is reachable
otari --api-base http://localhost:8000 health
# List the models the gateway can route to
otari models
# Create a chat completion
otari completion -m openai:gpt-4o-mini "Write a haiku about gateways."
# Stream the response token by token
otari completion -m openai:gpt-4o-mini --stream "Tell me a short story."
# Emit machine-readable JSON instead of formatted output
otari --json modelsotari completion -m openai:gpt-4o-mini "Hello" # chat completions (+ --stream)
otari message -m anthropic:claude-3-5-sonnet "Hello" # Anthropic-style messages (+ --stream)
otari response -m openai:gpt-4o-mini "Hello" # Responses API (+ --stream)
otari embedding -m openai:text-embedding-3-small "a sentence"
otari moderation -m openai:omni-moderation-latest "some text"
otari rerank -m cohere:rerank-v3.5 -q "query" "doc one" "doc two"
otari models
otari batches create -m openai:gpt-4o-mini --input requests.jsonl
otari batches list --provider openai
otari batches results <batch-id> --provider openaiThe --json and --stream flags compose: with both set, streaming commands emit
one JSON event object per chunk (newline-delimited) rather than a single document.
These require an admin credential and a self-hosted gateway:
# Keys
otari keys list
otari keys create --name prod --user u_123 --metadata '{"team": "ml"}'
otari keys update <key-id> --inactive
otari keys delete <key-id>
# Users, budgets, pricing
otari users create u_123 --alias "ML team" --budget b_1
otari budgets create --max-budget 100 --duration-sec 86400
otari pricing set openai:gpt-4o-mini --input-price 0.15 --output-price 0.60
# Usage
otari usage list --user u_123 --start 2026-01-01 --end 2026-01-31
otari users usage u_123otari-cli uses uv.
uv sync --extra dev # install with dev dependencies
uv run otari --help # run the CLI from source
uv run ruff check . # lint
uv run mypy src/ # type check (strict)
uv run pytest # testsSee CONTRIBUTING.md and AGENTS.md for the full workflow and conventions.
| Group | Commands |
|---|---|
| Generation | completion, message, response (each with --stream), embedding, moderation, rerank, models |
| Batches | batches create, batches retrieve, batches list, batches cancel, batches results |
| Control plane | keys, users, budgets, pricing (CRUD), usage list, users usage |
| Diagnostics | health |
Run otari <command> --help for the full options of any command.
otari-cli is licensed under the Apache License 2.0. See the LICENSE file for details.