Skip to content

rangehow/ToFu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tofu logo
Tofu
豆腐 — Self-Hosted AI Assistant

CI

Multi-model chat · Autonomous agents · Project co-pilot · Multi-agent swarm
Daily reports & to-do · Browser extension · Desktop agent · Feishu bot
🔀 CLI backend switching — use Claude Code or Codex as the agent engine

🇨🇳 中文文档

Python PostgreSQL License Platform


What is Tofu?

Tofu is a fully self-hosted AI assistant built with a Flask backend and vanilla JS frontend. It connects to any OpenAI-compatible LLM API and gives you autonomous tool-calling agents, a project co-pilot for any codebase, multi-agent swarm orchestration, a browser extension, and a desktop agent — all from a single python server.py.


Features

Multi-Model Chat

  • 20+ LLM models — OpenAI, Anthropic, Google Gemini, Qwen, DeepSeek, MiniMax, Doubao, GLM, Mistral, Grok, and any OpenAI-compatible API
  • Smart dispatch — multi-key, multi-provider routing with real-time latency scoring, error-rate tracking, and per-key rate-limit cooldowns
  • Streaming responses with per-model cost tracking (input/output/cache tokens × tiered pricing)
  • Multi-model comparison — send the same prompt to several models side-by-side
  • Auto-translation — bidirectional Chinese ↔ English translation per conversation

Tool Calling & Agent Mode

  • Built-in tools — web search (multi-engine: DDG, Brave, Bing, SearXNG in parallel), URL fetching, PDF parsing (text + VLM), image upload & generation, shell commands, Python execution
  • Project co-pilot — point it at any codebase for file browsing, grep search, code editing, git operations, and AI-powered file indexing
  • Autonomous task execution — multi-step tool chains with automatic retry, 3-layer context compaction, and configurable model fallback chains
  • Endpoint mode — Planner → Worker → Critic review loop for long-running tasks with iterative refinement
  • Scheduled tasks — cron-like recurring or one-shot tasks (shell, Python, LLM prompts)
  • Image generation — multi-model dispatch across Gemini and GPT image models with automatic 429-retry cycling

Multi-Agent Swarm

  • Swarm orchestration — MasterOrchestrator plans and delegates to multiple SubAgents running in parallel
  • Streaming scheduler — reactive task scheduling with no wave barriers, artifact sharing between agents
  • Review & synthesis — automatic review of agent outputs and final synthesis into a coherent result

Browser Extension

  • Chrome extension bridging the assistant with your browser for real-time page reading and interaction
  • Per-client command routing — multiple browsers connect simultaneously with independent command queues
  • Navigate, screenshot, click, type, extract content

Desktop Agent

  • Runs on your local machine, connects back to the server
  • File system operations, clipboard, screenshots, GUI automation (pyautogui), system info
  • Security-gated with --allow-write / --allow-exec flags

Feishu (Lark) Bot

  • Full Feishu bot integration via WebSocket — multi-turn LLM conversations with tool support directly in team chat
  • Slash commands, model/mode switching, conversation management

Daily Reports & To-Do

Click the ☑️ My Day button in the sidebar header to open the daily dashboard — a personal work journal powered by LLM analysis.

  • Auto-generated work streams — the LLM reads all your conversations for the day, clusters them into 5–15 coherent work streams (e.g. "修复图片回显", "项目部署调试"), and marks each as done, in progress, or blocked
  • Calendar view — month-at-a-glance calendar with per-day conversation counts and cost heatmap; click any date to view or generate its report
  • Tomorrow's plan — the LLM synthesizes 3–8 actionable TODO items from unfinished work, each with a detailed prompt and recommended tool configuration
  • One-click launch — click the ▶ button on any TODO to instantly open a new conversation pre-filled with the task prompt and the right tools enabled (search, code, browser, project, etc.)
  • To-do carry-forward — uncompleted TODOs automatically carry over to the next day as "今日待办" (Today's Tasks); the LLM tracks which ones you addressed and marks them done
  • Manual TODOs — add your own to-do items via the + input at the bottom; toggle done/undone, delete, or launch as conversations
  • Cost tracking — per-day and per-conversation cost breakdown in CNY, calculated from token usage and model pricing
  • Auto-backfill — a background scheduler automatically generates yesterday's report on server boot and daily at midnight if it's missing
  • Motivational quotes — a random quote appears at the top of each report ("人生苦短,我用 AI" 🧈)

Scheduled Tasks

  • Proactive agent scheduler — create cron-like recurring or one-shot tasks (Shell, Python, LLM prompts) via conversation or the scheduler panel
  • SCHEDULER badge — appears in the top status bar; click to see all active proactive agents and their recent run logs
  • Enable via the 🕐 Scheduler toggle in the tool submenu

🔀 CLI Backend Switching (NEW)

Switch between Tofu's built-in agent, Claude Code, or OpenAI Codex as the coding agent backend — right from the UI.

  • Pure frontend mode — when using Claude Code or Codex, Tofu acts as a pure web UI; the external CLI handles all LLM calls, tool execution, and context management with its own authentication
  • Zero config for external agents — install the CLI, log in once in your terminal, and Tofu auto-detects it
  • Capabilities-driven UI — the interface automatically adapts: model selector, thinking depth, and Tofu-only features (image gen, browser, swarm…) are hidden when using an external backend
  • Session persistence — multi-turn conversations are maintained across page refreshes via backend session ID mapping
  • One-click switching — click the backend selector in the top bar to switch between agents; each conversation remembers its backend

More

  • Skills system — persistent reusable knowledge (Markdown files) — the assistant learns project conventions, bug patterns, and workflows across sessions
  • 3-layer context compaction — micro-compact → structural truncation → LLM summary for very long conversations
  • IndexedDB caching — read-through conversation cache with LRU eviction for fast page loads
  • Error tracking — universal project error tracker with fingerprinting, resolution tracking, and digest reports
  • Dark theme UI with responsive layout, syntax highlighting, LaTeX rendering, and image previews
  • Cross-platform — runs on Linux, macOS, and Windows (see Platform Support below)
  • Mobile-friendly — responsive layout with compact topbar, swipe-open sidebar, and bottom-sheet tool toggles for touch screens
  • Auto-dependency repairbootstrap.py auto-installs missing pip packages via LLM diagnosis

Quick Start

Option A: One-Command Install (recommended)

Works on Linux, macOS, and Windows. Requires only Python 3.10+ and Git — no conda, no admin/root.

Linux / macOS:

curl -fsSL https://raw.githubusercontent.com/rangehow/ToFu/main/install.sh | bash

Windows (PowerShell):

irm https://raw.githubusercontent.com/rangehow/ToFu/main/install.ps1 | iex

Or run the cross-platform installer directly (any OS with Python 3.10+):

git clone https://github.com/rangehow/ToFu.git && cd ToFu
python install.py

This automatically creates a virtual environment, installs all dependencies, locates/installs PostgreSQL, and starts the server. Open http://localhost:15000 when it's ready.

With options:

python install.py --api-key sk-xxx --port 8080   # Pre-configure API key
python install.py --no-launch                     # Install only
python install.py --docker                        # Use Docker instead
python install.py --skip-playwright               # Skip browser automation

Option B: Docker (zero dependencies)

git clone https://github.com/rangehow/ToFu.git && cd ToFu
docker compose up -d

Or without cloning (when the image is published):

docker run -d -p 15000:15000 -v tofu-data:/app/data --name tofu ghcr.io/rangehow/tofu:latest

Open http://localhost:15000 — done. All data persists in Docker volumes.

Option C: Manual Install

Step-by-step for full control

Prerequisites: Python 3.10+, PostgreSQL 18+, ripgrep & fd-find (recommended)

git clone https://github.com/rangehow/ToFu.git
cd ToFu

# Create environment (pick one)
python -m venv .venv && source .venv/bin/activate   # Standard venv
# OR: conda create -n tofu python=3.12 -y && conda activate tofu

# Install PostgreSQL (if not already installed)
# macOS:   brew install postgresql@18
# Ubuntu:  sudo apt install postgresql
# Windows: https://www.postgresql.org/download/windows/
# conda:   conda install -c conda-forge postgresql>=18

# Install ripgrep & fd-find (recommended — faster code search & file finding)
# macOS:   brew install ripgrep fd
# Ubuntu:  sudo apt install ripgrep fd-find
# Windows: winget install BurntSushi.ripgrep.MSVC sharkdp.fd
# conda:   conda install -c conda-forge ripgrep fd-find

# Install Python dependencies
pip install -r requirements.txt

# Optional: browser automation for advanced page fetching
pip install playwright && playwright install chromium

# Run
python server.py

Open http://localhost:15000 in your browser — that's it! Configure everything from the Settings UI.

PostgreSQL runs as a local userspace process — no sudo, no system service. On first python server.py, the database auto-bootstraps (initdb, schema creation, port selection).

Automatic Dependency Repair

If any Python package is missing, server.py automatically delegates to bootstrap.py:

  1. Detects the ImportError and hands off to bootstrap.py
  2. Opens a live status page in your browser on the same port
  3. The LLM API diagnoses the traceback and determines which packages to install
  4. Packages are pip installed automatically, with up to 10 retry rounds
  5. Once everything resolves, the real server starts

This uses only Python stdlib — works even when every pip package is missing.


Configuration

All configuration is done through the Settings UI — click the ⚙️ gear icon in the top-right corner of the chat interface. Changes are saved to the server instantly, no restart needed (unless noted).

The Settings panel has 7 tabs, each with a dedicated icon in the left sidebar:

⚙️ General

Core model parameters and global preferences.

  • Theme — Dark, Light, or Tofu (豆腐) theme
  • Temperature — controls response randomness (0 = deterministic, 1 = creative)
  • Max tokens — maximum output token limit
  • Image max width — auto-compress uploaded images (0 = no compression)
  • PDF max pages — limit page count when parsing PDFs
  • Thinking depth — default thinking budget for new conversations (Off / Medium / High / Max)
  • System prompt — custom instructions prepended to every conversation

🔗 Providers

Multi-provider API management — this is where you add your LLM API keys.

  • ⚡ Add from template — one-click setup for OpenAI, Anthropic, Google Gemini, DeepSeek, Qwen, MiniMax, GLM, Doubao, Mistral, Grok, OpenRouter, Azure, Ollama, and more
  • Custom provider — add any OpenAI-compatible endpoint with custom base URL
  • Per-provider settings — each provider has its own API key(s), base URL, and model list
  • Auto-discover models — fetches available models from the provider's /v1/models endpoint
  • Multi-key rotation — add multiple API keys per provider for automatic rate-limit rotation

📦 Display

Control what appears in the model selector and image generation picker.

  • Image generation models — show/hide specific models in the image gen selector
  • Model dropdown — show/hide models in the main chat model switcher
  • Fallback model — auto-switch to this model when the primary model fails
  • Default model — override the default model for new conversations

🔍 Search & Fetch

Web search and content fetching behavior.

  • LLM content filter — use the model to strip navigation/ads from fetched pages (disable for speed)
  • Fetch top N — how many search results to auto-fetch (default: 6)
  • Fetch timeout — per-page timeout in seconds (default: 15)
  • Max characters — separate limits for search results, direct URL fetch, and PDF files
  • Max download size — byte limit for fetched content (default: 20 MB)
  • Blocked domains — domains the fetcher will never visit (one per line)

🌐 Network

Proxy configuration for all outbound requests.

  • HTTP / HTTPS proxy — proxy URL for LLM API calls, search, and page fetching
  • Bypass domains — domain suffixes that skip the proxy entirely (one per line, suffix-matched)
  • 💡 Tip: Add your LLM API domains here if your corporate/VPN proxy silently drops SSE long-connections, causing BrokenPipeError

🐦 Feishu (Lark)

Feishu bot integration settings.

  • Connection status — live indicator showing bot connection state
  • App ID / App Secret — credentials from open.feishu.cn (restart required after change)
  • Default project path — project co-pilot root for Feishu conversations
  • Workspace root — base directory for project switching
  • Allowed users — restrict bot access to specific Feishu user IDs (blank = allow all)

</> Advanced

Pricing and cache management.

  • Price overrides — customize per-model pricing (USD per million tokens) as JSON
  • Local cache — view IndexedDB cache stats and clear cached conversations
  • Server info — server status and version information

Environment Variables (fallback)

For first-time setup, headless servers, or Docker deployments, you can also configure via environment variables. The Settings UI always takes priority — env vars are only used as initial fallback values.

cp .env.example .env
Variable Description Example
LLM_API_KEY LLM provider API key (fallback) sk-abc123...
LLM_BASE_URL Chat completions endpoint (fallback) https://api.openai.com/v1
LLM_MODEL Default model (fallback) gpt-4o
PORT Server port 15000
BIND_HOST Bind address 0.0.0.0
PROXY_BYPASS_DOMAINS Comma-separated proxy bypass domains .corp.net,.internal.com
FEISHU_APP_ID Feishu bot app ID cli_xxxx
FEISHU_APP_SECRET Feishu bot app secret

💡 After first launch, we recommend configuring everything through the Settings UI. It's more intuitive, changes take effect immediately, and supports features like provider templates and model auto-discovery that env vars cannot.


Project Structure

├── server.py                  Flask app entry, middleware, logging
├── bootstrap.py               Auto-dependency repair (LLM-guided)
├── index.html                 Main chat UI (single-page app)
├── .env.example               Environment variable template
│
├── lib/                       Core libraries
│   ├── agent_backends/        Multi-backend agent switching (builtin/CC/Codex)
│   ├── llm_client.py          LLM API client (streaming, retry)
│   ├── llm_dispatch/          Multi-key multi-model dispatcher
│   ├── database.py            PostgreSQL (auto-bootstrap)
│   ├── tasks_pkg/             Task orchestration & compaction
│   │   ├── orchestrator.py    Main LLM ↔ tool loop
│   │   ├── executor.py        Tool execution engine
│   │   ├── endpoint.py        Planner → Worker → Critic loop
│   │   └── compaction.py      3-layer context compaction
│   ├── tools/                 Tool definitions & schemas
│   ├── swarm/                 Multi-agent orchestration
│   ├── fetch/                 Content fetching & extraction
│   ├── search/                Multi-engine web search
│   ├── browser/               Browser extension bridge
│   ├── project_mod/           Project co-pilot (scan, edit, undo)
│   ├── skills/                Skill accumulation system
│   ├── feishu/                Feishu bot integration
│   └── ...
│
├── routes/                    Flask Blueprints (21 modules)
├── static/                    CSS, JS, icons
├── browser_extension/         Chrome extension (MV3)
├── tests/                     Test suite (unit, API, E2E)
└── data/                      Runtime data (git-ignored)

Advanced Usage

CLI Backend Switching — Claude Code / Codex

Tofu can act as a pure web frontend for external coding agents. Instead of using Tofu's built-in orchestrator, you can delegate to Claude Code or OpenAI Codex — they handle LLM calls, tool execution, and context management with their own authentication.

Install Claude Code

# Install via npm
npm install -g @anthropic-ai/claude-code

# Log in (one-time)
claude auth login
# Follow the browser prompt to authenticate with your Claude account

# Verify
claude --version

Install Codex

# Install via npm
npm install -g @openai/codex

# Log in (one-time) — requires OpenAI API key or ChatGPT Plus subscription
codex auth login

# Verify
codex --version

Use in Tofu

  1. Start Tofu: python server.py
  2. Click the backend selector (🤖) in the top bar
  3. Available backends show a ✅ badge; unavailable ones show ❌
  4. Select Claude Code or Codex — the UI automatically adapts:
    • Model selector, thinking depth, and search toggle are hidden (the CLI handles these)
    • Tofu-only features (image gen, browser extension, swarm, scheduler) are greyed out
  5. Send a message — Tofu spawns the CLI subprocess, streams its output, and renders it in the chat UI

Feature availability by backend

Feature Built-in (Tofu) Claude Code Codex
Chat & streaming
Web search ✅ (CC's) ✅ (Codex's)
File operations ✅ (CC's) ✅ (Codex's)
Code execution ✅ (Bash) ✅ (exec)
Model selection — (CC decides) — (Codex decides)
Image generation
Browser extension
Multi-agent swarm
Desktop agent

Note: The CLI must be installed on the same machine as the Tofu server. Tofu spawns the agent as a subprocess.

Project Co-Pilot

Click Project in the sidebar, enter the path to any codebase. The assistant can browse files, search code, edit files, run commands, and track modifications with per-round undo.

Multi-Agent Swarm

For complex tasks, the assistant automatically plans sub-tasks and delegates to specialist agents running in parallel. Results are reviewed and synthesized into a coherent output.

Browser Extension

  1. chrome://extensions → Enable Developer Mode
  2. Load unpacked → select browser_extension/
  3. Click the extension icon → enter your server URL
  4. The assistant can now read and interact with your browser tabs

Desktop Agent

pip install pyautogui pillow psutil
python lib/desktop_agent.py --server http://your-server:15000 --allow-write --allow-exec

Feishu Bot

  1. Create an app at open.feishu.cn, enable Bot capability
  2. Open Settings → 🐦 Feishu tab → enter App ID and App Secret
  3. The bot auto-connects on server restart

Scheduled Tasks

Ask the assistant to "create a scheduled task" or "set up a daily cron job" — it will create a proactive agent that runs on your specified schedule. Manage all tasks from the SCHEDULER badge in the status bar.

Healthcheck

python healthcheck.py

Platform Support

Tofu runs on Linux, macOS, and Windows. All platform-specific code is isolated in lib/compat.py.

Feature Linux macOS Windows
Core chat & tools
PostgreSQL auto-bootstrap ✅ (PG bin/ must be in PATH)
Project co-pilot (file tools)
run_command (basic) ✅ (uses cmd.exe)
run_command interactive stdin ✅ (via /proc) ❌ (non-interactive) ❌ (non-interactive)
FUSE keepalive daemon ✅ (DolphinFS) — (not needed) — (not needed)
Desktop agent
Browser extension
Dangerous command blocking ✅ (Unix + Windows patterns)

Smoke test: python debug/test_cross_platform.py validates the compat layer on any platform.


Testing

# All tests
python tests/run_all.py

# Individual suites
python -m pytest tests/test_backend_unit.py
python -m pytest tests/test_api_integration.py
python -m pytest tests/test_visual_e2e.py
python -m pytest tests/test_db_bug_regressions.py

Security

  • No secrets in source — all credentials loaded from environment variables or Settings UI
  • Single-user mode — no multi-tenant auth; deploy behind a VPN or reverse proxy
  • Tool execution — the assistant can run shell commands and edit files; use with caution
  • Desktop agent — requires explicit --allow-write / --allow-exec flags

Contributing

See CONTRIBUTING.md for the full guide. Quick version:

  1. Fork → feature branch
  2. python healthcheck.py && python tests/run_all.py
  3. Submit a pull request

License

MIT

About

No description, website, or topics provided.

Resources

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors