Haidra Deployments

Ansible collection for deploying Haidra authored services, AI Horde workers, monitoring infrastructure, and supporting applications.

New here? Start with the Quick Start guide — test an AI-Horde code change in ~4 minutes or run the full stack locally in ~6. Want to contribute? See CONTRIBUTING.md.

Haidra Deployments

Scope and Audience

This collection is intentionally opinionated and is not a general-purpose Ansible toolkit. It targets three audiences:

The AI Horde team operating the stack.
Developers contributing to AI Horde services.
External groups adopting or vendoring the AI Horde stack and seeking reference deployment patterns.

Non-goals

Generic, vendor-neutral deployment abstractions for arbitrary software.
Replacing mature community roles for broad infrastructure concerns.
Hiding stack assumptions required by AI Horde topology and workflows.

In short, if you are looking for a general Ansible collection for deploying the software therein, this is not it. If you are looking for a reference deployment for the AI Horde stack, this is exactly it.

Usage

Install Ansible (Linux only):

python -m pip install ansible

Ensure your control host can SSH to targets using key-based authentication via an ssh-agent. If the remote user requires a sudo password, append -K to all ansible-playbook commands.

Install this collection and its dependencies:

wget https://raw.githubusercontent.com/Haidra-Org/deployments/main/examples/requirements.yml
ansible-galaxy collection install -r requirements.yml

Included Roles

Each role provides its own README with full variable documentation and examples. Adjust an example inventory with your hostnames, then run the corresponding example playbook — or build your own site.yml.

Application Roles

Role	Description
ai_horde	AI Horde backend (Flask + Postgres + Redis)
aihorde_frontpage	AiHordeFrontpage (Angular SSR website)
horde_model_reference	FastAPI service for AI Horde model metadata
artbot	Web frontend for AI Horde
artbot_revproxy	HAProxy reverse proxy for Artbot
horde_regen_worker	AI Horde worker (Dreamer, Scribe, Alchemist)
amd_gpu_drivers	AMD GPU driver and ROCm setup

Monitoring Roles

Role	Description
horde_monitoring	Mimir + Grafana + S3 storage monitoring stack (Docker Compose)
horde_stats_exporter	AI Horde API → Prometheus metrics exporter
horde_alloy	Grafana Alloy telemetry collector for app hosts

See MONITORING.md for the architecture overview, quick start, and how the monitoring roles work together.

Documentation

Document	Contents
Quick Start	Get running in minutes — 4 tiers from code change to production
Contributing	Dev setup, test conventions, PR guidelines
Monitoring Guide	Architecture, quick start, troubleshooting
Observability Stack	Loki, Tempo, and Alloy deep-dive
Backup & Restore	RPO/RTO, backup configuration, restore procedures
Credentials	Credential management and rotation
Upgrading	Component version upgrade procedures
Migration	Host migration runbook (planned and forced)

Role Testing

The collection ships a two-tier test suite under tests/.

Render tests (fast, no services started)

Validate Ansible template rendering, variable defaults, and negative (expected-failure) cases. Run entirely in check mode — no Docker daemon required for the test playbooks themselves.

# All render tests (builds a Docker systemd container per test):
./tests/run_tests.sh

# List all discoverable tests without running them:
./tests/run_tests.sh --list

# By role:
./tests/run_tests.sh monitoring
./tests/run_tests.sh ai_horde
./tests/run_tests.sh regen_worker
./tests/run_tests.sh artbot
./tests/run_tests.sh frontpage
./tests/run_tests.sh full_stack

# Specific test:
./tests/run_tests.sh monitoring/test_full_stack

Test output and logs

Every run_tests.sh invocation writes per-test log files and a structured summary under tests/test-results/<YYYYMMDD-HHMMSS>/:

tests/test-results/20260325-143012/
├── monitoring__test_full_stack.log               # full Ansible output
├── monitoring__test_full_stack__idempotency.log   # idempotency re-run
├── monitoring__test_runtime_services.log
├── ai_horde__test_deploy.log
└── summary.txt                                   # machine-readable results

The runner prints a colour-coded summary table at the end with one-line failure reasons extracted from the Ansible output:

TEST                                                STATUS  DETAILS
────────────────────────────────────────────────────────────────────────────
monitoring/test_full_stack                          PASS
ai_horde/test_deploy                                FAIL    {"msg": "No package matching 'python3-venv'"}
────────────────────────────────────────────────────────────────────────────

summary.txt is pipe-delimited for scripted analysis:

# FORMAT: STATUS | LABEL | LOG_FILE | REASON
PASS | monitoring/test_full_stack | monitoring__test_full_stack.log |
FAIL | ai_horde/test_deploy | ai_horde__test_deploy.log | {"msg": "No package matching..."}

Every playbook (except runtime and local_deploy tests) is automatically re-run after the first pass; the idempotency check fails the test if any task reports changed on the second run.

Playbook markers

Test playbooks support YAML comment markers near the top of the file (within the first 5 lines) to control runner behaviour:

Marker	Effect
`# idempotency: skip`	Skip the idempotency re-run for this test
`# requires: docker-daemon`	Skip the entire test when the target container has no Docker daemon

Multi-play tests that intentionally overwrite the same files with different variable sets (e.g. test_alloy_role.yml) should declare # idempotency: skip.

Integration tests (requires Docker)

Exercise cross-role coherence and optionally spin up live services.

# Smoke test — config-only, CI-friendly:
./tests/run_tests.sh integration

# Local deploy — starts AI-Horde in Docker:
./tests/integration/local_deploy.sh up
./tests/integration/local_deploy.sh down

# With GPU worker (requires NVIDIA GPU + nvidia-container-toolkit):
./tests/integration/local_deploy.sh up --with-worker

Test coverage by role

Role	Render	Negative	Integration	Full-stack
horde_monitoring	✅	✅	—	✅
ai_horde	✅	✅	✅	✅
aihorde_frontpage	✅	—	—	✅
horde_regen_worker	✅	—	✅	—
artbot / revproxy	✅	—	—	—
horde_stats_exporter	—	—	—	✅
horde_alloy	—	—	—	—

Full-stack local deploy

See also the Quick Start for a use-case driven introduction.

Spins up the complete Horde business stack on one machine: Backend (AI-Horde + Postgres + Redis), Frontend (AiHordeFrontpage), Stats Exporter, and HAProxy as the unified edge router. Monitoring and the GPU worker are optional tiers.

# Core stack (backend + frontpage + exporter + HAProxy):
./tests/full_stack/local_deploy.sh up

# With monitoring (Grafana, Mimir, Prometheus, Alertmanager, Alloy):
./tests/full_stack/local_deploy.sh up --with-monitoring

# With GPU worker (requires NVIDIA GPU):
./tests/full_stack/local_deploy.sh up --with-worker

# With Artbot on a separate port (8080):
./tests/full_stack/local_deploy.sh up --with-artbot

# Everything:
./tests/full_stack/local_deploy.sh up --all

# Tear down (unconditional — stops all tiers):
./tests/full_stack/local_deploy.sh down

# Status:
./tests/full_stack/local_deploy.sh status

# Logs for a specific tier:
./tests/full_stack/local_deploy.sh logs backend
./tests/full_stack/local_deploy.sh logs frontpage
./tests/full_stack/local_deploy.sh logs haproxy
./tests/full_stack/local_deploy.sh logs monitoring
./tests/full_stack/local_deploy.sh logs artbot

Local-deploy layout:

local-deploy/static/ contains committed overlays/config files used by local deploy scripts.
local-deploy/runtime/ contains generated configs, cloned sources, and runtime data.

Reset local deploy state safely:

rm -rf local-deploy/runtime

Port assignments (full-stack local deploy):

Service	Port	Notes
HAProxy (main)	80	Unified edge router
HAProxy stats	8404	http://localhost:8404/stats
AiHordeFrontpage	8006	Angular SSR (also via HAProxy on 80)
AI-Horde API	7001	Direct; also via /api on port 80
Stats Exporter	9109	Prometheus metrics
Grafana	3000	Monitoring dashboards
Prometheus	9090	Metrics collection
Artbot HAProxy	8080	Artbot site (`--with-artbot`)

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
.github		.github
docs		docs
examples		examples
local-deploy		local-deploy
meta		meta
roles		roles
tests		tests
.ansible-lint		.ansible-lint
.ansible-lint-ignore		.ansible-lint-ignore
.git-blame-ignore-rev		.git-blame-ignore-rev
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MONITORING.md		MONITORING.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
galaxy.yml		galaxy.yml
requirements.yml		requirements.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Haidra Deployments

Scope and Audience

Non-goals

Usage

Included Roles

Application Roles

Monitoring Roles

Documentation

Role Testing

Render tests (fast, no services started)

Test output and logs

Playbook markers

Integration tests (requires Docker)

Test coverage by role

Full-stack local deploy

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Haidra Deployments

Scope and Audience

Non-goals

Usage

Included Roles

Application Roles

Monitoring Roles

Documentation

Role Testing

Render tests (fast, no services started)

Test output and logs

Playbook markers

Integration tests (requires Docker)

Test coverage by role

Full-stack local deploy

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages