Skip to content

Self-hosting

Open in ChatGPT ↗
Ask ChatGPT about this page
Open in Claude ↗
Ask Claude about this page
Copied!

Run Oz cloud agents on your own infrastructure. Choose between a managed worker daemon orchestrated by Oz or unmanaged CLI-based execution you control.

Self-hosting lets your team run Oz cloud agent workloads on your own infrastructure instead of Warp-managed servers. You control the execution environment, compute resources, and network access. Repositories are cloned and stored only on your machines, and agents can reach services behind your VPN or firewall.

New to self-hosting? Start with the Self-hosting quickstart to get a managed worker running on Docker in under 10 minutes.

Want a CLI-only path with no Docker requirement? Jump straight to the Unmanaged quickstart to run oz agent run directly on any host.

Self-hosting has two architectures. The core distinction is who orchestrates agent runs — not who owns the compute. Both models keep code and execution on your infrastructure.

  • Managed — Oz orchestrates agent runs. You run the oz-agent-worker daemon on your infrastructure; it connects to Oz and waits for work. Slack mentions, Linear comments, schedules, API calls, and oz agent run-cloud commands all route tasks to your worker, which executes them in isolated Docker containers, Kubernetes Jobs, or directly on the host. Similar to a GitHub self-hosted runner.
  • Unmanaged — You orchestrate agent runs. You invoke oz agent run directly from your existing CI pipeline, Kubernetes pod, VM, or dev box. Oz provides session tracking and observability for each run, but does not start or stop agents for you.
AspectManagedUnmanaged
Who triggers runsOz (Slack, Linear, schedules, API, run-cloud)Your system (CI, cron, scripts)
What runs on your infraLong-lived oz-agent-worker daemonOne-shot oz agent run invocations
OS supportLinux (macOS/Windows coming)Linux, macOS, Windows
Execution isolationDocker container, Kubernetes Job, or direct hostWhatever your host provides
Automatic environment setupYes (via Warp environments)No (you manage it)
Session tracking and steeringYesYes

The two architectures are not mutually exclusive. Some teams run managed workers for integration-triggered work and unmanaged agents in CI pipelines.

Warp uses a split-plane architecture: execution happens on your infrastructure, while orchestration, session management, and LLM inference route through Warp’s backend. Agent interactions — including code context in session transcripts and LLM prompts — transit Warp’s control plane under Zero Data Retention (ZDR) agreements. Warp does not persistently store your source code or train on it.

Self-hosted Oz architecture showing customer-managed execution with Oz orchestration

With any self-hosted architecture:

  • Agent runs are tracked and steerable — View status, metadata, and session transcripts in the Oz dashboard, the Warp app, or via the API/SDK. Authorized teammates can attach to running sessions to monitor or steer agents.
  • Connectivity to Warp’s backend is required — Agents need outbound access to Warp for orchestration, session storage, and LLM inference. No inbound ports need to be opened.
  • Resource limits are controlled by your infrastructure — Concurrency and compute are only limited by the machines you provision, not by Warp.

Use these questions to decide between managed and unmanaged:

  1. Do you need agents to run on Windows or macOS?
    • Yes → Use the unmanaged architecture. Managed is Linux-only today.
    • No, Linux works → Continue to the next question.
  2. Do you want Oz to handle starting and stopping agents (from Slack, the web interface, the Warp app, schedules, or the API)?
    • Yes → Use the managed architecture.
    • No, you have your own triggering mechanism → Use the unmanaged architecture.
  3. Can your development environment run in a Docker container or Kubernetes pod?
  4. Do you have your own orchestrator (CI/CD, Kubernetes, internal job scheduler) that starts agents on demand?

The managed architecture supports three backends for task execution:

  1. Are you deploying the worker into a Kubernetes cluster?
    • Yes → Use the Kubernetes backend. Each task runs as a Kubernetes Job in your cluster; install with the included Helm chart.
    • No → Continue.
  2. Is Docker available on your worker host?
    • Yes → Use the Docker backend (default). Tasks run in isolated containers.
    • No → Use the Direct backend. Tasks run directly on the host.
  3. Do you need container-level isolation between tasks?
  4. Do you need Kubernetes-native scheduling, resource management, or policy enforcement?

With the managed architecture, you run the oz-agent-worker daemon on your infrastructure. The daemon connects to Oz’s backend, waits for tasks to be assigned to it, and executes those tasks on its host using one of three backends:

  • Docker backend (default) — Runs each task in an isolated Docker container.
  • Kubernetes backend — Runs each task as a Kubernetes Job in your cluster.
  • Direct backend — Runs each task directly on the host without a container runtime.

The managed architecture enables full orchestration by Oz — it can remotely start agents via Slack, Linear, the Oz web app, the API/SDK, and the oz agent run-cloud command. Agents can access host resources through volume mounts (Docker), Kubernetes-native configuration (Kubernetes), and injected environment variables.

With the unmanaged architecture, you run oz agent run inside your own orchestrator or dev environment. This works on any platform Warp supports (Linux, macOS, Windows), with no dependency on Docker or any other sandboxing platform.

You’re responsible for executing oz agent run on your infrastructure — similar to how you’d integrate Claude Code or Codex CLI. The agent runs directly on the host, which could itself be a Kubernetes pod, VM, container, or CI runner.


This section applies to all managed backends (Docker, Kubernetes, and Direct). Once a worker is connected, route Oz cloud agent runs to it by specifying the --host flag (or equivalent) with your worker ID. The --host value must match the --worker-id of a connected worker exactly.

Terminal window
oz agent run-cloud --prompt "Refactor the authentication module" --host "my-worker"

You can combine --host with any other run-cloud flags, such as --environment, --model, --mcp, --skill, --computer-use, and --attach.

When creating or updating a schedule, specify the host:

Terminal window
oz schedule create --name "daily-cleanup" \
--cron "0 9 * * *" \
--prompt "Run dead code cleanup" \
--environment ENV_ID \
--host "my-worker"
oz schedule update SCHEDULE_ID --host "my-worker"

When creating or updating an integration, specify the host:

Terminal window
oz integration create slack --host "my-worker" ...
oz integration update linear --host "my-worker" ...

All tasks created through that integration route to your self-hosted worker.

When creating a run via the Oz API, include worker_host in the config:

Terminal window
curl -X POST https://app.warp.dev/api/v1/agent/run \
--header 'Authorization: Bearer YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data '{
"prompt": "Refactor the authentication module",
"config": {
"environment_id": "ENV_ID",
"worker_host": "my-worker"
}
}'

When creating a run, schedule, or integration in the Oz web app, select your self-hosted worker from the host dropdown.


Self-hosted workers fully support environments. When a task specifies an environment, the worker resolves the Docker image, clones the repositories, runs setup commands, and executes the agent inside the prepared container or Kubernetes Job.

The same environment can be used for both Warp-hosted and self-hosted runs without modification. See Environments for details on creating and configuring them.

Self-hosted runs have the same observability as Warp-hosted runs:

  • Oz dashboard — View task status, history, and metadata at oz.warp.dev.
  • Session sharing — Authorized teammates can attach to running tasks to monitor progress.
  • APIs and SDKs — Query task history and build monitoring using the Oz API.

For infrastructure-level observability, the oz-agent-worker daemon can export OpenTelemetry metrics (worker health, task throughput, capacity saturation) to Prometheus, an OTLP collector, or the console. See Monitoring for setup, the full metric catalog, and sample PromQL queries.