Skip to content

Enterprise > Enterprise features

Bring Your Own LLM

Open in ChatGPT ↗
Ask ChatGPT about this page
Open in Claude ↗
Ask Claude about this page
Copied!

Route Warp's agents through your AWS Bedrock models for billing control and infrastructure flexibility.

Warp supports Bring Your Own LLM (BYOLLM) for enterprise teams that need to run inference on their own cloud infrastructure. With BYOLLM, your team can use Warp’s agents while routing inference through models hosted in your AWS Bedrock environment.

This gives you control over cloud spend and model hosting, without changing how your team works in Warp.

  • Cloud-native credentials - No long-lived API keys. Interactive terminal sessions use each user’s AWS CLI session credentials; cloud agent runs assume an IAM role in your AWS account via OIDC.
  • Admin-controlled IAM - Admins define which IAM role(s) Warp can assume and which models are available via AWS Bedrock, with the ability to disable non-Bedrock model access entirely.
  • Admin-enforced routing - Team admins configure which models are available to users in AWS Bedrock, with the ability to disable non-Bedrock model access entirely.
  • Consolidated billing - Inference costs are billed directly to your AWS account, leveraging existing cloud commitments.

When BYOLLM is enabled, Warp redirects inference calls to your AWS Bedrock environment instead of using model providers’ direct APIs.

Here’s the high-level flow:

Interactive terminal flow

  1. Admin configures routing - Your team admin sets routing policies in Warp’s admin settings (e.g., “Route Claude Opus 4.7 through AWS Bedrock; disable direct Anthropic API”).
  2. Team members authenticate - Each team member authenticates to AWS locally using the AWS CLI (aws login).
  3. Warp routes requests - When a team member uses an interactive agent in the terminal, Warp uses their short-lived session credentials to authenticate requests to your configured AWS Bedrock API endpoint.
  4. Inference executes in your cloud - The model runs in your AWS account. Responses return to the Warp client.

Cloud agent flow

  1. Admin configures routing - Your team admin configures BYOLLM in the Admin Panel and provides an IAM role ARN that Warp can assume. See Enabling BYOLLM for Cloud Agents for setup details.
  2. Warp assumes the role - At run start, Warp mints an OIDC token and assumes the configured IAM role in your AWS account to obtain temporary credentials.
  3. Warp routes requests - The cloud agent uses those temporary credentials to call your configured AWS Bedrock endpoint.
  4. Inference executes in your cloud - The model runs in your AWS account. Responses return to the cloud agent worker.

BYOLLM uses cloud-native IAM authentication, not long-lived API keys:

  • Automatic refresh - Session tokens refresh automatically every ~15 minutes. Users can enable auto-refresh by opening Settings and searching for AWS Bedrock, or when prompted during first credential expiration. With auto-refresh enabled, sessions can run uninterrupted for up to 12 hours (depending on your AWS admin configuration).
  • Per-user credentials - Credentials are not shared across the organization. Your cloud provider’s default credential provider chain (e.g., AWS CLI) provisions and refreshes them locally.
  • No storage or logging - Warp never stores or logs your cloud session tokens on its servers.

This approach ensures access management stays with your cloud provider, giving admins member-by-member control.

BYOLLM supports the intersection of models that Warp supports and models available on AWS Bedrock. Currently, only Claude models (Anthropic) are available through AWS Bedrock. OpenAI and Google models are not available on Bedrock.

To determine which models you can use with BYOLLM:

A model must appear on both lists to be available through BYOLLM.

Before configuring BYOLLM, confirm the following:

  • Your organization has the desired models enabled in AWS Bedrock.
  • You have admin access to both Warp’s Admin Panel and your AWS IAM settings.
  • Team members have the AWS CLI installed locally.

In the Admin Panel, configure which models should route through AWS Bedrock:

  1. From the Admin Panel, navigate to the Models page.
  2. Select which models should use your cloud provider (e.g., “Claude Opus 4.7 via AWS Bedrock”).
  3. Optionally, disable direct API access to enforce provider-only routing.

Grant your team members the necessary permissions in AWS. Use least-privilege IAM policies.

Example: AWS Bedrock minimum IAM policy

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "BedrockModelAccess",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": [
"arn:aws:bedrock:*::foundation-model/*",
"arn:aws:bedrock:*:*:inference-profile/*",
"arn:aws:bedrock:*:*:application-inference-profile/*"
]
}
]
}

Each team member authenticates to AWS using the AWS CLI:

Terminal window
aws login

Confirm your AWS environment and region are correctly configured before using Warp.

Run a test prompt in Warp using a model configured for BYOLLM routing. Verify:

  • The request completes successfully.
  • Logs appear in AWS CloudWatch.

Cloud agents authenticate to AWS Bedrock differently from the local terminal flow above. Instead of relying on each user’s AWS CLI session, Warp assumes an IAM role you provision in your AWS account using OIDC identity federation.

Before configuring BYOLLM for cloud agents, confirm the following:

  • You have admin access to both Warp’s Admin Panel and your AWS IAM settings.

1. Set up Warp as an OIDC identity provider in AWS (cloud admin)

Section titled “1. Set up Warp as an OIDC identity provider in AWS (cloud admin)”

Before AWS can trust tokens issued by Warp, register Warp as an OpenID Connect (OIDC) identity provider in IAM. This is a one-time setup per AWS account.

  1. Open the Identity providers page in the AWS IAM console.
  2. Click Add provider.
  3. For Provider type, choose OpenID Connect.
  4. For Provider URL, enter https://app.warp.dev.
  5. For Audience, enter sts.amazonaws.com.
  6. Click Add provider.

After the provider is created, copy its ARN — it will look like arn:aws:iam::<aws-account-id>:oidc-provider/app.warp.dev. You’ll reference this ARN in the trust policy in the next step.

For more detail, see AWS’s Create an OpenID Connect (OIDC) identity provider in IAM guide.

2. Provision an assumable IAM role (cloud admin)

Section titled “2. Provision an assumable IAM role (cloud admin)”

Create an IAM role that Warp can assume via OIDC, then attach the minimum Bedrock permissions policy. Use least-privilege IAM policies.

The role setup has two parts:

  1. A trust policy that allows Warp’s OIDC identity to call sts:AssumeRoleWithWebIdentity.
  2. A permissions policy that grants the minimum Bedrock inference permissions.

This trust policy authorizes any cloud-hosted run from your team. The sub claim Warp signs has the shape scoped_principal:<team-uid>/<actor-type>:<principal-uid>, where <actor-type> is user for user-triggered runs or service_account for cloud agent runs. The <team-uid>/* pattern below covers both.

Example trust policy

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::<aws-account-id>:oidc-provider/app.warp.dev"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": {
"app.warp.dev:sub": "scoped_principal:<team-uid>/*"
},
"StringEquals": {
"app.warp.dev:aud": "sts.amazonaws.com"
}
}
}
]
}

Replace the account ID, issuer host, and team UID with values for your environment.

The <team-uid> is the Warp team UID for the team that will be allowed to assume this role. You can find it in your team’s Admin Panel URL as the path segment after /admin/. For example, in https://app.warp.dev/admin/HzjUdNkg8Uiq8gp6FMgfxe/models, the team UID is HzjUdNkg8Uiq8gp6FMgfxe.

Attach the minimum Bedrock invoke permissions policy to the role:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "BedrockModelAccess",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": [
"arn:aws:bedrock:*::foundation-model/*",
"arn:aws:bedrock:*:*:inference-profile/*",
"arn:aws:bedrock:*:*:application-inference-profile/*"
]
}
]
}

After you create the role, copy its ARN. You’ll paste it into the Models page in the next step.

Attach the IAM role from Step 2 to your team or to a specific named agent.

This applies the OIDC role to all cloud agent runs on the team.

  1. In the Admin Panel, navigate to the Models page.
  2. Under the AWS Bedrock host configuration, paste the IAM role ARN from Step 2 into the Role ARN field.
  3. Select which models should route through AWS Bedrock.

This applies the OIDC role only to runs from a specific named agent.

In the Oz web app:

  1. Create a new agent or edit an existing one.
  2. In the agent form, expand the AWS Bedrock section.
  3. Choose Custom and paste the IAM role ARN from Step 2.
  4. Ensure the agent’s default model is one that’s enabled for Bedrock under the Admin Panel Models page.

New runs for this agent will authenticate to Bedrock using the configured role.

Start a test cloud agent run using a model configured for BYOLLM routing. Verify:

  • The request completes successfully.
  • Logs appear in AWS CloudWatch.

When a request routes through BYOLLM:

  • Warp does not consume AI credits for that request.
  • Cloud agent runs still consume platform and compute credits for orchestration and the cloud agent’s compute.

See The three credit buckets for more on credit types.

Warp’s agents automatically select the best model for your task while respecting your admin’s routing policies. If you configure a model for BYOLLM, requests for that model route to AWS Bedrock.

If a BYOLLM request fails (e.g., due to role assumption errors, insufficient permissions, or provider quota limits), Warp attempts to fall back to the next available model your admin has enabled.

For example, if Claude Opus 4.7 on Bedrock fails but your admin also enabled it via direct API, Warp falls back to the direct API to avoid disruption. If a fallback uses a direct API model, that request consumes Warp credits.

If no fallback is available (e.g., the admin disabled all non-Bedrock models), Warp displays a clear error message.

  • No long-lived API keys — BYOLLM uses cloud-native IAM with short-lived session tokens.
  • Per-user authentication — Each team member authenticates individually; credentials are not shared.
  • No storage or logging — Warp never stores or logs your cloud session tokens on its servers.

Warp maintains SOC 2 compliance and has Zero Data Retention (ZDR) agreements with its contracted LLM providers.

However, when using BYOLLM:

  • Your cloud account settings determine data retention policies.
  • Warp cannot enforce ZDR for requests routed through your infrastructure.
  • If your cloud account does not have ZDR enabled, your provider may retain data according to their terms.
  • Warp keeps all runs fully steerable and logged within Warp.
  • Your cloud account retains provider-side logs (usage, latency, errors).
  • Missing or expired local credentials (interactive terminal use) — Re-authenticate using aws login. To avoid interruptions, enable auto-refresh by opening Settings and searching for AWS Bedrock, or when prompted during credential expiration.
  • Role assumption failed (cloud agent runs) — Verify the IAM trust policy, issuer host, team UID restriction, and the configured role ARN in Warp.
  • Missing OIDC provider (cloud agent runs) — Confirm the OIDC provider exists in your AWS account for the issuer host referenced in the trust policy.
  • Insufficient permissions — Verify your IAM policy includes the required Bedrock actions and any needed resources.
  • Region or model mismatch — Confirm the model is enabled in your AWS region and that your environment is configured for the correct region.
  • Provider quota limits — Check your AWS Bedrock quota and request increases if needed.
  1. Confirm the configured role ARN is the one you intended Warp to assume.
  2. Check the IAM trust policy and verify the issuer host, sub, and aud conditions match your Warp configuration.
  3. Check the attached IAM policy for the required Bedrock permissions.
  4. Confirm the model ID and region match your Warp configuration.
  5. Inspect AWS CloudWatch logs for request details and errors.

BYOK (Bring Your Own API Key) lets individual users add their own API keys for direct model provider access (e.g., Anthropic, OpenAI, Google). Warp stores keys locally on the user’s device.

BYOLLM (Bring Your Own LLM) routes inference through your organization’s cloud infrastructure (AWS Bedrock) using cloud-native IAM. Admins configure it at the admin level and it applies to the entire team.

FeatureBYOKBYOLLM
Configuration levelUserAdmin/Team
AuthenticationAPI keys (local)IAM role assumed by Warp via OIDC
BillingDirect to providerYour cloud account
Data localityProvider infrastructureYour cloud infrastructure

Auto model selection is disabled if an admin disables any Direct API model, regardless of AWS Bedrock configuration.

When Direct API models remain enabled and BYOLLM is configured, Auto picks the best model for the task. If the selected model is also enabled for AWS Bedrock, the request routes through Bedrock; otherwise it routes through the Direct API.

Inference runs in your AWS account, which AWS bills directly. Warp does not consume AI credits for BYOLLM-routed inference. Cloud agent runs continue to consume platform and compute credits for orchestration. See The three credit buckets for more.

What data does Warp store? Do you store our cloud credentials?

Section titled “What data does Warp store? Do you store our cloud credentials?”

Warp does not store or log your cloud credentials.

  • Interactive terminal use — Credentials are used transiently to sign requests and are never persisted on Warp servers.
  • Cloud agent runs — Temporary AWS credentials are used only for the duration of the run and are not retained after it ends.

Can admins enforce provider-only routing and disable Warp-managed models?

Section titled “Can admins enforce provider-only routing and disable Warp-managed models?”

Yes. Admins can configure routing policies to require specific models to use BYOLLM and disable direct API access to Warp-managed model endpoints.