Limits and pricing

An admission gate checks agent session, concurrency, runtime, provider, rate, and ambient limits before work starts, with usage metered per dimension.

Hosted agents use the same workspace boundary as the rest of Akua. Sessions are cheap to keep open, but model usage, retained runtimes, filesystem storage, and automated triggers need explicit limits.

Limits that apply to agents

Limit	Scope	Canonical source
Active agent sessions	Workspace quota	Quotas →
Running turns	Workspace concurrency limit	Quotas → Concurrency limits →
Retained runtimes	Workspace quota and retention policy	Sessions and turns →
Provider usage	Model budget and rate limit	Configure an agent →
API and MCP calls	Rate limit	Quotas → Rate limits →
Ambient triggers	Workspace policy and cooldown	Ambient agents →

Agent limits use per-workspace admission control before expensive work starts, so an agent cannot silently exceed runtime, provider, or automation limits. When a limit blocks work before a turn is accepted, the API should return a structured error and create no turn. If work was accepted but a later runtime or budget gate blocks progress, the turn should expose a structured admission error that names the limit, scope, current usage, maximum, and retry timing when a retry is useful. Usage summaries should separate provider cost, retained runtime compute, retained filesystem storage, API/MCP calls, and ambient trigger runs.

Pricing model

Agents support two billing paths:

Billing path	How it works
Akua-managed billing	Akua routes provider calls and charges usage according to the workspace plan.
Workspace BYOK	The workspace supplies provider credentials through secrets and pays the provider directly.

Retained runtimes are separate from model usage. A long-running coding task can consume compute, storage, and provider budget, while a read-only triage turn may only use platform tools and a small model budget.

Usage dimensions

Agent usage is reported separately so teams can see what drove cost or limits:

Dimension	What it measures
Provider tokens	Tokens sent to and returned from model providers.
Provider spend	Provider cost attributed to the workspace, agent, session, and turn.
Runtime compute	Seconds a retained sandbox spent running compute.
Retained storage	GB-hours for retained session filesystems.
API calls	Attributed platform API calls made by generated widgets, snippets, or Code Mode.
MCP calls	Tool calls made through MCP/Code Mode bridges.
Ambient triggers	Signal-triggered runs admitted by ambient policy.

Cost controls

Use agent policy to keep spend predictable:

Set model budgets for agents that run automatically.
Prefer Code Mode for platform operations that do not need files or shell access.
Start retained runtimes only when repository, package manager, browser, or test work is required.
Use cooldowns for ambient triggers.
Expire retained filesystems unless the session is pinned.
Require approval before shell commands, network egress, repository change request acceptance, or secret access.

Plan comparison

Platform plans and included limits.

Quota model

How resource, concurrency, and rate limits work.

Configure an agent

Set model policy, runtime policy, grants, and triggers.

Permissions and security

Review approvals, grants, and audit behavior.

Overview

Workflows

Controls

Reference

Limits that apply to agents

Pricing model

Usage dimensions

Cost controls

Plan comparison

Quota model

Configure an agent

Permissions and security

​Limits that apply to agents

​Pricing model

​Usage dimensions

​Cost controls

Plan comparison

Quota model

Configure an agent

Permissions and security

Limits that apply to agents

Pricing model

Usage dimensions

Cost controls