Skip to main content

Documentation Index

Fetch the complete documentation index at: https://akua-1dce587a.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Use these runbooks when an agent workflow is slow, stuck, noisy, or blocked by infrastructure. Start from the workspace, agent, session, turn, and operation IDs visible in the dashboard or API response.

Stuck session

Symptoms:
  • A session stays active but no new events arrive.
  • A submitted turn is queued longer than expected.
  • The dashboard reconnects but shows no state change.
Checks:
  1. Get the session and latest turns with GET /v1/agent_sessions/{id} and GET /v1/agent_turns?session=....
  2. Stream events with GET /v1/agent_events:stream?session=... to confirm whether the event cursor is advancing.
  3. Check for pending approvals with GET /v1/approval_requests?session=...&state=PENDING.
  4. If the turn is blocked by quota or runtime policy, surface the admission error to the user instead of retrying blindly.

Stuck sandbox

Symptoms:
  • A turn requires a retained runtime, but the runtime does not become active.
  • A retained filesystem exists, but resume does not complete.
  • A sandbox stays in CREATING, STARTING, STOPPING, or DELETING past its expected deadline.
Checks:
  1. Inspect the turn’s runtime_decision and resolved_execution_mode.
  2. Check the sandbox state and last update time in the internal runtime view or operations dashboard.
  3. Confirm workspace sandbox quota and retained PVC quota are not exhausted.
  4. If a sandbox is stuck deleting, run the cleanup workflow rather than manually removing PVCs unless the retention policy explicitly allows it.

Provider outage

Symptoms:
  • Provider requests fail with repeated upstream errors.
  • Turns fail before producing useful assistant output.
  • Provider spend or token counters stop updating while requests continue.
Checks:
  1. Check provider exchange metadata for upstream error codes and response timing.
  2. Confirm whether the agent uses Akua-managed billing or workspace BYOK.
  3. For BYOK, verify the referenced secret exists and has an enabled version.
  4. Switch model/provider only through agent policy so the decision remains auditable.

Runaway ambient trigger

Symptoms:
  • Many sessions start from the same signal.
  • Ambient trigger usage spikes.
  • Users see duplicate investigations for one resource.
Checks:
  1. Check the agent’s trigger severity threshold and cooldown.
  2. Search active sessions for the same resource reference before starting new work.
  3. Disable the trigger or raise the minimum severity while investigating.
  4. Review the trigger source for repeated identical events.

Retained filesystem cleanup

Symptoms:
  • Retained storage grows unexpectedly.
  • Old sessions still have resumable filesystems.
  • PVC cleanup reports failures.
Checks:
  1. Check whether the session is pinned.
  2. Confirm the workspace tier and retention policy.
  3. Prefer the retained cleanup workflow so state transitions, audit events, and quota counters stay consistent.
  4. Keep summaries, repository change requests, and git history before deleting expensive runtime state.

Agent limits

Understand quotas and usage dimensions.

Sessions and turns

Understand runtime decisions and retained filesystem behavior.