Version: 2026.1

Troubleshooting

Symptoms, likely causes, and where to look. Start with the first matching row; the "diagnose" column tells you which signal rules the cause in or out.

Agent server won't start or seems unhealthy

Symptom	Likely cause	Diagnose
Container restarts repeatedly	Missing required env var (e.g. `AGENT_SERVER_ADMIN_TOKEN`, BYOK API key)	`docker compose logs agent-server` — error lines name the missing variable.
Startup succeeds but `/agent-server/api/health` returns 500	Session store cannot reach Pimcore DB	Check PHP container is up; `docker compose logs php`.
`/agent-server/api/agents` returns `[]` after a fresh start	Initial registry fetch failed; background retry is running	Logs show `Agent registry retry` on an exponential schedule (5 s → 10 s → … → 60 s). Wait, or hit the manual reload endpoint.
Running with a BYOK provider, dropdown shows no models	`available_models` not defined in the provider block	Add an `available_models` map to the provider in `pimcore_agent.inference.providers.<name>`. See Inference Providers.
Session creation fails with "unknown provider" error	Agent's `provider:` field references a name not in the providers map	Check that the provider name in the agent YAML matches a key in `pimcore_agent.inference.providers`. Provider names are case-sensitive.

See Architecture → Configuration System → Reload & recovery paths for the retry schedule and manual reload command.

Agent not responding

Symptom	Likely cause	Diagnose
No text appears after sending a message	Task is still running, SSE stream is live but slow	`docker compose logs -f agent-server` — look for `Task started` / timing entries. Raise `AGENT_SERVER_LOG_LEVEL=debug` for per-event detail.
HTTP 409 on `POST /chat/:sessionId`	A task for this session is already running	Either wait for completion, or `POST /chat/:sessionId/cancel` the current task.
HTTP 401 on every agent-server call	Session cookie is missing or invalid	Re-log-in to Pimcore Studio. See Architecture → Authentication.
Anthropic BYOK mode, response appears all-at-once instead of streaming	Known SDK limitation — no `message_delta` events from Anthropic	Expected behaviour, tracked in copilot-sdk#637. UI visibly streams because the server synthesises deltas, but wall-clock is end-of-response.
`streamThinking: true` ignored in BYOK mode	SDK does not expose Anthropic thinking events in BYOK	Expected, no workaround.

Tool calls fail or are refused

Symptom	Likely cause	Diagnose
`This tool is not available in the agent environment`	Always-denied tool (`bash`, `task`, etc.)	Expected — see Architecture → Tool Security. Not recoverable, not configurable.
`Access denied: path is outside the allowed scope`	Path sandbox violation (SDK file tool)	Confirm the path is inside `/app/uploads/{sessionId}/{uploaded,staged}/` or a specific file in `/tmp/`. Directory listings on `/tmp/` are denied.
MCP tool returns permission error	Pimcore user permissions	The agent runs as the logged-in user. Grant permissions in Pimcore user settings.
Chat-scoped tool (`stage_asset`, `propose_*`, …) returns "no chat session context"	Tool was called outside the agent-server chat flow (e.g. directly from a PAT-authenticated MCP client)	Expected — chat-scoped tools require the bearer-authenticated request that binds the chat session id. They cannot be called from stand-alone MCP clients.
Adding a new MCP tool — tool not found at runtime	Compiler pass hasn't picked up the tag	`docker compose exec php bin/console cache:clear` and restart the PHP container.
First tool call after a long idle returns "Session not found" / 404 from an MCP server, then the next user turn works again	Upstream MCP transport session was garbage-collected; the agent-server detects it and resets the SDK session on the next turn	Expected — see Architecture → MCP Integration → Transport-session recovery. The user only sees a small recovery delay; conversation history is preserved.

Proposals

Symptom	Likely cause	Diagnose
Proposal widget shows "Proposal not found" on approve	Stored payload was lost (session deleted?)	Check `bundle_agent_proposal_statuses`. Sessions cascade-delete proposals.
Proposal approve fails with a permission error	User permissions changed between propose and approve	Expected — resolvers re-check permissions. Reject, fix permissions, re-prompt the agent.
Proposal approve fails with "stale data"	The element was modified after proposal creation	Expected — `modificationDate` mismatch prevents silent overwrites. Reject, re-prompt the agent with the latest state.
Proposal card renders with empty element paths	Bulk fetch failed or returned incomplete data	Check `/pimcore-studio/api/bundle/agent/proposals/{sid}/data` in DevTools. The bundle-fetched payload is the single source of truth; LLM-supplied metadata is ignored intentionally.

See Features → HITL Proposals for the expected lifecycle and Extending → Custom Proposal Types for custom flows.

Sessions and reconnection

Symptom	Likely cause	Diagnose
Sessions disappear after container restart	Session data is stored in Pimcore DB — not the agent-server	Verify DB connectivity and that the bundle is installed (`pimcore:bundle:install PimcoreAgentBundle`).
After a container recreate, resuming an existing chat makes the agent "start over" — re-runs the same tool calls, ignores earlier results	The Copilot runtime session store (`events.jsonl`, `session.db`, checkpoints) is not on a durable mount, so it was wiped and `resumeSession` reloaded an empty conversation. The PHP chat transcript still shows the old messages, but the model lost its working context.	Confirm `AGENT_SERVER_COPILOT_STATE_DIR` (default `/app/.copilot-state`) is bind-mounted (`./var/tmp/copilot-state`) and that `events.jsonl` files appear under it — not in the container's `~/.copilot`. See Session Storage → Copilot runtime session state.
Reconnect to `/stream?seq=N` returns 204	No active task for that session	The task finished before you reconnected. This seq-based endpoint is server-to-server / eval-CLI only; the browser uses `GET /sessions/:id` catch-up instead.
Reconnect replays nothing	TaskRunner in-memory buffer TTL expired (5 min after completion)	The buffer is only the live tail; fetch the record via `GET /sessions/:id` — the assistant message was persisted incrementally + finalized by `onComplete`.
Internal MCP calls return 401 mid-conversation	Bearer was reminted but the cached SDK session still has the old one baked in	The next user turn auto-rebuilds the SDK session (`tokenReminted: true`). If it persists, check `pimcore_agent.chat_session_token.ttl` and confirm the maintenance task isn't GC-ing rows mid-turn.
Long overnight run completed but result missing from chat	Should not occur after the `mcp-token-authentication` change (bearer-bound persistence survives cookie expiry). If it does: (1) check `security.yaml` has `pimcore_agent_bundle_api: '%pimcore_agent.bundle_api_firewall_settings%'` placed before the `pimcore_studio` firewall; (2) check `AGENT_SERVER_MCP_TOKEN_TTL` matches `pimcore_agent.chat_session_token.ttl`; (3) check agent-server logs for `Token refresh tick failed` warnings (the server-driven refresh timer fires every `max(60s, ttl/2)`).

Real-time / multi-client sync

Symptom	Likely cause	Diagnose
Live updates do not arrive on the first login after a fresh auth (work in another tab is invisible until reload)	Studio's `GlobalMessageBus` opened its Mercure subscription before the Mercure cookie was set, so the hub did not authorise the user topic for private delivery	Reload the page — it re-runs `fetchMercureCookie()` and the subscription is re-authorised for the user topic. PHP catch-up still reconstructs the session, so no data is lost.
No live updates on any tab, but chat works and reload shows the result	Mercure publisher disabled — `MERCURE_JWT_KEY` unset/blank	Logs show `MERCURE_JWT_KEY not set — live cross-client chat sync disabled` at startup. Set the shared key (≥ 32 chars; the same secret the hub validates against) and forward it into the `agent-server` service. See Architecture → Real-time Sync.
Logs show `Mercure publish non-OK` with status `401`	Publisher JWT rejected by the hub — blank/short/mismatched `MERCURE_JWT_KEY`, or a wrong publish selector	The key must be ≥ 32 chars and identical to the hub's `MERCURE_PUBLISHER_JWT_KEY`. The publish selector is the URI Template `studio-backend-default/user/{id}` (a trailing-`/*` glob is rejected 401).
A reopened session shows an assistant bubble stuck "streaming" forever	Either an agent-server restart interrupted the turn (out of scope to recover — surfaced as interrupted), or the terminal `complete` flush never reached PHP	Check agent-server logs around the turn for `Incremental message flush failed` / `Persist sink finalize failed`. A reconnect-refetch (reload, or `online`/`visibilitychange`) re-reads PHP.

Frontend

Symptom	Likely cause	Diagnose
Frontend plugin does not appear in Studio	Build output not picked up	`npm run build` in `assets/`, then `bin/console cache:clear`.
Widget renders as plain text	Widget type not registered in the renderer registry	Verify `container.get('AgentChat/RichChatWidgetRegistry').register(...)` ran.
SSE stream closes early	Nginx buffering	Nginx needs `proxy_buffering off` and `chunked_transfer_encoding off` on the `/agent-server/api/` location. See Installation → Configure the Nginx Proxy.

Config changes not taking effect

Symptom	Likely cause	Diagnose
Agent YAML edit not visible after save	No reload was triggered	Studio UI auto-reloads on save (`AgentServerProxyService::triggerReload()`). For code-level edits, `POST /agent-server/api/admin/reload-agents`.
Added a new `pimcore_agent.agents.paths` entry — presets still missing	Path list is a compiled container parameter	`docker compose exec php bin/console cache:clear`. Subsequent edits at an already-registered path do not need `cache:clear`.
Skill content change not visible	Agent reload required	Hit the reload endpoint. Skill files are materialized on every reload.
Env var change not picked up	`env_file` is read only at container start	`docker compose restart agent-server`.

Reading the logs

Every request logs a timing summary at info:

{ "msg": "Request timing summary",
  "data": { "totalMs": 179111,
            "timeToFirstEventMs": 6,
            "modelMs": 10579,
            "totalToolMs": 86643,
            "toolCallCount": 21,
            "askUserPausedMs": 81889,
            "slowTools": [{"tool": "ask_user", "ms": 78027}] } }

See Architecture → Agent Framework → Performance instrumentation for field meanings.

Audit events (authentication, admin actions, tool denials) are logged at "level": "audit". See Architecture → Authentication → Audit log.

Still stuck?

docker compose logs -f agent-server php nginx — watch all three at once.
curl http://localhost/agent-server/api/health — is the server up?
curl -H "Authorization: Bearer $AGENT_SERVER_ADMIN_TOKEN" http://localhost/agent-server/api/admin/models — do the configured models validate?
Read the Architecture section for the subsystem you suspect is failing.

Agent server won't start or seems unhealthy​

Agent not responding​

Tool calls fail or are refused​

Proposals​

Sessions and reconnection​

Real-time / multi-client sync​

Frontend​

Config changes not taking effect​

Reading the logs​

Still stuck?​