Skip to main content
Version: 2026.1

Troubleshooting

Symptoms, likely causes, and where to look. Start with the first matching row; the "diagnose" column tells you which signal rules the cause in or out.

Agent server won't start or seems unhealthy

SymptomLikely causeDiagnose
Container restarts repeatedlyMissing required env var (e.g. AGENT_SERVER_ADMIN_TOKEN, BYOK API key)docker compose logs agent-server — error lines name the missing variable.
Startup succeeds but /agent-server/api/health returns 500Session store cannot reach Pimcore DBCheck PHP container is up; docker compose logs php.
/agent-server/api/agents returns [] after a fresh startInitial registry fetch failed; background retry is runningLogs show Agent registry retry on an exponential schedule (5 s → 10 s → … → 60 s). Wait, or hit the manual reload endpoint.
Running with a BYOK provider, dropdown shows no modelsavailable_models not defined in the provider blockAdd an available_models map to the provider in pimcore_agent.inference.providers.<name>. See Inference Providers.
Session creation fails with "unknown provider" errorAgent's provider: field references a name not in the providers mapCheck that the provider name in the agent YAML matches a key in pimcore_agent.inference.providers. Provider names are case-sensitive.

See Architecture → Configuration System → Reload & recovery paths for the retry schedule and manual reload command.

Agent not responding

SymptomLikely causeDiagnose
No text appears after sending a messageTask is still running, SSE stream is live but slowdocker compose logs -f agent-server — look for Task started / timing entries. Raise AGENT_SERVER_LOG_LEVEL=debug for per-event detail.
HTTP 409 on POST /chat/:sessionIdA task for this session is already runningEither wait for completion, or POST /chat/:sessionId/cancel the current task.
HTTP 401 on every agent-server callSession cookie is missing or invalidRe-log-in to Pimcore Studio. See Architecture → Authentication.
Anthropic BYOK mode, response appears all-at-once instead of streamingKnown SDK limitation — no message_delta events from AnthropicExpected behaviour, tracked in copilot-sdk#637. UI visibly streams because the server synthesises deltas, but wall-clock is end-of-response.
streamThinking: true ignored in BYOK modeSDK does not expose Anthropic thinking events in BYOKExpected, no workaround.

Tool calls fail or are refused

SymptomLikely causeDiagnose
This tool is not available in the agent environmentAlways-denied tool (bash, task, etc.)Expected — see Architecture → Tool Security. Not recoverable, not configurable.
Access denied: path is outside the allowed scopePath sandbox violation (SDK file tool)Confirm the path is inside /app/uploads/{sessionId}/{uploaded,staged}/ or a specific file in /tmp/. Directory listings on /tmp/ are denied.
MCP tool returns permission errorPimcore user permissionsThe agent runs as the logged-in user. Grant permissions in Pimcore user settings.
Chat-scoped tool (stage_asset, propose_*, …) returns "no chat session context"Tool was called outside the agent-server chat flow (e.g. directly from a PAT-authenticated MCP client)Expected — chat-scoped tools require the bearer-authenticated request that binds the chat session id. They cannot be called from stand-alone MCP clients.
Adding a new MCP tool — tool not found at runtimeCompiler pass hasn't picked up the tagdocker compose exec php bin/console cache:clear and restart the PHP container.
First tool call after a long idle returns "Session not found" / 404 from an MCP server, then the next user turn works againUpstream MCP transport session was garbage-collected; the agent-server detects it and resets the SDK session on the next turnExpected — see Architecture → MCP Integration → Transport-session recovery. The user only sees a small recovery delay; conversation history is preserved.

Proposals

SymptomLikely causeDiagnose
Proposal widget shows "Proposal not found" on approveStored payload was lost (session deleted?)Check bundle_agent_proposal_statuses. Sessions cascade-delete proposals.
Proposal approve fails with a permission errorUser permissions changed between propose and approveExpected — resolvers re-check permissions. Reject, fix permissions, re-prompt the agent.
Proposal approve fails with "stale data"The element was modified after proposal creationExpected — modificationDate mismatch prevents silent overwrites. Reject, re-prompt the agent with the latest state.
Proposal card renders with empty element pathsBulk fetch failed or returned incomplete dataCheck /pimcore-studio/api/bundle/agent/proposals/{sid}/data in DevTools. The bundle-fetched payload is the single source of truth; LLM-supplied metadata is ignored intentionally.

See Features → HITL Proposals for the expected lifecycle and Extending → Custom Proposal Types for custom flows.

Sessions and reconnection

SymptomLikely causeDiagnose
Sessions disappear after container restartSession data is stored in Pimcore DB — not the agent-serverVerify DB connectivity and that the bundle is installed (pimcore:bundle:install PimcoreAgentBundle).
After a container recreate, resuming an existing chat makes the agent "start over" — re-runs the same tool calls, ignores earlier resultsThe Copilot runtime session store (events.jsonl, session.db, checkpoints) is not on a durable mount, so it was wiped and resumeSession reloaded an empty conversation. The PHP chat transcript still shows the old messages, but the model lost its working context.Confirm AGENT_SERVER_COPILOT_STATE_DIR (default /app/.copilot-state) is bind-mounted (./var/tmp/copilot-state) and that events.jsonl files appear under it — not in the container's ~/.copilot. See Session Storage → Copilot runtime session state.
Reconnect to /stream?seq=N returns 204No active task for that sessionThe task finished before you reconnected. This seq-based endpoint is server-to-server / eval-CLI only; the browser uses GET /sessions/:id catch-up instead.
Reconnect replays nothingTaskRunner in-memory buffer TTL expired (5 min after completion)The buffer is only the live tail; fetch the record via GET /sessions/:id — the assistant message was persisted incrementally + finalized by onComplete.
Internal MCP calls return 401 mid-conversationBearer was reminted but the cached SDK session still has the old one baked inThe next user turn auto-rebuilds the SDK session (tokenReminted: true). If it persists, check pimcore_agent.chat_session_token.ttl and confirm the maintenance task isn't GC-ing rows mid-turn.
Long overnight run completed but result missing from chatShould not occur after the mcp-token-authentication change (bearer-bound persistence survives cookie expiry). If it does: (1) check security.yaml has pimcore_agent_bundle_api: '%pimcore_agent.bundle_api_firewall_settings%' placed before the pimcore_studio firewall; (2) check AGENT_SERVER_MCP_TOKEN_TTL matches pimcore_agent.chat_session_token.ttl; (3) check agent-server logs for Token refresh tick failed warnings (the server-driven refresh timer fires every max(60s, ttl/2)).

Real-time / multi-client sync

SymptomLikely causeDiagnose
Live updates do not arrive on the first login after a fresh auth (work in another tab is invisible until reload)Studio's GlobalMessageBus opened its Mercure subscription before the Mercure cookie was set, so the hub did not authorise the user topic for private deliveryReload the page — it re-runs fetchMercureCookie() and the subscription is re-authorised for the user topic. PHP catch-up still reconstructs the session, so no data is lost.
No live updates on any tab, but chat works and reload shows the resultMercure publisher disabled — MERCURE_JWT_KEY unset/blankLogs show MERCURE_JWT_KEY not set — live cross-client chat sync disabled at startup. Set the shared key (≥ 32 chars; the same secret the hub validates against) and forward it into the agent-server service. See Architecture → Real-time Sync.
Logs show Mercure publish non-OK with status 401Publisher JWT rejected by the hub — blank/short/mismatched MERCURE_JWT_KEY, or a wrong publish selectorThe key must be ≥ 32 chars and identical to the hub's MERCURE_PUBLISHER_JWT_KEY. The publish selector is the URI Template studio-backend-default/user/{id} (a trailing-/* glob is rejected 401).
A reopened session shows an assistant bubble stuck "streaming" foreverEither an agent-server restart interrupted the turn (out of scope to recover — surfaced as interrupted), or the terminal complete flush never reached PHPCheck agent-server logs around the turn for Incremental message flush failed / Persist sink finalize failed. A reconnect-refetch (reload, or online/visibilitychange) re-reads PHP.

Frontend

SymptomLikely causeDiagnose
Frontend plugin does not appear in StudioBuild output not picked upnpm run build in assets/, then bin/console cache:clear.
Widget renders as plain textWidget type not registered in the renderer registryVerify container.get('AgentChat/RichChatWidgetRegistry').register(...) ran.
SSE stream closes earlyNginx bufferingNginx needs proxy_buffering off and chunked_transfer_encoding off on the /agent-server/api/ location. See Installation → Configure the Nginx Proxy.

Config changes not taking effect

SymptomLikely causeDiagnose
Agent YAML edit not visible after saveNo reload was triggeredStudio UI auto-reloads on save (AgentServerProxyService::triggerReload()). For code-level edits, POST /agent-server/api/admin/reload-agents.
Added a new pimcore_agent.agents.paths entry — presets still missingPath list is a compiled container parameterdocker compose exec php bin/console cache:clear. Subsequent edits at an already-registered path do not need cache:clear.
Skill content change not visibleAgent reload requiredHit the reload endpoint. Skill files are materialized on every reload.
Env var change not picked upenv_file is read only at container startdocker compose restart agent-server.

Reading the logs

Every request logs a timing summary at info:

{ "msg": "Request timing summary",
"data": { "totalMs": 179111,
"timeToFirstEventMs": 6,
"modelMs": 10579,
"totalToolMs": 86643,
"toolCallCount": 21,
"askUserPausedMs": 81889,
"slowTools": [{"tool": "ask_user", "ms": 78027}] } }

See Architecture → Agent Framework → Performance instrumentation for field meanings.

Audit events (authentication, admin actions, tool denials) are logged at "level": "audit". See Architecture → Authentication → Audit log.

Still stuck?

  • docker compose logs -f agent-server php nginx — watch all three at once.
  • curl http://localhost/agent-server/api/health — is the server up?
  • curl -H "Authorization: Bearer $AGENT_SERVER_ADMIN_TOKEN" http://localhost/agent-server/api/admin/models — do the configured models validate?
  • Read the Architecture section for the subsystem you suspect is failing.