How Threads & Persistence Work
Architecture and mental model behind CopilotKit threads — how persistent conversations work, how reconnection replays history, and what to expect from thread lifecycle operations.
What are threads?#
A thread is a persistent, server-side container for a multi-turn conversation between a user and an agent. Unlike ephemeral chat sessions that disappear when the page reloads, threads store the full event history — every message, tool call, and state change — so conversations can be paused, resumed, and replayed across sessions and devices.
Threads are a platform-level concept, not tied to any specific agent framework. Whether your backend uses LangGraph, Mastra, CrewAI, or any other framework, threads work the same way.
Key concepts#
Thread vs. Run#
A thread is the durable container. A run is a single agent execution within that thread. One thread can have many runs — each time the user sends a message and the agent responds, that's a new run. The thread accumulates events across all its runs.
How the pieces fit together#
From a developer's perspective, threads involve three things:
| What you use | What it does |
|---|---|
useThreads hook | Lists, renames, archives, and deletes threads. Pagination via hasMoreThreads / fetchMoreThreads. Stays in sync across tabs and devices via WebSocket. |
CopilotChat with threadId | Connects to a specific thread, loads its history, and streams new events in realtime. |
CopilotRuntime | Server-side layer that executes agents, stores thread data on the Enterprise Intelligence Platform, and relays events to connected clients. |
You interact with the first two. The runtime and platform handle persistence and sync behind the scenes.
Auto-naming#
When a new thread is created and the first run completes, the runtime automatically generates a short name (2–5 words) using the LLM. This runs asynchronously — it doesn't block thread creation or the agent's response. The generated name appears in useThreads via the realtime sync.
Auto-naming is enabled by default. Disable it with generateThreadNames: false on the runtime. Users can always override the generated name via renameThread().
Archive vs. delete#
Threads support two removal operations with different semantics:
- Archive — a soft delete. The thread remains stored but disappears from the default list. Show archived threads by passing
includeArchived: truetouseThreads. Threads can also be unarchived, which restores them to the active list. - Delete — permanent and irreversible. The thread and its history are removed entirely.
Neither operation has a built-in confirmation dialog — your application should implement its own if needed.
How it works#
Starting a new conversation#
When a user sends their first message on a new thread:
- Your app renders
CopilotChat(with or without athreadId— if omitted, a new thread is created automatically). - The runtime creates the thread on the Enterprise Intelligence Platform and begins executing the agent.
- Events stream back to the client via WebSocket in realtime — messages, tool calls, and state updates appear as they happen.
- Once the first run completes, the runtime auto-generates a thread name (if enabled).
Resuming a conversation#
When a user returns to an existing thread (e.g., by clicking a thread in the sidebar), the client needs to catch up on any events it missed:
CopilotChatreceives the newthreadIdand requests the thread's history from the platform.- The platform checks whether the thread has a run in progress:
- No active run — the platform returns historical events only. The client replays them to reconstruct the conversation.
- Active run — the platform returns historical events plus opens a WebSocket connection. The client replays the history, then receives live events as they stream in.
- In either case, the transition from replayed history to live updates is seamless.
Switching threads#
When the threadId prop on CopilotChat changes:
- Any active run on the current thread is detached.
- All messages and agent state are cleared.
- The new thread's history is fetched and replayed.
- A WebSocket connection is established for live updates on the new thread.
The UI briefly shows an empty chat before the history loads. This is by design — it prevents stale messages from the previous thread appearing in the new one.
Safe during tool execution: If a tool call from the old thread completes during a switch, its result is discarded rather than inserted into the new thread's messages.
Realtime sync#
The useThreads hook maintains a WebSocket subscription for thread metadata changes. When any client creates, renames, archives, or deletes a thread, the update is pushed to all connected clients automatically. This is how a thread created on one tab appears in the sidebar on another tab without polling.
Pessimistic updates#
Thread mutations (rename, archive, delete) use a pessimistic update model — the client waits for the server to confirm via WebSocket before updating the thread list. This means:
- The thread list doesn't change until the server confirms the operation
- If the server rejects the mutation, the UI never shows an incorrect state
- The returned promise resolves only after server confirmation, or rejects on failure
Error handling#
Mutation failures#
All mutation methods (renameThread, archiveThread, deleteThread) return promises that reject with an Error if the server cannot complete the operation. Common causes:
- Network failure — the client can't reach the runtime
- Thread not found — another client deleted the thread before your mutation arrived
- Authorization failure — the user doesn't have permission to modify the thread
- Timeout — the server didn't respond within 15 seconds
The error field on useThreads always reflects the most recent error. It resets to null on the next successful operation.
WebSocket disconnection#
If the WebSocket connection drops (network change, server restart, laptop sleep):
- Thread list —
useThreadsstops receiving realtime updates. The list becomes stale until the connection is re-established. Reconnection is automatic with exponential backoff. - Active conversation — if
CopilotChatloses its WebSocket mid-run, the agent's output may be interrupted. Reloading the page or switching away and back to the thread triggers the reconnection flow, which replays any missed events.
Thread locked#
If a thread already has an active run and another client tries to start a new run on the same thread, the request is rejected with a 409 Conflict. This prevents two agent runs from interleaving events on the same thread. The existing run must complete or be stopped before a new one can begin.
The runtime acquires a Redis-backed lock on the thread for the duration of each run. You can tune this behavior on the runtime:
| Option | Default | Max | Description |
|---|---|---|---|
lockTtlSeconds | 20 | 3600 (1 hour) | How long the lock is held before it expires automatically. |
lockHeartbeatIntervalSeconds | 15 | 3000 (50 min) | How often the runtime renews the lock during a run. The heartbeat always runs — you only need to adjust the interval. |
lockKeyPrefix | — | — | Custom Redis key prefix for the thread lock. Useful when multiple apps share a Redis instance. |
If a run completes normally, the lock is released immediately. The TTL is a safety net for cases where the runtime crashes without releasing the lock.
Design decisions#
Why event replay instead of message snapshots?#
Threads store the raw event stream rather than a snapshot of the final message list. This enables:
- Partial replay — when reconnecting, the client only fetches events it missed rather than reloading the entire history
- Faithful reproduction — streaming tokens, tool calls, and state changes replay exactly as they originally occurred
The trade-off is that replay is more complex than loading a message array. The platform handles this complexity so your application doesn't have to.
When threads are the wrong tool#
- Ephemeral interactions — if your users don't need conversation history (e.g., a one-shot Q&A widget), threads add unnecessary complexity. Use
CopilotChatwithout athreadId. - Client-only state — if you need local-only chat history without server persistence, manage messages in React state or localStorage instead.
Next steps#
- Step-by-step guide: Threads — set up thread management in your app
- API reference: useThreads — parameters, return values, types
- Tutorial: Build a Multi-Conversation Chat App — end-to-end walkthrough building a chat app with thread history
