How Threads & Persistence Work

Architecture and mental model behind CopilotKit threads — how persistent conversations work, how reconnection replays history, and what to expect from thread lifecycle operations.

Info

Early access: Threads and the Enterprise Intelligence Platform are in early access. APIs may change before general availability.

Want to see threads in your own app?

Persistent threads ship with the Enterprise Intelligence Platform on the free Developer tier.

Get Intelligence free

What are threads?#

A thread is a persistent, server-side container for a multi-turn conversation between a user and an agent. Unlike ephemeral chat sessions that disappear when the page reloads, threads store the full event history — every message, tool call, and state change — so conversations can be paused, resumed, and replayed across sessions and devices.

Threads are a platform-level concept, not tied to any specific agent framework. Whether your backend uses LangGraph, Mastra, CrewAI, or any other framework, threads work the same way.

Key concepts#

Thread vs. Run#

A thread is the durable container. A run is a single agent execution within that thread. One thread can have many runs — each time the user sends a message and the agent responds, that's a new run. The thread accumulates events across all its runs.

How the pieces fit together#

From a developer's perspective, threads involve three things:

What you use	What it does
`useThreads` hook	Lists, renames, archives, and deletes threads. Pagination via `hasMoreThreads` / `fetchMoreThreads`. Stays in sync across tabs and devices via WebSocket.
`CopilotChat` with `threadId`	Connects to a specific thread, loads its history, and streams new events in realtime.
`CopilotRuntime`	Server-side layer that executes agents, stores thread data on the Enterprise Intelligence Platform, and relays events to connected clients.

You interact with the first two. The runtime and platform handle persistence and sync behind the scenes.

Auto-naming#

When a new thread is created and the first run completes, the runtime automatically generates a short name (2–5 words) using the LLM. This runs asynchronously — it doesn't block thread creation or the agent's response. The generated name appears in useThreads via the realtime sync.

Auto-naming is enabled by default. Disable it with generateThreadNames: false on the runtime. Users can always override the generated name via renameThread().

Archive vs. delete#

Threads support two removal operations with different semantics:

Archive — a soft delete. The thread remains stored but disappears from the default list. Show archived threads by passing includeArchived: true to useThreads. Threads can also be unarchived, which restores them to the active list.
Delete — permanent and irreversible. The thread and its history are removed entirely.

Neither operation has a built-in confirmation dialog — your application should implement its own if needed.

How it works#

Starting a new conversation#

When a user sends their first message on a new thread:

Your app renders CopilotChat (with or without a threadId — if omitted, a new thread is created automatically).
The runtime creates the thread on the Enterprise Intelligence Platform and begins executing the agent.
Events stream back to the client via WebSocket in realtime — messages, tool calls, and state updates appear as they happen.
Once the first run completes, the runtime auto-generates a thread name (if enabled).

Resuming a conversation#

When a user returns to an existing thread (e.g., by clicking a thread in the sidebar), the client needs to catch up on any events it missed:

CopilotChat receives the new threadId and requests the thread's history from the platform.
The platform checks whether the thread has a run in progress:
- No active run — the platform returns historical events only. The client replays them to reconstruct the conversation.
- Active run — the platform returns historical events plus opens a WebSocket connection. The client replays the history, then receives live events as they stream in.
In either case, the transition from replayed history to live updates is seamless.

Switching threads#

When the threadId prop on CopilotChat changes:

Any active run on the current thread is detached.
All messages and agent state are cleared.
The new thread's history is fetched and replayed.
A WebSocket connection is established for live updates on the new thread.

The UI briefly shows an empty chat before the history loads. This is by design — it prevents stale messages from the previous thread appearing in the new one.

Safe during tool execution: If a tool call from the old thread completes during a switch, its result is discarded rather than inserted into the new thread's messages.

Realtime sync#

The useThreads hook maintains a WebSocket subscription for thread metadata changes. When any client creates, renames, archives, or deletes a thread, the update is pushed to all connected clients automatically. This is how a thread created on one tab appears in the sidebar on another tab without polling.

Pessimistic updates#

Thread mutations (rename, archive, delete) use a pessimistic update model — the client waits for the server to confirm via WebSocket before updating the thread list. This means:

The thread list doesn't change until the server confirms the operation
If the server rejects the mutation, the UI never shows an incorrect state
The returned promise resolves only after server confirmation, or rejects on failure

Error handling#

Mutation failures#

All mutation methods (renameThread, archiveThread, deleteThread) return promises that reject with an Error if the server cannot complete the operation. Common causes:

Network failure — the client can't reach the runtime
Thread not found — another client deleted the thread before your mutation arrived
Authorization failure — the user doesn't have permission to modify the thread
Timeout — the server didn't respond within 15 seconds

The error field on useThreads always reflects the most recent error. It resets to null on the next successful operation.

WebSocket disconnection#

If the WebSocket connection drops (network change, server restart, laptop sleep):

Thread list — useThreads stops receiving realtime updates. The list becomes stale until the connection is re-established. Reconnection is automatic with exponential backoff.
Active conversation — if CopilotChat loses its WebSocket mid-run, the agent's output may be interrupted. Reloading the page or switching away and back to the thread triggers the reconnection flow, which replays any missed events.

Thread locked#

If a thread already has an active run and another client tries to start a new run on the same thread, the request is rejected with a 409 Conflict. This prevents two agent runs from interleaving events on the same thread. The existing run must complete or be stopped before a new one can begin.

The runtime acquires a Redis-backed lock on the thread for the duration of each run. You can tune this behavior on the runtime:

Option	Default	Max	Description
`lockTtlSeconds`	`20`	`3600` (1 hour)	How long the lock is held before it expires automatically.
`lockHeartbeatIntervalSeconds`	`15`	`3000` (50 min)	How often the runtime renews the lock during a run. The heartbeat always runs — you only need to adjust the interval.
`lockKeyPrefix`	—	—	Custom Redis key prefix for the thread lock. Useful when multiple apps share a Redis instance.

If a run completes normally, the lock is released immediately. The TTL is a safety net for cases where the runtime crashes without releasing the lock.

Design decisions#

Why event replay instead of message snapshots?#

Threads store the raw event stream rather than a snapshot of the final message list. This enables:

Partial replay — when reconnecting, the client only fetches events it missed rather than reloading the entire history
Faithful reproduction — streaming tokens, tool calls, and state changes replay exactly as they originally occurred

The trade-off is that replay is more complex than loading a message array. The platform handles this complexity so your application doesn't have to.

When threads are the wrong tool#

Ephemeral interactions — if your users don't need conversation history (e.g., a one-shot Q&A widget), threads add unnecessary complexity. Use CopilotChat without a threadId.
Client-only state — if you need local-only chat history without server persistence, manage messages in React state or localStorage instead.

Next steps#

Step-by-step guide: Threads — set up thread management in your app
API reference: useThreads — parameters, return values, types
Tutorial: Build a Multi-Conversation Chat App — end-to-end walkthrough building a chat app with thread history

How Threads & Persistence Work

Architecture and mental model behind CopilotKit threads — how persistent conversations work, how reconnection replays history, and what to expect from thread lifecycle operations.

Info

Early access: Threads and the Enterprise Intelligence Platform are in early access. APIs may change before general availability.

Want to see threads in your own app?

Persistent threads ship with the Enterprise Intelligence Platform on the free Developer tier.

Get Intelligence free

What are threads?#

Threads are a platform-level concept, not tied to any specific agent framework. Whether your backend uses LangGraph, Mastra, CrewAI, or any other framework, threads work the same way.

Key concepts#

Thread vs. Run#

How the pieces fit together#

From a developer's perspective, threads involve three things:

What you use	What it does
`useThreads` hook	Lists, renames, archives, and deletes threads. Pagination via `hasMoreThreads` / `fetchMoreThreads`. Stays in sync across tabs and devices via WebSocket.
`CopilotChat` with `threadId`	Connects to a specific thread, loads its history, and streams new events in realtime.
`CopilotRuntime`	Server-side layer that executes agents, stores thread data on the Enterprise Intelligence Platform, and relays events to connected clients.

You interact with the first two. The runtime and platform handle persistence and sync behind the scenes.

Auto-naming#

Auto-naming is enabled by default. Disable it with generateThreadNames: false on the runtime. Users can always override the generated name via renameThread().

Archive vs. delete#

Threads support two removal operations with different semantics:

Archive — a soft delete. The thread remains stored but disappears from the default list. Show archived threads by passing includeArchived: true to useThreads. Threads can also be unarchived, which restores them to the active list.
Delete — permanent and irreversible. The thread and its history are removed entirely.

Neither operation has a built-in confirmation dialog — your application should implement its own if needed.

How it works#

Starting a new conversation#

When a user sends their first message on a new thread:

Your app renders CopilotChat (with or without a threadId — if omitted, a new thread is created automatically).
The runtime creates the thread on the Enterprise Intelligence Platform and begins executing the agent.
Events stream back to the client via WebSocket in realtime — messages, tool calls, and state updates appear as they happen.
Once the first run completes, the runtime auto-generates a thread name (if enabled).

Resuming a conversation#

When a user returns to an existing thread (e.g., by clicking a thread in the sidebar), the client needs to catch up on any events it missed:

CopilotChat receives the new threadId and requests the thread's history from the platform.
The platform checks whether the thread has a run in progress:
- No active run — the platform returns historical events only. The client replays them to reconstruct the conversation.
- Active run — the platform returns historical events plus opens a WebSocket connection. The client replays the history, then receives live events as they stream in.
In either case, the transition from replayed history to live updates is seamless.

Switching threads#

When the threadId prop on CopilotChat changes:

Any active run on the current thread is detached.
All messages and agent state are cleared.
The new thread's history is fetched and replayed.
A WebSocket connection is established for live updates on the new thread.

The UI briefly shows an empty chat before the history loads. This is by design — it prevents stale messages from the previous thread appearing in the new one.

Safe during tool execution: If a tool call from the old thread completes during a switch, its result is discarded rather than inserted into the new thread's messages.

Realtime sync#

Pessimistic updates#

Thread mutations (rename, archive, delete) use a pessimistic update model — the client waits for the server to confirm via WebSocket before updating the thread list. This means:

The thread list doesn't change until the server confirms the operation
If the server rejects the mutation, the UI never shows an incorrect state
The returned promise resolves only after server confirmation, or rejects on failure

Error handling#

Mutation failures#

All mutation methods (renameThread, archiveThread, deleteThread) return promises that reject with an Error if the server cannot complete the operation. Common causes:

Network failure — the client can't reach the runtime
Thread not found — another client deleted the thread before your mutation arrived
Authorization failure — the user doesn't have permission to modify the thread
Timeout — the server didn't respond within 15 seconds

The error field on useThreads always reflects the most recent error. It resets to null on the next successful operation.

WebSocket disconnection#

If the WebSocket connection drops (network change, server restart, laptop sleep):

Thread list — useThreads stops receiving realtime updates. The list becomes stale until the connection is re-established. Reconnection is automatic with exponential backoff.
Active conversation — if CopilotChat loses its WebSocket mid-run, the agent's output may be interrupted. Reloading the page or switching away and back to the thread triggers the reconnection flow, which replays any missed events.

Thread locked#

The runtime acquires a Redis-backed lock on the thread for the duration of each run. You can tune this behavior on the runtime:

Option	Default	Max	Description
`lockTtlSeconds`	`20`	`3600` (1 hour)	How long the lock is held before it expires automatically.
`lockHeartbeatIntervalSeconds`	`15`	`3000` (50 min)	How often the runtime renews the lock during a run. The heartbeat always runs — you only need to adjust the interval.
`lockKeyPrefix`	—	—	Custom Redis key prefix for the thread lock. Useful when multiple apps share a Redis instance.

If a run completes normally, the lock is released immediately. The TTL is a safety net for cases where the runtime crashes without releasing the lock.

Design decisions#

Why event replay instead of message snapshots?#

Threads store the raw event stream rather than a snapshot of the final message list. This enables:

Partial replay — when reconnecting, the client only fetches events it missed rather than reloading the entire history
Faithful reproduction — streaming tokens, tool calls, and state changes replay exactly as they originally occurred

The trade-off is that replay is more complex than loading a message array. The platform handles this complexity so your application doesn't have to.

When threads are the wrong tool#

Ephemeral interactions — if your users don't need conversation history (e.g., a one-shot Q&A widget), threads add unnecessary complexity. Use CopilotChat without a threadId.
Client-only state — if you need local-only chat history without server persistence, manage messages in React state or localStorage instead.

Next steps#

Step-by-step guide: Threads — set up thread management in your app
API reference: useThreads — parameters, return values, types
Tutorial: Build a Multi-Conversation Chat App — end-to-end walkthrough building a chat app with thread history