Sub-Agents
Decompose work across multiple specialized agents with a visible delegation log.
What is this?#
Sub-agents are the canonical multi-agent pattern: a top-level supervisor LLM orchestrates one or more specialized sub-agents by exposing each of them as a tool. The supervisor decides what to delegate, the sub-agents do their narrow job, and their results flow back up to the supervisor's next step.
This is fundamentally the same shape as tool-calling, but each "tool" is itself a full-blown agent with its own system prompt and (often) its own tools, memory, and model.
When should I use this?#
Reach for sub-agents when a task has distinct specialized sub-tasks that each benefit from their own focus:
- Research → Write → Critique pipelines, where each stage needs a different system prompt and temperature.
- Router + specialists, where one agent classifies the request and dispatches to the right expert.
- Divide-and-conquer — any problem that fits cleanly into parallel or sequential sub-problems.
The example below uses the Research → Write → Critique shape as the canonical example.
Setting up sub-agents#
Each sub-agent is a full create_agent(...) call with its own model,
its own system prompt, and (optionally) its own tools. They don't share
memory or tools with the supervisor; the supervisor only ever sees
what the sub-agent returns.
import operator
import uuid
from typing import Annotated, Literal, TypedDict
from langchain.agents import AgentState as BaseAgentState, create_agent
from langchain.tools import ToolRuntime, tool
from langchain_core.messages import AIMessage, HumanMessage, ToolMessage
from langchain_openai import ChatOpenAI
from langgraph.types import Command
from copilotkit import CopilotKitMiddleware
# ---------------------------------------------------------------------------
# Shared state
# ---------------------------------------------------------------------------
class Delegation(TypedDict):
id: str
sub_agent: Literal["research_agent", "writing_agent", "critique_agent"]
task: str
status: Literal["completed"]
result: str
# Cap the supervisor → critique sub-agent loop at a single iteration.
# Without this, the supervisor LLM occasionally re-calls `critique_agent`
# repeatedly on the same draft (visible as stacking 🧐 cards in the
# chat). The critic only adds value once per draft, so we hard-stop
# after `_MAX_CRITIQUE_ITERATIONS` invocations and return a no-op
# result.
_MAX_CRITIQUE_ITERATIONS = 1
class AgentState(BaseAgentState):
"""Shared state. `delegations` is rendered as a live log in the UI.
`delegations` uses an `operator.add` reducer so that concurrent
sub-agent emissions in the same supervisor step accumulate into a
single list instead of conflicting (LangGraph would otherwise raise
`INVALID_CONCURRENT_GRAPH_UPDATE` — "Can receive only one value per
step. Use an Annotated key to handle multiple values.").
"""
delegations: Annotated[list[Delegation], operator.add]
# ---------------------------------------------------------------------------
# Sub-agents (real LLM agents under the hood)
# ---------------------------------------------------------------------------
# Each sub-agent is a full-fledged `create_agent(...)` with its own
# system prompt. They don't share memory or tools with the supervisor —
# the supervisor only sees their return value.
_sub_model = ChatOpenAI(model="gpt-5.4")
_research_agent = create_agent(
model=_sub_model,
tools=[],
system_prompt=(
"You are a research sub-agent. Given a topic, produce a concise "
"bulleted list of 3-5 key facts. No preamble, no closing."
),
)
_writing_agent = create_agent(
model=_sub_model,
tools=[],
system_prompt=(
"You are a writing sub-agent. Given a brief and optional source "
"facts, produce a polished 1-paragraph draft. Be clear and "
"concrete. No preamble."
),
)
_critique_agent = create_agent(
model=_sub_model,
tools=[],
system_prompt=(
"You are an editorial critique sub-agent. Given a draft, give "
"2-3 crisp, actionable critiques. No preamble."
),
)Keep sub-agent system prompts narrow and focused. The point of this pattern is that each one does one thing well. If a sub-agent needs to know the whole user context to do its job, that's a signal the boundary is wrong.
Exposing sub-agents as tools#
The supervisor delegates by calling tools. Each tool is a thin wrapper
around sub_agent.invoke(...) that:
- Runs the sub-agent synchronously on the supplied
taskstring. - Records the delegation into a
delegationsslot in shared agent state (so the UI can render a live log). - Returns the sub-agent's final message as a
ToolMessage, which the supervisor sees as a normal tool result on its next turn.
import operator
import uuid
from typing import Annotated, Literal, TypedDict
from langchain.agents import AgentState as BaseAgentState, create_agent
from langchain.tools import ToolRuntime, tool
from langchain_core.messages import AIMessage, HumanMessage, ToolMessage
from langchain_openai import ChatOpenAI
from langgraph.types import Command
from copilotkit import CopilotKitMiddleware
# ---------------------------------------------------------------------------
# Shared state
# ---------------------------------------------------------------------------
class Delegation(TypedDict):
id: str
sub_agent: Literal["research_agent", "writing_agent", "critique_agent"]
task: str
status: Literal["completed"]
result: str
# Cap the supervisor → critique sub-agent loop at a single iteration.
# Without this, the supervisor LLM occasionally re-calls `critique_agent`
# repeatedly on the same draft (visible as stacking 🧐 cards in the
# chat). The critic only adds value once per draft, so we hard-stop
# after `_MAX_CRITIQUE_ITERATIONS` invocations and return a no-op
# result.
_MAX_CRITIQUE_ITERATIONS = 1
class AgentState(BaseAgentState):
"""Shared state. `delegations` is rendered as a live log in the UI.
`delegations` uses an `operator.add` reducer so that concurrent
sub-agent emissions in the same supervisor step accumulate into a
single list instead of conflicting (LangGraph would otherwise raise
`INVALID_CONCURRENT_GRAPH_UPDATE` — "Can receive only one value per
step. Use an Annotated key to handle multiple values.").
"""
delegations: Annotated[list[Delegation], operator.add]
# ---------------------------------------------------------------------------
# Sub-agents (real LLM agents under the hood)
# ---------------------------------------------------------------------------
# Each sub-agent is a full-fledged `create_agent(...)` with its own
# system prompt. They don't share memory or tools with the supervisor —
# the supervisor only sees their return value.
_sub_model = ChatOpenAI(model="gpt-5.4")
_research_agent = create_agent(
model=_sub_model,
tools=[],
system_prompt=(
"You are a research sub-agent. Given a topic, produce a concise "
"bulleted list of 3-5 key facts. No preamble, no closing."
),
)
_writing_agent = create_agent(
model=_sub_model,
tools=[],
system_prompt=(
"You are a writing sub-agent. Given a brief and optional source "
"facts, produce a polished 1-paragraph draft. Be clear and "
"concrete. No preamble."
),
)
_critique_agent = create_agent(
model=_sub_model,
tools=[],
system_prompt=(
"You are an editorial critique sub-agent. Given a draft, give "
"2-3 crisp, actionable critiques. No preamble."
),
)
# Sentinel surfaced when a sub-agent run produces no usable text. Kept
# as a module-level constant so the harness probe (and any UI fallback)
# can match the exact phrase. The leading/trailing angle brackets keep
# it out of plausible LLM phrasing.
SUB_AGENT_EMPTY_SENTINEL = "<sub-agent produced no output>"
def _invoke_sub_agent(agent, task: str) -> str:
"""Run a sub-agent on `task` and return its final prose message."""
result = agent.invoke({"messages": [HumanMessage(content=task)]})
messages = result.get("messages", [])
# Walk newest -> oldest so we pick the answer for THIS task, not a stale
# intro. Skip empty AIMessages that only carry tool_calls.
for msg in reversed(messages):
if isinstance(msg, AIMessage):
content = msg.content
if isinstance(content, str) and content.strip():
return content
# Some providers stream content as a list of content blocks
# (e.g. {"type": "text", "text": "..."}); concatenate the text.
# The `isinstance(block.get("text"), str)` guard rejects
# `{"type": "text", "text": null}` payloads — a known provider
# quirk — that would otherwise crash `"".join(...)` with
# `TypeError: sequence item N: expected str instance, NoneType found`.
if isinstance(content, list):
parts = [
block["text"]
for block in content
if isinstance(block, dict)
and block.get("type") == "text"
and isinstance(block.get("text"), str)
]
joined = "".join(parts).strip()
if joined:
return joined
return SUB_AGENT_EMPTY_SENTINEL
def _delegation_update(
sub_agent: str,
task: str,
result: str,
tool_call_id: str,
) -> Command:
"""Append a completed delegation entry to shared state.
Returns just the new entry (a one-element list). The reducer on
`AgentState.delegations` is `operator.add`, which concatenates the
new list with the prior state — so we must NOT echo back the
existing delegations here, or they would be duplicated each step.
"""
entry: Delegation = {
"id": str(uuid.uuid4()),
"sub_agent": sub_agent, # type: ignore[typeddict-item]
"task": task,
"status": "completed",
"result": result,
}
return Command(
update={
"delegations": [entry],
"messages": [
ToolMessage(
content=result,
name=sub_agent,
id=str(uuid.uuid4()),
tool_call_id=tool_call_id,
)
],
}
)
# ---------------------------------------------------------------------------
# Supervisor tools (each tool delegates to one sub-agent)
# ---------------------------------------------------------------------------
# Each @tool wraps a sub-agent invocation. The supervisor LLM "calls"
# these tools to delegate work; each call synchronously runs the
# matching sub-agent, records the delegation into shared state, and
# returns the sub-agent's output as a ToolMessage the supervisor can
# read on its next step.
@tool
def research_agent(task: str, runtime: ToolRuntime) -> Command:
"""Delegate a research task to the research sub-agent.
Use for: gathering facts, background, definitions, statistics.
Returns a bulleted list of key facts.
"""
result = _invoke_sub_agent(_research_agent, task)
return _delegation_update("research_agent", task, result, runtime.tool_call_id)
@tool
def writing_agent(task: str, runtime: ToolRuntime) -> Command:
"""Delegate a drafting task to the writing sub-agent.
Use for: producing a polished paragraph, draft, or summary. Pass
relevant facts from prior research inside `task`.
"""
result = _invoke_sub_agent(_writing_agent, task)
return _delegation_update("writing_agent", task, result, runtime.tool_call_id)
@tool
def critique_agent(task: str, runtime: ToolRuntime) -> Command:
"""Delegate a critique task to the critique sub-agent.
Use for: reviewing a draft and suggesting concrete improvements.
Capped at `_MAX_CRITIQUE_ITERATIONS` invocations per supervisor run
— the supervisor LLM occasionally re-calls the critic in a loop and
each rerun produces near-identical output, so additional calls are
short-circuited with a no-op result that nudges the supervisor to
finish.
"""
state: AgentState = runtime.state # type: ignore[assignment]
delegations = state.get("delegations") or []
prior_critiques = sum(
1 for d in delegations if d.get("sub_agent") == "critique_agent"
)
if prior_critiques >= _MAX_CRITIQUE_ITERATIONS:
# Short-circuit without appending another delegation entry — the
# UI renders one card per delegation and we want exactly one
# critic card per supervisor run, even if the LLM ignores the
# system prompt and re-issues the call.
skip_message = (
"Critique already produced for this run. "
"Stop calling critique_agent and return your final answer "
"to the user now."
)
return Command(
update={
"messages": [
ToolMessage(
content=skip_message,
name="critique_agent",
id=str(uuid.uuid4()),
tool_call_id=runtime.tool_call_id,
)
],
}
)
result = _invoke_sub_agent(_critique_agent, task)
return _delegation_update("critique_agent", task, result, runtime.tool_call_id)
This is where CopilotKit's shared-state channel earns its keep: the
supervisor's tool calls mutate delegations as they happen, and the
frontend renders every new entry live.
Rendering a live delegation log#
On the frontend, the delegation log is just a reactive render of the
delegations slot. Subscribe with useAgent({ updates: [OnStateChanged, OnRunStatusChanged] }), read agent.state.delegations,
and render one card per entry.
/**
* Live delegation log — renders the `delegations` slot of agent state.
*
* Each entry corresponds to one invocation of a sub-agent. The list
* grows in real time as the supervisor fans work out to its children.
* The parent header shows how many sub-agents have been called and
* whether the supervisor is still running.
*/
// Fixed list of the three sub-agent roles the supervisor can call.
// Rendered as always-visible indicator chips at the top of the log
// (regardless of whether the supervisor has delegated yet) so the user
// — and the e2e suite — can see at a glance which sub-agents exist and
// which are currently active.
const INDICATOR_ROLES: ReadonlyArray<{
role: "researcher" | "writer" | "critic";
subAgent: SubAgentName;
}> = [
{ role: "researcher", subAgent: "research_agent" },
{ role: "writer", subAgent: "writing_agent" },
{ role: "critic", subAgent: "critique_agent" },
];
export function DelegationLog({ delegations, isRunning }: DelegationLogProps) {
const calledRoles = new Set<SubAgentName>(
delegations.map((d) => d.sub_agent),
);
return (
<div
data-testid="delegation-log"
className="w-full h-full flex flex-col bg-white rounded-2xl shadow-sm border border-[#DBDBE5] overflow-hidden"
>
<div className="flex items-center justify-between px-6 py-3 border-b border-[#E9E9EF] bg-[#FAFAFC]">
<div className="flex items-center gap-3">
<span className="text-lg font-semibold text-[#010507]">
Sub-agent delegations
</span>
{isRunning && (
<span
data-testid="supervisor-running"
className="inline-flex items-center gap-1.5 px-2 py-0.5 rounded-full border border-[#BEC2FF] bg-[#BEC2FF1A] text-[#010507] text-[10px] font-semibold uppercase tracking-[0.12em]"
>
<span className="w-1.5 h-1.5 rounded-full bg-[#010507] animate-pulse" />
Supervisor running
</span>
)}
</div>
<span
data-testid="delegation-count"
className="text-xs font-mono text-[#838389]"
>
{delegations.length} calls
</span>
</div>
<div
data-testid="subagent-indicators"
className="flex items-center gap-2 border-b border-[#E9E9EF] bg-white px-6 py-2"
>
{INDICATOR_ROLES.map(({ role, subAgent }) => {
const style = SUB_AGENT_STYLE[subAgent];
const fired = calledRoles.has(subAgent);
return (
<span
key={role}
data-testid={`subagent-indicator-${role}`}
data-role={role}
data-fired={fired ? "true" : "false"}
className={`inline-flex items-center gap-1 px-2 py-0.5 rounded-full text-[10px] font-semibold uppercase tracking-[0.1em] border ${style.color} ${
fired ? "" : "opacity-60"
}`}
>
<span aria-hidden>{style.emoji}</span>
<span>{style.label}</span>
</span>
);
})}
</div>
<div className="flex-1 overflow-y-auto p-4 space-y-3">
{delegations.length === 0 ? (
<p className="text-[#838389] italic text-sm">
Ask the supervisor to complete a task. Every sub-agent it calls will
appear here.
</p>
) : (
delegations.map((d, idx) => {
const style = SUB_AGENT_STYLE[d.sub_agent];
return (
<div
key={d.id}
data-testid="delegation-entry"
className="border border-[#E9E9EF] rounded-xl p-3 bg-[#FAFAFC]"
>
<div className="flex items-center justify-between mb-2">
<div className="flex items-center gap-2">
<span className="text-xs font-mono text-[#AFAFB7]">
#{idx + 1}
</span>
<span
className={`inline-flex items-center gap-1 px-2 py-0.5 rounded-full text-[10px] font-semibold uppercase tracking-[0.1em] border ${style.color}`}
>
<span>{style.emoji}</span>
<span>{style.label}</span>
</span>
</div>
<span className="text-[10px] uppercase tracking-[0.12em] font-semibold text-[#189370]">
{d.status}
</span>
</div>
<div className="text-xs text-[#57575B] mb-2">
<span className="font-semibold text-[#010507]">Task: </span>
{d.task}
</div>
<div className="text-sm text-[#010507] whitespace-pre-wrap bg-white rounded-lg p-2.5 border border-[#E9E9EF]">
{d.result}
</div>
</div>
);
})
)}
</div>
</div>
);
}The result: as the supervisor fans work out to its sub-agents, the log grows in real time, giving the user visibility into a process that would otherwise be a long opaque spinner.
Related#
- Shared State — the channel that makes the delegation log live.
- State streaming — stream individual sub-agent outputs token-by-token inside each log entry.
