OpenAI Agents SDK¶

Overview¶

Attribute	Detail
Repository	openai/openai-agents-python
Stars	20,220
Language	Python, TypeScript
License	MIT
Last Push	2026-03-23
Maturity	Production-ready
Use Case Fit	Multi-agent workflows, agent handoffs, production agent systems

The OpenAI Agents SDK is a lightweight, open-source framework for building multi-agent workflows. Launched in March 2025 as the production-ready successor to the experimental Swarm project, it provides four core primitives — Agents, Handoffs, Guardrails, and Tools — that handle the key building blocks of agent systems without the overhead of heavier orchestration frameworks. While optimized for OpenAI models, it works with 100+ LLMs via the Chat Completions API.

Architecture¶

The SDK's architecture is intentionally minimal — a small set of composable primitives rather than a sprawling framework:

┌─────────────────────────────────────────────────┐
│                    Runner                        │
│         (Manages the agent execution loop)       │
├─────────────────────────────────────────────────┤
│                                                  │
│  ┌──────────┐  handoff()  ┌──────────────────┐  │
│  │ Triage   │────────────>│ Specialist       │  │
│  │ Agent    │             │ Agent            │  │
│  │          │  handoff()  │                  │  │
│  │          │<────────────│                  │  │
│  └──────────┘             └──────────────────┘  │
│       │                          │               │
│       ▼                          ▼               │
│  ┌──────────┐             ┌──────────────────┐  │
│  │ Tools    │             │ Guardrails       │  │
│  │ (func,   │             │ (input, output,  │  │
│  │  MCP,    │             │  tool)           │  │
│  │  hosted) │             │                  │  │
│  └──────────┘             └──────────────────┘  │
│                                                  │
│  ┌──────────────────────────────────────────┐   │
│  │              Tracing                      │   │
│  │  (Automatic spans, custom spans, export)  │   │
│  └──────────────────────────────────────────┘   │
└─────────────────────────────────────────────────┘

Core Primitives¶

Primitive	Description
Agents	Instruction-driven entities with access to tools and models. Defined by a system prompt, a model, and a set of tools.
Handoffs	Native mechanism for delegating tasks between agents. One agent transfers control (and conversation context) to another specialized agent. Stays within a single run.
Guardrails	Input, output, and tool-level validation. Can run in parallel with agent execution (for latency) or blocking (for cost/safety).
Tools	Python/TypeScript functions auto-converted to tool schemas. Also supports hosted tools (WebSearch, FileSearch, CodeInterpreter, ImageGeneration) and MCP servers.

The Agent Loop¶

The Runner manages the iterative agent execution loop: the LLM receives input, decides on an action, invokes a tool, receives the output, and feeds it back for the next reasoning step — continuing until the task is complete or a stopping condition is met. This automates the core orchestration that developers would otherwise build manually.

Multi-Agent Patterns¶

The SDK supports two complementary patterns for multi-agent collaboration, as described in the official documentation:

Handoffs — Peer-to-peer delegation. Agent A transfers the conversation to Agent B. Flexible for open-ended or conversational workflows, but can make it harder to maintain a global view.
Agents as Tools — Hierarchical orchestration. A manager agent calls specialist agents as tools, retaining overall control. Keeps a single thread of control and tends to simplify coordination.

Execution Support¶

Mode	Support
Local	Full support. Runs on your own infrastructure.
Remote	Client-side framework; deploy to containers, edge, or cloud.
Languages	Python and TypeScript with feature parity.
Streaming	Full token streaming support.
Tracing	Automatic tracing of agent runs without custom instrumentation. Custom spans supported. OpenTelemetry compatible.
MCP	Full Model Context Protocol integration via `openai-agents-mcp` package.
Models	Optimized for OpenAI (GPT-4o-class), works with 100+ LLMs via Chat Completions API.

Code Example: Multi-Agent with Handoffs and Guardrails¶

from agents import Agent, Runner, InputGuardrail, GuardrailFunctionOutput

# Define a guardrail to block off-topic requests
async def topic_guardrail(ctx, agent, input) -> GuardrailFunctionOutput:
    # Use a fast/cheap model to check if the input is on-topic
    result = await Runner.run(
        Agent(
            name="Topic Checker",
            instructions="Return 'off_topic' if the user is asking about unrelated subjects.",
            model="gpt-4o-mini",
        ),
        input,
    )
    is_off_topic = "off_topic" in result.final_output.lower()
    return GuardrailFunctionOutput(
        output_info={"decision": result.final_output},
        tripwire_triggered=is_off_topic,
    )

# Define specialist agents
billing_agent = Agent(
    name="Billing Specialist",
    instructions="You help users with billing questions, refunds, and payment issues.",
    tools=[lookup_invoice, process_refund],
)

technical_agent = Agent(
    name="Technical Support",
    instructions="You help users debug technical issues with the product.",
    tools=[search_docs, check_status],
)

# Define triage agent with handoffs and guardrails
triage_agent = Agent(
    name="Triage",
    instructions="Route users to the appropriate specialist based on their request.",
    handoffs=[billing_agent, technical_agent],
    input_guardrails=[
        InputGuardrail(guardrail_function=topic_guardrail),
    ],
)

# Run the agent
result = await Runner.run(triage_agent, "I was charged twice for my subscription")
print(result.final_output)

Guardrails in Depth¶

Guardrails are a key differentiator of the Agents SDK. They operate at three levels:

Level	When It Runs	Scope
Input Guardrails	Before the first agent processes input	First agent in the chain only
Output Guardrails	After the final agent produces output	Last agent in the chain only
Tool Guardrails	Before and after each tool invocation	Every custom function-tool call

Guardrails can run in parallel (default — best latency, but agent may consume tokens before guardrail fails) or blocking (guardrail completes before agent starts — prevents wasted tokens and side effects).

from agents import function_tool, tool_input_guardrail, ToolGuardrailFunctionOutput

@tool_input_guardrail
def block_secrets(data):
    args = json.loads(data.context.tool_arguments or "{}")
    if "sk-" in json.dumps(args):
        return ToolGuardrailFunctionOutput.reject_content(
            "Remove secrets before calling this tool."
        )
    return ToolGuardrailFunctionOutput.allow()

@function_tool(tool_input_guardrails=[block_secrets])
def classify_text(text: str) -> str:
    """Classify text for internal routing."""
    return f"length:{len(text)}"

From Swarm to Agents SDK¶

The Agents SDK is the direct successor to OpenAI Swarm (21k stars, archived March 2025). Swarm was an educational, experimental framework that demonstrated the handoff pattern in under 100 lines of code, but was explicitly not intended for production use — no built-in memory, no tracing, no guardrails.

The Agents SDK carries forward Swarm's core handoff abstraction while adding everything needed for production:

Capability	Swarm	Agents SDK
Agent handoffs	Yes (basic)	Yes (with input filters, context management)
Guardrails	No	Yes (input, output, tool-level)
Tracing	No	Yes (automatic, OpenTelemetry-compatible)
Streaming	No	Yes (full token streaming)
MCP support	No	Yes (via `openai-agents-mcp`)
Hosted tools	No	Yes (WebSearch, FileSearch, CodeInterpreter, etc.)
TypeScript	No	Yes (full feature parity)
Provider support	OpenAI only	100+ LLMs via Chat Completions API
Maintenance	Archived	Active development (pushed daily)

Ecosystem Context¶

In late 2025, OpenAI expanded the agent platform further:

AgentKit — Higher-level building blocks for orchestrating agents, complementing the Agents SDK.
Conversations API — Durable threads with replayable state for persistent agent conversations.
Connectors and MCP Servers — Standardized external context and tool access.
Apps SDK — Extends MCP to let developers build UIs alongside MCP servers.

Strengths and Limitations¶

Strengths:

Minimalist design — four primitives cover most agent patterns
Production-ready tracing out of the box (strongest among lightweight frameworks)
Cleanest handoff model of any framework (source)
Dual Python/TypeScript support with feature parity
Provider-agnostic (100+ LLMs) despite OpenAI optimization
Active daily development (20k stars, growing)
Strong guardrail system with input/output/tool-level validation
MCP integration for standardized tool access

Limitations:

No built-in persistent memory — developers must implement their own state management
Handoffs stay within a single run, limiting long-running multi-session workflows
Best performance requires OpenAI models; other providers may have degraded experience
Less flexible than LangGraph for complex graph-based workflows with conditional routing
No built-in human-in-the-loop patterns (must be implemented manually)
Newer framework — smaller ecosystem than LangGraph or CrewAI

When to Use the OpenAI Agents SDK¶

Choose the Agents SDK when you want a lightweight, production-ready framework for multi-agent systems with minimal abstraction overhead. It is the best choice for teams already in the OpenAI ecosystem, projects that need strong tracing and guardrails out of the box, and scenarios where the handoff pattern (triage → specialist delegation) maps naturally to the problem. For complex graph-based workflows with conditional branching and checkpointing, consider LangGraph instead.