Skip to content

OpenAI Agents SDK

Overview

Attribute Detail
Repository openai/openai-agents-python
Stars 20,220
Language Python, TypeScript
License MIT
Last Push 2026-03-23
Maturity Production-ready
Use Case Fit Multi-agent workflows, agent handoffs, production agent systems

The OpenAI Agents SDK is a lightweight, open-source framework for building multi-agent workflows. Launched in March 2025 as the production-ready successor to the experimental Swarm project, it provides four core primitives — Agents, Handoffs, Guardrails, and Tools — that handle the key building blocks of agent systems without the overhead of heavier orchestration frameworks. While optimized for OpenAI models, it works with 100+ LLMs via the Chat Completions API.

Architecture

The SDK's architecture is intentionally minimal — a small set of composable primitives rather than a sprawling framework:

┌─────────────────────────────────────────────────┐
│                    Runner                        │
│         (Manages the agent execution loop)       │
├─────────────────────────────────────────────────┤
│                                                  │
│  ┌──────────┐  handoff()  ┌──────────────────┐  │
│  │ Triage   │────────────>│ Specialist       │  │
│  │ Agent    │             │ Agent            │  │
│  │          │  handoff()  │                  │  │
│  │          │<────────────│                  │  │
│  └──────────┘             └──────────────────┘  │
│       │                          │               │
│       ▼                          ▼               │
│  ┌──────────┐             ┌──────────────────┐  │
│  │ Tools    │             │ Guardrails       │  │
│  │ (func,   │             │ (input, output,  │  │
│  │  MCP,    │             │  tool)           │  │
│  │  hosted) │             │                  │  │
│  └──────────┘             └──────────────────┘  │
│                                                  │
│  ┌──────────────────────────────────────────┐   │
│  │              Tracing                      │   │
│  │  (Automatic spans, custom spans, export)  │   │
│  └──────────────────────────────────────────┘   │
└─────────────────────────────────────────────────┘

Core Primitives

Primitive Description
Agents Instruction-driven entities with access to tools and models. Defined by a system prompt, a model, and a set of tools.
Handoffs Native mechanism for delegating tasks between agents. One agent transfers control (and conversation context) to another specialized agent. Stays within a single run.
Guardrails Input, output, and tool-level validation. Can run in parallel with agent execution (for latency) or blocking (for cost/safety).
Tools Python/TypeScript functions auto-converted to tool schemas. Also supports hosted tools (WebSearch, FileSearch, CodeInterpreter, ImageGeneration) and MCP servers.

The Agent Loop

The Runner manages the iterative agent execution loop: the LLM receives input, decides on an action, invokes a tool, receives the output, and feeds it back for the next reasoning step — continuing until the task is complete or a stopping condition is met. This automates the core orchestration that developers would otherwise build manually.

Multi-Agent Patterns

The SDK supports two complementary patterns for multi-agent collaboration, as described in the official documentation:

  1. Handoffs — Peer-to-peer delegation. Agent A transfers the conversation to Agent B. Flexible for open-ended or conversational workflows, but can make it harder to maintain a global view.

  2. Agents as Tools — Hierarchical orchestration. A manager agent calls specialist agents as tools, retaining overall control. Keeps a single thread of control and tends to simplify coordination.

Execution Support

Mode Support
Local Full support. Runs on your own infrastructure.
Remote Client-side framework; deploy to containers, edge, or cloud.
Languages Python and TypeScript with feature parity.
Streaming Full token streaming support.
Tracing Automatic tracing of agent runs without custom instrumentation. Custom spans supported. OpenTelemetry compatible.
MCP Full Model Context Protocol integration via openai-agents-mcp package.
Models Optimized for OpenAI (GPT-4o-class), works with 100+ LLMs via Chat Completions API.

Code Example: Multi-Agent with Handoffs and Guardrails

from agents import Agent, Runner, InputGuardrail, GuardrailFunctionOutput

# Define a guardrail to block off-topic requests
async def topic_guardrail(ctx, agent, input) -> GuardrailFunctionOutput:
    # Use a fast/cheap model to check if the input is on-topic
    result = await Runner.run(
        Agent(
            name="Topic Checker",
            instructions="Return 'off_topic' if the user is asking about unrelated subjects.",
            model="gpt-4o-mini",
        ),
        input,
    )
    is_off_topic = "off_topic" in result.final_output.lower()
    return GuardrailFunctionOutput(
        output_info={"decision": result.final_output},
        tripwire_triggered=is_off_topic,
    )

# Define specialist agents
billing_agent = Agent(
    name="Billing Specialist",
    instructions="You help users with billing questions, refunds, and payment issues.",
    tools=[lookup_invoice, process_refund],
)

technical_agent = Agent(
    name="Technical Support",
    instructions="You help users debug technical issues with the product.",
    tools=[search_docs, check_status],
)

# Define triage agent with handoffs and guardrails
triage_agent = Agent(
    name="Triage",
    instructions="Route users to the appropriate specialist based on their request.",
    handoffs=[billing_agent, technical_agent],
    input_guardrails=[
        InputGuardrail(guardrail_function=topic_guardrail),
    ],
)

# Run the agent
result = await Runner.run(triage_agent, "I was charged twice for my subscription")
print(result.final_output)

Guardrails in Depth

Guardrails are a key differentiator of the Agents SDK. They operate at three levels:

Level When It Runs Scope
Input Guardrails Before the first agent processes input First agent in the chain only
Output Guardrails After the final agent produces output Last agent in the chain only
Tool Guardrails Before and after each tool invocation Every custom function-tool call

Guardrails can run in parallel (default — best latency, but agent may consume tokens before guardrail fails) or blocking (guardrail completes before agent starts — prevents wasted tokens and side effects).

from agents import function_tool, tool_input_guardrail, ToolGuardrailFunctionOutput

@tool_input_guardrail
def block_secrets(data):
    args = json.loads(data.context.tool_arguments or "{}")
    if "sk-" in json.dumps(args):
        return ToolGuardrailFunctionOutput.reject_content(
            "Remove secrets before calling this tool."
        )
    return ToolGuardrailFunctionOutput.allow()

@function_tool(tool_input_guardrails=[block_secrets])
def classify_text(text: str) -> str:
    """Classify text for internal routing."""
    return f"length:{len(text)}"

From Swarm to Agents SDK

The Agents SDK is the direct successor to OpenAI Swarm (21k stars, archived March 2025). Swarm was an educational, experimental framework that demonstrated the handoff pattern in under 100 lines of code, but was explicitly not intended for production use — no built-in memory, no tracing, no guardrails.

The Agents SDK carries forward Swarm's core handoff abstraction while adding everything needed for production:

Capability Swarm Agents SDK
Agent handoffs Yes (basic) Yes (with input filters, context management)
Guardrails No Yes (input, output, tool-level)
Tracing No Yes (automatic, OpenTelemetry-compatible)
Streaming No Yes (full token streaming)
MCP support No Yes (via openai-agents-mcp)
Hosted tools No Yes (WebSearch, FileSearch, CodeInterpreter, etc.)
TypeScript No Yes (full feature parity)
Provider support OpenAI only 100+ LLMs via Chat Completions API
Maintenance Archived Active development (pushed daily)

Ecosystem Context

In late 2025, OpenAI expanded the agent platform further:

  • AgentKit — Higher-level building blocks for orchestrating agents, complementing the Agents SDK.
  • Conversations API — Durable threads with replayable state for persistent agent conversations.
  • Connectors and MCP Servers — Standardized external context and tool access.
  • Apps SDK — Extends MCP to let developers build UIs alongside MCP servers.

Strengths and Limitations

Strengths:

  • Minimalist design — four primitives cover most agent patterns
  • Production-ready tracing out of the box (strongest among lightweight frameworks)
  • Cleanest handoff model of any framework (source)
  • Dual Python/TypeScript support with feature parity
  • Provider-agnostic (100+ LLMs) despite OpenAI optimization
  • Active daily development (20k stars, growing)
  • Strong guardrail system with input/output/tool-level validation
  • MCP integration for standardized tool access

Limitations:

  • No built-in persistent memory — developers must implement their own state management
  • Handoffs stay within a single run, limiting long-running multi-session workflows
  • Best performance requires OpenAI models; other providers may have degraded experience
  • Less flexible than LangGraph for complex graph-based workflows with conditional routing
  • No built-in human-in-the-loop patterns (must be implemented manually)
  • Newer framework — smaller ecosystem than LangGraph or CrewAI

When to Use the OpenAI Agents SDK

Choose the Agents SDK when you want a lightweight, production-ready framework for multi-agent systems with minimal abstraction overhead. It is the best choice for teams already in the OpenAI ecosystem, projects that need strong tracing and guardrails out of the box, and scenarios where the handoff pattern (triage → specialist delegation) maps naturally to the problem. For complex graph-based workflows with conditional branching and checkpointing, consider LangGraph instead.