SWE-agent¶
Overview¶
| Attribute | Detail |
|---|---|
| Repository | SWE-agent/SWE-agent |
| Stars | 18,821 |
| Language | Python |
| License | MIT |
| Last Push | 2026-03-23 |
| Maturity | Production-ready (research-originated) |
| Use Case Fit | Automated bug fixing, issue resolution, cybersecurity |
SWE-agent is a system that enables LLM agents to autonomously solve software engineering tasks by interacting with codebases through a custom Agent-Computer Interface (ACI). Published at NeurIPS 2024, it pioneered the idea that LLM agents are a new category of end users that benefit from specially designed interfaces to the software they use.
Architecture¶
┌─────────────────────────────────────┐
│ SWE-agent │
│ │
│ ┌──────────┐ ┌───────────────┐ │
│ │ LLM │───>│ Agent-Comp. │ │
│ │ (Brain) │ │ Interface │ │
│ └──────────┘ │ (ACI) │ │
│ └───────┬───────┘ │
│ │ │
│ ┌────────────────┼─────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────┐ ┌──────┐ │
│ │File Edit │ │Repo │ │Test │ │
│ │Commands │ │Nav │ │Exec │ │
│ └──────────┘ └──────┘ └──────┘ │
└─────────────────────────────────────┘
Agent-Computer Interface (ACI)¶
The ACI is SWE-agent's key contribution. Rather than giving the LLM raw bash access, it provides a curated set of commands optimized for how LLMs interact with code:
- File operations: Create, edit, view files with LLM-friendly output formatting
- Repository navigation: Search across files, jump to definitions, explore directory structures
- Test execution: Run tests and parse results in a structured format
- Contextual feedback: Error messages and outputs formatted for LLM comprehension
The ACI design significantly impacts performance — the paper shows that interface design choices can swing benchmark scores by 20+ percentage points.
Benchmarks¶
| Benchmark | Score | Notes |
|---|---|---|
| SWE-bench Lite | Strong baseline | First major autonomous SE agent |
| SWE-bench Verified | Competitive | Via Live-SWE-agent variant (77.4%) |
| HumanEvalFix | Evaluated | Code repair tasks |
SWE-agent Variants¶
- SWE-agent (original): The NeurIPS 2024 system. Custom ACI for code interaction.
- SWE-agent-LM-32B: Trained on SWE-smith data (50k instances from 128 repos). Achieves 40.2% on SWE-bench Verified — SOTA among open-source models.
- Live-SWE-agent: Self-evolving variant that autonomously improves its own scaffold at runtime. 77.4% on SWE-bench Verified — outperforms all existing agents including proprietary solutions.
Execution Support¶
| Mode | Support |
|---|---|
| Local | Full support. Runs locally with Docker. |
| Remote | Docker-based sandboxed execution. |
| Models | Model-agnostic — works with any LLM via API. |
| Security | Also used for offensive cybersecurity research. |
Code Example¶
# Install SWE-agent
pip install swe-agent
# Run on a GitHub issue
swe-agent run \
--model gpt-4 \
--data-path https://github.com/user/repo/issues/42 \
--config-file config/default.yaml
# Run on a local repository
swe-agent run \
--model claude-3-sonnet \
--repo-path /path/to/repo \
--problem-statement "Fix the authentication bypass in login.py"
Key Papers¶
- SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering (Yang et al., NeurIPS 2024). The foundational paper introducing ACI design.
- SWE-smith: Scaling Data for Software Engineering Agents (Yang et al., NeurIPS 2025). Data generation pipeline; SWE-agent-LM-32B achieves 40.2% SOTA.
- Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly? (Xia et al., 2025). Self-evolving agent achieving 77.4% on SWE-bench Verified.
- SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution (Li et al., 2025). Multi-agent debate + MCTS for patch generation.
Related: Agentless¶
Agentless (Xia et al., 2024) is a counterpoint to SWE-agent's approach. It uses a simple three-phase process — localize, repair, validate — without autonomous agent behavior. Despite its simplicity, Agentless achieved competitive results at $0.70 per issue and was adopted by OpenAI and DeepSeek for model evaluation.
Strengths and Limitations¶
Strengths:
- Pioneered ACI design — purpose-built interfaces for LLM agents
- Strong academic grounding (NeurIPS publications)
- Active research ecosystem (SWE-smith, Live-SWE-agent, SWE-Debate)
- Model-agnostic
- Also applicable to cybersecurity and competitive coding
- Open-source training pipeline (SWE-smith)
Limitations:
- Primarily research-focused — less polished developer experience than Aider or OpenHands
- Performance heavily dependent on ACI design choices
- Docker required for safe execution
- Not designed for general-purpose multi-agent orchestration (focused on SE tasks)
When to Use SWE-agent¶
Choose SWE-agent when you need a research-grade tool for automated software engineering tasks, especially bug fixing and issue resolution. It is the go-to choice for academic research on SE agents and for teams wanting to train custom coding models (via SWE-smith).