Changelog & Sources¶

Changelog¶

v1.3.0 — March 23, 2026¶

New section: Production Systems — reverse-engineers two closed-source agentic systems using a five-layer methodology (Observable Behavior → Inferred Architecture → Published Information → OSS Analog Mapping → DIY Replication Path).
New page: Claude Code (production-systems/claude-code.md) — deep dive into Anthropic's agentic coding assistant. Covers the single-agent + tool loop architecture, 17 exposed tools, extended thinking, context compaction, CLAUDE.md convention, SWE-bench benchmarks, and DIY replication with open-source models (Qwen2.5-Coder, DeepSeek, Devstral).
New page: Perplexity Computer (production-systems/perplexity-computer.md) — deep dive into Perplexity's multi-agent research and task assistant. Covers the orchestrator + subagent architecture, Firecracker VM isolation, integrated search pipeline (200B+ URLs, Vespa AI), skill system, connector ecosystem, and DIY replication with LangGraph + open-source search.
New page: Production Systems Overview (production-systems/index.md) — methodology explanation and system comparison table.
Updated Learning Path (learning-path.md) — added Week 0 ("Production Systems — What You're Actually Using") before existing weeks. Curriculum is now 5 weeks. Existing Weeks 1–4 unchanged.
Updated navigation: added Production Systems section between Internals and Getting Started.
Updated Executive Overview in index.md with reference to Production Systems.
Sources: 50+ new web sources, engineering blogs, system prompt analyses, and benchmark data referenced.

v1.0.0 — March 23, 2026¶

Initial research compilation covering 10 frameworks and tools across research/orchestration and software development agents.
Deep dives for: LangGraph, AutoGen, CrewAI, OpenAI Agents SDK, Agency Swarm, SWE-agent, OpenHands, Aider, Patchwork, Devin.
Comparison tables, decision matrix, and maturity assessment.
4-week learning path for engineers.
Sources: 15+ academic papers, 10+ GitHub repositories, industry reports, and framework documentation.

v1.1.0 — March 23, 2026¶

Replaced OpenAI Swarm deep dive with OpenAI Agents SDK deep dive (Swarm's production successor).
Added site_url to mkdocs.yml for canonical URLs.
Created root README.md with project description and live-docs link.
Expanded Day 7 of the learning path with agentic security research use cases: vulnerability discovery (SWE-agent), multi-agent SOC operations (arXiv:2511.15755), and hardware/firmware analysis agents.
Updated comparison tables and decision matrix to reflect Agents SDK.

v1.2.0 — March 23, 2026¶

New page: Internals (internals.md) — raw agent loop, tool calling wire format, state serialization patterns (full replay vs. typed state vs. role-scoped memory), handoff mechanics, framework philosophy tradeoffs, and the orchestration tax with empirical data.
New page: Evaluation (evals.md) — why evals are hard, what to evaluate (5 dimensions), evaluation approaches (LLM-as-judge, deterministic checks, trajectory eval, human eval), benchmark landscape (SWE-bench, GAIA, HumanEval, AgentBench, WebArena, ToolBench, BFCL, TAU-bench), eval tools comparison, and a minimal 3-layer eval setup with code.
New page: Full Workflow (workflow.md) — end-to-end annotated walkthroughs for research and software development workflows with concrete data payloads, failure tables, and Mermaid comparison diagram.
Rewrote Learning Path (learning-path.md) — surveyed 11 external resources (DeepLearning.AI, HuggingFace, Microsoft GitHub, UC Berkeley, Maven, Anthropic, Lilian Weng), restructured Week 1 to start with raw internals before frameworks, added eval day (Week 2 Day 7), added framework comparison day (Week 3 Day 7), preserved v1.1.0 Day 7 security frontier content.
Updated navigation: added Internals, Full Workflow, and Evaluation to mkdocs.yml nav.
Updated Executive Overview in index.md with references to new Internals and Evaluation pages.
Sources: 40+ new web sources, 5+ new academic papers referenced.

Academic Sources¶

#	Paper	Authors	Venue / Year	DOI / Link	Code Available
1	SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering	Yang et al.	NeurIPS 2024	arXiv:2405.15793	Yes — GitHub
2	Agentless: Demystifying LLM-based Software Engineering Agents	Xia et al.	ACM 2025	DOI:10.1145/3715754	Yes — GitHub
3	Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?	Xia et al.	arXiv 2025	arXiv:2511.13646	Yes — GitHub
4	HALO: Hierarchical Autonomous Logic-Oriented Orchestration	Hou et al.	arXiv 2025	arXiv:2505.13516	Yes — GitHub
5	SagaLLM: Context Management, Validation, and Transaction Guarantees	Chang & Geng	VLDB 2025	DOI:10.14778/3750601.3750611	No
6	LEGOMem: Modular Procedural Memory for Multi-agent LLM Systems	Han et al.	arXiv 2025	arXiv:2510.04851	No
7	OSC: Cognitive Orchestration through Dynamic Knowledge Alignment	Zhang et al.	arXiv 2025	arXiv:2509.04876	No
8	MASAI: Modular Architecture for Software-engineering AI Agents	Arora et al.	arXiv 2024	arXiv:2406.11638	No
9	ChatDev: Communicative Agents for Software Development	Qian et al.	arXiv 2024	arXiv:2307.07924	Yes — GitHub
10	HyperAgent: Generalist Software Engineering Agents	Phan et al.	arXiv 2024	arXiv:2409.16299	No
11	The OpenHands Software Agent SDK	OpenHands Team	arXiv 2025	arXiv:2511.03690	Yes — GitHub
12	SWE-smith: Scaling Data for Software Engineering Agents	Yang et al.	NeurIPS 2025	NeurIPS Poster	Yes — swesmith.com
13	SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution	Li et al.	arXiv 2025	arXiv:2507.23348	No
14	SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development	Du et al.	arXiv 2025	arXiv:2505.16975	Yes — GitHub
15	Large Language Model-Based Agents for Software Engineering: A Survey	Liu et al.	arXiv 2024	arXiv:2409.02977	No
16	Multi-Agent Real-time Chat Orchestration (MARCO)	Shrimal et al.	arXiv 2024	arXiv:2410.21784	No
17	Multi-Agent LLM Orchestration Achieves Deterministic Decision Support for Incident Response	Drammeh	arXiv 2025	arXiv:2511.15755	Yes
18	OmniNova: A General Multimodal Agent Framework	Du	arXiv 2025	arXiv:2503.20028	No
19	A Multi-Agent LLM Framework for Design Space Exploration in Autonomous Driving	Shih et al.	arXiv 2025	arXiv:2512.08476	No
20	VoCare AI: A Multi-Agent LLM Workflow for Improved Clinic Operational Efficiency	Han et al.	IEEE TENCON 2025	IEEE:11375163	Yes — GitHub

GitHub Repositories Referenced¶

Repository	Stars	Language	License	Last Active
langchain-ai/langgraph	27,241	Python	MIT	2026-03-23
microsoft/autogen	56,069	Python	CC-BY-4.0	2026-03-21
crewAIInc/crewAI	46,965	Python	MIT	2026-03-23
openai/swarm	21,211	Python	MIT	2025-03-11
VRSEN/agency-swarm	4,108	Python	MIT	2026-03-23
SWE-agent/SWE-agent	18,821	Python	MIT	2026-03-23
All-Hands-AI/OpenHands	69,606	Python	Custom	2026-03-23
Aider-AI/aider	42,286	Python	Apache-2.0	2026-03-17
openai/openai-agents-python	20,220	Python	MIT	2026-03-23
patched-codes/patchwork	1,548	Python	AGPL-3.0	2025-04-18
OpenAutoCoder/Agentless	2,022	Python	MIT	2024-12-22
OpenAutoCoder/live-swe-agent	339	Python	MIT	2026-01-19

Web Sources¶

All star counts and last-active dates were collected on March 23, 2026 via the GitHub API.