Phase 14

Agent Engineering

Phase 14: Agent Engineering. 42 hands-on lessons building AI from first principles in the browser. Free reading; graded exercises and certificate with lifetime access.

The Agent Loop: Observe, Think, Act
ReWOO and Plan-and-Execute: Decoupled Planning (graded)
Reflexion: Verbal Reinforcement Learning (graded)
Tree of Thoughts and LATS: Deliberate Search
Self-Refine and CRITIC: Iterative Output Improvement
Tool Use and Function Calling
Memory: Virtual Context and MemGPT
Memory Blocks and Sleep-Time Compute (Letta)
Hybrid Memory: Vector + Graph + KV (Mem0)
Skill Libraries and Lifelong Learning (Voyager)
Planning with HTN and Evolutionary Search (graded)
Anthropic's Workflow Patterns: Simple Over Complex
LangGraph: Stateful Graphs and Durable Execution
AutoGen v0.4: Actor Model and Agent Framework
CrewAI: Role-Based Crews and Flows
OpenAI Agents SDK: Handoffs, Guardrails, Tracing
Claude Agent SDK: Subagents and Session Store
Agno and Mastra: Production Runtimes
Benchmarks: SWE-bench, GAIA, AgentBench
Benchmarks: WebArena and OSWorld
Computer Use: Claude, OpenAI CUA, Gemini
Voice Agents: Pipecat and LiveKit
OpenTelemetry GenAI Semantic Conventions
Agent Observability: Langfuse, Phoenix, Opik
Multi-Agent Debate and Collaboration
Failure Modes: Why Agents Break
Prompt Injection and the PVE Defense
Orchestration Patterns: Supervisor, Swarm, Hierarchical
Production Runtimes: Queue, Event, Cron
Eval-Driven Agent Development
Agent Workbench Engineering: Why Capable Models Still Fail
The Minimal Agent Workbench
Agent Instructions as Executable Constraints
Repo Memory and Durable State
Initialization Scripts for Agents
Scope Contracts and Task Boundaries (graded)
Runtime Feedback Loops
Verification Gates
Reviewer Agent: Separate Builder from Marker
Multi-Session Handoff
The Workbench on a Real Repo
Capstone: Ship a Reusable Agent Workbench Pack