Phase 14
Agent Engineering
Phase 14: Agent Engineering. 42 hands-on lessons building AI from first principles in the browser. Free reading; graded exercises and certificate with lifetime access.
- The Agent Loop: Observe, Think, Act
- ReWOO and Plan-and-Execute: Decoupled Planning (graded)
- Reflexion: Verbal Reinforcement Learning (graded)
- Tree of Thoughts and LATS: Deliberate Search
- Self-Refine and CRITIC: Iterative Output Improvement
- Tool Use and Function Calling
- Memory: Virtual Context and MemGPT
- Memory Blocks and Sleep-Time Compute (Letta)
- Hybrid Memory: Vector + Graph + KV (Mem0)
- Skill Libraries and Lifelong Learning (Voyager)
- Planning with HTN and Evolutionary Search (graded)
- Anthropic's Workflow Patterns: Simple Over Complex
- LangGraph: Stateful Graphs and Durable Execution
- AutoGen v0.4: Actor Model and Agent Framework
- CrewAI: Role-Based Crews and Flows
- OpenAI Agents SDK: Handoffs, Guardrails, Tracing
- Claude Agent SDK: Subagents and Session Store
- Agno and Mastra: Production Runtimes
- Benchmarks: SWE-bench, GAIA, AgentBench
- Benchmarks: WebArena and OSWorld
- Computer Use: Claude, OpenAI CUA, Gemini
- Voice Agents: Pipecat and LiveKit
- OpenTelemetry GenAI Semantic Conventions
- Agent Observability: Langfuse, Phoenix, Opik
- Multi-Agent Debate and Collaboration
- Failure Modes: Why Agents Break
- Prompt Injection and the PVE Defense
- Orchestration Patterns: Supervisor, Swarm, Hierarchical
- Production Runtimes: Queue, Event, Cron
- Eval-Driven Agent Development
- Agent Workbench Engineering: Why Capable Models Still Fail
- The Minimal Agent Workbench
- Agent Instructions as Executable Constraints
- Repo Memory and Durable State
- Initialization Scripts for Agents
- Scope Contracts and Task Boundaries (graded)
- Runtime Feedback Loops
- Verification Gates
- Reviewer Agent: Separate Builder from Marker
- Multi-Session Handoff
- The Workbench on a Real Repo
- Capstone: Ship a Reusable Agent Workbench Pack