The AI tooling stack

A lot of AI confusion is really naming confusion. Once each piece has a place — model access, tools, protocols, agent loops, interfaces, workflows, controls — the ecosystem stops feeling like one giant blur.

The short version

The stack is a vocabulary list disguised as a diagram.

The stack is not here to look architectural. It is here so you can stop lumping five different things into the word tool or agent.

You do not need every edge case up front. Rough placement is enough for most of the site to start making sense.

That's the core idea Ready for one important layer? The protocols page zooms in on MCP and shows how one layer makes tools portable across hosts.

Example first

A local CLI agent is not one layer. That is the point.

A terminal agent feels like one product because that is how you meet it. Under the hood, it is a stack of decisions: which model path it uses, which tools it can reach, how it handles context, what approvals it asks for, and whether it speaks a protocol like MCP. Goose, an open-source AI CLI, is one example of that shape.

So the stack is less like a neat pile of boxes and more like a cutaway drawing. You are seeing what the product is made of.

How the layers interact

The stack is vertical, but the work still runs in a loop.

The layer list tells you where a thing lives. This sketch answers a second question: how the pieces usually work together once you start building.

When I get lost, this is the question I come back to: what job is this piece doing right now? Model access, connection, execution, packaging, loop control, human mediation, coordination, or risk control?

From hardware to agents

The stack layers from bottom to top

08

Governance, evaluation, and observability

Plain English: The safety rails and dashboard.

Technical view: Approvals, audit logs, traces, evals, policy, rate limits, cost controls, secret scanning, reproducibility, and deployment gates.

Examples: promptfoo (eval testing), LangSmith (traces and evals), OpenTelemetry (shared tracing standard), CI checks.

07

Orchestration and multi-agent coordination

Plain English: The project manager that splits and schedules work.

Technical view: Workflow engines, graph execution, background jobs, subagent delegation, retries, queues, routing, and long-running task state.

Examples: LangGraph (graph-based agent flows), Temporal (durable workflows), Airflow, GitHub Actions, n8n.

06

Hosts and user interfaces

Plain English: The place where you talk to the AI and approve its work.

Technical view: Hosts own the conversation, context window policy, permission prompts, tool result display, and user interaction model.

Examples: Copilot CLI, Claude Code, Cursor, VS Code, chat apps, web dashboards.

05

Agent runtime

Plain English: The loop that keeps asking, "what should I do next?"

Technical view: Planning, tool selection, state management, memory retrieval, error handling, reflection, retry logic, and stop conditions.

Examples: custom loops, LangChain (LLM app framework), OpenAI Agents SDK, Microsoft Agent Framework, coding agents.

Framework note: LangChain mainly fits here; LangGraph pushes higher into orchestration.

04

Packaging and behavior extension

Plain English: Reusable recipes and add-ons.

Technical view: Skills, plugins, prompt packs, hooks, slash commands, project instruction files, templates, and reusable workflows.

Examples: skills, project instruction files, pre-tool hooks, code review prompt packs.

03

Protocols and adapters

Plain English: Standard plugs that connect the AI to tools and information.

Technical view: Schemas and transports for discovering context, invoking tools, streaming results, authenticating, and adapting existing systems.

Examples: MCP, function calling, OpenAPI, LSP (editor code-intelligence protocol), DAP (debugger protocol), browser automation protocols.

02

Executable capabilities

Plain English: The actual tools that do work.

Technical view: CLIs, APIs, shell commands, scripts, test runners, package managers, databases, browsers, and cloud services.

Examples: git, rg, jq, curl, gh, npm, pytest, SQL clients, REST APIs.

01

Model access, data, and compute

Plain English: The engine, the fuel, and the workshop floor.

Technical view: Hosted subscriptions, provider APIs, aggregate routers, local endpoints, model artifacts, embedding models, filesystems, databases, vector stores, containers, sandboxes, and credentials.

Examples: provider APIs, local hosting runtimes, quantized model files, Docker, E2B, local files.

Start here: model access paths.

Layer coverage

Where real tool shapes sit on the stack

Most products do not belong to one neat box. This is the useful part: it shows where a tool is primarily living and which neighboring layers it still reaches into.

Primary fit Touches this layer No meaningful fit
Layer Local CLI agent Protocol adapter Task-memory graph Workspace coordinator Persistent assistant
Governance Approvals Watchdogs Sandboxing
Orchestration Ready tasks Multi-agent work Routing + schedules
Host / UI CLI + desktop Workspace manager Chat + channels
Agent runtime Agent loop Launches agents Persistent loop
Packaging Extensions Hooks + roles Skills
Protocols Uses MCP Protocol layer Adapter package MCP / gateway
Capabilities Calls tools Exposes tools Task CLI Coordinator CLI Built-in toolsets
Foundation Model access + files Database state Project state Workspace + memory

How it got this shape

Why people keep reaching for layered diagrams

Layered diagrams are an old engineering habit for a reason. Networking had OSI. Software teams talk about application, data, and infrastructure layers. Cloud systems added gateways, queues, runtimes, and control planes.

AI tooling grew into the same shape. First you mostly had model access and chat. Then tools let models reach outward into files, APIs, shells, and browsers, which meant the interesting boundary was no longer just the model. After that, agent loops and protocols made the middle of the system much more visible.

So yes, this page borrows old architecture instincts on purpose. A coding assistant is not just a model call. It is also tools, hosts, permissions, packaging, orchestration, and observability. The layered view is still the cleanest way I know to say where each part lives.

Go deeper Place a real task-memory graph on the stack

Place a task-memory graph on the stack

1

It has a CLI. A command such as bd is an executable capability, so it touches layer 02.

2

It stores durable state. The task graph acts like memory for agents and humans, so it also belongs near the foundation.

3

It shapes workflow. Ready-task detection, claiming, and dependencies push it upward into coordination.

Quick diagnostic

I'm stuck. What layer is the problem on?

When something feels broken, do not start by blaming the whole stack. Start by asking which boundary is failing.

Model problem

If the answer is weak or wrong before tools even enter the picture, start at model access.

Think prompt, model choice, or raw API behavior. Start with lab 00.

Tool-shape problem

If the agent can do the job in theory but keeps tripping over messy inputs or outputs, the tool interface is the problem.

That usually points to lab 01 or lab 02.

Protocol problem

If the host cannot discover the tool, call it reliably, or pass structured results around, check the adapter layer.

That is usually a lab 03 problem.

Host problem

If approvals, context loading, file permissions, or the overall UX feel wrong, the host is shaping the behavior.

That is the territory of lab 09.

If the tool call works but the agent still chooses bad next steps, that is usually a runtime problem instead. See lab 06.

Ready to build

The stack makes more sense once you've touched each layer.

Reading the stack helps. Building through it helps more. The labs take you from model access up through tools, protocols, skills, hooks, the agent loop, memory, coordination, and governance one layer at a time.

Start at the bootstrap step, then move through the main arc. Come back here whenever something in the labs needs a name.