Autonomy

Agents and agent systems

People use the word agent for almost anything that feels a little autonomous. I think it gets clearer fast once you separate the loop, the framework, the host, and the long-running system.

The short version

An agent is a model running in a loop.

The important shift is that it does not stop after one answer. It looks around, picks a next step, does something, checks what happened, and keeps going. That continuing loop is the part that earns the name.

Products pile on memory, approvals, dashboards, subagents, and all the rest. Fine. The loop is still the thing to anchor on first.

That's the core idea Want to try the pieces? The labs turn these concepts into small buildable steps instead of one big abstract category.

Core shape

An agent is a model-driven loop.

An agent observes the current situation, decides what to do, takes an action, reads the result, and repeats until the task is done or it needs help.

To be clear, the defining trait is not mystical intelligence. It is persistence across steps. A chatbot answers once; an agent keeps working the problem.

Minimal loop

Observe -> decide -> act -> evaluate -> repeat or stop.

In practice

The loop is only one part of what people call an agent.

Some tools sell you a runtime, some sell you a host, and some bundle a long-running assistant platform around both.

Current default

Agent runtime

ELI5: The worker keeps looking around the room, picking the next step, and checking whether the job is done.

What it actually is: The observe-decide-act-evaluate loop plus state, retries, memory retrieval, and stop conditions.

Try it: Start with lab 06.

Real tools: custom loops you write yourself, LangChain (an LLM app framework), OpenAI Agents SDK, and Microsoft Agent Framework.

Current default Absorbs other layers

CLI or IDE agent

ELI5: A worker plus the front desk plus the toolbox all bundled into one station.

What it actually is: A user-facing assistant that combines runtime, host experience, tools, approvals, and context handling.

Try it: Follow lab 09 and lab 11.

Real tools: Goose, Claude Code, Cursor, Gemini CLI, GitHub Copilot CLI.

Useful niche

Persistent agent system

ELI5: The worker never clocks out; it keeps notes, messages people, and comes back to jobs later.

What it actually is: A long-running assistant with memory, scheduling, skills, channels, and durable task state across sessions.

Try it: Explore the persistent assistant platform lab.

Real tools: Hermes Agent (a research-adjacent open-source assistant), OpenClaw (an early-stage local-first assistant platform), long-running team assistants.

Concrete example: when Claude Code is fixing a failing test, the agent part is the loop that reads the failure, opens the relevant files, reruns the test, and stops once it has a credible answer or needs your approval.

Long-running systems

Where persistent assistant platforms fit

Self-improving persistent agent

This category combines a terminal interface, messaging gateways, skills, memory, scheduling, tool execution, subagents, protocol integration, and multiple execution backends. Hermes Agent, a research-adjacent open-source assistant with early-stage adoption, is one public example.

Category: persistent agent system with CLI and messaging interfaces.

Local-first assistant gateway

This category centers on a long-running assistant you run on your own devices, with a local gateway, many chat channels, skills, toolsets, multi-agent routing, companion apps, and sandbox options. OpenClaw, an early-stage local-first assistant platform, is one public example.

Category: persistent assistant platform. Specific feature claims should be checked against current project docs because this area changes fast.

Area of effect

A skill changes the agent. A sub-agent is a separate agent.

These two get mixed up constantly, and the clearest way to tell them apart is to ask where the effect lands. A skill changes how the agent in front of you behaves. A sub-agent is a second worker the agent hands a job to.

Using a skill

The agent keeps its full context and stays in one reasoning thread. The skill adds know-how and points to the right tools, but the same agent is still doing the work and still deciding, step by step, how to apply it.

A guided path, not a rail. One context window, one chain of reasoning, better informed.

Invoking a sub-agent

The agent spins up a separate reasoning process with its own context window, hands over only what that sub-agent needs, and waits for a result. It gets back an answer to integrate, not a window into how the sub-agent thought.

Delegation. Separate thread, separate context, a result handed back.

The practical tell is simple. With a skill, the agent knows more and does the work itself. With a sub-agent, the agent delegates, gets an answer back, and folds it into its own work.

One nuance worth holding loosely: how much context you hand a sub-agent is a design choice, so the isolation is tunable rather than guaranteed. Some orchestrators pass almost nothing, others pass a lot. And some frameworks build skills as lightweight sub-agents under the hood, which is part of why the words blur. The separation of reasoning threads is the real line; the amount of context crossing it is a dial. For the skill side of this, see skills, hooks, and wrappers.

How it got this shape

Why the word agent got crowded so quickly

The modern agent story usually starts with ReAct in 2022 because it made the loop legible: reason, act, observe, continue. It was not the first automation pattern of that kind, but it was simple enough that builders could copy it immediately.

Then 2023 happened. LangChain gave developers a convenient way to wire prompts, tools, and memory-like pieces together. AutoGPT made a lot of noise and helped push the word agent into the mainstream, usually with a much looser meaning.

That is how the term got mushy. A loop, a framework, a coding assistant, and a persistent assistant platform all started wearing the same label. More recent systems have moved back toward explicit tool-calling, visible state, and tighter controls, which I think is a healthier direction.

Go deeper Place familiar frameworks like LangChain and LangGraph

If you already know LangChain

Where frameworks like LangChain and LangGraph fit

LangChain is mostly a framework layer

LangChain is best understood as a developer framework for wiring model calls, prompts, tools, retrieval, memory-like patterns, and agent loops into applications. That places it mostly around the agent-runtime layer rather than the raw model-access layer.

Primary fit: framework around layers 03 through 05.

LangGraph pushes upward into orchestration

Once the same ecosystem starts expressing explicit graphs, stateful workflows, and longer-running coordination, it begins to live higher in the stack as orchestration as well as runtime.

Primary fit: layers 05 and 07, with governance adjacent through LangSmith-style tooling.

The practical rule: frameworks like LangChain usually sit above model access and below user-facing hosts. They help you build the middle of the stack.

Go deeper See the longer timeline behind today's agent talk

How agent loops evolved

  1. Command-line automation

    Developers already had composable tools: shells, pipes, CLIs, scripts, Makefiles, and CI.

  2. Structured developer protocols

    Language Server Protocol (LSP) and Debug Adapter Protocol (DAP), along with test runners, package managers, and static analysis, made software systems easier for machines to inspect.

  3. LLM APIs and function calling

    Applications began asking models for structured tool calls instead of plain text only.

  4. Early agent loops

    ReAct-style prompting and AutoGPT-style experiments showed models could chain thought, action, and observation.

  5. AI coding assistants

    Assistants moved into editors and terminals, where they could combine code context, shell commands, patches, and approvals.

  6. Protocol and skill ecosystems

    MCP and skill systems made integrations and procedures more portable across hosts.

  7. Persistent agents

    Long-running systems started combining memory, chat gateways, scheduling, subagents, and autonomous workflows.

Go deeper Review the controls more autonomous agents need

Control pressure

More autonomy means more need for controls.

Capability Why it is useful What to watch
Memory Keeps preferences, project history, and repeated procedures available. Stale, sensitive, or incorrect memories can mislead the agent.
Scheduling Lets agents run reports, audits, or maintenance without manual prompting. Unattended actions need strong permissions and notifications.
Subagents Parallelizes research, testing, review, and specialized work. Coordination overhead and unclear ownership can create confusion.
Self-improvement Can turn repeated experience into better skills or memory. Generated procedures need review before they become trusted defaults.
Go deeper Walk through one loop cycle step by step

Watch the loop happen

1

Observe: the agent reads the user request, repo files, prior tool output, or memory.

2

Decide: it chooses whether to inspect more context, edit a file, run a command, or ask for help.

3

Act: it calls a tool, edits code, creates a task, or delegates work.

4

Evaluate: it reads the result and decides whether the task is done or the loop should continue.

Ready to build

Build the loop before you theorize about it.

The fastest way to stop treating agents like a mystical category is to build the loop yourself. Lab 06 walks through a minimal working version.

If you want the bigger arc after that, follow the main lab sequence from the loop into memory, coordination, and governance.