Labs / Lab 01

Build a dumb CLI tool an AI can call

Start with one deterministic capability that takes clear inputs, prints a clear result, and fails loudly when something is wrong.

What you'll build

A tiny term counter with a stable command-line shape.

In this lab you run a small Python CLI that counts how many times a term appears across one or more files. It prints a per-file count and a total, then exits with a status code that tells you whether the run succeeded.

The deeper point is not the counting. It is the interface. A tool an AI host might call later needs boring inputs, boring outputs, and boring failure modes. That kind of predictability is what makes a tool safe to wrap and automate.

Run it

cd ai_ecosystem_labs
python3 01-cli/term_count.py agent sample_docs/agents.txt
Starting here? Quick setup
git clone https://github.com/BanditF/ai_ecosystem_labs
cd ai_ecosystem_labs
python3 01-cli/term_count.py agent sample_docs/agents.txt

Requires Python 3. No extra packages needed for this lab.

Time guide. Setup: ~2 min. Working through it: 15–25 min if shell flags and exit codes already feel familiar.

Why this piece exists

Before a host can orchestrate tools, each tool has to be boring.

When people talk about agents using tools, it can sound more magical than it really is. Under the hood, a lot of that work is just a model deciding when to call a deterministic command. If that command is vague, flaky, or inconsistent, the whole layer above it gets shaky fast.

This is why tiny CLIs matter. A command like this has the same basic job as familiar tools such as grep, wc, or git grep: accept predictable arguments, do one thing, print a result, and exit clearly. Ollama plays a similar role for model access at a larger scale. The exact task changes, but the principle is the same.

The code

term_count.py

Walk through it

Four things worth noticing.

Exit codes are part of the interface

If a file is missing, the script prints missing file: ... to stderr and returns 1. That matters. A tool that fails silently is worse than no tool, because the caller may keep going as if everything worked.

Positional args keep the shape predictable

The command takes a positional term and one or more positional files. That makes the calling convention easy to remember and easy to validate. Later, when you add flags, the important thing is still the same: callers need a stable shape they can rely on.

Determinism is what makes tools trustworthy

Given the same term and the same files, the script produces the same counts every time. There is no randomness, no hidden state, and no model judgment involved. That repeatability is what lets you trust the result enough to build more layers on top of it.

Plain text now, machine output next

This included version prints human-friendly text like sample_docs/agents.txt: 2 and total: 2. That is fine for a person in a terminal. For automation, you would usually want a --json flag with named fields instead of loose text. That machine-facing wrapper is the next step in Lab 02.

Expected output

What a successful run looks like.

With the included sample file:

sample_docs/agents.txt: 2
total: 2

If you pass a file that does not exist, it fails clearly instead of guessing:

missing file: sample_docs/nope.txt

The first line in a successful run is the count for each file you passed. The total line sums them across the whole invocation. There is no --json flag in this lab's included script yet, which is exactly why the next lab exists.

Try this

Three things to try before moving on.

  1. Try different terms and files. Run the script against sample_docs/agents.txt, sample_docs/memory.txt, and sample_docs/protocols.txt. Change the search term and see how the per-file counts and total move together.
  2. Break it on purpose. Pass a filename that does not exist, then run echo $?. You should see a non-zero exit code. That is how shells, wrappers, and hosts know the tool failed.
  3. Try the case-sensitive flag. Run with --case-sensitive and compare the count for a mixed-case term with and without the flag. Notice the difference.

Concepts behind this

If you want the bigger picture for how small local tools get wrapped and extended, read Extensions.

If you want the protocol layer that eventually standardizes how hosts talk to tools, read Protocols.