Labs / Lab 08

Build a workspace coordinator

Run multiple workers against one shared queue without letting them grab the same work twice.

What you'll build

A tiny coordinator that hands tasks to the right worker.

By the end of this lab you will have a simple queue file, a claim step, a complete step, and an append-only event log. One worker asks for a task kind, the coordinator gives it one unclaimed ready task, and the rest of the system can keep moving.

The point is not fancy orchestration. It is showing the shape of claim-based coordination. It prevents duplicate claims in normal sequential runs, but as you will see, it needs a real lock or database to be truly concurrency-safe.

Run it

cd ai_ecosystem_labs
python3 08-coordinator/coordinator.py claim docs-worker docs
python3 08-coordinator/coordinator.py complete docs-worker inspect-docs
Starting here? Quick setup
git clone https://github.com/BanditF/ai_ecosystem_labs
cd ai_ecosystem_labs
python3 08-coordinator/coordinator.py --help

Requires Python 3.8+. No additional packages needed for this lab.

Time guide. Setup: ~2 min. Working through it: 20–35 min because coordination only clicks once you watch the claims and handoffs.

Why this piece exists

As soon as you have multiple workers, you need a referee.

A single worker can cheat a little. It can just read a list, do the next thing, and assume nothing else is touching that state. The moment you run two workers at the same time, that falls apart. Without a claim step, both workers can see the same task and both can start doing it.

A coordinator is the boring middle layer that fixes that. It tracks which tasks are ready, which worker claimed what, and what happened over time. The real-world analog is a dispatch board in a warehouse: one pile of work, several people pulling from it, and a shared system that says who took which job.

The code

coordinator.py

Walk through it

Four things worth noticing.

claim() is the assignment gate — in a single-process model

The coordinator loads the queue, builds a set of already claimed task IDs, and only hands out a task if it is both ready and not already claimed. This implementation is a teaching model. It reads, updates, and writes queue.json in sequence — safe when workers run one at a time, but not protected against concurrent races. Real coordinators use atomic claims: file locks, database transactions, or atomic renames.

The event log is your reconstruction trail

Every claim and completion gets appended to events.jsonl with a timestamp. That means the queue shows current state, while the event log shows history. If something went wrong, you can read back through the log and see the order things happened.

Workers stay stateless on purpose

A worker does not need to know who else exists. It just asks for work of a given kind, does the job, and marks it complete. Scaling the system is mostly running another worker process, not teaching each worker about the rest of the fleet.

kind is a routing rule

Tasks carry a kind like docs or validation, and a worker only claims matching kinds. That is the small version of how real pipelines route work between specialized stages. A search worker should pick up search work, not summary work.

Expected output

What a claim and completion look like.

Before, queue.json contains:
{
  "tasks": [
    {"id": "inspect-docs", "kind": "docs", "status": "ready"},
    {"id": "run-validation", "kind": "validation", "status": "ready"}
  ],
  "claims": []
}

$ python3 08-coordinator/coordinator.py claim docs-worker docs
{
  "ok": true,
  "task": {
    "id": "inspect-docs",
    "kind": "docs",
    "status": "ready"
  }
}

After the claim, queue.json includes:
"claims": [{"worker": "docs-worker", "task_id": "inspect-docs"}]

$ python3 08-coordinator/coordinator.py complete docs-worker inspect-docs
{
  "ok": true,
  "task": {
    "id": "inspect-docs",
    "kind": "docs",
    "status": "done"
  }
}

And events.jsonl gets lines like:
{"event": "claimed", "worker": "docs-worker", "task_id": "inspect-docs", "time": "2025-01-01T12:00:00Z"}
{"event": "completed", "worker": "docs-worker", "task_id": "inspect-docs", "time": "2025-01-01T12:00:03Z"}

Try this

Three things to try before moving on.

  1. Run two workers at once. Open two terminal windows and use different worker names. Try claiming different kinds at the same time and confirm the same ready task does not get claimed twice. (In practice, with this JSON-based implementation, two simultaneous workers may race. This exercise shows the shape — a production system would add a file lock or database transaction.)
  2. Trace one task through the event log. After a run, open events.jsonl and follow a single task from claimed to completed. That is the simplest version of workflow observability.
  3. Add a new task kind. Put another task in queue.json with a new kind, then make a worker that claims that kind. You have just added another stage to the pipeline without changing the coordinator design.

Concepts behind this

Read Agents for the bigger picture. Coordinators are one of the ways multi-agent systems delegate work without every worker needing to know the whole system.

Then go back to Lab 07: task graph. The task graph is the planner. This coordinator is the executor that hands those tasks to actual workers.