Run multiple workers against one shared queue without letting them grab the same work twice.
What you'll build
A tiny coordinator that hands tasks to the right worker.
By the end of this lab you will have a simple queue file, a claim step,
a complete step, and an append-only event log. One worker asks for a
task kind, the coordinator gives it one unclaimed ready task, and the
rest of the system can keep moving.
The point is not fancy orchestration. It is showing the shape of
claim-based coordination. It prevents duplicate claims in normal sequential
runs, but as you will see, it needs a real lock or database to be truly
concurrency-safe.
git clone https://github.com/BanditF/ai_ecosystem_labs
cd ai_ecosystem_labs
python3 08-coordinator/coordinator.py --help
Requires Python 3.8+. No additional packages needed for this lab.
Time guide. Setup: ~2 min. Working through it: 20–35 min because coordination only clicks once you watch the claims and handoffs.
Why this piece exists
As soon as you have multiple workers, you need a referee.
A single worker can cheat a little. It can just read a list, do the next
thing, and assume nothing else is touching that state. The moment you run
two workers at the same time, that falls apart. Without a claim step, both
workers can see the same task and both can start doing it.
A coordinator is the boring middle layer that fixes that. It tracks which
tasks are ready, which worker claimed what, and what happened over time.
The real-world analog is a dispatch board in a warehouse: one pile of work,
several people pulling from it, and a shared system that says who took which job.
The code
coordinator.py
Walk through it
Four things worth noticing.
claim() is the assignment gate — in a single-process model
The coordinator loads the queue, builds a set of already claimed task
IDs, and only hands out a task if it is both ready and not
already claimed. This implementation is a teaching model. It reads,
updates, and writes queue.json in sequence — safe when
workers run one at a time, but not protected against concurrent races.
Real coordinators use atomic claims: file locks, database transactions,
or atomic renames.
The event log is your reconstruction trail
Every claim and completion gets appended to events.jsonl
with a timestamp. That means the queue shows current state, while the
event log shows history. If something went wrong, you can read back
through the log and see the order things happened.
Workers stay stateless on purpose
A worker does not need to know who else exists. It just asks for work
of a given kind, does the job, and marks it complete. Scaling the system
is mostly running another worker process, not teaching each worker about
the rest of the fleet.
kind is a routing rule
Tasks carry a kind like docs or
validation, and a worker only claims matching kinds. That is
the small version of how real pipelines route work between specialized
stages. A search worker should pick up search work, not summary work.
Run two workers at once.
Open two terminal windows and use different worker names. Try claiming
different kinds at the same time and confirm the same ready task does not
get claimed twice. (In practice, with this JSON-based implementation,
two simultaneous workers may race. This exercise shows the shape — a
production system would add a file lock or database transaction.)
Trace one task through the event log.
After a run, open events.jsonl and follow a single task from
claimed to completed. That is the simplest version
of workflow observability.
Add a new task kind.
Put another task in queue.json with a new kind,
then make a worker that claims that kind. You have just added another
stage to the pipeline without changing the coordinator design.
Concepts behind this
Read Agents for the bigger picture.
Coordinators are one of the ways multi-agent systems delegate work
without every worker needing to know the whole system.
Then go back to Lab 07: task graph.
The task graph is the planner. This coordinator is the executor that
hands those tasks to actual workers.