The YAML names the contract
The top-level fields give the skill an identity: name,
version, and description. That makes the file
more than notes for a human. It becomes structured data a host can load.
Labs / Lab 04
Separate the recipe around the tool from the tool itself so the behavior becomes teachable and reusable.
What you'll build
By the end of this lab you will have a real YAML skill definition that tells the runner what the skill is called, which inputs matter, what procedure it follows, what output it returns, and what validation rules it should enforce.
That is the teaching point here. The skill contract is externalized into structured data: name, inputs, defaults, output shape, and validation all live in the YAML. The Python runner still owns the implementation, but another runner could implement the same contract. That separation is the design move.
cd ai_ecosystem_labs
cat 04-skill/term-count.skill.yaml
python3 04-skill/skill_runner.py --term agent --path 04-skill/sample.txt
Artifacts:
04-skill/term-count.skill.yaml
and
04-skill/skill_runner.py
git clone https://github.com/BanditF/ai_ecosystem_labs
cd ai_ecosystem_labs
python3 04-skill/skill_runner.py --term agent --path 04-skill/sample.txt
This version is meant to be read and run: the YAML carries the contract, while the runner supplies the implementation behind it.
Time guide. Setup: ~2 min. Working through it: 20–35 min, mostly around understanding procedure files versus tools.
Why this piece exists
The underlying tool here is still just a term counter. Left on its own, it does one narrow thing and depends on the caller to know when to use it, how to format the call, and how to decide whether the result is useful. The YAML skill file moves that judgment into a reusable, machine-readable artifact.
This is roughly how real systems scale agent behavior. Copilot spaces, AGENTS.md files, custom instructions, and Fabric skills all play a similar role: they do not replace the underlying tool or model, but they tell the host how to apply it in context. Once that guidance lives in structured data, you can swap or improve the runner without changing the schema. That is different from implementation. It is closer to operational memory.
The artifacts
First look at the skill definition itself. Then look at the runner that parses that YAML and turns it into actual term-count behavior.
Walk through it
The top-level fields give the skill an identity: name,
version, and description. That makes the file
more than notes for a human. It becomes structured data a host can load.
action bridges schema to code
The YAML also includes action: term_count. The runner reads
that field and dispatches to the matching implementation. If you wanted
one runner to support multiple skills, you would add another YAML file
with a different action value and teach the runner how to
handle it.
The inputs block declares term, path,
and case_sensitive with types, required flags, defaults, and
descriptions. The runner uses that information to build its CLI behavior
instead of hard-coding every argument by hand.
The file does not just say what to call. It also records the intended procedure, the JSON fields the skill should return, and validation rules like making sure the match list length equals the reported count.
Today the host is skill_runner.py. Later you could replace it
with another implementation, keep the same YAML, and preserve the same
external contract. The contract lives in data; the implementation still
lives in code.
Expected output
The file is now structured YAML, so a runner can parse it directly:
name: term_count
version: "1.0"
description: Count occurrences of a term in a document.
action: term_count
inputs:
term:
type: string
required: true
path:
type: string
required: true
case_sensitive:
type: boolean
default: false
procedure:
- Read the file at `path`.
- Search for `term`, respecting `case_sensitive`.
output:
format: json
fields:
term: string
count: integer
matches: array
validation:
count_non_negative: true
matches_length_equals_count: true
name, version, and description identify the skill. action is the bridge into the runner: it tells the host which implementation to dispatch to. inputs defines the arguments and defaults. procedure captures the intended behavior. output describes the JSON shape the runner should return. validation gives the runner rules it can check automatically. That is more rigid than prose, which is exactly why it is useful.
Try this
max_matches to term-count.skill.yaml,
update the runner to read it, and see how far you can push the idea that the
schema drives behavior.
case_sensitive.
Change default: false to true, run
python3 04-skill/skill_runner.py --term agent --path 04-skill/sample.txt,
and notice how the output changes when Agent and agent
stop counting as the same thing.
.skill.yaml file.
Keep the same runner idea, but define a different operation, give it a new
action value, and wire the runner to it. The point is to feel
how the schema can stay stable while the implementation behind it changes.
Read Extensions: skills, hooks, and wrappers for the bigger picture on why this layer exists and where it fits.
Then read Protocols for the complementary layer: protocols define how tools are exposed, while skills describe how to use them well.