Lab 04: YAML Skill — AI Tooling Field Guide

What you'll build

An included YAML skill schema plus a runner that actually uses it.

By the end of this lab you will have a real YAML skill definition that tells the runner what the skill is called, which inputs matter, what procedure it follows, what output it returns, and what validation rules it should enforce.

That is the teaching point here. The skill contract is externalized into structured data: name, inputs, defaults, output shape, and validation all live in the YAML. The Python runner still owns the implementation, but another runner could implement the same contract. That separation is the design move.

Open it

cd ai_ecosystem_labs
cat 04-skill/term-count.skill.yaml
python3 04-skill/skill_runner.py --term agent --path 04-skill/sample.txt

Artifacts: 04-skill/term-count.skill.yaml and 04-skill/skill_runner.py

Starting here? Quick setup

git clone https://github.com/BanditF/ai_ecosystem_labs
cd ai_ecosystem_labs
python3 04-skill/skill_runner.py --term agent --path 04-skill/sample.txt

This version is meant to be read and run: the YAML carries the contract, while the runner supplies the implementation behind it.

Time guide. Setup: ~2 min. Working through it: 20–35 min, mostly around understanding procedure files versus tools.

Why this piece exists

A tool exposes capability. A skill schema exposes judgment.

The underlying tool here is still just a term counter. Left on its own, it does one narrow thing and depends on the caller to know when to use it, how to format the call, and how to decide whether the result is useful. The YAML skill file moves that judgment into a reusable, machine-readable artifact.

This is roughly how real systems scale agent behavior. Copilot spaces, AGENTS.md files, custom instructions, and Fabric skills all play a similar role: they do not replace the underlying tool or model, but they tell the host how to apply it in context. Once that guidance lives in structured data, you can swap or improve the runner without changing the schema. That is different from implementation. It is closer to operational memory.

The artifacts

The YAML schema and the runner that reads it.

First look at the skill definition itself. Then look at the runner that parses that YAML and turns it into actual term-count behavior.

Walk through it

Five things worth noticing.

The YAML names the contract

The top-level fields give the skill an identity: name, version, and description. That makes the file more than notes for a human. It becomes structured data a host can load.

`action` bridges schema to code

The YAML also includes action: term_count. The runner reads that field and dispatches to the matching implementation. If you wanted one runner to support multiple skills, you would add another YAML file with a different action value and teach the runner how to handle it.

Inputs are machine-readable

The inputs block declares term, path, and case_sensitive with types, required flags, defaults, and descriptions. The runner uses that information to build its CLI behavior instead of hard-coding every argument by hand.

Procedure, output, and validation travel together

The file does not just say what to call. It also records the intended procedure, the JSON fields the skill should return, and validation rules like making sure the match list length equals the reported count.

You can swap runners without rewriting the schema

Today the host is skill_runner.py. Later you could replace it with another implementation, keep the same YAML, and preserve the same external contract. The contract lives in data; the implementation still lives in code.

Expected output

What the YAML skill contains.

The file is now structured YAML, so a runner can parse it directly:

name: term_count
version: "1.0"
description: Count occurrences of a term in a document.
action: term_count
inputs:
  term:
    type: string
    required: true
  path:
    type: string
    required: true
  case_sensitive:
    type: boolean
    default: false
procedure:
  - Read the file at `path`.
  - Search for `term`, respecting `case_sensitive`.
output:
  format: json
  fields:
    term: string
    count: integer
    matches: array
validation:
  count_non_negative: true
  matches_length_equals_count: true

name, version, and description identify the skill. action is the bridge into the runner: it tells the host which implementation to dispatch to. inputs defines the arguments and defaults. procedure captures the intended behavior. output describes the JSON shape the runner should return. validation gives the runner rules it can check automatically. That is more rigid than prose, which is exactly why it is useful.

Try this

Three things to try before moving on.

Add a new YAML input. Add something like max_matches to term-count.skill.yaml, update the runner to read it, and see how far you can push the idea that the schema drives behavior.
Flip the default on case_sensitive. Change default: false to true, run python3 04-skill/skill_runner.py --term agent --path 04-skill/sample.txt, and notice how the output changes when Agent and agent stop counting as the same thing.
Write a second .skill.yaml file. Keep the same runner idea, but define a different operation, give it a new action value, and wire the runner to it. The point is to feel how the schema can stay stable while the implementation behind it changes.

Concepts behind this

Read Extensions: skills, hooks, and wrappers for the bigger picture on why this layer exists and where it fits.

Then read Protocols for the complementary layer: protocols define how tools are exposed, while skills describe how to use them well.

Next lab

Lab 05: add a hook or lifecycle automation layer →