Labs / Lab 00

Start with model access

Turn a model endpoint into one boring local command so the rest of the stack has a concrete surface to wrap.

What you'll build

A CLI that wraps any model endpoint.

By the end of this lab you will have a single command that sends a prompt to a model (or a stand-in) and returns structured JSON. That surface is what every other lab in this sequence wraps, validates, and extends.

The script uses a toy response by default so you can focus on interface shape without needing an API key. Swapping in a real provider is one environment variable.

Run it

cd ai_ecosystem_labs
python3 00-model-access/model_cli.py "Hello from the lab" --json
Starting here? Quick setup
git clone https://github.com/BanditF/ai_ecosystem_labs
cd ai_ecosystem_labs
python3 00-model-access/model_cli.py "Hello from the lab" --json

Requires Python 3.8+. No additional packages needed for this lab.

Time guide. Setup: ~2 min. Working through it: 15–25 min if you are mostly focused on the interface shape.

Why this piece exists

The rest of the stack needs something stable to call.

Without this layer, every tool, hook, and agent loop in your stack has to re-decide how to reach a model: which provider, which API shape, which credentials. That decision creeps into code that shouldn't care about it.

A single CLI surface fixes this. Callers send a prompt and get a response. The details — endpoint, API key, model name — live in environment variables and stay out of the tools themselves. This is exactly what Ollama does at a larger scale: one local command that anything can call.

The code

model_cli.py

Walk through it

Four things worth noticing.

call_model() is a seam, not a detail

The function signature — prompt, endpoint, api_key, model — is the stable contract. Right now the body returns a toy string. To connect a real model, you replace only the body, not the signature. Everything that calls call_model() keeps working.

Environment variables keep secrets out of code

os.getenv("TOY_MODEL_ENDPOINT", "local://toy-model") means the script works with no setup, but a real deployment just sets an env var — no code change. This is the pattern every real model client uses. Credentials never get committed.

--json makes output machine-readable

Without --json, the script prints a human string. With it, it prints a JSON object a tool, agent, or test can parse reliably. The shape — ok, endpoint, prompt, response — stays the same every run. That predictability is what later labs depend on.

argparse gives you stable flags for free

Using argparse instead of reading sys.argv directly gives you --help, type checking, and consistent error messages at no cost. A tool an AI calls should always have a stable, documented flag interface — not "it works if you pass things in the right order."

Expected output

What a successful run looks like.

Without --json:

Toy model response to: Hello from the lab

With --json:

{
  "ok": true,
  "endpoint": "local://toy-model",
  "api_key": "missing",
  "model": "toy-v1",
  "prompt": "Hello from the lab",
  "response": "Toy model response to: Hello from the lab"
}

If you see this shape, the lab is working. The api_key field shows "missing" by default — that is expected. It will show "set" once you point it at a real provider.

Try this

Three things to try before moving on.

  1. Change the model name. Run with --model gpt-4o and check the JSON output. The model field changes, but nothing else does. This is the point — callers can request different models without the rest of the stack caring.
  2. Set a fake endpoint via environment variable. Run TOY_MODEL_ENDPOINT=myhost://custom python3 00-model-access/model_cli.py "test" --json. The endpoint field in the output reflects the env var. No code change needed. This is how you would point the script at Ollama, OpenAI, or any compatible endpoint.
  3. Try the real-call path. Set TOY_MODEL_API_KEY to your OpenAI-compatible key and run the script again. The output shape is identical — only the source of the answer changes. Notice the script does not need to know which provider you are using as long as the response format matches.

What you just built

The plumbing that everything else stands on.

You now have a local command that turns a prompt into a structured response. It does not matter whether that response comes from a toy function, a local Ollama instance, or the OpenAI API — the interface is the same. That is the whole point of this lab: fix the surface so the layers above it do not have to care about the details below.

In production systems, this layer is what Ollama, LiteLLM, and provider SDKs provide. You just built the concept from scratch. Lab 01 adds the first real capability on top of it.

Concepts behind this

The full decision framework for model access — hosted APIs, aggregators, local runners, and when to use each — lives on the model access concept page.

If you are connecting a real provider key, read API key security first.