Model access is the first architecture decision.

Before tools, agents, protocols, and hooks matter, you need a way to talk to a model. That can mean a subscription product, an API key, a managed model platform, an aggregator, a local server, or a downloaded model artifact.

The short version

A model is a program. Model access is everything between you and that program.

A language model does one thing: it takes text in and returns text out. Every tool, agent, protocol, and wrapper in the AI ecosystem is built on top of that basic exchange.

"Model access" is the first question to sort out because the answer shapes everything else: where the model runs, how you call it, and what comes bundled with that path — billing, rate limits, privacy constraints, governance. These decisions feel like plumbing. They are. But the plumbing determines your costs, your privacy exposure, and what you can actually build on top.

That's the core idea Ready to see the whole ecosystem at once? The Stack page's coverage map shows how model access connects to every other layer — come back here once the vocabulary clicks.

Access boundary

Do not collapse all model access into "the model."

A chat subscription, a provider API, a managed platform, a model router, a local runtime, and a model file are different things. They can all lead to a text response, but they create different costs, privacy boundaries, portability constraints, and integration surfaces.

The useful question is not just "which model is best?" It is "where does the model run, how do I call it, and what else comes bundled with that access path?"

Three handles

  1. What are you paying for?
  2. What endpoint or surface do you call?
  3. Which parts can you replace?

The layers people often mix together

Hosted subscriptions

A human-facing product gives you a chat, IDE, CLI, or assistant surface. You are usually buying the experience, account layer, usage policy, and product workflow more than direct model control.

Examples: ChatGPT, Claude, GitHub Copilot, Gemini, Cursor.

Direct provider APIs

A hosted provider gives you programmatic access through API keys, SDKs, rate limits, model names, and usage billing. This is the most common starting point for building your own tools around a model.

Examples: OpenAI, Anthropic, Google, Mistral, Cohere, xAI.

Need credential handling guidance? See API key security.

Aggregate providers and routers

A broker exposes one interface across many model providers. That can make experimentation easier, but it adds another trust, pricing, and routing boundary to understand.

Examples: OpenRouter, LiteLLM, provider comparison layers.

Local model artifacts

The artifact is the actual model checkpoint, weights, or quantized file. It has its own license, size, architecture, context window, hardware needs, and fit for chat, coding, embeddings, or tool use.

Examples: small instruct models, coding models, embedding models, quantized GGUF files.

See also: which part does what.

Client surfaces

A client is what you actually use: chat UI, CLI, SDK, wrapper, IDE extension, agent host, or notebook. It may hide whether the backing model is hosted, local, direct, or routed.

Examples: SDK calls, terminal chat, IDE assistants, local agent hosts.

How it got this shape

The API format is a default that stuck, not a standard that was designed.

Most provider APIs use the same basic shape: a list of messages, each labeled with a role (system, user, assistant) and some content. OpenAI introduced this format in early 2023 with gpt-3.5-turbo, the first widely-accessible model fast enough to feel conversational.

It became the de facto standard not by any official agreement, but because OpenAI went first at scale. When Anthropic, Mistral, and local runtimes like Ollama needed to interoperate, they all built OpenAI-compatible endpoints. Routers like LiteLLM exist largely because this format became the common interface worth bridging to.

This is why a tool written against one provider usually works against another with minimal changes. It is not a formal standard. It is a default that stuck — and understanding that helps explain why the whole ecosystem converged around one shape so quickly.

Common confusion

A subscription is not the same thing as API access.

Access path What you usually get What it is good for Where tooling friction appears
Subscription product App, chat, IDE, account features, usage limits, saved history, product-specific tools. Human workflows, writing, coding assistance, product-integrated context, team adoption. Automation may be limited to product-supported extension points.
Provider API Programmatic endpoint, SDKs, API keys, usage billing, model/version selection. Custom CLIs, agents, evals, internal tools, repeatable workflows, backend services. You own auth handling, retries, logging, cost controls, and safety boundaries.
Local endpoint Runtime process, downloaded model, local address, hardware-dependent performance. Offline experiments, privacy-sensitive prototypes, learning, model swapping, and low-cost testing. You own installation, updates, speed, memory use, and model compatibility.

Common first access paths

Choose the first path that matches what you actually want.

Most beginners do not need every option at once. Pick the path that matches your immediate goal, then add complexity later.

If you already have a hosted CLI agent product but no API key, that still counts as a model surface. Treat it like a subscription-based host, then use the labs bootstrap step to decide where to jump in.

If you are entering through an enterprise AI cloud, see managed model platforms before choosing where to jump into the labs.

If you are about to put a provider key into a shell or tool host, read API key security first.

Go deeper Match the path to the goal — detailed decision matrix

Match the path to the goal

Start from the constraint, not from the hype.

If you want... Start with... Because...
The fastest path to using an assistant Subscription product A polished product removes setup and lets you learn the workflow first.
A hosted CLI agent surface, but no API key Subscription product plus CLI host You already have a usable model-facing interface, even if you do not control a raw API. For the labs, treat that surface as your starting point and skip ahead accordingly.
A scriptable foundation for custom tools or agents Direct provider API You get a stable programmatic surface you can wrap, log, test, and automate.
An enterprise cloud boundary with deployment and policy controls Managed model platform You may need model access plus deployment, org policy, identity, and managed evaluation features in one place.
Easy model comparison across providers Aggregate provider or router You can hold prompts and evals constant while changing the backing model.
Offline experiments or maximum local control Local hosting runtime plus a small model artifact You own the runtime and endpoint, but you also own the setup and performance tradeoffs.
Go deeper Constraints that decide the access path

Constraints that decide the access path

The right access path depends on the constraint.

Privacy boundary

Where do prompts, files, outputs, logs, and embeddings go?

Cost model

Are you paying per seat, per token, through a platform or router, or through local hardware?

Latency and reliability

Is the bottleneck network, provider queueing, local CPU/GPU speed, or model size?

Capability fit

Does the model handle chat, code, tool calls, long context, embeddings, or structured output well enough?

Portability

Can you swap models without rewriting your prompts, tool schemas, evals, client code, or platform-specific deployment assumptions?

License and terms

What are you allowed to run, modify, redistribute, log, fine-tune, or use commercially?

Go deeper Local-first path — building from the bottom up

Local-first path

If you want the full stack, start from one tiny local runtime.

The current labs use a toy model interface so the tooling boundaries stay easy to see. A deeper track can start from the real beginning: pull a small open model, host it locally, expose an endpoint, and then rebuild the same tooling layers on top.

1

Choose the model artifact. Check license, size, hardware needs, context length, and task fit before downloading anything.

2

Run a local host. Use a runtime that can expose a local endpoint, then prove a simple prompt works.

3

Wrap the endpoint. Build the smallest CLI around that endpoint, then add tools, JSON, protocols, hooks, memory, and evals.

Next move

Turn the chosen surface into one boring interface.

1

Pick the access path. Subscription, API, managed platform, router, or local host.

2

Give it a stable interface. Turn it into one boring command, request, or CLI surface that always accepts the same kind of input.

3

Then build upward. Continue into the labs, the stack, and the protocols page once the model surface is clear.