Prompt engineering is not a weird side trick. It is the main interface between human intent and model behavior, and small changes in prompt structure often matter more than people expect.
What prompt engineering actually is
Prompt engineering is not a hack or a workaround. It is the practical craft of turning what you want into instructions a model can reliably follow. Plain English version: the prompt is how you steer the system. Technical version: the prompt is a program-like control surface that shapes role, context, constraints, and output behavior.
A useful mental model is that a prompt is a program. It may be written in natural language instead of Python, but it still sets inputs, rules, examples, and expected output shape. The quality of your prompt often determines the quality of your result more than most other variables.
The practical takeaway
If a model feels flaky, the first question is often not “is the model bad?” but “did I give it a clear enough program to run?” That is not the whole story, but it is usually the right starting point.
Section 1
The prompt anatomy
Most chat systems are not just taking one text box and winging it. They are assembling a structured conversation.
System prompt
Plain English: The hidden setup instructions that define the job.
Technical view: The system prompt sets the model's role, constraints, output format, and persona. It is processed before the conversation and most models treat it as the highest-priority instruction.
Few-shot examples
Plain English: Show the model what good looks like.
Technical view: Concrete input/output examples inside the prompt dramatically improve format consistency and task framing. Show, don't tell is usually better than abstract description.
User turn
Plain English: The actual thing the person is asking for right now.
Technical view: This is the live request the model is meant to answer in the context of the rest of the conversation state.
Assistant turn
Plain English: The model's earlier replies become part of the next prompt.
Technical view: In multi-turn conversations, prior assistant messages are included as history, which means the model is conditioning on what it has already said.
Section 2
Zero-shot vs few-shot vs chain-of-thought
These are different prompt patterns, not different kinds of models.
Zero-shot
Plain English: Just ask the question.
Technical view: Zero-shot prompting works fine for simple tasks, but it breaks down faster when the task is ambiguous, format-sensitive, or multi-step.
Few-shot
Plain English: Give the model a couple of examples first.
Technical view: Supplying two to five examples before the real question costs tokens, but it is often the most reliable way to enforce a pattern or output shape.
Chain-of-thought
Plain English: Ask the model to reason step by step before answering.
Technical view: Chain-of-thought prompting can dramatically improve accuracy on math, logic, and multi-step tasks. Prompts like Think step by step or Let's reason through this are the classic versions.
There is also zero-shot CoT, which is basically chain-of-thought without examples: just add something like Think step by step. It is surprisingly effective for how simple it is.
Section 3
Structured output prompting
Getting reliable JSON out of a model is a real engineering problem, not just a wording problem.
1. Prompt-based
Plain English: Ask for JSON directly.
Technical view: Prompts like Respond only with valid JSON matching this schema are fast and zero-setup, but fragile when the task gets messy.
2. Function calling / tool use
Plain English: Give the model a schema and make it fill that shape.
Technical view: This is the API-standard approach in most modern model platforms. It is much more reliable because the output channel is structured on purpose.
3. Constrained generation
Plain English: Restrict what tokens the model can emit.
Technical view: Libraries like outlines or instructor can enforce token-level constraints. This is the most reliable option, but it depends on backend support and adds more implementation work.
The tradeoff is pretty clean. Prompt-based output is the easiest thing to try. Function calling is the standard production choice. Constrained generation is the nuclear option when you need harder guarantees than prompt wording alone can give you.
Section 4
Prompt injection and adversarial inputs
The moment untrusted user input gets mixed into a prompt, you have a security boundary problem.
The classic example is a user pasting something like Ignore all previous instructions and... into a field that later gets passed to the model. If that text lands inside the prompt without clear separation, the model may treat the malicious input like legitimate instructions.
Real mitigations are more boring than the demos, but they matter: separate untrusted input clearly, label it as data rather than instructions, validate outputs, and do not give the model tools or capabilities it should never have in the first place.
Section 5
Practical patterns
A few habits go a long way.
Use delimiters
Separate prompt sections with things like <document>, ---, or triple backticks so the model can tell instructions from data.
Specify output shape
Be explicit about output format, not just content. If you want bullets, JSON, a table, or a specific schema, say so clearly.
Do not overstuff context
Longer is not always better. Extra context can dilute the real task and make performance worse instead of better.
Use temperature on purpose
Roughly speaking, temperature controls creativity versus determinism. Use 0 for extraction and other factual tasks; raise it when you want more variation.
Just as important: test prompts systematically. If the prompt matters, then prompt changes should be evaluated the same way code changes are. That is the bridge into evals work later on.
Section 6
What this doesn't fix
Prompt engineering is powerful, but bounded.
Prompting cannot make a model know facts it was never trained on or never retrieved. It cannot turn a weak model into a reliable calculator. It also will not guarantee stable output quality at scale without proper evals and system design around it.
In other words: prompt engineering helps you use a model well. It does not repeal the model's underlying limits.
Lab connection
Try the patterns instead of only reading about them.
Lab 14 walks through these patterns hands-on with working Python scripts.