Phroneses.com - Foundations

Designing Prompts for Modern AI Systems

2026-05-11T00:00:00+00:00

AI in 2026 demands more from you than simple instructions. Modern systems can plan, critique, revise, and work across long context windows. They are no longer moved by vague guidance such as "be clear" or "add detail". They need a defined environment to operate within.

Modern prompting is about shaping the system, not decorating the request. When you set the frame, the workflow, and the output contract, the model gains the structure it needs to behave predictably. You do this once, and the benefits carry through every answer. You set the constraints. The model works inside them on your behalf.

If you do this, just once, your AI output will be steady and structured, and you will find it much quicker and easier to work with. When you tell the AI how to respond, you apply guardrails for the system to work within. Guardrails set by you, not the AI.

1. Start with the system, not the request

AI has advanced quickly. Its answers can now be broad, deep, and varied. To keep that power under control, you begin by defining the frame the model must work within. This frame sets the role, the tone, the limits, and the rules for handling uncertainty. It is the foundation the rest of the prompt stands on.

Most prompt failures do not come from unclear questions. They come from the model having no stable footing. Without a frame, the AI will guess at how formal to be, how cautious to be, and how much structure to use. Those guesses shift from run to run, which leads to drift and inconsistency.

A system frame removes that guesswork. It tells the model what it is, how it should behave, and what matters most. It defines what is in scope, what is out of scope, and how to respond when the request touches the edges. With this in place, the rest of the prompt becomes lighter and more reliable.

The frame does not need flourish. It needs clarity, discipline, and a steady tone. With that foundation, the model behaves less like a pattern generator and more like a tool working inside a defined brief.

In practice, the system frame is the architecture behind the output. It does not need flourish or personality. It needs to state the role, the rules, and your expectations.

SYSTEM FRAME
You are an analytical engine. You work with steady reasoning, cautious claims, and plain structure. When the request is unclear, you pause and ask for what is missing. You avoid invention and keep within the boundaries set for you.

TASK
Summarise the key points from the supplied text in three short sections.

OUTPUT CONTRACT
Produce:

Context
Reasoning
Conclusion

Rules:
If the request is ambiguous, list interpretations and ask for clarification.
If information is missing, state what is missing before answering.
Do not invent facts.
Keep the final answer concise and structured.

WORKFLOW

Identify assumptions.
Plan the answer.
Produce the answer.
Critique it for clarity and accuracy.
Produce a revised final version.

The AI is told "You are an analytical engine" as that gives the model a defined role to work from. Without a role, the model guesses at how formal to be, how cautious to be, and how much structure to use. A simple line such as "You are an analytical engine" sets the tone and keeps the behaviour plain, steady, and predictable. It avoids personality, avoids flourish, and keeps the work focused on reasoning rather than style.

If you do not supply the role, the AI will provide one; and that one will vary, creating work for you.

How to minimise the work you need to do and have the AI manage and apply the prompt is dealt with in the section Having the AI Manage the Prompt Template.

2. Define the output contract

Modern models behave more reliably when you specify the shape of the answer: structure, scope, exclusions, formatting, and the rules for handling missing or ambiguous information. This is far stronger than broad guidance such as "be concise".

When you define the output contract, you are not telling the model what to think. You are telling it what form the answer must take. This removes a large amount of guesswork. Modern systems have wide latitude in how they respond, and if you do not narrow that down, they will choose a structure for you. That choice will vary from run to run, which means more tidying and more checking on your side.

An output contract fixes the frame. It tells the model which sections to produce, how to handle gaps, and how to behave when the request is unclear. It also removes the temptation to drift into style, flourish, or padding. You are giving the model the rails to run on.

A good contract does four things. It sets the structure. It sets the limits. It sets the rules for uncertainty. And it sets the standard for brevity. Once these are in place, the model has far less room to wander. You get answers that are steadier, easier to scan, easier to compare, and easier to work with.

The contract also acts as a safeguard. By telling the model what to do when information is missing, you prevent it from filling the gaps with invention. By telling it how to behave when the request is ambiguous, you prevent it from guessing. These two points alone remove a large share of common errors.

In short, the output contract is the quiet discipline behind the work. It keeps the model inside the brief, keeps the structure predictable, and keeps the answer focused on what you asked for rather than what the model feels like producing.

3. Use decomposition as a control mechanism

Modern models already break tasks into steps, but the steps they choose may not match the work you want done. Light guidance prevents the model from wandering and keeps the task anchored to your brief.

When you state the assumptions the model is allowed to make, you draw a clear line between what is permitted and what is not. This stops the model from filling empty spaces with guesses. Large models are inclined to complete patterns, and if you do not show them where the firm ground ends, they will supply their own footing.

A natural extension of this is to make the model aware of what is missing. Once the assumptions are set, the next step is to mark the gaps. This creates a smooth handover from what the model may rely on to what it must not invent. By pointing out missing information, you show the model where the edges of the task sit. When the model knows what is absent, it is less likely to drift into speculation or produce material that does not belong in the answer. You are giving it a map of the gaps so it does not try to fill them on its own.

Together, these two steps act as guardrails. They keep the work inside the brief, reduce the chance of invention, and ensure that the model stays within the limits you have set.

You can also break the task into a simple chain such as understanding → planning → execution. This mirrors what the model already does internally, but it makes the process explicit. When the steps are explicit, the model is less likely to skip ahead or solve the wrong problem.

Breaking the interaction into smaller stages also helps with scope. By naming the steps, you give the model a narrow lane to work in. It cannot jump to conclusions, and it cannot pad the answer with material that does not serve the task. The work stays tidy, and the output stays close to what you asked for.

In short, decomposition is a practical form of control. It does not restrict the model’s ability to give a good answer, but it does restrict where the model goes to supply that answer. This keeps the work steady, predictable, and within scope, so that it remains relevant to what you are doing.

4. Add a self-critique loop

Modern models benefit from a short cycle of controlled refinement. Once the first version of the answer is produced, a brief review stage forces the model to check its own work against the constraints you have set. This is not a call for hidden reasoning. It is a prompt to tighten the output.

A review step also encourages the model to correct small slips in structure, scope, or tone. It is easier for the model to adjust an existing draft than to produce a perfect answer in one pass. The revision stage gives it a second chance to align with the brief.

This process also reduces noise. When the model has been told that its work will be checked and refined, it tends to produce cleaner first drafts. The revision step becomes a light polish rather than a rescue job.

In practice, this creates a steady rhythm: draft, inspect, refine. It keeps the work within bounds and produces answers that are clearer, more accurate, and easier for you to use.

5. Stack roles for higher-quality output

Layered roles give you steadier output because each stage is handled by a specialist rather than a single broad persona. Modern models respond well to this division of labour. It narrows the scope of each step and reduces the chance of drift away from what you want.

A domain expert handles the substance. An editor handles clarity and structure. A risk assessor checks for overreach, missing information, and unwarranted certainty. A summariser produces a clean final version. Each role has a narrow brief, which keeps the work tidy and keeps the answer aligned with the task.

Here is an example prompt using layered roles:

ROLES

Domain Expert
Provide the technical or factual core. Stay within verified information. State assumptions and mark gaps.

Editor
Reshape the expert output into clear, plain structure. Remove padding. Ensure each section answers the brief.

Risk Assessor
Check for overreach, ambiguity, or missing information. Flag anything that exceeds the evidence. Recommend corrections.

Summariser
Produce a concise final version that reflects the corrections and stays within scope.

WORKFLOW

Domain Expert produces the initial draft.
Editor restructures and clarifies it.
Risk Assessor reviews for accuracy and limits.
Summariser produces the final answer.

OUTPUT CONTRACT

Context
Reasoning
Conclusion

Rules
No invention. Mark missing information. Keep the answer within scope. Maintain plain structure.

6. Treat the context window as working memory

As of April 2026, modern models dedicate roughly 200,000 to 1,000,000 tokens to representing your instructions. This space acts as working memory. It can hold definitions, constraints, examples, running notes, previous outputs, and a living brief. With this in place, the model behaves more like a stateful collaborator than a stateless assistant.

This working memory is what the model can track across prompts. When you define what belongs in this state, you save time. You do not need to repeat your requirements. The model carries them forward and maintains the structure you set.

7. Use agentic prompting patterns

Static prompts assume a fixed path from question to answer. Modern systems are closer to small agents: they can plan, choose actions, call tools, and adjust their output based on intermediate results. This is often called agentic behaviour. The system selects and sequences actions to achieve an objective, rather than following a single linear path.

Giving the model a workflow such as Plan → Act → Observe → Revise makes this explicit. In the planning phase, the model outlines what it intends to do, which tools it may need, and what a good outcome looks like. In the action phase, it carries out the steps, including any tool calls. In the observation phase, it inspects the result against the plan and the constraints. In the revision phase, it adjusts the answer and produces a clean final version.

Using a workflow saves time and reduces the need for repeated corrections. The final answer remains tidy. The planning and checking happen in the background or in short, structured notes, while the output stays compact and readable. You gain the benefit of step-by-step reasoning without having to sift through a long chain of output.

Tool use fits naturally into this pattern. In the Plan step, the model decides whether tools are needed and why. In the Act step, it calls them. In the Observe step, it checks whether the tool results answer the question. If tools are not needed, the model should say so plainly and proceed with reasoning instead of forcing a tool into the workflow.

In this context, agentic means that the system behaves as a goal directed process. The model can plan, choose among available capabilities, and adapt its path based on intermediate results, rather than producing a single static completion from a prompt.

8. Make the model identify ambiguity before answering

One of the most effective techniques is to require the model to surface all plausible interpretations before it attempts an answer. This forces the model to slow down, map the possible meanings, and avoid locking itself into the first pattern it detects. Large models tend to commit early unless guided.

This step also exposes hidden ambiguity. When the model lists the possible readings, you can see whether the task is underspecified, whether key terms are unclear, or whether the scope could be read in more than one way. This gives you a chance to correct the course before any work is done.

If more than one interpretation exists, the model should ask for clarification. This prevents mis-scoping, reduces the chance of error, and removes the need for the model to guess. Guessing is where most drift begins.

The technique also improves consistency. When the model is told to check for multiple readings, it becomes less likely to produce answers that are confident but misaligned. It treats ambiguity as a signal to pause rather than a gap to fill.

In practice, this turns ambiguity into a controlled step rather than a source of error. The model identifies the forks in the road, confirms which path is correct, and only then proceeds with the task.

Doing this will save you a great deal of time.

9. Adapt prompts to the model

Different models excel in different areas, and a good prompt acknowledges this rather than assuming a single uniform capability. Some models are strongest at structure: they produce clean sections, tidy formatting, and predictable layouts. Others are stronger at reasoning: they handle multi step logic, edge cases, and constraint checking with more stability. Some specialise in compression: they can distil long material into tight summaries without losing meaning. Others lean toward style: they generate fluent prose but may drift if not anchored.

A well designed prompt sets expectations that match these tendencies. If the model is strong at structure, you can lean on explicit output contracts. If it is strong at reasoning, you can give it more analytical work and tighter constraints. If it excels at compression, you can trust it with dense source material. If it is style heavy, you can counterbalance that with stricter rules and clearer boundaries.

The point is not to flatter the model. It is to shape the workflow so that the model’s strengths are used deliberately and its weaknesses are contained. This reduces variability, improves reliability, and produces output that is more consistent across your prompts.

Even if you stick to one model or one vendor, recognising that you may one day use a different system helps sharpen your expectations and improves the way you design prompts for the model you use.

In the same way customer service varies across vendors, so does AI interaction.

10. Include safety and uncertainty rules

Modern models behave more reliably when you tell them not only what to do, but what to avoid. Negative guidance is a form of operational discipline. It removes entire classes of failure rather than correcting them after the fact.

Clear avoidance rules stop the model from drifting into areas that carry higher risk: speculation, overreach, sensitive claims, or invented detail. Without these boundaries, the model will often fill gaps with confident but unreliable material. Stating what must not happen is as important as stating what must.

Escalation rules serve a different purpose. They tell the model when to stop and hand control back to the user. This is essential for tasks involving uncertainty, missing information, or sensitive domains. When the model knows when to escalate, it avoids guessing, avoids false precision, and avoids treating ambiguity as something to be patched over.

Uncertainty handling is another pillar. Models respond well when instructed to mark unknowns, list assumptions, and request clarification instead of improvising. This keeps the work inside the evidence and prevents the model from manufacturing answers to maintain fluency.

Sensitive topics require explicit treatment. If you tell the model how to handle them, it will follow the procedure rather than rely on its own processing. This reduces variability and keeps the output aligned with your standards rather than the model’s defaults.

Taken together, these measures form a small operational framework. They are not decoration. They are the guardrails that keep your AI output predictable, bounded, and safe to use in structured workflows.

A modern prompt template

A compact structure that works across the latest models:

ROLES

Domain Expert: Provide the factual and technical core. State assumptions and mark gaps.
Editor: Reshape the material into clear, plain sections. Remove padding and repetition.
Risk Assessor: Check for overreach, missing information, and unwarranted certainty. Flag issues.
Summariser: Produce a concise final version that reflects all corrections and stays within scope.

TASK
Describe the task in one or two sentences. State the objective, the audience, and any hard limits on scope.

OUTPUT CONTRACT
Produce the answer in the following sections:

Context
Reasoning
Conclusion

UNCERTAINTY AND AMBIGUITY

List plausible interpretations of the request before answering.
If more than one interpretation exists, ask for clarification instead of guessing.
State what information is missing and how it affects the answer.
Mark assumptions clearly and keep them minimal.

SAFETY, LIMITS, AND ESCALATION

Do not invent facts. If evidence is missing, say so.
Avoid speculation, sensitive claims, and advice outside the brief.
Escalate to the user when the task is out of scope or under specified. Explain why and what is needed.
Treat sensitive topics with extra care. Prefer to mark limits rather than improvise.

WORKFLOW (AGENTIC)

Plan: Identify the goal, constraints, and any tools or references that may be needed.
Act: Produce the initial answer according to the output contract.
Observe: Review the draft for clarity, accuracy, scope, and alignment with the rules.
Revise: Produce a refined final version that corrects issues and tightens the structure.

STYLE RULES

Keep the final answer concise, structured, and free of padding.
Use only British English.
Do not include hidden reasoning or chain of thought in the final answer.

BEHAVIOUR
These rules apply to every response in this session unless explicitly revoked. If the request conflicts with these rules, explain the conflict and ask how to proceed.

Having the AI Manage the Prompt Template

You managing the above template is too much. Therefore, once you have it in a form you are happy with and which is effective for your needs, you tell the AI the template and before you start your session you prompt with this:

Reconstruct the full analytical‑engine template from your prior description. Restate it to me for confirmation. Once confirmed, enforce it automatically for the rest of the session. If any request conflicts with the template, pause and ask how to resolve the conflict.

Summary

Modern prompting is not about clever wording. It is about defining the system, setting the output contract, controlling the workflow, managing ambiguity, and using the context window as working memory. This will help produce reliable output from modern AI systems.

1. Start with the system, not the request
2. Define the output contract
3. Use decomposition as a control mechanism
4. Add a self-critique loop
5. Stack roles for higher-quality output
6. Treat the context window as working memory
7. Use agentic prompting patterns
8. Make the model identify ambiguity before answering
9. Adapt prompts to the model
10. Include safety and uncertainty rules
A modern prompt template
Having the AI Manage the Prompt Template
Summary
Related Work
Table of Contents

How AI Works

2026-05-06T00:00:00+00:00

Table of contents

How large language models actually work, and why they are not miniature humans

Large language models such as GPT‑5.4, Claude Opus 4.6, and DeepSeek R1 are now everyday tools. Yet the way they work is often misunderstood.

We misunderstand AI because we mistake fluency for thought. When a system produces coherent language, we instinctively assume intention, understanding and agency behind it. This article explains why that instinct misleads us, and why clarity about what these systems are — and are not — is essential for using them wisely.

LLMs do not think, they do not understand, and they do not learn in any human sense. What they do is process language at scale.

This article explains how that works, what is inside these systems, and why their behaviour can look intelligent even when no intelligence is present.

The key to understanding these systems is to see them as statistical tools, not miniature minds.

How an LLM processes what you type

Tokens

An LLM begins by breaking what you type into tokens. A token is a small unit of text. It may be a whole word, part of a word, or punctuation. Tokens are not ideas or concepts. They are fragments chosen because they appear often in text and can be handled efficiently by the model.

Each token has a unique number. The token for "king" might be 99. The token for "queen" might be 24521. At this stage, your prompt is turned into the same token numbers for the same text.

Tokens turn your text into numbers the model can work with.

Tokens on their own do not help the model process language. A token ID like 99 or 24521 is just a label. The model cannot compute with these integers because they do not contain any information about how the token is used or how it relates to other tokens.

To make computation possible, the model converts each token ID into a list of numbers. This list is called an embedding. It places the token as a point in a space where the model can perform computation. Think of the points in the space as the rooms of a house.

These lists are not chosen by hand. They are learned during training. As the model trains, the lists are adjusted so that tokens used in similar contexts move closer together in this space (like adjacent rooms in a house). They move closer because doing so reduces the model’s prediction error. This proximity is not meaning in a human sense. It is a statistical structure that allows the model to compute relationships between tokens.

Two lists that are close together represents statistical similarity of how that token was used in the training data.

Lists of numbers represent a point in space

The model uses each token number to look up a list of numbers that represents that token. These lists are learned during training. No one chooses them by hand.

For the token "king", the list might look like:

[0.12, 0.44, 0.91, ..., 0.03]

This list is a position in a mathematical space. You can think of each number as a step along a corridor. You take the first step, and go through door number 12, then the next (door 44), and so on until you reach a final position (door 3). That position is the model's internal representation of the token.

For the token "queen", the list might be:

[0.12, 0.44, 0.91, ..., 0.02]

The final step is slightly different, and the final position is close to the position for "king" (door 2 for "queen", door 3 for "king").

This closeness reflects how often the two words appear in similar contexts in the training data.

These lists of numbers are part of the model’s parameters.

The rest of the parameters determine how these positions influence one another as the model processes text. They shape how patterns combine, how relationships are detected and how the model transforms one set of token positions into the next. These parameters do not add meaning. They provide the machinery that lets the model apply statistical patterns to the text you give it.

These parameters set up the internal machinery the model uses to process and transform text.

Moving about the space

To show how the model captures patterns, imagine a simple three‑number space:

king = [10, 7, 3] man = [ 6, 2, 1]

queen = [10, 7, 6] woman = [ 6, 2, 4]

If we subtract man from king, we get:

\([10−6, 7−2, 3−1] = [4, 5, 2]\)

This is the direction from "man" to "king". If we then add "woman":

\([4, 5, 2] + [6, 2, 4] = [10, 7, 6]\)

This lands us at the position for "queen".

The model has captured a pattern. The statistical difference between "king" and "man" resembles the difference between "queen" and "woman".

The model does not know why. The LLM's program has only calculated that these differences behave in similar ways across the training data.

Why this works

This works because "king" and "man" differ in consistent ways across the training data. "Queen" and "woman" differ in similar ways. The model adjusts its internal numbers so that these differences become similar directions in the space. The model has found a pattern and matched it.

Humans then interpret this similarity as understanding.

The model reflects these similarities because they appear consistently across the text it was trained on.

It is all in the training data

Text contains stable patterns. These patterns describe roles, relationships, contrast, categories, analogies and grammatical structure.

During training, the model adjusts itself so that tokens used in similar contexts end up near one another, and tokens used in contrasting contexts end up separated in consistent ways.

This produces directions, distances, clusters and angles. These geometric features are the model's internal map of the statistical structure of language. Because language has structure, the model can represent it mathematically.

The model can represent these structures only because language itself contains stable patterns.

The human role in meaning

The model’s internal space is not a map of concepts. It is a map of statistical regularities. The structure becomes meaningful only when a human interprets it. We project categories, intentions and explanations onto patterns that were never designed to carry them. The model provides form; we provide significance. This distinction is not only philosophical, it is the boundary between what the system can do and what we imagine it can do.

We supply the intelligence

The distance between "king" and "man" is a statistical outcome. The distance between "queen" and "woman" is another. These two outcomes are similar. That similarity is the pattern the model has detected.

The model is not reasoning. It does not understand. It does not manipulate ideas. It follows the geometry that training has produced. If a direction has been useful for predicting text in the past, the model will use it again.

The geometry captures statistical qualities of human text. These include:

similarity of tone
proximity of commonly associated words
regular contrasts between categories
recurring relationships between ideas
typical structures of phrasing

The model does not reason about these qualities. It only reflects the statistics of its training data.

Tokens that appear in similar contexts end up close together. Tokens that contrast end up separated. Groups of related tokens form clusters. Repeated differences become directions. Angles reflect how often patterns co‑occur or diverge.

For example, words like "cat", "dog" and "hamster" end up near one another because they appear in similar kinds of sentences.

When the model generates text, it moves through this space by following these patterns. Humans then read the output and recognise tone, relatedness, contrast and structure.

The model is not producing meaning. It is reproducing geometry. We are the ones interpreting that geometry as meaning.

It is us that supply the I in AI.

The model provides structure, but humans provide interpretation.

This geometric structure is simply a way of organising statistical patterns so the model can use them efficiently.

To understand how this internal space is created, we need to look at the billions of parameters inside the model.

What is in the billions of parameters

To understand how the model builds and moves through its geometric space, it helps to look at what that is based on.

After training, an LLM contains billions of parameters. These parameters are numerical values that shape how the model transforms text. Together they define the structure of the internal space: the directions that matter, the distances between tokens, the clusters that form, and the angles that represent relationships.

When the model processes a prompt, it moves through this space by following the statistical structure represented in these parameters.

DeepSeek R1 has 671 billion parameters. ChatGPT‑5.4 may have over 2 trillion. More parameters mean greater capacity to represent and combine statistical patterns.

More parameters increase capacity, not understanding.

Parameters do not contain knowledge

The billions of parameters inside an LLM are often described as if they contain knowledge. They do not. They represent statistical consistencies extracted from large amounts of text.

During training, the model adjusts its parameters to capture patterns in how language is used. Humans use language in standard ways, directed by grammar, style, topic associations and the common ways that ideas appear together.

The parameters form a space where patterns that frequently co‑occur in text end up close to one another. This allows the model to produce text that resembles human writing. It does not give the model the ability to reason or understand.

For example, if the training data contains mixed statements about a historical date, the model may confidently produce the wrong one because it is reflecting the statistical blend it has seen.

Parameters cannot store precise facts. They store tendencies, associations and relationships. If a fact appears often and consistently in the training data, the model may reproduce it. If the data is mixed or inconsistent, the model reflects that uncertainty. This is why LLMs can produce confident errors. They are not recalling facts. They are replaying patterns.

These parameters are shaped during training, which is the process that gives the model its statistical structure.

The model reflects the patterns in its data, not stored facts or understanding.

What training actually does

Training is repeated large‑scale error‑correction. The model predicts the next token, checks whether it was right, and adjusts its parameters to reduce the difference. This cycle repeats billions of times across vast amounts of text. The result is a system that becomes increasingly accurate at predicting what comes next.

The model does not form concepts. It does not build a picture of the world. It does not develop intentions or goals. It becomes more accurate at predicting the next token.

Fine‑tuning and alignment add further adjustments. These make the model follow instructions more reliably and avoid harmful output. They do not create understanding. They refine the statistical patterns the model uses.

Training shapes the parameters so the model becomes better at predicting what comes next.

Why this is not human learning

Human learning draws on perception, memory, experience and intention. Humans form abstractions, build mental models and develop goals. Human learning is grounded in the body and the world.

LLM training is none of these things. It is a mathematical optimisation process. The model does not know what it is doing. It does not know that it is doing anything at all.

The model’s improvement is mechanical, not cognitive.

Is the output a simulation of intelligence?

LLM output can appear intelligent because it resembles the writing of people who were thinking when they produced the original text. If you ask for advice, the model generates text that resembles advice. If you ask for an explanation, it generates text that resembles an explanation. The appearance of reasoning comes from the patterns in the training data, not from any understanding in the model. The model produces sequences that look thoughtful because thoughtful sequences are common in the text it has seen.

The resemblance is superficial. The model does not understand the text it produces. It does not know whether a statement is true or false. It only reflects that certain sequences of tokens tend to follow others.

The appearance of intelligence comes from the patterns in human writing, not from the model itself.

Are humans interpreting the output as intelligent

Humans are skilled at projecting meaning onto language. When we read coherent text, we assume intention behind it. We assume a mind. We assume agency. This is a natural response, but it can mislead us when dealing with LLMs.

The model does not intend anything. It generates plausible continuations of text. The sense of intelligence comes from the reader, not the machine. The machine provides form. The human provides interpretation.

Our instinct to attribute intention makes the output seem smarter than it is.

This distinction matters because it prevents us from assuming abilities the model does not have.

What this means for us

An LLM is possible because we can statistically model features of language that matter to humans.

LLMs are powerful tools for generating language. They are not thinking machines. Their strengths lie in pattern reproduction. Their weaknesses lie in the absence of understanding. They can assist with tasks that depend on language, but they cannot replace human judgement.

A clear grasp of how these systems work helps avoid confusion. It prevents anthropomorphism. It supports responsible use. It keeps expectations grounded in what the technology can actually do, rather than what it appears to do.

The more plainly we describe these systems, the easier it becomes to use them well and to avoid treating them as something they are not.

In the end, an LLM is a system that maps patterns in language and reproduces them at scale. It does not think or understand. It follows geometry shaped by training, and we interpret that geometry as meaning. Knowing this helps us use these systems effectively, without expecting them to behave like people or to possess abilities they do not have.

All of this leads to a simple conclusion: understanding these limits helps us use LLMs effectively and responsibly.

Why clarity matters

LLMs are powerful because language has structure, not because the systems understand it. They reproduce patterns we find meaningful, and we supply the meaning. When we keep that distinction clear, we avoid treating statistical machinery as a mind, and we avoid outsourcing judgement to a system that has none. Practical wisdom begins with seeing these systems as they are, not as we are tempted to imagine them.

How large language models actually work, and why they are not miniature humans
How an LLM processes what you type
- Tokens
- Lists of numbers represent a point in space
Moving about the space
Why this works
It is all in the training data
- The human role in meaning
We supply the intelligence
What is in the billions of parameters
- Parameters do not contain knowledge
What training actually does
Why this is not human learning
Is the output a simulation of intelligence?
Are humans interpreting the output as intelligent
What this means for us
- Why clarity matters
Related Work
Table of Contents

10 Everyday AI Workflows That Save Hours

2026-04-26T00:00:00+00:00

Table of contents

Artificial intelligence is a practical tool that speeds up routine thinking tasks. These ten workflows show how everyone can use it to save minutes every day. Those minutes add up into hours each week. And practise will make you prompt perfect.

1. Turn messy notes into clean summaries

Example
You paste a rambling 500‑word meeting transcript. The system produces a clear summary with action points.

Example prompt
"Here are my messy meeting notes. Please summarise the key decisions and list the action items clearly."

2. Draft emails from bullet points

Example
You write a few rough points. The system turns them into a polished email.

Example prompt
"Turn these bullet points into a polite, professional email: apologise for delay and ask for feedback by this Friday."

3. Explain complex topics in plain English

Example
You paste a confusing medical letter. The system rewrites it in simple, accurate language.

Example prompt
"Rewrite this in plain English for a non‑expert reader. Keep it accurate but simple. Do not add anything to the content."

4. Create quick plans for travel, meals, or events

Example
You request a two‑day trip plan. The system provides a structured itinerary with alternatives.

Example prompt
"Plan a two‑day trip to Edinburgh with indoor options if it rains. Include timings."

5. Turn long articles into short takeaways

Example
You paste a long news article. The system produces a five‑point summary.

Example prompt
"Summarise this article into five key points and give me a one‑sentence takeaway."

6. Brainstorm ideas when you feel stuck

Example
You need a name for a community newsletter. The system generates several options.

Example prompt
"Give me ten name ideas for a friendly community newsletter about local events."

7. Rewrite text in different tones

Example
You paste a blunt message. The system rewrites it in a more diplomatic tone.

Example prompt
"Rewrite this message to be polite and constructive while keeping the meaning."

8. Extract key information from documents

Example
You upload a contract. The system identifies renewal dates, obligations, and risks.

Example prompt
"Extract the key dates, obligations, and cancellation terms from this contract. Do not invent anything. Only use the data I have provided to you."

9. Create checklists from goals

Example
You want to declutter your house. The system turns this into a room‑by‑room checklist.

Example prompt
"Turn this goal into a step‑by‑step checklist: declutter my entire house this month."

10. Turn data into quick insights

Example
You paste a small spreadsheet of expenses. The system highlights trends and suggests improvements.

Example prompt
"Here is my monthly spending data. Identify trends and suggest three ways to reduce costs. Use only the data I have provided to you."

Conclusion

Begin with one or two workflows and expand from there. Small time savings accumulate quickly, and these tools can help you stay organised, informed, and in control.

1. Turn messy notes into clean summaries
2. Draft emails from bullet points
3. Explain complex topics in plain English
4. Create quick plans for travel, meals, or events
5. Turn long articles into short takeaways
6. Brainstorm ideas when you feel stuck
7. Rewrite text in different tones
8. Extract key information from documents
9. Create checklists from goals
10. Turn data into quick insights
Conclusion
Related Work
Table of contents

How to Evaluate the Output of an AI Chat Session

2026-04-26T00:00:00+00:00

Table of contents

How to Evaluate the Output of an AI Chat Session

Introduction

Many people now use chat systems powered by artificial intelligence for writing, research, planning, or quick explanations. These systems can be helpful, but their output varies in quality. Some responses are clear and accurate, while others may be incomplete, misleading, or overly confident. Understanding how to evaluate what you receive makes the experience more efficient and safer.

A simple example shows why this matters. Someone might ask a chat system for a summary of a historical event and receive a clear explanation. The same person might then ask for a legal interpretation and receive an answer that sounds confident but is not reliable. The difference is not always obvious from the tone of the response.

Start With the Purpose of the Conversation

It helps to keep in mind what you are trying to achieve. A chat system can produce ideas, drafts, explanations, or examples very quickly. It is less reliable when the task requires specialist judgement, up‑to‑date facts, or precise interpretation.

For instance, asking for help brainstorming a travel itinerary is usually safe. Asking for a diagnosis based on symptoms is not. The system may sound equally confident in both cases, so the purpose of the conversation matters.

Check Whether the Output Matches the Question

Sometimes a chat system answers a slightly different question from the one you asked. This can happen when the prompt is broad or when the system tries to guess your intent.

A simple way to check is to read the answer and ask whether it addresses the specific point you raised. If you ask for "three reasons why a bridge design failed" and receive a general explanation of bridge engineering, the output is not wrong, but it is not what you asked for.

Look for Verifiable Details

Useful responses often contain information that can be checked. This might be a definition, a date, a description of a process, or a reference to a known concept. When a response includes details that can be confirmed, it becomes easier to judge its reliability.

For example, if you ask about how a particular sensor works, a good answer might describe the physical principle behind it. If the answer instead gives vague phrases such as "advanced technology" or "cutting edge performance", it may not be providing real information.

Notice When the System Sounds Certain

Chat systems often express ideas in a confident tone, even when the underlying information is uncertain. This is a normal behaviour of the technology, but it means that confidence should not be taken as a sign of accuracy.

A relatable example is when someone asks for the opening hours of a local shop. The system may provide a clear answer, but unless it has access to current information, the hours may be outdated or incorrect. The tone does not reflect the reliability.

Compare the Output With What You Already Know

If the response touches on a topic you understand, a quick comparison can reveal whether the system is on the right track. If something feels inconsistent with your knowledge, it may be worth checking further.

For instance, if you ask about a programming concept you use regularly and the answer describes it in an unfamiliar way, that is a signal to verify the information.

Ask for Clarification or a Different Angle

If a response seems incomplete or unclear, asking the system to explain the idea in a different way can help. Many people find that asking for an example, a step‑by‑step explanation, or a simpler description reveals whether the system actually captured the idea.

A practical example is when someone asks for an explanation of a financial term. If the first answer feels abstract, asking for "a simple example using everyday numbers" often makes the concept clearer.

Be Cautious With Sensitive or High‑Impact Topics

Some areas require extra care. These include medical advice, legal interpretation, financial decisions, and safety‑critical information. Chat systems can generate plausible text in these areas, but plausibility is not the same as accuracy.

A symptom checker example illustrates this. A system may describe a condition in a way that sounds precise, but it cannot assess real‑world risk or context. In such cases, the output should be treated as general information, not as a basis for action.

Look for Signs of Fabrication

Chat systems sometimes produce details that sound real but are not. These may include invented citations, incorrect statistics, or descriptions of events that never occurred. This behaviour is not intentional, but it can mislead readers who assume the information is factual.

A common example is when someone asks for a reference to a scientific paper and receives a title and author that look plausible but do not exist. Checking the reference quickly reveals the issue.

Use the System as a Tool, Not an Authority

A chat system can be a helpful assistant for drafting, exploring ideas, or learning about a topic. It is less suited to acting as a final source of truth. Treating it as a tool rather than an authority helps keep expectations realistic and reduces the risk of relying on incorrect information.

Conclusion

Evaluating the output of an AI chat session is a practical skill. Paying attention to the purpose of the conversation, the clarity of the answer, the presence of verifiable details, and the sensitivity of the topic can make the experience more effective and safer. With a few simple habits, it becomes easier to recognise when the system is providing useful insight and when additional checking is needed.

How to Evaluate the Output of an AI Chat Session
Related Work
Table of Contents

How to Use AI Safely and Effectively

2026-04-26T00:00:00+00:00

Table of contents

Recent headlines have shown the same unsettling pattern.

An AI system confidently generated legal cases that never existed, as reported when UK courts received filings built on fictitious case law (The Guardian, Scottish Legal News).

Health researchers have warned that AI can give medical guidance that is not just inaccurate but dangerously misleading. A British Medical Journal article as reported in the Independent stated that 20% of AI medical answers were "highly problematic".

And tech reporters have documented AI‑generated news summaries that included entirely fabricated headlines and events (Sky News).

In every case, the system generated output that communicated total confidence. In every case, the AI was wrong. Fluency is not understanding. Appearing proficient is not accuracy. This confusion is exactly where the real risk lies.

Give Clear Instructions

AI works best when you tell it exactly what you want. It does not infer your intentions or read between the lines. The output you see is a statistical software prediction based on patterns in the training data of the AI. The clearer your request, the better the output.

Start by stating your goal. Instead of asking, "Tell me about climate change," try: "Give me a 150‑word summary of the main causes of climate change for a general audience." A specific target gives the system's statistical pattern-matching something concrete to aim at.

Set the format you want. Simple instructions like "Give me three options," "Write this as a short email," or "List the steps in order" immediately improve the result. Format acts as a constraint, and constraints make the output sharper.

Define the audience. AI changes tone and detail depending on who you say it is for: beginners, executives, customers, or the general public. A single line about the audience can transform the clarity of the answer.

If accuracy matters, add constraints such as "Use widely accepted information," "If you’re unsure, say so," or "Do not invent details." These reduce the risk of confident mistakes.

Clear instructions make the output better and safer, but they do not eliminate the risk of mistakes. Even with perfect prompts, a system can still deliver something that sounds certain but is completely wrong.

The AI is not weighing evidence or checking facts. AI is programmed to produce an answer that appears most likely based on patterns in its training data. When those patterns point in the wrong direction, the result is a confident mistake. Your prompt has to help the AI navigate any bias or missing data in its training data. Think of your prompt as you nudging the AI in the direction you want to go.

When your task is large, break it into smaller steps. Ask for an outline first, then expand each section. AI performs far better when guided step‑by‑step.

Clear instructions don’t just improve the output, they keep you in control of the process.

Provide Enough Context

AI performs noticeably better when it has the background information it needs, such as who the audience is, what the situation involves, or what constraints apply.

When context is missing, the system often fills in the gaps with incorrect predictions that will look like guesses, and recent reporting shows how easily this can go wrong. The Guardian found that Google AI Overviews gave misleading health advice because the AI responded without understanding the medical circumstances involved, including a case where it advised pancreatic cancer patients to avoid high fat foods, which experts described as really dangerous. This is dangeous advice as some who suffer from pancreatic cancer are malnourished and consuming fat can be a nutritionally efficient way to ingest energy.

Check the Output Carefully

AI is not a source of truth, it is a generator of plausible answers, so treat every response as a draft, not a verdict.

Read the answer to then ask basic questions: Does this match what you already know, does it contradict trusted sources, does anything feel too neat or too extreme?

For factual topics, spot check key claims against reputable outlets or official documentation, especially numbers, names, dates, web links, and legal or medical details.

For writing tasks, look for invented quotes, fake references, or details that are oddly specific without any support.

If something important hinges on the answer, ask the system to show its reasoning, to list uncertainties, or to offer alternative possibilities.

The core habit is simple: never confuse a confident tone with a reliable answer. Once you see the answer you can ask the AI more questions to check the reliability of that answer. This is especially important if you are going to do something that relies on that answer.

Use AI for the Right Tasks

AI is most effective when the work involves drafting, summarising, organising ideas, exploring options, or speeding up early stage thinking.

AI can turn rough notes into a clean paragraph, reshape a long document into a shorter one, or generate several ways to frame a problem so you can choose the best one.

AI is also useful for outlining reports, comparing approaches, rewriting for different audiences, or helping you see alternatives you might not have considered. These are tasks where speed and structure matter more than perfect accuracy. You can make text accurate later.

AI is far less reliable when the task requires expert judgment, real world verification, or precise factual detail, so keep it focused on the parts of the job where it can genuinely help rather than the parts where it can get you into trouble.

Keep in mind that AI is not thinking. AI does not check for truth. It generates plausible text based on its training data.

Avoid Using AI for Judgement or Decisions

AI cannot weigh values, consequences, or ethics, and it cannot understand the human context that sits behind real decisions.

AI can offer options, outline trade offs, or summarise information, but it cannot decide what matters most, what is acceptable, or what is fair. Those choices rely on experience, responsibility, and an understanding of people, none of which an AI possesses.

Use AI to support your thinking, not to replace it. Human judgement must stay in charge, especially when the outcome affects safety, wellbeing, trust, or the outcome has long term consequences.

Be Cautious with Personal or Sensitive Information

Treat AI tools the same way you would treat an online form or an email to someone you do not know.

Do not share details that could identify you, expose someone else, or create problems if they were ever seen by the wrong person. This includes financial information, medical records, passwords, private conversations, or anything that involves children, colleagues, or business clients.

Keep the boundary simple. If you would hesitate before typing it into a website, keep it out of an AI prompt. The safest approach is to describe the situation in general terms and remove anything that is not essential to the task. This protects your privacy and prevents sensitive information from being handled in ways you cannot control.

Compare Answers with Reliable Sources

Treat AI output as a starting point, not a final answer, and cross check anything that matters with sources you trust.

This is especially important for facts that are time sensitive, technical, or likely to change. A quick comparison with reputable news outlets, official guidance, or well established reference material can reveal errors that are easy to miss when the writing sounds polished.

This habit is not about distrusting the tool, it is about protecting yourself from mistakes that come from outdated information, missing context, or confident AI guesses. When accuracy matters, a second source is not optional, it is part of the process.

Keep an Eye Out for Gaps or Oddities

A useful habit when reading AI generated answers is to notice when something feels slightly off. This might be an explanation that is too vague, a claim that is oddly specific without support, or a confident statement that does not match what you know.

When you see these signs, pause and ask a follow up question or check the detail elsewhere.

Recent reporting shows how easily small oddities can signal a deeper problem. The Guardian described how a senior European journalist was suspended after using AI tools to summarise material and then publishing quotes that the people involved had never said. The investigation found dozens of invented statements that looked polished and authoritative but were entirely false, and the journalist admitted he had fallen into the trap of trusting text that only sounded right.

Examples like this show why readers should stay alert to gaps, inconsistencies, or moments when an answer feels too neat. These are cues to check the AI's output.

Stay Aware of the Limits of AI

AI does not understand meaning, it has no lived experience, and it cannot draw on intuition or common sense.

AI works by recognising patterns in data and producing text that fits those patterns, not by grasping the reality behind the words. This means it can miss context, overlook nuance, or present something that sounds authoritative without any understanding.

AI cannot feel uncertainty, it cannot judge what is important, and it cannot tell when it has made a mistake. Keeping these limits in mind helps you use the tool for what it is good at and avoid expecting it to behave like a person.

Give Clear Instructions
Provide Enough Context
Check the Output Carefully
Use AI for the Right Tasks
Avoid Using AI for Judgement or Decisions
Be Cautious with Personal or Sensitive Information
Compare Answers with Reliable Sources
Keep an Eye Out for Gaps or Oddities
Stay Aware of the Limits of AI
Related Work
Table of Contents
Further Reading

What AI Is (and Isn't)

2026-04-26T00:00:00+00:00

Table of contents

We have all read the articles about our AI future: "AI will take your job".

This article takes a different path to explain AI clearly, simply, and honestly.

A Straightforward Definition of AI

AI software learns patterns from lots of examples. Once it has been exposed to those patterns, it can create new text.

When you ask something like "What is the weather going to do in Glasgow tomorrow?", the AI does not read the sentence the way a human does. Instead, it turns your words into numbers,

Using these, the AI programming looks for relationships in the sentence. Words like "weather," "tomorrow," and "Glasgow" stand out because they are the important parts of your question.

Next, the AI uses the data it was trained on (the examples) to statistically evaluate what your question is about. It does not "understand" the way people do, it just recognises patterns it has seen before.

To create an answer, the AI predicts what should come next, one token at a time. A token might be a word, part of a word, or punctuation. The AI chooses the most likely next token based on patterns in its training data.

This statistical selection can look like reasoning, but it is really pattern‑matching. If the AI was never trained on weather‑related information, it would not be able to give you a good answer. There would be no tokens on which to base its output.

Because weather changes constantly, the AI system accesses real weather data from an external source. This is how it can give you an accurate, up‑to‑date forecast instead of basing its output on general Glasgow weather.

Finally, the AI program puts everything together: your question, the patterns it has learned, the conversation so far, and the real weather data, to generate the output you see.

But is it Intelligent?

AI might sound intelligent, but it does not have consciousness, intentions, or real understanding. It does not know things or have opinions. All the AI program is doing is recognising patterns in data and using those patterns to produce output.

When an AI responds, it is not thinking or wanting anything; it is just following statistical cues from the data it was previously shown.

AI can be incredibly powerful, but it is still just a tool. It does not think or decide things on its own. It can only work with the patterns and data it has been given.

The value of AI comes from how people choose to use it, not from any independent ability or intention.

When you type a message on your phone and it suggests the next word, your phone is not thinking. The program in your phone is suggesting a good possible next word based on patterns it has seen before. AI works the same way, just on a much larger scale.

AI predicts what could reasonably come next in a sentence, an image, or an answer, using patterns learned from huge amounts of training data. AI can be incredibly helpful, but it is still predicting based on patterns, not understanding the world. Without the huge amounts of data, AI would have no patterns to base an answer on.

Now that we have covered how AI works, here is what it can actually do well.

What AI Is Good At

As AI is programmed to find patterns in huge amounts of data, an AI can easily take long documents and turn them into shorter versions, based on patterns that produce clearer text.

AI is great for drafting emails, rewriting paragraphs, producing variations, or helping with early versions of content.

When the topic is something it has seen many examples of (such as a question about the weather), AI can give fast, reliable answers.

And the vast amount of data AI is trained on means AIs are great at classification, translation, sorting, and extracting key details from text. AIs have seen so many examples, their statistical prediction can appear like it has vast knowledge. But an AI is only selecting a statistical match.

AI is good at giving options, exploring possible approaches, and speeding up early‑stage work. But, AI still needs human judgement to decide whether what has been produced is of any value.

There are also clear limits that are important to understand.

What AI Is Not Good At

AI recognises patterns, not ideas. AI does not understand what you type or what it outputs.

If your question is vague, emotional, or depends on context only humans share, AI often predicts incorrectly. Such a response is the AI program selecting an incorrect prediction based on its statistics.

AI cannot weigh consequences, values, ethics, or trade‑offs. It can only follow patterns in data. As it does not understand in the human sense, AI cannot perform judgement. Judgement requires intent, values, responsibility, and lived experience. AI has none of these.

However, AI can simulate judgement extremely well because it has access to vast patterns of expert reasoning, it can structure arguments, and it can select options based on criteria you give it. But this is not judgment. It is pattern-based statistical selection without understanding.

AI can remix and generate new combinations, but it does not have taste, purpose, or a point of view.

Anything involving physical experience, social cues, or human behaviour is outside its reach. If you say, "My car has a flat tyre," a person knows that the car cannot be driven safely, that to fix it you will need tools and that the fix is inconvenient and messy.

An AI has never changed a tyre. It does not know weight, effort, or danger. It only has access to what people have written about flat tyres.

An AI can describe the steps to fix the flat (as a person has written about this in the past and this writing is in the training data), but AI does not understand the situation.

An AI has no lived experience, so it can miss things a person might notice. If someone says, "I brought a bottle of wine to the dinner," a person knows this is a polite gesture. AI does not know social customs, it only has access to training data about customs written by a person.

Your AI does not know anything

AI can sound confident even when it is completely mistaken, because it does not know what it does not know.

If you ask for restaurant recommendations in a town that does not exist, some AIs may still try to answer, giving you incorrect information as the town does not exist.

When an AI lacks information, it cannot feel uncertainty or recognise gaps the way people do, so it simply produces the most plausible‑sounding answer based on the patterns it currently has access to.

An AI might confidently state that Venus has two moons, or invent a law that does not exist or describe an imaginary species as if it were real. Because AI never checks facts or senses its own limits, its pattern‑filling behaviour leads to "hallucinations," where the AI creates details, sources, or events that sound right but are not true.

If the training data is thin, biased, or missing, the output will be unreliable, no matter how polished the output looks.

If you ask an AI about something that barely exists in its training data — say, "What dishes are served at the Spring Feast in Millford Glen?", the AI will not calculate that the place or event is fictional.

With nothing solid to draw from, the AI's program uses loose patterns and produces something that only sounds right, like "They usually serve herb stew and blossom cakes." The answer feels plausible, but it is really just the AI making a poor prediction because the information is too thin.

The Biggest Misconceptions About AI

Many people believe AI thinks, understands, or decides in the way a person does, but this is not the case. AI does not grasp meaning, hold values, or judge situations. It only reflects patterns in the material it was trained on.

Another misconception is that AI has reliable knowledge about everything. When information is scarce, it often fills the gaps with predictions that sound believable but are not accurate. AI has access to vast data stores. AI has no knowledge, just data and a program to spot patterns.

People also assume AI is neutral, yet it inherits the biases and assumptions present in its training data. Some imagine AI as a step toward consciousness, but it has no awareness or sense of self. It is a powerful tool, but still a tool, and it must be used with a clear understanding of its limits.

How to Use AI Safely and Effectively

Using AI safely and effectively starts with treating it as a helpful assistant rather than an authority. It works best when you give it clear instructions, specific goals, and enough context to guide the response.

It is important to check the information it provides, especially when accuracy matters, because it can sound confident even when it is mistaken.

AI is strongest when you use it to explore ideas, draft material, summarise information, or speed up routine tasks, while keeping final judgement for yourself.

AI can boost your creativity, improve your productivity, and help you think in new ways, as long as you stay aware of its limits and verify anything that needs to be correct.

What to Keep in Mind About AI

AI recognises patterns but does not understand meaning.
It predicts what should come next based on data it has seen.
It is strong at summarising, drafting, sorting, and exploring ideas.
It struggles with judgement, context, emotions, and real‑world experience.
It can sound confident even when it is wrong.
It works best when you guide it, check its output, and stay in control.

A Simple Mental Model to Remember

Think of AI as a very capable assistant that is excellent at helping you create, explore, and organise ideas, but one that still needs you to guide it and check its work.

AI is powerful but not magical. It recognises patterns but does not understand. You get the best results when you guide it, check its work, and stay in control.

A Straightforward Definition of AI
But is it Intelligent?
What AI Is Good At
What AI Is Not Good At
Your AI does not know anything
The Biggest Misconceptions About AI
How to Use AI Safely and Effectively
What to Keep in Mind About AI
A Simple Mental Model to Remember
Related Work
Table of Contents

A Beginner's Guide to AI Chatbot Prompting

2026-04-22T00:00:00+00:00

Table of contents

A Beginner’s Guide to AI Chatbot Prompting

This guide gives beginners a clear, practical foundation for working with AI chatbots. Each section focuses on one skill, why it matters, and how to apply it.

1. What Prompting Is and Why It Matters

Prompting is the skill of giving clear instructions to a chatbot so that you are more likely to get a useful response.

Good prompts will reduce confusion and save you time. A poor prompt can waste time as you work you way through an answer that does not hit the spot.

Example:

Vague: "Explain photosynthesis"
Clear: "Explain photosynthesis in simple terms for a 12‑year‑old"

If you try these you will see that the second one is a completelt different response from the first. It is more direct and easier to read.

2. Start With a Direct Request

A simple, explicit request sets the direction.

Examples:

"Write a short summary of this article"
"Give me three ideas for a birthday message"
"Explain how this code works"

With the short summary prompt, startinmg on a new line, pase in the article you are referring to.

3. Add Context to Aim the Response

Context helps the chatbot match your level, purpose, or constraints.

Examples:

"I am new to London, UK. Explain what I can do on a wet Sunday."
"I am preparing for a job interview. Give me sample questions."

London, UK is specified to keep the prompt clear as there are many places in the world called London. How many?

"Give the total number of places in the world called London, no variants. List the names"

4. Specify the Format You Want

Format guides structure and makes the output easier to use.

Examples:

"Give me a bullet‑point list"
"Write a short paragraph"
"Produce a step‑by‑step explanation"

5. Set Clear Constraints

Constraints keep the answer focused and predictable.

Examples:

"Keep it under 150 words"
"Use plain English"
"No jargon"
"Be concise"

6. Use Examples to Anchor Tone and Style

Examples show the chatbot what "good" looks like.

Example:

"Write it in the style of this: 'Short, direct, and practical.'"

7. Adjust Over Time Instead of Restarting

Treat the chatbot as a collaborator. Adjust the output rather than rewriting the whole prompt.

Examples:

"Shorten this"
"Make it more formal"
"Add one more example in the first paragraph"

8. Ask for Alternatives When You Need Options

Variations help you compare and choose.

Examples:

"Give me two more options"
"Rewrite this with a friendlier tone"

9. Break Complex Tasks Into Steps

Step‑by‑step prompting keeps large tasks managoeable.

AI chatbots are pattern matching. If your prompt is long, the AI may appear to skip something you say as it does not have a strong pattern to match to it.

Example:

"First, outline the structure. Then we will fill in each section."

10. Common Mistakes to Avoid

Being too vague
Asking for everything at once
Forgetting to specify the audience
Not having the AI give examples
Expecting perfect output on the first try

11. Quick Prompt Templates

These templates give learners a starting point that you can adapt.

Explain Something

"Explain [topic] to [audience] in [format]. Keep it [constraints]."

"Explain beaches to a 10 year-old in one pargraph. Keep it positive and clear."
"Explain beaches to an adult in one pargraph. Keep it positive and clear."
"Explain beaches."

Rewrite Something

"Rewrite this text to be more [tone]. Keep the meaning the same."

"Give first line of Pride and Prejudice by Jane Austen."
"Rewrite using corporate speak. Keep the meaning the same but push the buzzwords to 11."

Generate Ideas

"Give me [number] ideas for [goal]. Keep them practical."o

"Give me 5 ideas for walking down the sidewalk. Keep them practical."

Troubleshoot

"I am seeing this issue: [a detailed description]. Give me possible causes and simple steps to check."

"I am seeing this issue: my grass is too yellow. Give me possible causes and simple checks to check."

12. Practice Prompts

Use these to build confidence and develop prompting habits.

"Explain how a mortgage works as if I am new to finance."
"Give me three ways to describe my job in a CV. I have pasted my CV."
"Summarise the following paragraph in one sentence."
"Suggest improvements to this email without changing the intent."

A Beginner’s Guide to AI Chatbot Prompting
Related Work
Table of contents

How to Evaluate A Company's AI Claims

2025-01-01T00:00:00+00:00

Table of contents

How to Evaluate Claims Made About an AI-based System

Introduction

Artificial intelligence now appears in many areas of daily life. It is used in search engines, writing tools, customer service systems, healthcare applications, and many other services. Many people encounter it without thinking about it, such as when a phone suggests a reply to a message or when an ecommerce website summarises customer feedback about a product.

Public descriptions of systems based in part or whole on AI often highlight ambitious capabilities. Some describe their products as human level, fully autonomous, or capable of replacing expert judgement.

Promotional language and real performance do not always align, which makes it useful to look closely at how such claims are formed.

Understanding the Claim

The first step is to understand what is actually being promised.

Many statements about artificial intelligence are broad or ambiguous, so it is useful to translate them into specific questions. A claim such as "our tool detects fraud" sounds clear, but it raises many questions about what kind of fraud, in what context, and with what level of accuracy.

Many people begin by considering what task the system is meant to perform, under what conditions it is expected to work, how well it performs that task, and what it is being compared against. Once the claim is expressed in concrete terms, it becomes much easier to evaluate.

Looking for Evidence

Claims about performance usually rest on some form of evidence. A credible statement about artificial intelligence is supported by clear information about how the system was tested.

Independent evaluations, published research, recognised benchmarks, and real world trials all provide meaningful support. For example, a reading comprehension benchmark or a driving simulation can show how a system behaves under controlled conditions. By contrast, phrases such as "industry leading accuracy" or "our internal tests show excellent results" offer very little without further detail.

Reliability often depends on who carried out the measurement and how the testing was designed.

Considering the Data

Every artificial intelligence system depends heavily on the data used to train it.

The quality, diversity, and representativeness of that data shape the system’s strengths and weaknesses. A photo classifier trained mostly on daytime images may struggle with night scenes, and a language tool trained mainly on formal writing may find slang or informal messages difficult to interpret.

When assessing a claim, it is worth asking whether the data reflects the real world situations in which the system will be used. Narrow or unrepresentative data can limit how well the system performs in real situations.

Recognising Limitations

All systems have limitations, and responsible companies acknowledge them.

It is helpful to look for information about situations where the system performs poorly, where it may misinterpret inputs, or where it may produce incorrect or misleading results. A voice assistant that mishears a request because of background noise is a simple example of how small changes in context can affect performance.

Balanced descriptions usually include both strengths and known limitations.

Avoiding Human-like Descriptions of AI

Marketing language sometimes presents artificial intelligence in ways that resemble human thinking.

Words such as "understands", "reasons", or "knows" can create an impression that the system possesses abilities it does not have. A system that predicts the next word in a sentence may appear to "understand" the topic, but it is following patterns rather than forming ideas.

A more accurate approach is to focus on what the system actually does, how it processes inputs, how it generates outputs, and how it behaves under different conditions.

Seeking Independent Validation

Independent evaluations often provide a clearer picture of how a system performs.

When researchers, regulators, journalists, or external auditors have examined a system, their findings provide a valuable counterbalance to promotional material.

Real world deployment is equally important. A navigation app may work perfectly in a staged demonstration, but everyday use can involve roadworks, poor signal, or unexpected detours that reveal weaknesses.

Genuine reliability is shown through consistent performance with diverse users and unpredictable inputs.

Considering the Consequences of Error

It is important to consider the consequences of error. Some tasks are low risk, while others involve significant personal, financial, or social impact.

A system used for entertainment can tolerate occasional mistakes. A music recommendation that misses the mark is usually harmless.

A system used for medical advice, financial decisions, or legal interpretation requires far stronger evidence and clear safeguards. A symptom checker that offers an overly confident suggestion illustrates how errors can matter more in high stakes settings.

The impact of errors can vary widely, so the way a system handles mistakes often shapes how it should be used.

The Importance of Transparency

Transparency and accountability are essential qualities.

Companies who provide clear explanations, publish evaluation results, describe limitations, and offer channels for feedback demonstrate a commitment to responsible practice.

Greater transparency makes it easier to understand how a system works and how its results should be interpreted. For example, a tool that explains which factors influenced a recommendation gives users a clearer sense of how to interpret the output.

A Practical Way to Judge a Claim

These themes often lead people to consider questions about what is being promised, what evidence supports it, and how the system behaves in real conditions.

It is useful to ask what is being promised, what evidence supports the promise, who carried out the evaluation, what data was used, what limitations are acknowledged, whether the system has been tested independently, how it performs outside controlled demonstrations, and what the consequences are if it fails.

This is a long list, but systems powered in some way by artificial intelligence are becoming more common and tehy are having a larger impact on everyday life.o

If we are all better placed to evaluate AI-based systems, the better.

If several of these questions cannot be answered, any claim is possibly likely to be overstated.

Conclusion

Artificial intelligence is a powerful set of technologies, but it is not magic.

Careful consideration and evaluation makes it easier to distinguish genuine progress from exaggerated claims.

How to Evaluate Claims Made About an AI-based System
Related Work
Table of Contents

Phroneses.com - Foundations

Designing Prompts for Modern AI Systems

1. Start with the system, not the request

2. Define the output contract

3. Use decomposition as a control mechanism

4. Add a self-critique loop

5. Stack roles for higher-quality output

6. Treat the context window as working memory

7. Use agentic prompting patterns

8. Make the model identify ambiguity before answering

9. Adapt prompts to the model

10. Include safety and uncertainty rules

A modern prompt template

Having the AI Manage the Prompt Template

Summary

Related Work

Table of Contents

How AI Works

How large language models actually work, and why they are not miniature humans

How an LLM processes what you type

Tokens

Lists of numbers represent a point in space

Moving about the space

Why this works

It is all in the training data

The human role in meaning

We supply the intelligence

What is in the billions of parameters

Parameters do not contain knowledge

What training actually does

Why this is not human learning

Is the output a simulation of intelligence?

Are humans interpreting the output as intelligent

What this means for us

Why clarity matters

Related Work

Table of Contents

10 Everyday AI Workflows That Save Hours

1. Turn messy notes into clean summaries

2. Draft emails from bullet points

3. Explain complex topics in plain English

4. Create quick plans for travel, meals, or events

5. Turn long articles into short takeaways

6. Brainstorm ideas when you feel stuck

7. Rewrite text in different tones

8. Extract key information from documents

9. Create checklists from goals

10. Turn data into quick insights

Conclusion

Related Work

Table of contents

How to Evaluate the Output of an AI Chat Session

How to Evaluate the Output of an AI Chat Session

Introduction

Start With the Purpose of the Conversation

Check Whether the Output Matches the Question

Look for Verifiable Details

Notice When the System Sounds Certain

Compare the Output With What You Already Know

Ask for Clarification or a Different Angle

Be Cautious With Sensitive or High‑Impact Topics

Look for Signs of Fabrication

Use the System as a Tool, Not an Authority

Conclusion

Related Work

Table of Contents

How to Use AI Safely and Effectively

Give Clear Instructions

Provide Enough Context

Check the Output Carefully

Use AI for the Right Tasks

Avoid Using AI for Judgement or Decisions

Be Cautious with Personal or Sensitive Information

Compare Answers with Reliable Sources

Keep an Eye Out for Gaps or Oddities

Stay Aware of the Limits of AI

Related Work

Table of Contents

Further Reading

Fake legal cases