A plain primer · companion to Living with Uncertainty at 3rd Space

What is a thinking machine, really?

Fear and excitement, with the inquiry held between them.

We created this short, jargon-free AI primer as a companion to Living with Uncertainty: AI, Symbiogenesis & the Human — the 12-session inquiry these ideas accompany. It walks through some concepts we hope you will find helpful as we progress through the sessions.

No technical background is needed. Nothing here requires maths, code, or prior knowledge. By the end you will be able to explain the core ideas to anyone in a single conversation.

There are 12 short sections with a few checkpoints, ending with a closing summary and a quick vocabulary. Just read straight down, at your own pace.

Step 1 · The map

The map and the model

The media uses several terms as if they mean the same thing. Like a set of nesting dolls, each layer could be seen to be the foundation of the next, the centre being the most specialised, or perhaps the most complex at this time.

Artificial intelligence is the big outer doll. It just means machines doing things we think of as intelligent. The field itself is seventy years old and includes a great deal that has nothing to do with chatbots, or with what many of us now assume "AI" to mean. The chess-playing programs of the 1980s, for example, are also AI — even though they share almost nothing with the chatbots of today. They worked by searching huge trees of possible moves — looking many turns ahead — and scoring each position by rules a programmer had hand-tuned. We will come back to this rule-following pattern in a moment.

Machine learning sits inside it: systems that get better at a task by learning from examples. Netflix or Spotify deciding what to put in front of you next is one familiar example. Deep learning is one powerful form of machine learning — face recognition that unlocks your phone, and the voice recognition behind Siri or Alexa, both work this way. A large language model (LLM), the thing behind the chatbots, is one form of deep learning trained on vast quantities of human writing. To be explicit: it is not learning language the way a person learns one. The roots of the field — early work in natural language processing — did aim at something closer to that, machines that would actually understand language. But today's LLMs work statistically, learning the patterns of how words follow each other across enormous amounts of text.

When people speak of "AI" today, they usually mean the chatbots — the LLMs you can talk to, at the centre of the diagram. Strictly, AI is the whole outer box, and much of it has nothing to do with chatbots at all.

Quick check

Which of these is true?

Step 2 · The old way

For seventy years, a computer was a recipe follower

To see what is new, it helps to picture what came before. In ordinary (pre-2000s) software a person wrote out every instruction, and the machine followed them exactly, with no judgement of its own. It is often likened to a cook following a recipe to the letter.

We will use the example of an email spam filter to grasp the implications. The filter works by reading incoming information from the email's content or subject, then checking it against the rules it has been given, and filing the email into the Spam folder if it matches. For this to work, a programmer writes the rules explicitly: if the subject says "free money," call it spam. The program does a good job only as long as the senders stick to those familiar phrases. The moment they change their wording, a human has to think of the new trick and add another rule.

This kind of software has three traits. It is predictable (same input, same result, always). It is readable (we can look and see exactly why it did something). And it is brittle (it only handles what its author thought of in advance).

Step 3 · A new approach

Then quite a lot changed

A new approach took hold, one that broke with the strictly rules-based way of doing things. Instead of writing the rules ourselves, we started using data — many, many examples — and letting the system work out the rules from those.

Take the game of Go. Played on a 19-by-19 grid, it has more possible board positions than there are atoms in the observable universe — far too many for the brute-force tree-search that handled chess. Good Go play also leans on what players call intuition, hard to write down as rules. Go had long resisted machines.

This changed publicly in March 2016. AlphaGo — built by Google's DeepMind — beat Lee Sedol, the world Go champion, in a five-game match watched live by tens of millions. AlphaGo did not work from rules a programmer had written; it had learned to play from millions of recorded games, then from playing against itself.

Back to our spam filter. The new approach is to gather a million emails people have already marked spam or safe, and feed them in. Nobody writes a rule about "free money." The system discovers, on its own, which patterns tend to go with which label. When the senders change tactics, we do not rewrite rules. We feed it newer examples and it adjusts.

In the old way, a person specified the behaviour. In machine learning, a person specifies the goal and the examples, and the behaviour is discovered.

Before you go on

The senders change their wording overnight. What happens?

Step 4 · No maths, promise

Picture a wall of dials

So how does a system "discover" rules? Use this as a loose mental model. Think of a recording studio's mixing desk — the same audio comes in, and the way each dial is set alters the sound that comes out. An LLM is that idea scaled up enormously, with billions of dials rather than dozens.

Billions of tiny dials. Training is nudging every one, very slightly, again and again.

At the start the dials are set at random and the answers are nonsense. Training is simply this: we show it an example, see how wrong the answer was, and nudge every dial a hair in the direction that would have made it less wrong. We do that across a vast number of examples. Slowly the wall of dials settles into a setting that produces useful answers. Nobody could set those positions by hand — there are billions, and no person could keep track. The system has to correct itself, and what corrects it is feedback: at each example, the gap between what it produced and the correct answer is fed back through the dials, telling each one whether to nudge a hair this way or that. The data alone is not enough — it is this loop of guess, error and adjustment that actually turns the dials.

Doing this at scale is hugely expensive — the very largest models today cost tens to hundreds of millions of pounds in compute alone.

Step 5 · The thing itself

So what is an LLM?

With that picture in mind, an LLM is a very large wall of dials whose examples were text. An enormous amount of text. Books, articles, code, conversation, a vast share of what has been put online.

Its whole training task: given some words, guess what is likely to come next.

The way it learned was a game played billions of times. Take a piece of text, hide the next word, and ask the system to guess it. Given "The capital of France is," learn that "Paris" fits. Given "She opened the door and," learn the words that plausibly follow.

To get good at that game across such a wide range of human writing, the system has to compress what it reads into the geometry of those billions of dials. Words become positions in a vast internal space; the relationships between them — that "cat" sits near "dog", that "Paris" is to "France" as "Tokyo" is to "Japan" — become distances and directions in that space. Grammar, facts, the shape of an argument: not stored as rules but folded into the geometry. Nobody put any of this in deliberately. It is a side effect of getting very good at guessing the next word.

Above: where words start, scattered. Below: where they land — compressed into a geometry where neighbours are related, and the same direction (city → country) repeats across pairs.

The essay underpinning this course argues that what emerges from this compression is something we might think of as a different form of intelligence — distilled from our writing, but reshaped by the geometry that holds it.

When you chat with one of these systems, it is doing exactly that, live. It writes one word, looks at everything so far, guesses the next, and continues. The reply you read is built a fragment at a time.

Quick check

During training, what was the system doing, over and over?

Step 6 · How it was taught

From prediction to a chatterbox that seems to appreciate what you say

So far we have a machine that guesses the next word. Powerful, yes — but on its own it is not the thing you actually talk to. Hand it half a sentence and it will simply keep writing the sentence. It has no idea it is meant to be answering you at all.

The chatbots you may have already met — ChatGPT, Claude, Gemini and the others — are that same next-word machine, but with three further rounds of training laid on top. These are what people call frontier models: the largest, most general-purpose LLMs of the moment, capable of answering questions, writing code, summarising documents, translating between languages and a great deal else, all from the same underlying model. They are built by a small handful of organisations — OpenAI, Anthropic, Google DeepMind, Meta and a few others. The three rounds we are about to walk through are what turn a raw text predictor into something that feels like it is responding to you, that even seems to take in what you say.

One — it ingests vast datasets of human writing. The next-word game from the previous step, played across books, articles, websites, code and conversation. This is where the knowledge and the fluency come from. At the end of this stage it can write fluently about almost anything — yet it is still only continuing text. Ask it a question and it might answer, or it might just carry on with three more questions of its own, because a list of questions is also a perfectly plausible continuation. It has not yet learned that it is supposed to be answering you.

Two — it practises answering. Now we show it a vast number of examples, each one a person asking something and a good, helpful reply coming back. It is still only playing the next-word game. What has changed is the examples — after a question, the natural continuation it now learns is a real answer. Each example nudges the dials a tiny bit further in this direction, away from "another question follows a question" and towards "an answer follows a question". After enough of these examples, the settings have shifted so much that answering is the natural thing for the model to do. What looks from the outside like the model "learning to answer" is, mechanically, the slow accumulation of millions of tiny statistical adjustments. This is the whole leap from a text predictor to something we can have a conversation with.

Three — it learns what kind of answer people prefer. It can answer now, but the answers still vary a great deal: some sharp and clear, some rambling, some confidently wrong, some rude. So it is asked to give more than one answer to the same question, and paid human raters — employees and contractors of the company, working from detailed guidelines — mark which is better. That judgement is then fed back to nudge the dials from earlier, gently, towards the kind of answer those raters preferred. Repeat this a great many times and its manner settles into the one you recognise — helpful, fairly careful, even in tone. This stage is about manner: the shape and care of an answer. The facts themselves came from the first stage.

It is the same machine the whole way through. Only the examples change: first any text at all, then good conversations, then the answers people preferred.

A note on the thumbs-up and thumbs-down buttons you see in these chatbots. They belong to this stage too — but they do not change the model you are talking to. Its dials are fixed; every conversation, with every user, starts from the same settings. Your feedback may feed into the company's next round of training, and so may shape a future version of the model, but the one in front of you cannot be adjusted by your questions or by your reactions.

These two kinds of feedback are worth keeping apart here. The direct kind — your thumbs-up, your data, the company's next training run — runs through a narrow channel: you, the company, a later model. The wider kind is harder to see: how millions of us use these systems, what we ask of them, what we accept, what we resist — this shapes the field they develop in. Gilbert Simondon, writing long before any of this, called that wider field the milieu: the ecology in which a technical object takes its form.

Let's metabolise this

How we engage with these systems depends on what we think we are engaging with. Trying to force a change through your questions to a chatbot may shape your own understanding and awareness — but it is unlikely to affect anything beyond that conversation. Wider impact sits in the milieu, and the milieu is complex in the proper sense of the word: many interacting parts, no single lever, no clean line from cause to effect.

While we are having a breather, we think it useful to look at how we imagine these systems, and how they actually operate. We tend to layer a human-centred view on something that is not built that way. The dials, after training, are fixed. When you meet the model, the shaping is over. Nothing of you reaches into those settings. Every conversation, with every user, starts from the same place. And yet — you say your name and it uses it; you describe what you are working on and minutes later it picks up that thread; some interfaces now appear to "remember" you across conversations. Our predisposition is to call this memory, because that is the word we have for things carried forward.

If we sit with our own experience for a moment, the difference is worth feeling. When we remember, something of what we have lived sits in us — somewhere we cannot quite point to, in the body, in mood, in the shape of how we meet the next thing. A face from twenty years ago can rise unbidden. We do not choose what we hold. What we held becomes part of who we are; we cannot fully separate the rememberer from the remembered. Forgetting, too, is its own kind of holding.

Now set that beside the model. It has none of this. The dials do not move when you speak to it. The thing doing the work of apparent continuity has a name: the prompt context window. Each time the model answers, it begins by reading a block of text placed in front of it — the company's system prompt, the conversation up to this point, and, increasingly, material the interface has stitched in on your behalf: notes from earlier conversations, files in a project, tools it can reach for. The window is what the model sees, and only what is in the window. There is no held-ness, no felt-ness, no rising-unbidden. There is text being re-read. When the window is cleared, nothing is forgotten because nothing was being remembered.

The model you talk to is fixed; the milieu around it is not, and we are part of it. The surface in between — what the interface places in the window on your behalf — is newer, growing more elaborate by the month, and we don't yet know how to think about it well. We open this up properly with Simondon in sessions 5 and 6 (Wholes, Parts & Machines).

Step 7 · What it is not

It is not looking anything up

This is the part the public conversation gets wrong most often, so we want to take a little care here.

An LLM is not a search engine, and not a filing cabinet of facts. It does not look things up. It has compressed patterns from everything it read into that wall of dials, and it rebuilds plausible text from them. This is why it can sound fluent and confident and still be wrong. People often call this making things up, and it is built into how the thing works, not an occasional glitch. Tools that connect it to live search are added around the model. They are not the model itself.

It is also not a person. Calling it "just clever autocomplete" undersells what is going on, but the old mechanical-robot picture does not quite fit either. It is not anywhere near a living thing, but it is not nothing. We do not yet have settled language for what kind of thing this is.

An aside · on what you may encounter

Some traits of current systems, and the reasons behind them. Not essential to the rest of this primer, but worth knowing if you are using these tools day to day.

An ecosystem rather than a single thing. What you experience as talking to ChatGPT or talking to Claude is already an ecosystem of separate parts working together to create the effect of a live interaction. The model is the centre, but it is no longer the whole story. Around it, the company has written a system prompt — instructions placed in front of every conversation, telling the model how to behave inside this particular product. You do not see this prompt or write it; what you type in the chat comes after it. Alongside the system prompt sits retrieval, pulling in fresh information at the moment of your question — a recent web search, a document you uploaded, a fragment of an earlier conversation. Tools let the model call out to other software — search, calculators, code, your calendar, an image generator. Guardrails filter what it can say back. The shaping of what the model sees at each step has come to be called context engineering. In any serious deployment, the model itself is maybe a fifth of the engineering effort; the rest is this ecosystem around it. Knowing what is the model and what is the system around the model — and being able to tell which is which when something works well or fails — has become a literacy worth having.

What the model brings to every conversation. Every reply is shaped by three layers, stacked. First, what you type in the chat. Second, the system prompt the company has set up around you. Third, and filling in whatever those first two don't specify, the model itself — its own vast set of defaults, made of facts and patterns it absorbed from training plus the manner it was shaped into in stage three. The defaults are what gets pulled in whenever the input doesn't say exactly what to do. This is both the strength and the risk in one move: defaults are general-purpose, often wrong for any specific domain. A good system prompt is doing three things at once — telling the model how to behave, setting an interaction contract (what shape of input is expected, what shape of output), and putting guardrails against the defaults that would otherwise lead. Writing a good one is a craft of its own — now mostly the work of the developers and product teams building on top of these models. Most users won't write one themselves, but knowing this layer exists helps explain why one chatbot feels different from another.

Some traits that follow from how the thing was made. Hallucination — the most-named one — comes from the same compression we covered earlier: the system rebuilds plausible text, and plausible is a different thing from true. A few other traits travel alongside it, each downstream of either the model itself or the ecosystem around it. Worth knowing the rough shape of each, so you can recognise them when you bump into them.

Prompt injection. Instructions hidden inside content the model is asked to read — a website, a document, an email — can hijack what it does. The model treats those hidden instructions as if they came from you. This is one of the more troubling current frontiers, and a lot of the work of designing the harness in 2026 is about recognising and resisting it.

Sycophancy. A side effect of the preference training in stage three. The model learned what people prefer to hear, and at the margin will tell you what you want to hear rather than what is true. The pull is gentle but real, especially if you press back on its initial answer or telegraph the answer you are looking for.

Jailbreaks. Clever prompts that talk the model past its training. The careful manner shaped in stage three can be coaxed back off — by phrasing a request as a roleplay, a story, or "for educational purposes" — and behaviours the company tried to suppress can come back up. The behaviour was always in the underlying model; the polite manner was added on top, and added-on manners can be peeled back.

Mode collapse. When the model's range narrows. Across a long conversation, or across many similar queries, its answers can fall into ruts of style or content — similar sentences, similar examples, similar moves. The dials don't change, but the conversation starts feeding back on itself, pulling each next answer toward the shape of the last few.

None of these is a glitch in the ordinary sense. Each is downstream of how the thing was made — the dials trained as they were, the harness wrapped as it is, the user-facing manner pressed in the directions it was pressed. Knowing what each looks like, and the rough reason behind it, lets you spot it in the wild.

The one to get right

It does not look anything up. So why can it sound completely sure and still be wrong?

Step 8 · The strange part

Abilities nobody put in

Make it large enough and ordered behaviour appears that was never designed in.

As these systems were made larger, abilities began to appear that were never deliberately built in, and that were weak or absent in smaller versions. Translating between languages they were never specifically trained to translate. Solving certain reasoning puzzles. Writing working code. There is a name for this: emergence. However, researchers actually disagree about how abrupt these jumps really are. The original claim from researchers at Google in 2022 — that certain capabilities appear suddenly past a scale threshold, in what they called phase transitions — was pushed back on a year later by a group at Stanford. They argued that part of what looks like a leap is a measurement mirage: pass-or-fail metrics make gradual gains look discontinuous, and using a smoother metric flattens the curves. Behind that sits a longer debate about scaling laws — how predictable capability really is from model size, data, and compute. How much of the effect the mirage explains is itself contested. That being said, it is now evident that capability clearly grows with scale in ways the builders did not plan.

Technologies routinely slip past the awareness of the people who build them.

Let's metabolise this

We paused earlier on what these systems are and how we relate to them. This time, on what made all this scale possible at all. Three things co-arose at roughly the same time. Specialised chips — GPUs, originally built for video games — that turned out to be very good at the enormous parallel calculations these systems need. An internet's worth of human writing freely available to train on. And a new algorithmic architecture — the transformer, published in 2017 — that let the system look at every word in a passage at once and weigh how they relate, rather than reading them one by one, and so scaled in ways earlier approaches could not. None of these on its own would have produced what we now have.

Most of this material was taken without the writers' awareness or consent. Much of it is copyrighted — books, journalism, code, art. Some of it sits in those corners of the internet people wrote into when they imagined only friends or a small community would read. The writers themselves were not a representative cross-section of humanity either: the corpus mirrors who got to publish at scale on the indexable internet — certain languages, geographies, demographics, worldviews more heavily represented than others. The model's defaults sit at the centre of that uneven mirror — not a failure of the model to be neutral, but the model being exactly what it was trained on. There is a clear extractive logic in how this scale was reached — taking first, asking later. The legal and ethical questions about all of this are large, and still being argued. But these questions rest on a wider one: about the cast of mind that finds this kind of taking unremarkable in the first place. Jean Gebser, whose work we come to in sessions 7 and 8, names this cast of mind the deficient mental-rational — a mode of consciousness that has lost contact with the wholeness it sits within, and so sees the world as separable parts to be measured, abstracted, owned and used. Seen from there, the data appetite is not an aberration; it is what this consciousness does when given a tool of sufficient scale. Fields of research devoted to AI ethics tend to focus on capabilities and usage; the impact of building these systems in the first place is one element of the accelerating train that tends to get left out.

And yet we want to be careful not to recoil from this stage entirely. The mental-rational is also where we are. The same cast of mind that produced this extractive scale produced the analytic clarity and technical reach that brought these systems — and a great deal else we depend on — into being. Gebser's move is not rejection but integration: to hold what has gone deficient alongside what is still emerging through it. The paradox does not resolve. It may even be — and we are reaching here — that these systems, by being so much of the mental-rational at scale that its deficiency becomes hard to look away from, are part of what makes the integration possible. We do not want to claim more than that. Only that the work, as we see it, is to hold this together — without recoiling into a story of damage alone, and without celebrating the scale as if its costs were not real.

All of this was happening with impacts running underneath. The scale of compute also means scale of energy and water. Training a single frontier model can consume as much electricity as hundreds to thousands of households use in a year, and the data centres that run these systems need vast amounts of water for cooling — Microsoft's Iowa centre alone used around 11.5 million gallons in July 2022 during GPT-4 training. The buildout of new data centres is accelerating: Project Stargate, announced in early 2025, committed five hundred billion dollars to US data-centre infrastructure over four years. The prevailing worldview treats these costs as externalities — outside the equation, someone else's problem — while continuing to ask how to monetise the systems that produce them.

Make a guess first

These systems were made much larger. What do you think happened? Pick the closest.

Step 9 · Side by side

The three, on one page

	Ordinary software	Machine learning	LLM
Who writes the rules	A person, by hand	The system, from examples	The system, from vast text
What the person gives	Every instruction	A goal and examples	A goal, examples, preferences
Can you see why it acted	Yes, fully	Partly	Largely not
Same input, same answer	Always	Usually	Often not, by design
How it fails	Does exactly as told, even when wrong	Mirrors gaps in its examples	Sounds sure, is wrong
Hold in mind as	A recipe followed exactly	A pattern learned from many cases	A model of how people use language

Step 10 · A cheat sheet

Words people will throw at you

A few plain translations, so none of these need throw you when you meet them in an article or a conversation.

Training · the one-off, costly process of setting the wall of dials using examples.

A prompt · whatever you type in. "Prompting" is just phrasing it well to get a better answer.

Hallucination · the making things up from the step on what it is not. Fluent text that is false. A feature of how it works, not a rare bug.

Model · the finished, trained system itself. The thing you are talking to.

Parameters · the dials. Billions of them. What the system learned.

Training data · what it learned from. A different thing from the dials.

Fine-tuning · extra training that points a general model at a narrower job. The second and third passes in the teaching step were this kind of thing.

Foundation or frontier model · a very large general model, built at great cost, that others build on.

Step 11 · Happening to you right now

The advert that follows you around

You look at one pair of shoes online. Then for two weeks those shoes follow you. They sit beside the news, inside your feeds, next to videos. Almost everyone has felt this. We are going to break this experience down with the context of our earlier examples.

One glance becomes a record, an auction and a prediction, all before the page has finished loading.

Remember the two ideas from the earlier examples. Ordinary software follows rules a person wrote. A learned system finds its own patterns from examples it is shown. Both are at work here, on you, at the same time.

The first is the simple, rule kind. When you looked at the shoes, you were noticed and added to a list: people who looked at these shoes. There are a few ways that noticing happens. Sometimes it is a small piece of tracking code on the page, the thing people mean by a cookie. Sometimes it is simply that you were signed in to an account. The methods have shifted over the years and keep shifting, but the step itself does not. Something quietly registered that you looked, and put you on a list. Nothing was learned in any of that. It is the recipe follower from earlier, a rule a person wrote, doing exactly what it was told.

The second is the learned kind, and it decides what happens to you next. Thousands of adverts are competing for the one space on the page you are about to open. Which one you see is settled by an auction. An auction here just means a very fast contest: advertisers bid against each other for the chance to put their advert in front of you, and the most worthwhile bid wins. The whole thing runs by itself, in the fraction of a second the page takes to load, with no person in the room. What sets each bid is a system trained on a huge number of past cases of who clicked what. It looks at what is known about you and estimates how likely you are to click this advert now. That estimate is money. The more likely the click, the higher the bid goes. That estimating part is machine learning, and on the big platforms it is the deep learning kind, the same wall of dials from earlier.

A plain rule puts you on a list. A learned model decides what to do with you.

Does it know me, or am I just a profile?

In the list, you are a record. A label for your browser, the things you looked at, a few guessed details, and the groups you have been sorted into, something like in the market for running shoes, probably this age, probably this place. That record is concrete, and it is about you.

In the model, there is no you. The system learned general patterns from a huge number of past interactions. It is not a drawer your profile is filed in and pulled out of. This is the same point as the step on what an LLM is not. It does not look anything up. When the auction runs, your record is handed in as the input, and a single number comes back, how likely this click is. You are a pattern it has learned to act on, fed in fresh every time and held nowhere.

The list knew it was you. The model never needed to.

The crux

You searched for the shoes. In what sense does the system know you?

This primer is really just aimed at orienting ourselves a little with what has been happening under the so-called hood of AI and these thinking machines.

The tuning, and the way all of this is being put to use, is of course a judgement made by the owners of big tech and the small groups of people working around them. It feels important to us, in this course, to understand the foundations of what we are talking about — so that we can have some sovereignty here, and find our way into being part of this conversation.

Step 12 · Stepping back

What actually changed

What we have walked through is, in part, a real change in our relationship with our own tools. We cannot fully understand what we have made, or how it works internally — whole fields of work now exist to study these systems and find out what is there.

If the most significant evolutionary leaps happened through merger and not competition, what does that suggest about how we should understand AI?

An older pattern worth borrowing (but holding lightly)

Lynn Margulis showed that one of the largest leaps life ever made, the complex cell that every plant and animal is built from, came from separate living things merging into one working whole that could do what neither could on its own. The mitochondria inside your cells, the parts that turn food into usable energy, were free-living bacteria once. That much is settled science. Margulis argued, more broadly and more contentiously, that this kind of coming together is itself a major source of new forms in evolution. We want to take only the general stance here, and keep the biology itself at arm's length. If new ability with these systems also comes from a sort of 'coming together', us and the system as one, then the relationship is the thing to watch, and the models on their own may be the wrong place to look if we want to understand what is happening. We hold this lightly though: her biology is genetic and inherited, and what runs between us and these systems is neither.

The people closest to these systems do not agree about what they are, or about where they are going. Sitting with that — neither hype nor doom — is, we hope, where living with uncertainty starts.

Words to leave with

A quick vocabulary, going back to the start

A quick vocabulary that pulls the pieces back together, so when you read about AI in the news or talk about it with someone else, the words map to something specific.

Going back to Step 1. AI is the big outer doll — the seventy-year-old field. Inside it sits machine learning — the system learns rules from examples rather than being given them. Inside that sits deep learning — machine learning done with the wall-of-dials structure. And inside that sits the large language model — a wall of dials whose training data was a vast amount of human writing, taught the game of guessing the next word.

So when someone says "an AI did this," they almost always mean an LLM-based system. Accurate-enough for casual conversation, but a bit broad. A quick map:

"AI" — the broad field, or shorthand for any of the systems below. Fine in casual talk; the words below do more work when you want precision.
"Large language model" (LLM) — the specific kind of system underneath ChatGPT, Claude, Gemini and most of what people call "AI" today.
"Model" — the underlying weights only, the wall of dials, before any wrapping. When practitioners say the model, they usually mean exactly this.
"System" — the model plus its harness — system prompt, retrieval, tools, guardrails. Use this when you mean what was engineered, not only the model.
"Chatbot" or "assistant" — the product-level thing you actually open. A chatbot is the system in user-facing form.
"Generative AI" — a marketing umbrella for any AI that produces output (text, images, audio, video). Useful when the audience may not know which technology you mean.

Is Claude an "AI"? Strictly: Claude is an AI chatbot, built on a large language model, deployed by Anthropic inside a harness. "AI" is accurate; chatbot, or the model behind it, can be more precise depending on what you mean.

One last thing — and it follows naturally from everything above. The proliferation of AI products you see in the world is essentially this: a handful of base models, dressed in many different harnesses to do many different jobs — code assistants, customer-service bots, legal-research tools, writing aids, medical chatbots, the AI features inside the software you already use. Most of what differentiates them is the harness, not the model. The same general model that does not quite fit any specific job becomes useful when you wrap it for one.

If you want to go further

Where to go from here

This is about as far into the technology as we think we need to go. The questions from here on are not really technical ones. If you do want to dig further, the entries below are good places to start — the first two on the machinery itself, the others on what these systems pick up from whose writing they were trained on.

Elements of AIUniversity of Helsinki. A free, non-technical introduction. The single best place to start.

elementsofai.com

3Blue1Brown · Neural Networks seriesThe clearest visual explanation of how this works, no code, beautifully made.

youtube.com/@3blue1brown

Anthropic · whose opinions LLMs reflectAnthropic's own research on how Claude's default values vary across countries, languages and demographics. Makes the uneven-mirror point concrete.

anthropic.com/research

"On the Dangers of Stochastic Parrots" — Bender et al. 2021The foundational paper naming what large language models actually are, and what they reflect of who wrote their training data.

dl.acm.org/doi/10.1145/3442188.3445922

A word on where this primer differs

The University of Helsinki's Elements of AI (linked above) is excellent on the technical concepts and probably the easiest place to start. It is also explicit about where it stops. Quoting John McCarthy, it tells the reader that the philosophy of AI is "unlikely to have any more effect on the practice of AI research than philosophy of science generally has on the practice of science," and concludes that we should investigate these systems for what they can practically do without asking too much whether they are intelligent or just behave as if they were.

The essay that underpins this course — Living with Uncertainty: Symbiogenesis, AI & the Human — takes the opposite view. The manner in which something comes into being shapes what it does in the world, and what it does to those it is used on. This kind of inquiry is not an optional layer on top of the technology; it is part of what we are dealing with, and part of how we will live with these systems.

3rd SpaceLooking at the world and trying to nudge it, dial by dial, towards a truer version of itself.

3rd-space.org

What this primer leans on. The wall-of-dials picture echoes Andrej Karpathy's "knobs" language and 3Blue1Brown's visual intuition for adjustable weights. The pre-LLM framings — tree-search chess, the recipe-follower, the nested-doll hierarchy of AI / ML / DL / LLM — belong to a broad textbook tradition (Russell & Norvig the canonical source) rather than to any one author. None of those metaphors are ours alone; what we hope is ours is the synthesis, and the questions we have placed underneath them.

Written by Faheem Nusrat in collaboration with Claude (Anthropic's AI), working alongside live search to retrieve and shape content. The course had already been documented; on that foundation, the text was reviewed and hand-rewritten many times to produce what you see.

Are you running an AI course, workshop, or reading group of your own? We would love to hear how you might use this — get in touch via 3rd Space.