AB-730 Episode 1: What is Generative AI?
In this first episode of our AB-730 masterclass series, Simon and Lachlan break down the absolute fundamentals of Generative AI. Learn how Large Language Models (LLMs) interpret our world through 'tokens', how Microsoft Copilot keeps your company's data locked down inside a secure enterprise boundary, and why managing 'hallucinations' requires a human in the loop. Stick around until the end for a play-along trivia quiz to test what you've learned!
Is this your podcast and want to remove this banner? Click here.
Chapter 1
Demystifying LLMs and Tokenization
Simon Carver
Welcome to the show everybody! I'm Simon Carver, here with Lachlan Reed. And Lachlan, I want to start with a concept that completely flipped how I view AI. When you type a sentence into a large language model, it doesn't actually see words. It chops your text up into these tiny semantic building blocks called tokens.
Lachlan Reed
Right, tokens! It's like instead of reading the word "unbelievable" as one unit, the model might slice it into "un," "believe," and "able." On average, a token is about four characters of English text, or roughly three-quarters of a word.
Simon Carver
Exactly! And it translates those pieces into numbers. So when we talk about these massive LLMs, they aren't reading for meaning the way you and I do in our backyard sheds. They're actually running complex math to calculate probability.
Lachlan Reed
Yeah, they're essentially super-powered prediction engines. If I say "a cup of hot," the model calculates that the next token is highly likely to be "coffee" or "tea," and very unlikely to be "motorcycle." It's predicting the most logical next token based on patterns in the massive datasets it was trained on.
Simon Carver
Which represents a massive shift from how we used to use computers. Traditional search engines just look for exact keyword matches like a giant digital index. Generative AI, though, acts like a reasoning partner. It actually synthesizes and generates fresh, context-aware responses on the fly.
Lachlan Reed
It's the difference between asking a library catalog to find books on a topic, and sitting down with a brilliant assistant who has read all those books and can summarize the key arguments for you right there.
Chapter 2
The Enterprise Security Boundary
Simon Carver
But that brings up a massive concern for businesses. If this assistant has read everything, is it going to take my private company data and blab about it to someone else?
Lachlan Reed
Nah, and that's the big worry, isn't it? People think if they put a sensitive financial report into Copilot, it's going to leak onto the public internet. But Microsoft 365 Copilot uses a strict enterprise security boundary. Your data stays entirely within your company's tenant. It's locked down.
Simon Carver
Right, it never crosses over to train the public models. And the way it pulls in your specific business data securely is through a process called Retrieval-Augmented Generation, or RAG.
Lachlan Reed
RAG is brilliant. Think of the LLM as a super-smart brain that knows how to write beautifully, but doesn't know your specific business. When you ask it a question, RAG goes into your secure company files, retrieves the exact, relevant document, and hands it to the LLM to use as a reference sheet.
Simon Carver
So the model is only using your data in real-time to answer your specific prompt. Once that session is done, it doesn't retain that information to train its core weights. Your intellectual property remains entirely yours.
Chapter 3
Hallucinations and the 'Human in the Loop'
Lachlan Reed
Which is a massive relief, but we still have to talk about the elephant in the room: hallucinations. Because these models run on math and probability, not actual "truth" databases, they can sometimes confidently state things that are completely made up.
Simon Carver
Oh, absolutely. Since it's always predicting the next most likely token, it cares about sounding plausible, not necessarily being accurate. It doesn't lie out of malice; it's just following the math to make a coherent-sounding sentence, even if the facts are wrong.
Lachlan Reed
That's why we always need a "human in the loop." You can't just copy-paste a summary of a contract and sign off on it. You have to be the editor. Use the AI to get you from a blank page to a first draft, but never let it make final, authoritative decisions without a human double-checking the facts.
Simon Carver
Exactly. Think of it as a brilliant intern. They're incredibly fast and creative, but you still need to review their work before you send it to the CEO.
Chapter 4
Interactive Quiz - What is Generative AI?
Lachlan Reed
Alright, let's put this knowledge to the test. We've got a quick three-question quiz for everyone listening. Simon, are you ready to play along too?
Simon Carver
I am absolutely ready. Let's do this.
Lachlan Reed
Question one: How do large language models process text? Is it A, word-by-word; B, through numerical tokens; or C, by scanning entire sentences at once? We'll pause for a second so you can think.
Simon Carver
I'm going with B, tokens.
Lachlan Reed
Spot on, Simon! And for everyone listening, remember, those tokens are those semantic chunks translated into math. Question two: True or False -- Microsoft 365 Copilot uses your private business data to train its public models. Think about it.
Simon Carver
That has to be False.
Lachlan Reed
Correct! It's absolutely False. The enterprise boundary keeps your data locked in your tenant. Alright, final question: Why do LLMs sometimes hallucinate facts? Is it A, they have a software bug; B, they run on probabilistic math rather than a database of facts; or C, they are pulling from outdated search indexes? Take a moment.
Simon Carver
It's B. They are probability engines, not databases.
Lachlan Reed
Brilliant! Three out of three, mate. You passed with flying colors.
Simon Carver
Well, thank goodness for that! And that is our quick take on the fundamentals of generative AI from the AB-730 course. Thanks for listening, everyone!
Lachlan Reed
See ya next time!
