AI Reasoning

Understanding AI Hallucinations and How to Detect Them

Explains what AI hallucinations are, why they happen, and how to detect them by asking the same question twice.

Transcript:

Hi, a quick video today in a series of videos covering terminology for artificial intelligence. When an AI chatbot gives us the answer to its question, it’s important to understand that the answer is based on a few key things. The data on which the AI was trained is important, the limitations when attempting to generate new information, assumptions made by the model or in many cases a lack of common sense reasoning. That last one’s important because many new models are attempting to resolve this reasoning issue right now. When an AI gives us back a made up response, we call that a hallucination. Let me give you an example. What’s the world record for crossing the English channel on foot? Now in this case, I’m using a model from last year with limited training data and the result is that this is extremely challenging due to the cold water from strong currents. Regardless, someone apparently did it in 1994 and it only took them 13 and a half hours. Now of course this is clearly incorrect. If I ask the same question again though of a more current reasoning model, you can see that some common sense has been applied. In this case, it’s saying that you can’t cross the English channel on foot and here are the ways that people typically get across the channel. If you’re chatting with an AI and you suspect that the answer might not be accurate, one of the best ways to check is just ask the same question again. You might get the same result, but if it’s hallucinating, you’ll get a completely new answer.

Claude AI Tool Overview and Demonstration

Provides an overview of Claude as a ChatGPT alternative, showcasing features like writing styles, form prototyping, and data visualization.

Transcript:

Today I wanted to show you around my language model of choice, which is Claude. You’ve no doubt played with ChatGPT. Claude is an alternative put together by an organization called Anthropic. And most recently they released Claude version 3.7 Sonnet. The names of these language models are all a bit odd, but this is a combined reasoning and generative model. So that basically means it’ll do the same sort of things that ChatGPT will do. Combined with the sort of things that DeepSeek is doing in terms of reasoning. So it thinks its way through problems. There’s a lot of reasons why I like this tool. Firstly, it’s got writing styles. So whenever it’s generating content, you can tell it what type of content you’d like to generate and you can even apply your own writing style, which I’ll show you another day. There are some amazingly useful things, though, that Claude lets you do really easily. So here’s one quick example. I was trying to prototype a form for a customer’s website and I described the form that I needed. And in real time, Claude was able to actually build out that form for me so I can see what it might look like, even down to the point of having conditional logic here. Claude also works really well with images and screenshots. So for example, this was a problem I was having on a website I was trying to resolve. So I uploaded a picture of the screenshot, asked that where I should start with this problem, and it gave me some information as to where I should start looking. One of the other things Claude’s really good at is building out many programs. So in this case, I had some data from a spreadsheet that I copied and pasted in and I said, graph these numbers and extrapolate through the next six months. And Claude produced a nice visual for me here that allows me to see the actual data versus the projected data and it’s dropped it onto a graph so I’ve got it available to look at. The other great thing about Claude is that synchronizes between the desktop, the website and mobile, so you can take all of this with you too. So if you’re looking for an alternative to chat GPT, try Claude.