E5 - Generative AI, ChatGPT and the Stochastic Parrot Artwork

The Technology Sounding Board

The Technology Sounding Board is a forum for thinkers, consumers and evangelists of technology in the Enterprise to get together and discuss new ideas...the good, the bad and the ugly!

All Episodes

The Technology Sounding Board

E5 - Generative AI, ChatGPT and the Stochastic Parrot

June 04, 2023 • Michael R. Gilbert • Episode 5

Send us a text

What if I told you that you could unravel the mysteries of generative AI, and ChatGPT in particular, through the eyes of a Stochastic Parrot? Join me, your host Michael R Gilbert, as we embark on an exciting journey into the world of generative AI, exploring its capabilities and contemplating the potential dangers hidden within this powerful technology.

Together, we'll dive deep into the concept of word distance, essential for encoding context and understanding language. Our AI parrot example will guide us through a fascinating exploration of probability frequency and how it influences the AI's responses to prompts. We'll also discuss the implications of generative AI, its ability to create new content never seen before, and the challenges that come with it. By the end of this episode, you'll have a better grasp of generative AI's power and potential pitfalls, as well as a newfound appreciation for our stochastic feathered friend.

Unless you’ve been hiding under a rock for the past year, you’ll have heard of Generative AI, even if only the most popular implementation at the moment - ChatGPT.

Now, people are getting themselves (and their companies) into real trouble because they don’t actually understand what it is that they are playing with…

There was a CNN article just few days ago, highlighting a lawyer that needed to apologize to the court because 4 of the 5 cases he presented in support of his client, were in fact completely fabricated. He had used ChatGPT to do some research, and got some extremely plausible, but completely fictional responses, which he had then used in his work.

Is that happening to you… or your company.. now? Are you sure?

Now, I have read articles claiming GenerativeAI is everything from the second coming of the Lord, to the most insidious work of the Devil himself…

...a new reality that will end civilization as we know it.

But what’s the truth here? What can it actually do? What can’t it do? What should, …and shouldn’t… we do with it?

And what the heck is a Stochastic Parrot? Let’s talk about it…

[Intro Music]

Welcome to The Technology Sounding Board, I’m your host, Michael R. Gilbert and in this episode we’re going to talk about GenerativeAI, and and specifically ChatGPT. We’re going to cover what is able to do and where this may be of real benefit to enterprises today.

And we’re going explain a model that may help you understand what it’s *really* doing and so, hopefully, you’ll be able see where it is a help and where it could be a real danger.

Ok, so let’s start at the top, what is GenerativeAI? Well, as the name suggests, it’s the use of AI technologies to “generate” content…to create, if you will. We’ll explore whether or not it’s really creating, in the sense that we humans would, when we get to our Parrot discussion, but for now we’ll take it as red that it is.

So… what can it create? Well, pretty much anything that can be created…

Today, there are applications in the space of pictures…photo’s, paintings, etc - OpenAI, the company that brought us ChatGPT also created DALL-E which creates new images based on the ideas you give it - “paint me a tree growing upside down in a refrigerator on Mars” perhaps? In fact the parrot picture I used as the Thumbnail for this Podcast episode was generated by DALL-E.

There are applications in the space of video, there are certainly applications now that can create new music - even with vocals to them, being “sung” by virtual people who never existed, or in the style of real people you already now - Google has a good example of this today.

How about physical design…sure, applications exist for that too.

Now for this conversation, we’re going to focus on text based applications, simply because it is easier to imagine what is happening in this space, but everything we’re talking about in the realm of text can translate just as easily to these others.

So let’s talk about ChatGPT…breaking down the name for a minute, we see

Chat, tells you that it is essentially a chatbot (designed for interactive conversation) and GPT tells you its based on it uses a technology called GPT - Generative, Pre-trained Transformers.

Generative because it “creates” or generates its responses rather than repeating something that’s pre-scripted.

Pre-trained because, well it is…it’s “learned” how language functions and how to “create” new sentences through being trained on a vast corpus of real documents written by real people in the past.

The transformer bit is just reference to the latest and greatest generation of AI algorithms that it uses tracks “context”…which we will explain later.

ChatGPT is just one specific application and probably the most famous (or perhaps infamous) at this time, but there are a multitude of others already and more every day…Microsoft has recently introduced a version called Co-pilot which is designed to help you generate new programming code for example…they’ve also included this technology into their Search engine, Bing, to help you find and research ideas faster.

The amount of investment in this space is enormous, and growing fast.

When people initially interact with ChatGPT, the first thing they often say is, “how does it know that?”…they are immediately reacting as if they were talking to a sentient being, something that “knows things”. They aren’t, but it is extraordinarily convincing.

You can ask ChatGPT things like, “How do I turn on capability X in application Y” and it will come back with something like:

“Select settings from the file menu, switch to the tab labeled whatever, hover over the button labeled thingy and a menu will pop up…you are looking for the third option down from the top”. Obviously the “thingy’s and “whatevers” are specific to the application you asked about, but you get my point.

And the thing is, it’ll be right about it.

You can ask it to write you a Pac-man style game in javascript, and it’ll spit out code that works!

You can ask it to write you a short story about a taxman riding a wolverine in the style of Shakespeare, and it does!

How does it know all this, how can it do all this?

Well, let’s introduce our Stochastic Parrot and see if we can make all this magic seem a little less frightening (or slight more frightening when you see the potential for problems, but we’ll get to that).

Ok Michael, what the heck is a Stochastic Parrot.

Well, we all know what a parrot is, right? It’s a small green bird that sits on your shoulder and repeats phrases that it’s heard in the past…”pieces of eight, pieces of eight” for example (and I am assuming that you are a pirate from the early 16th century in this example, obviously).

The word, Stochastic here, we are using to mean “randomly, in accordance to a certain probability distribution”…now don’t worry if that doesn’t make much sense to you yet, it’s going to I promise.

So imagine with me that you are talking to a parrot that will respond to whatever you say with a random set of words. If it was random in the sense that most of us think of (technically randomly drawn according to a Uniform distribution - i.e., where absolutely every word is as likely to be picked as any other…

..you might ask it, “What’s the best way to get to the nearest McDonalds?”… and it might reply, “Lieutenant swimming Sirius green”…not that helpful.

So we are going to introduce three constraints that we want our Parrot to use when deciding which words to respond with, and they are:

A Language Model
Word Distance, and
Context

I will explain each of these in turn, with an example, but let’s start with the idea of a language model. We’re going to restrict ourselves to English here, but you can easily see how this translates to any other language, whether a human spoken language, or a computer coding language, or frankly a language used to describe musical notes…

You see, every language has basic rules that it must follow. We are all taught in school, the idea of verbs, nouns, pronouns, adverbs, adjectives and so on…and whether we paid attention to our English lessons or not, we eventually figure out that we can’t just string these together in any particular order, we must put them together in a certain way, otherwise we don’t make sense.

Let’s go back to our Parrot and play a simpler game. Well give the parrot a prompt, a series of words, and ask the parrot to give us the next word in the series. We can start with

“Jane threw the …”

Now Jane is a Noun, technically a Pronoun, and it’s naming the subject of the sentence. Threw is a verb, it’s telling us what the subject is doing. The is an “article” here, so what is the word type that must follow?

Well, it is likely to be a Noun, the object that Jane is throwing…

...it could also be an Adjective - like “green” or “small”, that describes the Noun that Jane is throwing, but let’s go with a Noun for now…so what is our Parrot going to say…?

[Dramatic Pause]

I am pausing dramatically here (you can imagine a drum roll if you like) while you listen to your parrot. Perhaps yours said “ball”? So we’d have the sentence, “Jane threw the ball”. Great. But it could have picked any noun right? And not every possible noun would make sense here…

I’m going to take a small diversion at this point, to quote one of my favorite sentences from a great British comedian, Stephen Fry. In a sketch he was delivering underlining just how many different ideas you can express with the English language, he came up with this one”

“Hold the waiter’s nose squarely Susan, or friendly milk will countermand my trousers”.

Every word in that sentence is simple and common. It obeys every grammatical rule in the language, and yet it’s complete non-sense!

So a grammar based language model is necessary, but clearly not sufficient.

Let’s introduce the other two constraints that I talked about, Word Distance and Context. Now context we’ve hinted at already…it’s the idea behind what we’re saying….so far in our example we have a subject, called Jane, who is throwing something. And we are likely inferring some information from this context.

See, we can’t know everything about every possible Jane in the universe, and even if we did, we don’t know which Jane is being referred to here, so we need to translate the word “Jane” into a conceptual Jane in order to decide what this particular Jane might be throwing. And that’s where we are going to lean on Word Distance.

Let me take another small detour to help clarify that. Let’s assume we are a reading movie review and trying to get a sense of whether the reviewer thinks the movie was good or bad. If I used words like “lame”, “dull” or “boring”…these words don’t mean the same thing, but they are relatively close to bad. And I am deliberately being a bit hand wavy when I talk about “close to”, but I think we inherently understand the concept, right? And likewise, “fun”, “exciting”, and “dramatic” also don’t share meanings, but they are close to “good”, or at least much closer to “good” than “bad”, ok?

Well that’s the basis of the technique that we’re using to encode the context for what we’ve hear so far with our prompt…word distance rather than the actual word…we kind of have to… because there are trillions of possible word combinations, frillions to quote a word from the urban dictionary, defined to mean a number so large that it is considered to be obscene.

We can’t hope to encode them all with any computer that exists today, so we have to try encode a much smaller number of ideas and then map the words in terms of distance from those ideas.

Alright, so let’s go back to our trivial example and put this in practice with our parrot…that might help make sense of it.

We look at the word Jane, and we can see that this has a relatively short word distance from, Human and female,…(I can hear alarm bells going off already, but stick with me, I’m going to get to that I promise). We can see now that when we’re looking for a word to describe our object, the thing Jane is throwing, should be something that a female human is likely to throw…and ball has a very close word distance to that idea…hence ball is a good fit.

Great, but our Parrot isn’t just going to pick, “the most likely” answer…

...that wouldn’t be very stochastic of it…,

It’s going to work out the probability frequency for all the words it could choose and choose one word from the list according to that probability….what do we mean by that. Let’s use a slightly different example for a minute…suppose the prompt had been, “He tossed a coin and it landed …”. We can easily see that the vast majority of times the answer is going to be “heads” or “tails”…let’s say either of these answers is equally probably. We can also see that occasionally it might be a preposition, like “on”, “in”, “down”, “under” etc. Think “it landed under the table”, “or on Wednesday” or whatever. Let’s say that 49% of the time the answer would be heads, 49% of the time it would be tails and then all the other answers would be equally probable in the last 2%. So if we give the parrot the same prompt again and again, let’s say 100 times, then we’d expect that 49 times it would say heads, 49 times it would say tails and the other 2 times it would give as a preposition. That’s the idea of Stochastic…if it is a non-zero probability of occurring, then it will occur…just if it’s improbable, then it won’t occur very often.

So back to our Parrot and Jane…did any of your Parrot’s say “bulldozer”? Why not? It’s possible right, so if enough of you were asking, eventually one of your parrot’s is going to. Then what?

We have “Jane threw the bulldozer….”, now what’s the next word. Well, the context has changed, so the word distances have changed. There are three word distances that might be very different now, the relationship between Jane and what we recognize Jane as, the definition of “threw” and likewise what we meant by bulldozer.

Let me give you an example…we could continue the sentence with, “into gear”, right? “Jane threw the bulldozer into gear” We’ve selected a different relationship with the verb “threw” here.

Alternatively, we could have continued with, “and although it was only a small toy, it still hurt when it hit me!”…and we’ve redefined the bulldozer.

Again, we could have continued with, “Jack always thought that Jane was a stupid name for 30 foot tall, psychotic android but there it was just the same”…and in this case we’ve redefined Jane - picked a different point in our word distance encoding.

Now that we better understand what our Stochastic Parrot is doing, there are a number of really important points to pick up on.

First, GenerativeAI’s do exactly what they say on the tin…they generate new content…if it’s possible, it’s possible it will come up with it. Just because it does, doesn’t mean that it is real, or could exist in the universe as we know it - they aren’t telling you the truth, any more than any human artist is, they are telling you one possible truth within a very wide possible worlds. The fact that they “usually” come up with something that people would “usually” say, is simply a factor of probability and the distribution of ideas that it found in the material upon which it was trained. If you don’t know the material used, you don’t know what they are capable of saying.

Similarly, the word-distances that it has encoded are those that it learned from the corpus upon which it was trained. Now this has an obvious bear trap waiting for you doesn’t it? For example, if I give you the word King and ask you for the closest pronoun, you are likely to respond with “he” and you aren’t going to get into much trouble with that…likewise, Queen maps closely to “she”. But what if I ask you for the pronoun closest to “Doctor”, “Scientist”, or “Homemaker”…and very quickly we can see the problem.

The bias that is actually in our literature will be propagated into the Parrot’s responses and that can get you into difficulty very quickly indeed. To be fair, this is well understood by the makers of ChatGPT and they have done a lot to counter these biases, but whereas these ones are very obvious, what about those that you can’t see so easily. What about biases of understanding, politics, religion….what about simple cases of common beliefs which just plain wrong….all of these things are encoded from the training corpus and you can’t see them…however, they will influence the answers your parrot gives.

One last warning before I sum up where we are, and that’s around the nature of creation. Human’s don’t work this way…they don’t start with a set of words and try to find one that fits on the end. They tend to work top down by establishing an idea that they want to convey, and then searching for the words that embody that idea best.

When doing so, they may borrow from other sources, but they know that there are limits to what can be borrowed before it starts to be just plain plagiarism. Even so, cases of copyright violations hit the courts all the time…most recently perhaps, Ed Sheeran’s case, being sued for allegedly copying Marvin Gaye’s work, “Let’s Get It On”, because it was “too similar”. Ed won this case, but many artists loose, and what “too close” means is just not well defined.

If this occurs in cases where clearly the intention wasn’t to copy based on “closeness”, how often will it occur when your algorithm, your Parrot, is actually designed to do exactly this. What about if you have it create something in the style of someone else, or sing, or just talk with a voice similar to someone else…what if you have it create a picture that is in the style of a famous artist…even if this isn’t just plain creepy, how will you know that what your parrot has “created” for you is unique enough in the eyes of the law…when you get hit with the law suite? How will you defend an copyright accusation, when you don’t even know what it learned from in the first place?

This is an area of law that is going to need to be figured out, and quickly, if this isn’t going to be mess that everyone get’s caught in.

So the takeaways here…

GenerativeAI systems, like ChatGPT, don’t “know” anything, they are designed to take a seed idea and extend it by pulling related ideas into the response according the probability that these ideas would have been connected in the material upon which they were trained;
If they can relate different ideas, they will…just because a relationship between two ideas is tenuous at best doesn’t mean they won’t connect the ideas, just that this particular connection will occur very rarely…but it might occur on your watch! That’s part of what makes them ‘creative’ in the first place.
There is a real danger that they will reflect all sorts of biases that you are not aware of and not ok with.
Copyright concerns may become a real thorn - only time will tell how much so.

Can you do anything to limit the risks and still benefit from the technology…yes, I think so. Firstly, don’t think of this as a intelligent entity that can actually create something unique out of nowhere, treat it as a tool that can amplify human skill.

It could, perhaps, be used to generate boilerplate text or first drafts that are then later edited by humans…this last layer, the human in the circuit if you will, is a sanity check and a bias prevention mechanism. That might not be enough to skirt copyright infringements though, you might want to lean into more technology - like the plagiarism tools schools already use - to help you there.

You could use it the other way around too…run copy that humans create through an AI layer and let it flag things it thinks are questionable and offer edits where it thinks things could be different…much less likelihood of running into copyright issues here, just don’t let it have the final say.

As for is this a game changing invention that will bring ruin to most human endeavors…yes to the first part and no to the second.

It reminds me of another “killer application” that first appeared in 1979 - VisiCalc. For those who haven’t been around quite that long, VisiCalc was the first ever spreadsheet program and it was written for the Apple II home computer - before the PC was even invented. It was such a game changer that almost from the day it was release, the Apple II started being bought by enterprises at a much higher rate than home user ever did…indeed you could say that it was the spark that led to IBM getting into the PC game in the first place.

It later spawned Lotus 1-2-3, and then ultimately Microsoft Excel, without which most enterprises today just couldn’t function. It completely automated and transformed the way we treated tabulated numbers, and the word at the time is that it would completely decimate the Accounting industry…it would empower everyone to be their own analysts and remove the need for such expensive roles.

Instead, it has become an incredibly powerful tool that every Accountant and every Analyst uses to do their job…roles which have become more efficient, more effective and more numerous than ever.

We haven’t figured out what to do with GenerativeAI yet, our Parrots are just babies learning to talk, but I don’t believe it will destroy or replace the creative roles in our world, I believe it will become the equivalent of Excel for naturally creative types, and help more and more of us to add an element of creativity to what we do.

[Outro Music]

Well, with all that said, I hope you’ve got a better insight into GenerateAI now, and that you’ve had fun with your Stochastic Parrot. As always, the transcript can be found on the website at www.thetechnologysoundingboard.com and if you get a chance, stop by and leave us a review or a comment. Until next time…

Michael R. Gilbert

Host