Adventures in Salmonburg: ChatGPT lied to me!

I'm learning that AI is a tool, not magic

Jul 12, 2024

Generated using DALL-E with prompt: “Make a realistic photo of an arrogant robot telling lies to an ignorant man in a conference room that has shelves full of books along the walls”

My first weeks bringing my literary curiosity to the world of AI has been quite an adventure! While I have spent the past 8 years studying literature (undergrad, postgrad, conference papers, research projects, running a bookshop, community publishing), I am quickly learning that I have a lot to learn when it comes to AI and my few weeks of study thus far have got me about as far as my first few weeks of undergrad studying literature. This week’s big lesson? Robots lie!

In my previous blog post, I reached the conclusion that I could simply drop some data on existing generative AI chatbots like ChatGPT-4o and it would automagically incorporate the dataset into its knowledge. Why did I think this was the case? Because I asked ChatGPT and it told me this was the case. Unfortunately, when I tried to give it a go, I quickly discovered I had been lied to.

You see, with ChatGPT-4o users can upload structured and unstructured data that the chatbot will then incorporate into its discussions with that user. I gave this a go with the Redmond Writers Archive and was very impressed. I could ask questions like “Have any of these writers won awards?” And it would use its knowledge and capabilities to do an investigation that would take me days in a matter of seconds. Sure enough, a few had. The discussion was great, but my goal was not to build a robot I could talk to about my special data, I wanted to make it so everyone around the world could combine the power of LLMs and generative AI with the useful literary data I’d dug up. ChatGPT-4o said it could do this and that it indeed had done this after I uploaded the data. But it doesn’t appear to be the case.

Subsequent dialogs on my account couldn’t even access the locally stored knowledge, instead it told me I had to upload the file again. Attempts to access information related to the dataset, when calling it by specific name even, on a separate account failed. Now, it may not be a total failure, I encourage you to give it a go if you’re curious, talk with ChatGPT-4o about the Redmond Writers Archive from Redmond Historical Society. It does know what it is, but from what I can tell, does not incorporate the actual contents of the dataset into its knowledge. This is a step forward, because it didn’t know what it was based on my prompts from a week ago before I started the experiment.

My next step was to explore creating a custom GPT with OpenAI’s tools built into ChatGPT. In theory, this lets you create a GPT that will incorporate specific data sets and instructions you give it. So I made a librarian with a copy of the Redmond Writers Archive dataset, but to my great dismay she turned out to be a pathological liar. I would ask her about writers listed in the archive and she would make up names along with books they supposedly wrote. Her answers were so convincing I had to pop back to the dataset and double-check them. Unfortunately, it was all lies!

It turns out this is a demonstration of two well known issues. Firstly, that GPTs have to be trained to tell the truth. It turns out ‘prompt engineering’ is more than a fancy term for having a chat with a chatbot. I need to learn how to tune my bot to use the data properly. The second issue is that apparently other fellow explorers have also found that OpenAI’s custom GPTs frequently fail to reference their own custom datasets. This is likely a bug on OpenAI’s part.

So, the good news / bad news is that in fact what I want to do with cultural work and AI has not yet been done and is not something that can be easily built given existing technologies. It’s more complicated than feeding datasets to ChatGPT, at least in the near-term. In the long-term, since these LLMs are built by incorporating data from almost all of the entire Internet, if I were to build a public website with my various datasets, it is possible that future releases would in fact have the literary knowledge and capability to act on it that I’m looking for.

Some of the interesting data is there and a conversation with ChatGPT can provide more useful book recommendations than any other online tool I’ve found. I can say “Suggest some books written by authors from Seattle, Washington in the 90s” and immediately get a response like “Notable books by authors from Seattle, Washington, in the 1990s include "Snow Falling on Cedars" by David Guterson, "The Alienist" by Caleb Carr, "The Sky, the Stars, the Wilderness" by Rick Bass, "What It Takes to Win" by Jack Lambert, "The River Why" by David James Duncan, and "Still Life with Woodpecker" by Tom Robbins.” However, it’s important to note that many of these books were not published in the 90s nor written by authors from Seattle. But that’s the nature of machine learning: approximations. What you get is authors born approximately in Seattle (David James Duncan was born in Portland) and published approximately in the 90s, “The River Why” was originally published in 1983. Close enough? Well, better than any alternative I know of and at least it’s a real book written by a real person (as opposed to what my custom GPT librarian was telling me).

For exact answers, the GPT needs to be tuned to know how accurate it needs to be and in what ways. In some cases, using machine learning and a GPT might be the wrong tool for the job. Building a structured database and an interface to query it will not return approximate answers but only the exact ones. The catch is that it takes a lot more time and resources to build such a database system and interface rather than just piling an LLM on top of my data. I’m not sure what the right approach is, but I will say I am much more motivated to gather the data and make it useful to others than to tinker around on building a perfect robot.

So this is all to say that I’ve been lied to and learned not to trust everything ChatGPT tells me. It’s a tool, not a magical superpower. It is awe inspiring to see what it’s capable of and so unsurprising that I simply classified it as magical upon my first encounter. Now I am better understanding its limitations and learning that I have a lot to learn. Next up is exploring the generative AI bot-building tools from Microsoft, Amazon and Google - these might offer a better experience than OpenAI’s custom GPTs that have let me down - although I might tinker around with better tuning and prompt engineering to see if that helps my GPT librarian.

This is indeed going to take a while and be an interesting learning adventure. I’m really excited to improve how people discover interesting books (and other media for that matter) with the combination of creating useful datasets and custom chatbot interfaces for people to interact with them.

And dear readers please don’t worry, I am not planning to make this a purely technical blog - stay tuned for travelogue entries on my upcoming voyage to Nova Scotia, Newfoundland, Greenland and Iceland! I do hope to avoid rants and pure opinion pieces; while I have a lot of thoughts on the issues of our day, I’m not sure they’re worth sharing. Thanks for following along!

palios.ink

Adventures in Salmonburg: ChatGPT lied to me!

I'm learning that AI is a tool, not magic