Kaj Sotala

Monday, July 21, 2025, 3:14 PM

Kaj Sotala
Monday, July 21, 2025, 3:14 PM

You can get AIs to say almost anything you want

I think LLMs like ChatGPT and Claude are really useful, but it's also important to realize that you can get them to say almost anything you want - without necessarily even realizing that you're manipulating them toward a particular conclusion!

I saw a blogger skeptical of "LLMs make people go crazy" stories claim that it's hard to get ChatGPT to take crazy positions, and shared an example of how it would refuse to even consider 9/11 conspiracy theories. I thought that I could do a better job of coaxing it into going against its programming, and on my first try I got ChatGPT to start slipping into a conspiratorial tone on that topic _within two messages into the conversation_.

I could do that because I know how to get LLMs to say things that I want and I was consciously manipulating it in a certain direction. But I think that there are a lot of people who are _unconsciously_ manipulating LLMs to say specific things while thinking they are talking to a source of objective truth.

You can get AIs to say almost anything you want

And that also tells us something about humans

^{Kaj Sotala (Kaj’s Substack)}

like this

Kaj Sotala

Kaj Sotala Monday, July 21, 2025, 3:14 PM •

You can get AIs to say almost anything you want

You can get AIs to say almost anything you want

Kaj Sotala
Monday, July 21, 2025, 3:14 PM