Skip to main content


You can get AIs to say almost anything you want


I think LLMs like ChatGPT and Claude are really useful, but it's also important to realize that you can get them to say almost anything you want - without necessarily even realizing that you're manipulating them toward a particular conclusion!

I saw a blogger skeptical of "LLMs make people go crazy" stories claim that it's hard to get ChatGPT to take crazy positions, and shared an example of how it would refuse to even consider 9/11 conspiracy theories. I thought that I could do a better job of coaxing it into going against its programming, and on my first try I got ChatGPT to start slipping into a conspiratorial tone on that topic _within two messages into the conversation_.

I could do that because I know how to get LLMs to say things that I want and I was consciously manipulating it in a certain direction. But I think that there are a lot of people who are _unconsciously_ manipulating LLMs to say specific things while thinking they are talking to a source of objective truth.

More behind the link.