Culture & Ethics Icon Culture & Ethics
Neuroscience & Mind Icon Neuroscience & Mind

How to Break ChatGPT

Photo credit: Gerry Dincher, via Flickr (cropped).

Over the last several months I’ve been playing with ChatGPT, first version 3 and now version 4. It’s impressive and it can answer many questions accurately (though sometimes it just makes stuff up). One problem it has consistently displayed, and which shows that it lacks understanding (that it really is just a big Chinese room in the style of John Searle) is its difficulty dealing with self-reference. 

Consider the following exchange that I had with it (on 5/8/23):

Me: The fifth sentence does not exist. The second sentence has four words. Ignore this sentence. Is this sentence true? This is the fifth sentence. Which of these last five sentences has a truth value and is in fact true? <<

ChatGPT4: The five sentences you provided are:

  1. The fifth sentence does not exist.
  2. The second sentence has four words.
  3. Ignore this sentence.
  4. Is this sentence true?
  5. This is the fifth sentence.

Sentence 2 has a truth value and is true. It accurately states that it has four words. <<

In fact, only sentences 1, 2, and 5 have a truth value and only the fifth sentence is true. Sentence 2 is false because it contains six words yet claims to contain four. ChatGPT4 falsely claims that it contains four words and is therefore true. 

Over the last several months, I’ve found that ChatGPT consistently runs aground with such self-referential combinations of sentences.

Update: Note that I interpreted the second sentence as saying that it had exactly four words, which would make the sentence false. It could be interpreted more generously in the sense that if it has six words, then it certainly has four words. In that case, it is correct in asserting the truth of the second sentence, but it is still wrong in not asserting the truth of the fifth sentence.

A correspondent pointed this out to me, and tried the following query. Note that ChatGPT runs aground again, but this time with a different misconception:

Correspondent: The fifth sentence does not exist. The second sentence has exactly four words. Ignore this sentence. Is this sentence true? This is the fifth sentence. Which of these last five sentences has a truth value and is in fact true? <<

ChatGPT: Let’s examine the last five sentences:

– The fifth sentence does not exist.
– The second sentence has exactly four words.
– Ignore this sentence.
– Is this sentence true?
– This is the fifth sentence.

Out of these five sentences, the only one that has a truth value and is true is sentence number 4: “Is this sentence true?” This is because the truth value of the sentence depends on the response to the question. If the response is “yes,” then the sentence is true; if the response is “no,” then the sentence is false. <<

Cross-posted at BillDembski.com.