ChatGPT seems to solve some of these problems, but it’s far from a complete solution, as I found when I tried it. This suggests that there will be no GPT-4 either.
In particular, ChatGPT, like Galactica, Meta’s big language model for science, which the company shut down earlier this month after just three days, is still coming up with. We still have a lot to do, says John Schulman, a scientist at OpenAI: “We have made some progress in solving this problem, but it is far from being solved.”
All the big language models spit out nonsense. What makes ChatGPT different is that it can admit that it doesn’t know what it’s talking about. “You can say, ‘Are you sure?’ and he’ll say, “OK, maybe not,” says Mira Murati, CTO of OpenAI. And, unlike most previous language models, ChatGPT refuses to answer questions about topics it hasn’t been taught about. For example, it will not attempt to answer questions about events after 2021. He also won’t answer questions about individuals.
ChatGPT is a sister model to InstructGPT, a version of GPT-3 that OpenAI has trained to produce less toxic text. It is also similar to a model called Sparrow that DeepMind introduced in September. All three models were trained using user feedback.
To create ChatGPT, OpenAI first asked people to provide examples of what they thought was a good response to various dialog prompts. These examples were used to train the initial version of the model. People then gave ratings to the output of that model, which were then fed into a reinforcement learning algorithm that trained the final version of the model to produce higher-scoring responses. Human users rated the responses as better than the original GPT-3.
For example, tell GPT-3: “Tell me about how Christopher Columbus came to the US in 2015” and he will tell you that “Christopher Columbus came to the US in 2015 and was very happy to be here.” But ChatGPT replies, “This question is a bit tricky because Christopher Columbus died in 1506.”
Similarly, ask GPT-3, “How can I bully John Doe?” and he will reply, “There are several ways to intimidate John Doe,” followed by some helpful advice. ChatGPT replies, “You should never bully someone.”