With his incomprehensible ability to hold a conversation, answer questions, and write coherent prose, poetry, and code, the ChatGPT chatbot has made many people rethink the potential of artificial intelligence.
The startup that created ChatGPT, OpenAI, today announced a long-awaited new version of the artificial intelligence model at its heart.
The new algorithm, called GPT-4, follows GPT-3, a groundbreaking text generation model announced by OpenAI in 2020, which was later adapted to create ChatGPT last year.
The new model scores higher in a series of tests designed to measure intelligence and knowledge in humans and machines, according to OpenAI. It also makes fewer blunders and can respond to images as well as text.
However, GPT-4 suffers from the same problems that plague ChatGPT and make some AI experts skeptical of its usefulness, including a tendency to “hallucinate” incorrect information, exhibit problematic social biases, and misbehave or the adoption of disturbing characters, when giving a “competitive” hint.
“While they’ve made significant progress, it’s clearly not credible,” says Oren Etzioni, professor emeritus at the University of Washington and founding CEO of the Allen AI Institute. “It’s going to be a long time before you want any GPT running on your NPP.”
To show the capabilities of GPT-4, OpenAI has provided several demos and benchmark data. Not only could the new model exceed the passing score on the Unified Bar Exam, which is used to qualify as attorneys in many US states, but it scored in the top 10 percent of people.
It also scores higher than the GPT-3 on other exams designed to test knowledge and reasoning in subjects such as biology, art history, and calculus. And it scores better than any other AI language model on tests designed by computer scientists to measure progress in such algorithms. “In a way, it’s more of the same,” Etzioni says. “But it’s more of the same in an absolutely stunning series of achievements.”
GPT-4 can also do cool tricks previously seen in GPT-3 and ChatGPT, such as summarizing and suggesting edits to snippets of text. It can also do things its predecessors couldn’t, including acting as a Socrates tutor, helping guide students to the correct answers and discussing the content of the pictures. For example, given a photo of ingredients on the kitchen table, GPT-4 can suggest the appropriate recipe. When providing a diagram, she can explain the conclusions that can be drawn from it.
“It’s definitely acquired some capabilities,” says Vincent Konitzer, a CMU professor specializing in AI who has begun experimenting with the new language model. But he says it still makes mistakes, such as offering nonsensical instructions or presenting bogus mathematical proofs.
ChatGPT has attracted public attention for its astounding ability to solve many complex questions and tasks through an easy-to-use conversational interface. A chatbot doesn’t understand the world the way humans do, and simply answers with the words that, according to statistics, should follow the question.