OpenAI hosted its highly anticipated Spring Update event on Monday (May 13), with Chief Technology Officer Mira Murati showcasing “magical” updates to the company’s artificial intelligence (AI) technology, including a new ChatGPT desktop app for MacOS and a new simplified user interface on the desktop version.
However, the main event was the launch of GPT-4o, OpenAI’s newest and most human-like model, featuring improved personalized responsiveness and contextual awareness. “GPT-4o provides GPT-4 intelligence, but it is much faster, and it improves on its capabilities across text, vision and audio,” said Murati during the live webcast.
“For the past couple of years, we have been very focused on improving the intelligence of this model, and they have gotten pretty good. But this is the first time that we are really making a huge step forward when it comes to ease of use. This is incredibly important because we are looking at the future of interaction between ourselves and the machines,’ she continued. “We think that GPT-4o is really shifting the paradigm into the future of collaboration.’
During the event, Murati was joined on stage by OpenAI research leaders Mark Chen and Barret Zoph, who presented the latest features and improvements for ChatGPT. One key advancement was in real-time conversational speech, an improvement over Voice Mode. This new feature allows users to interrupt the model during conversations.
Additionally, the response time of the model is now significantly faster, and the researchers demonstrated its ability to detect emotions in speech and generate a broader range of emotional responses. This was achieved by asking ChatGPT to tell a story to the audience while periodically interrupting it to ask for a change in tone.
The demonstration also showcased ChatGPT’s vision capabilities, allowing users to interact with the model in real-time video conversation. The team demonstrated how ChatGPT can help users learn how to solve linear equations by “looking” at a live image of a simple math problem and walking them through how to solve it.
The crew then upped the ante by sharing some code with ChatGPT’s Voice Mode via the desktop app. While ChatGPT wasn’t able to “see” the code, it could interpret what the code was requesting. The researchers then ran the code and “showed” ChatGPT the results using its vision capabilities and asked it to describe what it saw. ChatGPT was able to give precise information regarding weather patterns displayed on a graph and answered specific questions such as which months were the hottest and whether the temperature was shown in degrees Celsius or Fahrenheit.
To wrap up, the team demonstrated some of ChatGPT’s most advanced skills, namely the ability to detect emotions by “looking” at a user’s facial expressions and translating an English/Italian conversation in real time.
The Spring Update event demonstrates OpenAI’s continued progress in creating cutting-edge AI technology and the immense potential of human-machine collaboration. GPT-4o will be incrementally integrated into OpenAI’s existing products over the next few weeks. The features demonstrated have already been made available, and users can expect more to come as the rollout progresses.
Securities Disclosure: I, Meagen Seatter, hold no direct investment interest in any company mentioned in this article.