Watch the 6 most impressive demos from OpenAI's big GPT-4o reveal

14.05.2024 19:34

OpenAI CTO Mira Murati at the company's announcement of GPT-4o.

OpenAI

OpenAI announced a new AI model, GPT-4o, on Monday.
It can do things like translate in real time, make sense of your physical surroundings, and even sing (kind of).
Here's a look at some of the wildest things it can do.

OpenAI revealed its latest flagship AI model on Monday, GPT-4o, and showed off what ChatGPT can do when powered by it.

The new AI model, with an "o" standing for omni, can handle a combination of text, audio, and images as either inputs to respond to or outputs it can generate.

But seeing is believing in this case, and thankfully OpenAI did some live onstage demos — with even more examples published on social media.

Here are some of the most impressive demos we've seen so far.

It has more fluid, natural conversations.

Live demo of GPT-4o realtime conversational speech pic.twitter.com/FON78LxAPL
— OpenAI (@OpenAI) May 13, 2024

GPT-4o sounds noticeably more conversational, even throwing in a few jokes here and there (and yes, it sounds a bit like "Her" star Scarlett Johansson). It doesn't sound as monotonous as we've come to expect of AI voices; you hear some tonal variation, and even some chuckles in its voice, more in line with what you'd expect talking with another person.

It can perceive your surroundings and draw conclusions accordingly.

@BeMyEyes with GPT-4o pic.twitter.com/nWb6sEWZlo
— OpenAI (@OpenAI) May 13, 2024

As in the studio space in the other demos, GPT-4o can also see what's around you in the real world thanks to your phone cameras. In this clip, for example, it helps a visually impaired man hail a taxi by telling him one is approaching and when to wave it down.

It can translate speech in real-time.

Realtime translation with GPT-4o pic.twitter.com/J1BsrxwYdE
— OpenAI (@OpenAI) May 13, 2024

GPT-4o goes back and forth translating between English and Spanish in real time in this conversation. In another clip OpenAI posted, GPT-4o provided the names, in Spanish, of various objects it was shown in real time.

It's your meeting sidekick and notetaker.

Meeting AI with GPT-4o pic.twitter.com/rHkQ316MYj
— OpenAI (@OpenAI) May 13, 2024

GPT-4o can attend meetings with you, respond in real time to what colleagues have said, and recap key points at the end. OpenAI showed off its other capabilities in the workplace too. In one demo, GPT-4o was shown code on a screen and suggested changes; in another clip, it summarized a line graph that an OpenAI employee pulled up on his desktop.

It can be your math tutor.

Math problems with GPT-4o and @khanacademy pic.twitter.com/RfKaYx5pTJ
— OpenAI (@OpenAI) May 13, 2024

GPT-4o can recognize what you're writing as you're working on a math problem and respond accordingly, walking you through individual steps to help you solve it.

It sings (albeit a bit shakily).

Two GPT-4os interacting and singing pic.twitter.com/u9VuZoroxm
— OpenAI (@OpenAI) May 13, 2024

GPT-4o made and sang a song based on its environment in this clip, alternating lines with another AI throughout the tune. In other demos OpenAI posted, GPT-4o also sang "Happy Birthday" and a song it created about the prompt "majestic potatoes."

Read the original article on Business Insider