OpenAI shocked the whole world yesterday when they released their most powerful AI chatbot yet! The GPT4-o (o for omni) is the latest and most powerful iteration of OpenAI’s GPT-4 yet. But how are developers worldwide reacting to this latest surge in the world of AI chatbots?
12 Amazing Features of GPT-4o
This latest flagship model comes with GPT-4 level intelligence but is much faster across text, vision, and audio. It comes with better vision capabilities that means you can interact with it using both images and videos.
Here are 12 amazing features of GPT-4o:
1) Displaying Input Texts in Font Images
Andrew Gao, an AI enthusiast shared an image on X, where it can be seen that GPT-4o is generating font images when given instructions in the form of input texts. You can see in the first input, that Andrew mentions the letters to be displayed in three different rows in the ultra-futuristic font.
GPT-4o does an excellent job of displaying the fonts as mentioned in the query. This can all be attributed to the high text analyzing capabilities of GPT-4o. Very impressive from OpenAI.
2) Solving a year 3 Math Question
Another user took to X, to share a video demonstration of GPT-4o solving a year 3 math question. The question was a bit complex which involved even formulas. But surprisingly the chatbot not only managed to solve the problem with the correct answer but also provided a good logical explanation of the problem.
> i asked chatgpt mac os app (gpt4o) to answer an year 3 maths question from browser
— Anu Aakash (@anukaakash) May 14, 2024
> it got the answer right, the reasoning is quite good. pic.twitter.com/rG9D6LYLAp
This just goes to show that GPT-4o is highly skilled and trained to deal with tricky maths problems which also involves deep thinking.
3) Solving Coding Related Problems Extremely Fast
In this experiment, we can see GPT-4o, solving an extremely difficult coding-related problem in constructing K robots with the minimum possible cost while maintaining specific conditions. Not only it provide the coding solution at lightning-fast speed but it later also offered a step-by-step analysis of the code.
I just got access to GPT-4o and it is super mega fast. And it can also solve my favorite, super difficult problem with assembling robotic cows. Wow! 🤩#gpt #gpt4o pic.twitter.com/R9SKMwGFwS
— Two Minute Papers (@twominutepapers) May 13, 2024
This just gives us an idea of how fast and efficient GPT-4o’s processing capabilities are. It is highly suited for developers who want faster and simpler solutions to complex code problems.
4) Faster Search Results Retrieval from Bing
Mukul Sharma, an AI influencer on X tried searching the latest tech information on both GPT-4o and the normal GPT-4. Both chatbots used the Bing search engine to retrieve the latest information on the queries asked by the user.
The search results (fetched from Bing) are also way faster and seemingly more accurate in GPT 4o.#OpenAI #GPT4o pic.twitter.com/bY4sdMgd3I
— Mukul Sharma (@stufflistings) May 14, 2024
However, the surprising thing was that GPT-4o was much faster at extracting information from the Bing search engine compared to the standard GPT-4 chatbot. Is this a glance at GPT-4o’s powerful access to real-time information and capabilities? It would be a major breakthrough if we get a chatbot that can act as a search engine.
5) Identifying Image Objects Accurately
A user named Jakub Jakóbowski, Deputy Director of OSW – Centre for Eastern Studies, Poland tried an interesting experiment with GPT-4o. He gave the chatbot an image of a missile and asked where it was produced. He also asked for three points to justify the answer.
Holy…
— Jakub Jakóbowski (@J_Jakobowski) May 14, 2024
I fed GPT4o with a photo of a missle wreckage from Kharkiv and it was correct on the origin – North Korea 🇰🇵 pic.twitter.com/w6VmomZFGo
Not only did GPT-4o guess the answer correctly i.e. Kharkiv, but it also provided strong points about the design features, context, and construction materials which provides strong proof that the missile originated in Kharkiv.
This just proves the fact that GPT-4o has extremely strong vision capabilities with enhanced image and natural language interaction. Otherwise, the chatbot wouldn’t be so accurate in correctly determining the missile location!
6) One Shot Stable Diffusion Fine-Tuning
GPT-4o can also be used for one-shot stable diffusion fine-tuning. Andrew Gao again provided an image on X where this ability is demonstrated. First, the user provided the image of a young white man and also provided his description in a text input to the chatbot. Then he gave a query asking for the caricature version of the man on a white background.
GPT-4o responded perfectly providing the accurate image as asked by the user. The caricature seems to capture all the qualities of the real image and also has a cartoon-like and playful tone to it as was mentioned in the query.
7) A Cartoon-Style Satire Image
Shijie Wang, an ex-AI researcher also conducted an experiment with GPT-4o, which in the end turned out to be hilarious! The user had uploaded an image of a tasty dish and asked GPT-4o to redraw it in a cartoon style but to his surprise what he received was a cartoon image with the text “There was an error in processing this image”.
I tried the latest #OpenAI #GPT4O and got this funny results. It seems to be strong evidence showing that GPT4o is following the image->text->image way for multimodal I/O. pic.twitter.com/90jC61glvf
— Shijie Wang (@ShijieWang20) May 13, 2024
Clearly, we get the image that GPT-4o is still not at that level to perfectly capture all image details and convert them into a relevant style, but this satirical humor from the chatbot manages to impress us in hilarious ways!
8) Interview Preparation with GPT-4o
This experiment was shared by OpenAI’s official account. In this video, you can see Rocky Smith, a technical staff member at OpenAI, taking the help of GPT-4o to prepare for a software engineering interview at OpenAI. He asks the Chatbot whether he looks presentable for an interview, to which GPT-4o tells him to fix his appearance to look more fit for an interview.
Interview prep with GPT-4o pic.twitter.com/st3LjUmywa
— OpenAI (@OpenAI) May 13, 2024
The video just continues to blow everyone’s minds! The conversation just feels like Rocky is having a real-life interaction with a person helping him prepare for an interview. OpenAI has become really serious about enhanced natural language interaction with AI agents.
9) Sound Effect Synthesis
GPT-4o can not only generate speech but also audio sound effects. Andrew Gao, shared an image on X, in which he asked the chatbot to generate the sound effects of coins clanging on metal.
Although he didn’t share the video where we can hear the sound effects, we can see in the image that GPT-4o responded amazingly with a 3-second audio file. Audio generation capabilities take a chatbot to completely different heights and this is just what is happening with OpenAI’s latest prowess.
We can only begin to imagine to what extent users and audio creators will go to in designing the perfect sound effect that they have been looking for.
10) Two AIs Talking to Each Other
This is a completely mind-blowing experiment that was conducted by OpenAI. In this video taken from OpenAI at X, you can see two GPT-4os set up in an environment to talk to each other.
Two GPT-4os interacting and singing pic.twitter.com/u9VuZoroxm
— OpenAI (@OpenAI) May 13, 2024
One AI is set to both see and hear its nearby environment, whereas the other one can only hear but not see. Throughout the interaction, you can see how the former AI helps the latter AI explore the nearby environment by communicating with it and providing it with proper responses to its questions. At the end of the video, you can also see the two GPT-4os singing together and having a good time.
Who thought that we could enable communication between two AI agents so easily! OpenAI is already providing us a view into the future with GPT-4o.
11) Generating Visual Data
Zain Kahn, an AI enthusiast on X, conducted an experiment with GPT-4o. He asked the chatbot to analyze a spreadsheet first and then generate charts and visualizations based on the spreadsheet data.
Wild. GPT-4o on ChatGPT can generate full blown charts and statistical analysis from spreadsheets with a single prompt in less than 30 seconds.
— Zain Kahn (@heykahn) May 14, 2024
This stuff use to take ages in Excel. pic.twitter.com/Mvh5arWOQh
The charts and graphs were generated by the chatbot were of high quality and contained several variates. They were also represented in multi-color to help identify the data points and variables more effectively. The chatbot also provided insights in the end to help understand the figures more effectively. Also, the processing time for this operation was impressively fast.
12) Coding a Breakout Game in Python
Alvaro Cintas, an AI researcher did an amazing experiment with GPT-4o. He took a screenshot of the breakout game interface and gave the image to GPT-4o. He asked the chatbot to code the game in Python language.
The new ChatGPT Mac app is amazing.
— Alvaro Cintas (@dr_cintas) May 14, 2024
I got a fully working Breakout game code using a shortcut to pull up the app with GPT-4o and a simple screenshot of my screen.
So many use cases and faster workflows. pic.twitter.com/hBU2arjvMv
GPT-4o provided the code at lightning-fast speed, and also the user ran the code in a compiler. To his surprise, the code ran perfectly well and the game was well up and running with a similar interface as that of the initial screenshot.
GPT-4o is already equipped with groundbreaking vision capabilities and when you incorporate it with highly powerful coding knowledge, this is what you get.
Conclusion
The numerous use cases across diverse scenarios in which the chatbot has performed so well and surprised everyone with its capabilities just goes to show how well GenAI has advanced, courtesy of OpenAI. We are set for even more impressive stuff with GPT-4o in the days to come.