{"id":4792,"date":"2024-05-13T17:37:33","date_gmt":"2024-05-13T17:37:33","guid":{"rendered":"https:\/\/favtutor.com\/articles\/?p=4792"},"modified":"2024-05-13T18:36:51","modified_gmt":"2024-05-13T18:36:51","slug":"openai-releases-gpt-4o","status":"publish","type":"post","link":"https:\/\/favtutor.com\/articles\/openai-releases-gpt-4o\/","title":{"rendered":"OpenAI Releases GPT-4o! Here&#8217;s How You Can Try It"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">OpenAI, in their spring update just announced a new model called GPT-4o (\u201co\u201d for \u201comni\u201d). This model is available to all categories of users both free and paying users. This is a huge step by OpenAI towards freely accessible and available AI.<\/p>\n\n\n\n<div align=\"center\"><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">our new model: GPT-4o, is our best model ever. it is smart, it is fast,it is natively multimodal (!), and\u2026<\/p>&mdash; Sam Altman (@sama) <a href=\"https:\/\/twitter.com\/sama\/status\/1790065469296156715?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">May 13, 2024<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">GPT 4o provides GPT-level intelligence but it is much faster with audio, image, and textual inputs<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This model focuses on Understanding the tone of voice and providing a real-time audio and vision experience. It is <strong>2x faster, 50% cheaper<\/strong>, and has <strong>5x higher rate limits compared to GPT-4 turbo<\/strong>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This experience is demonstrated by their new voice assistant. The demo was streamed live for users to watch and see all new developments.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>How To Access OpenAI GPT-4o?<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">GPT-4o has been made available to all ChatGPT users, including those on the free plan. Previously, access to GPT-4 class models was restricted to individuals with a paid monthly subscription.<\/p>\n\n\n\n<div align=\"center\"><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">it is available to all ChatGPT users, including on the free plan! so far, GPT-4 class models have only been available to people who pay a monthly subscription. this is important to our mission; we want to put great AI tools in the hands of everyone.<\/p>&mdash; Sam Altman (@sama) <a href=\"https:\/\/twitter.com\/sama\/status\/1790065541262032904?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">May 13, 2024<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>How exactly is GPT-4o better than previous GPT iterations?<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Prior to GPT-4o, Voice Mode could be used to talk to ChatGPT with latencies of <strong>2.8 seconds (GPT-3.5)<\/strong> and <strong>5.4 seconds (GPT-4) <\/strong>on average. The pipeline of three separate models- transcription of audio to text, the central GPT model that takes text input and gives text output, and lastly the model that converts the text back to audio.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This process means that the main source of intelligence, GPT-4, loses a lot of information\u2014it can\u2019t directly observe tone, multiple speakers, or background noises, and it can\u2019t output laughter, singing, or expressing emotion.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">GPT-4o is a single end-to-end model, trained across text, vision, and audio data. I.e. All inputs are processed by a single neural network. This is the first all-encompassing model they have developed so GPT-4o has barely scratched the surface with its capabilities.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Evaluation and Capabilities<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The model was evaluated on traditional industry benchmarks. GPT-4o achieves GPT-4 Turbo-level performance on text, reasoning, and coding intelligence while setting new high watermarks on multilingual, audio, and vision capabilities.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"899\" height=\"745\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/05\/GPTs-Model-evaluations.jpg\" alt=\"Gpt's Model Evaluation\" class=\"wp-image-4815\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/05\/GPTs-Model-evaluations.jpg 899w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/05\/GPTs-Model-evaluations-768x636.jpg 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/05\/GPTs-Model-evaluations-750x622.jpg 750w\" sizes=\"(max-width: 899px) 100vw, 899px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">This model also implements a new tokenizer that provides better compression across language families.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"745\" height=\"814\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/05\/gpt-4o.jpg\" alt=\"Gpt-4o Language tokenization\" class=\"wp-image-4818\"\/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">OpenAI in their <a href=\"https:\/\/openai.com\/index\/hello-gpt-4o\/\" target=\"_blank\" rel=\"noopener\">release blog<\/a> gave a detailed explanation of the model capabilities with many different samples.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"769\" height=\"835\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/05\/gpt-4o-capabilities.jpg\" alt=\"\" class=\"wp-image-4819\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/05\/gpt-4o-capabilities.jpg 769w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/05\/gpt-4o-capabilities-750x814.jpg 750w\" sizes=\"(max-width: 769px) 100vw, 769px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The researchers also discussed the limitations of the model along with the safety of the model.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">We recognize that GPT-4o\u2019s audio modalities present a variety of novel risks. Today we are publicly releasing text and image inputs and text outputs. Over the upcoming weeks and months, we\u2019ll be working on the technical infrastructure, usability via post-training, and safety necessary to release the other modalities. For example, at launch, audio outputs will be limited to a selection of preset voices and will abide by our existing safety policies. We will share further details addressing the full range of GPT-4o\u2019s modalities in the forthcoming system card.<\/p>\n<cite>OpenAI<\/cite><\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">AI companies are constantly looking for increased computational power. For previous voice interactive models, three models- transcription, intelligence, and text-to-speech all came together to deliver voice mode. However, this brings high latency and breaks the immersive experience. But with GPT 4o, this all happens seamlessly and natively with voice modulations and minimal latency. This is truly an incredible tool for all users!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>OpenAI, in their spring update just announced a new model called GPT-4o (\u201co\u201d for \u201comni\u201d). This model is available to all categories of users both free and paying users. This is a huge step by OpenAI towards freely accessible and available AI. our new model: GPT-4o, is our best model ever. it is smart, it [&hellip;]<\/p>\n","protected":false},"author":20,"featured_media":4793,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jnews-multi-image_gallery":[],"jnews_single_post":null,"jnews_primary_category":{"id":"","hide":""},"footnotes":""},"categories":[57],"tags":[60],"class_list":["post-4792","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","tag-openai"],"_links":{"self":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/4792","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/users\/20"}],"replies":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/comments?post=4792"}],"version-history":[{"count":5,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/4792\/revisions"}],"predecessor-version":[{"id":4820,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/4792\/revisions\/4820"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media\/4793"}],"wp:attachment":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media?parent=4792"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/categories?post=4792"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/tags?post=4792"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}