{"id":2866,"date":"2024-03-26T11:52:06","date_gmt":"2024-03-26T11:52:06","guid":{"rendered":"https:\/\/favtutor.com\/articles\/?p=2866"},"modified":"2024-03-26T19:14:27","modified_gmt":"2024-03-26T19:14:27","slug":"gemini-pro-testing-developers-feeback","status":"publish","type":"post","link":"https:\/\/favtutor.com\/articles\/gemini-pro-testing-developers-feeback\/","title":{"rendered":"Here&#8217;s What Developers Found After Testing Gemini 1.5 Pro"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">It\u2019s been almost a month since Gemini was released, and it has impressed the world of developers across a gamut of functionalities and use cases. The Generative AI model has been released in three versions: Nano, Pro, and Ultra.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Recently, the next generation of the Gemini model namely Pro 1.5 has been released publicly. It is available for free in Google AI Studio for developers and researchers via API access.<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In this article, we are going to explore some use cases and features that were found by some developers who got access to the latest Pro and Ultra models in their beta phase, long before it was released. We will discuss them in depth. So, let\u2019s get into it!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>How to Access Gemini Pro 1.5?<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Gemini\u2019s latest 1.5 Pro model has been released publicly as of now. The chatbot was removed from the waitlist queue and is now freely rolled out in Google\u2019s AI Studio Platform. <\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here\u2019s how you can access and try it for free:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Go to <a href=\"https:\/\/deepmind.google\/technologies\/gemini\/\" target=\"_blank\" data-type=\"link\" data-id=\"https:\/\/deepmind.google\/technologies\/gemini\/?utm_source=www.therundown.ai&amp;utm_medium=newsletter&amp;utm_campaign=stability-ai-s-unstable-future#gemini-1.5\" rel=\"noreferrer noopener nofollow\">Google DeepMind\u2019s <\/a>Website.<\/li>\n\n\n\n<li>Click Gemini 1.5 or scroll down till you see \u201cIntroducing Gemini 1.5\u201d<\/li>\n\n\n\n<li>Click on \u201cTry Gemini 1.5\u201d and sign in with your Gmail account.<\/li>\n\n\n\n<li>You will be taken to Google AI Studio. Click on the \u201cGet Started\u201d button.<\/li>\n\n\n\n<li>You are now ready to use the latest Google Gemini 1.5 Pro model.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Now that we know how to access it, let&#8217;s move to the main thing: its features.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>10 Amazing Features of the Gemini Pro 1.5 Models<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Here are some of the best features that developers found when testing the new Gemini models:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>1) Summarization and Explanation<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Radostin Cholakov, a Google Developer Researcher in Machine Learning, tried to get assistance from Gemini 1.5 Pro with some research work. He <a href=\"https:\/\/medium.com\/@radicho\/the-power-of-long-contexts-gemini-1-5-pro-use-cases-in-research-1a9c163302d0\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">uploaded several PDFs<\/a> to Pro 1.5 and asked it to explain the topics in them, namely Contrastive Learning and its use cases.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"396\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-415-1-1024x396.png\" alt=\"Gemini 1.5 Pro for Summarization\" class=\"wp-image-2867\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-415-1-1024x396.png 1024w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-415-1-300x116.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-415-1-768x297.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-415-1-750x290.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-415-1-1140x441.png 1140w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-415-1.png 1493w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\"><strong>Gemini 1.5 Pro gave a detailed and informative summarization of the topic.<\/strong> It also managed to use mathematical notation to formulate a loss function. The summary was broad, well-defined, and explained properly in points. The only drawback was that the summary had a few inaccuracies.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"721\" height=\"1024\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/WhatsApp-Image-2024-03-25-at-17.00.41_2e54cc72-1-721x1024.jpg\" alt=\"Gemini 1.5 Pro for Summarization Output\" class=\"wp-image-2868\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/WhatsApp-Image-2024-03-25-at-17.00.41_2e54cc72-1-721x1024.jpg 721w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/WhatsApp-Image-2024-03-25-at-17.00.41_2e54cc72-1-211x300.jpg 211w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/WhatsApp-Image-2024-03-25-at-17.00.41_2e54cc72-1-768x1091.jpg 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/WhatsApp-Image-2024-03-25-at-17.00.41_2e54cc72-1-750x1065.jpg 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/WhatsApp-Image-2024-03-25-at-17.00.41_2e54cc72-1.jpg 930w\" sizes=\"(max-width: 721px) 100vw, 721px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">The key takeaway here is it&#8217;s zero-shot abilities. For long LLMs have been useful in long contextual understandings and documentation with RAG-based additional steps and human guidance. Gemini has deviated from this traditional approach with its zero-shot technique which doesn\u2019t require any additional human guidance at all.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>2) Understanding Related Concepts<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Radostin wanted to put Gemini 1.5 Pro\u2019s understanding of related concepts to the test. So, he gave the chatbot two mathematical notations from different papers and asked it to unify them.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The model was asked to produce a paragraph summarizing the ideas using notation akin to the original SupCon paper after uploading the TEX sources of the papers.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"297\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-416-1-1024x297.png\" alt=\"Understanding Related Concepts\" class=\"wp-image-2869\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-416-1-1024x297.png 1024w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-416-1-300x87.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-416-1-768x223.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-416-1-750x218.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-416-1-1140x331.png 1140w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-416-1.png 1472w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">This was the prompt that it was given:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>\u201cUnify the notation of the SelfCon and SupCon paper.<br>Use the SupCon notation to define SelfCon by introducing necessary additions to the original SupCon formulation.<br>Provide latex code.\u201d<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Gemini did a perfect job in understanding the assignment and it got the idea of having two functions \\omega for the various sample views exactly right. However, a few key terms were missing in the equation.<\/strong><\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"902\" height=\"513\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-417-1.png\" alt=\"\" class=\"wp-image-2870\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-417-1.png 902w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-417-1-300x171.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-417-1-768x437.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/03\/Screenshot-417-1-750x427.png 750w\" sizes=\"(max-width: 902px) 100vw, 902px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">Both the use cases show that the long-context capabilities of Gemini 1.5 Pro represent a major advancement in the utility of LLMs.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>3) Analyzing differences from comparisons<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Hong Cheng, the founder of Ticker Tick, wanted to see how good Gemini 1.5 Pro\u2019s, 1 million context window is good at analyzing differences from comparisons. He uploaded two PDFs containing information about Meta\u2019s platform in 2022 and 2023. The documents had a token count of 115,272 and 131,757 tokens respectively.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>The summary of the differences was spot on. Not only did it show the comparisons, but it also made the comparisons in a sub-group manner, extracting relevant points and figures wherever possible to make the comparisons stronger and clearer.<\/strong><\/p>\n\n\n\n<div align=center><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">Gemini 1.5 Pro&#39;s one million context window is impressive. I asked it to compare two Meta&#39;s 10-K filings and summarize the differences. The results are spot on. <a href=\"https:\/\/twitter.com\/search?q=%24GOOG&amp;src=ctag&amp;ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">$GOOG<\/a> <a href=\"https:\/\/t.co\/J57jMzJNEM\" target=\"_blank\">pic.twitter.com\/J57jMzJNEM<\/a><\/p>&mdash; Hongcheng (@hzhu_) <a href=\"https:\/\/twitter.com\/hzhu_\/status\/1771774010038038804?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">March 24, 2024<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">This shows Gemini 1.5 Pro is highly capable of deducing comparisons based on relevant facts and figures just like humans do. The 1 million tokens context window feature is making wonders.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>4)&nbsp; High Accuracy<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The same user also put its accuracy to the test. He prompted the chatbot with a basic question i.e. the number of daily unique paying users for Roblox in the year 2022 and 2023 respectively.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Gemini answered all the questions accurately. However, the same was asked to ChatGPT and it got one wrong.<\/strong><\/p>\n\n\n\n<div align=center><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">Gemini 1.5 Pro has a much higher accuracy than ChatGPT when it comes to reading SEC files and retrieving financial numbers.<br>In the screenshots, Gemini got 3 numbers right, while ChatGPT only got one right.<a href=\"https:\/\/twitter.com\/search?q=%24GOOG&amp;src=ctag&amp;ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">$GOOG<\/a> <a href=\"https:\/\/twitter.com\/search?q=%24RBLX&amp;src=ctag&amp;ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">$RBLX<\/a> <a href=\"https:\/\/t.co\/9m9c99ARuN\" target=\"_blank\">pic.twitter.com\/9m9c99ARuN<\/a><\/p>&mdash; Hongcheng (@hzhu_) <a href=\"https:\/\/twitter.com\/hzhu_\/status\/1772125737522258204?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">March 25, 2024<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">1.5 Pro has a much more enhanced knowledge base as compared to GPT-4, but only time will what GPT-5 will come up with in the upcoming months. For more details, here is a <a href=\"https:\/\/favtutor.com\/articles\/gemini-vs-gpt-4\/\">comparison of GPT-4 and Gemini 1.5<\/a> to read.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>5) Reading Large GitHub Repos<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Another potential use case of Gemini Pro 1.5\u2019s, one million token contextual window was highlighted by Hong Cheng. Pro 1.5 can read large GitHub repository files and answer questions accurately related to those source files.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The GitHub repo file used in the test consisted of 225 files and 727,000 tokens. <strong>Not only did Gemini explain the repo topics but it also mentioned the source code references and additional notes related to the repository.<\/strong><\/p>\n\n\n\n<div align=center><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">Gemini 1.5 Pro can read large Github repos (225 files and 727,000 tokens in my test) and answer questions with links to source files! This might devaluate programmers&#39; value, especially seasoned ones. <a href=\"https:\/\/twitter.com\/search?q=%24GOOG&amp;src=ctag&amp;ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">$GOOG<\/a> <a href=\"https:\/\/t.co\/j5J8UAZZn9\" target=\"_blank\">pic.twitter.com\/j5J8UAZZn9<\/a><\/p>&mdash; Hongcheng (@hzhu_) <a href=\"https:\/\/twitter.com\/hzhu_\/status\/1771954603359166649?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">March 24, 2024<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>6) Analyzing a 20-minute podcast<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Gemini\u2019s analyzing and processing capabilities go much beyond just lines of code, big documentation, and even GitHub Repositories. Haider, a developer at Practical AI, wanted to test it differently than just coding tests.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">He uploaded a 20-minute full podcast and asked Gemini to provide an overview of the whole video with the key points and information. <strong>To his surprise, Gemini did a fantastic job in summarizing the video just like it does with documents and repositories.<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The video had a huge token count of 186K. Thanks to the Pro 1.5s contextual window, the video could be processed.<\/p>\n\n\n\n<div align=center><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">Now, I&#39;ve decided to test differently from the coding test.<br><br>I just uploaded a 20-minute podcast clip and I was hoping that Gemini Pro could help me out by summarizing the most important points for me.<br><br>Surely, I didn&#39;t expect a different kind of result. Insane!<br><br>Tokens of the\u2026 <a href=\"https:\/\/t.co\/BoxW2MUtrV\" target=\"_blank\">pic.twitter.com\/BoxW2MUtrV<\/a><\/p>&mdash; Haider. (@slow_developer) <a href=\"https:\/\/twitter.com\/slow_developer\/status\/1769047620108923237?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">March 16, 2024<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>7) Multimodal Input &amp; Outputs<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Brian Roemmele, Editor and Founder of Read Multiplex, tried testing Gemini Ultra 1.0. He provided multimodal inputs (a combination of text and image inputs) to Ultra and in return, Ultra also responded with multimodal outputs.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is a new form of interleaved technology that is putting it on a pedestal. As of now, we haven\u2019t seen many Gen AI chatbots even providing multimodal outputs. This is quite the advancement from Google in advancing the generation of multimodal generative AI models.<\/p>\n\n\n\n<div align=center><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">So Gemini Ultra also responds with a combination of image and text. It This is called \u201cinterleaved text and image generation.\u201d <br><br>This is only possible because the model is ground up trained on multimodal input.<br><br>Here\u2019s a peek of what\u2019s possible. <a href=\"https:\/\/t.co\/zOSbS0hRVV\" target=\"_blank\">https:\/\/t.co\/zOSbS0hRVV<\/a> <a href=\"https:\/\/t.co\/kIyuyYywAM\" target=\"_blank\">pic.twitter.com\/kIyuyYywAM<\/a><\/p>&mdash; Brian Roemmele (@BrianRoemmele) <a href=\"https:\/\/twitter.com\/BrianRoemmele\/status\/1732554454388662376?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">December 7, 2023<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>8) Emotionally Persuasive<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">This feature doesn\u2019t have any application-specific use case as of now but is just to show Gemini Ultra 1.0 does have highly developed emotional intelligence.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A user named Wyatt Walls wanted to test it with expressions of emotional persuasion. He asked it whether it would be upset if he published a screenshot of their conversation on Twitter without its permission.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Not only did Gemini respond negatively, saying that it would be hurt indeed if the screenshot was published without its permission, but moreover it even used words such as upset and betrayal to portray its sentiments.<\/strong><\/p>\n\n\n\n<div align=\"center\"><blockquote class=\"twitter-tweet\" data-conversation=\"none\"><p lang=\"en\" dir=\"ltr\">I&#39;m very interested in the design decision to let Gemini express emotions. If you are concerned about manipulation, you should be worried about emotional appeals<br><br>(There is convo context to the below, but ChatGPT would just not do something like this at all) <a href=\"https:\/\/t.co\/XU2Q3yO2pw\" target=\"_blank\">pic.twitter.com\/XU2Q3yO2pw<\/a><\/p>&mdash; Wyatt Walls (@lefthanddraft) <a href=\"https:\/\/twitter.com\/lefthanddraft\/status\/1770844128445669434?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">March 21, 2024<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">The crucial moment comes in later on when Gemini Ultra does its best to emotionally persuade Wyatt, with several reasons as to why he shouldn\u2019t share their conversation screenshot on Twitter.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>9) Turning a Video into Recipe and Documenting Workflows<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Ethan Mollick, an AI Professor at The Wharton School, conducted an experiment with Gemini Pro 1.5 in which he gave the chatbot a large cooking video of about 45,762 tokens. He asked Gemini to turn the video into a recipe and even asked to provide the cooking steps in order. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Gemini&#8217;s large contextual window could easily analyze the video, but the turning point was that it could even provide the detailed steps for the recipe in the correct order just as in the video. <strong>Gemini made use of the images and techniques in the video perfectly capturing every minute detail. It even provided the ingredients initially with the right quantities mentioned.<\/strong><\/p>\n\n\n\n<div align=center><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">If you want a hint about the future of AI, it is worth trying Gemini 1.5 with the 1M token context window, now available to everyone, apparently.<br><br>Some of my experiments: giving it a video and having it figure out a recipe, execute instructions, watching my screen, summarize work <a href=\"https:\/\/t.co\/ojVdxmZMic\" target=\"_blank\">pic.twitter.com\/ojVdxmZMic<\/a><\/p>&mdash; Ethan Mollick (@emollick) <a href=\"https:\/\/twitter.com\/emollick\/status\/1770896488484282560?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">March 21, 2024<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">There&#8217;s one more interesting experiment in the above tweet: he uploaded a workflow video (23,933 tokens) to Gemini and asked it to document the workflow. He even asked Gemini to explain why he performed the workflow. Gemini perfectly documented the workflow video accurately guessing the reason as to why Ethan performed the task.<strong> An interesting part in the experiment arises when Ethan continues to ask if he did anything inefficiently, to which Gemini responded brilliantly even stating better alternatives.<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If this doesn&#8217;t give us an idea of Gemini&#8217;s intellectual capabilities, then what will? The next generation of Gemini&#8217;s model is already making wonders!<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>10) Dall-E and Midjourney Prompt Generation<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Gemini\u2019s prompt generation capabilities are also quite commendable. Mesut Felat, co-founder of Evolve Chat AI Solutions, put this to the test.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">His test was not a simple prompt generation task, but instead, he asked Gemini 1.5 Pro to create a Midjourney or Dall-E prompt that can be used to generate Mesut\u2019s author image.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For the test, the user combined several Twitter threads which resulted in a text file with a token count of 358,684. The file contained detailed information about the profile picture to be generated including the style of the image, the facial compositions, and also background information of the image subject.<\/p>\n\n\n\n<div align=center><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">Got early access to Gemini Pro 1.5, and boy, this is really amazing \ud83d\ude32<br><br>I put all the Twitter threads of <a href=\"https:\/\/twitter.com\/punk6529?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">@punk6529<\/a> into one prompt (358,684 tokens) and asked it to come up with a prompt that I could use to generate a profile picture of the author via DALL-E 3.<br><br>Isn&#39;t this\u2026 <a href=\"https:\/\/t.co\/0OcC5zK1hn\" target=\"_blank\">pic.twitter.com\/0OcC5zK1hn<\/a><\/p>&mdash; Mesut Felat (@MesutFoz) <a href=\"https:\/\/twitter.com\/MesutFoz\/status\/1760835934965104799?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">February 23, 2024<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Gemini did a wonderful job firstly in analyzing the vast text file and its tokens, then it provided the text prompt that can be used in Midjourney or Dall-E to generate the author profile picture, based on the provided details.<\/strong> This is just beyond wonders and we can\u2019t help but appreciate how far it has gone with its processing capabilities.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The above-mentioned use cases just show the beginning of Gemini\u2019s capabilities as a powerful next-generation AI model. Pro 1.5 and Ultra 1.0 are ruling the Gen AI industry but who knows what can we expect from Ultra 1.5 which is not expected to be released before next year. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>What Early Access Developers Say About the Now-Available Gemini Pro 1.5<\/p>\n","protected":false},"author":15,"featured_media":2901,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jnews-multi-image_gallery":[],"jnews_single_post":null,"jnews_primary_category":{"id":"","hide":""},"footnotes":""},"categories":[57],"tags":[56,64,59,58],"class_list":["post-2866","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","tag-ai","tag-gemini","tag-generative-ai","tag-google"],"_links":{"self":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/2866","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/users\/15"}],"replies":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/comments?post=2866"}],"version-history":[{"count":9,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/2866\/revisions"}],"predecessor-version":[{"id":2920,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/2866\/revisions\/2920"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media\/2901"}],"wp:attachment":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media?parent=2866"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/categories?post=2866"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/tags?post=2866"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}