{"id":7273,"date":"2025-03-21T09:03:51","date_gmt":"2025-03-21T09:03:51","guid":{"rendered":"https:\/\/favtutor.com\/articles\/?p=7273"},"modified":"2025-03-21T09:03:53","modified_gmt":"2025-03-21T09:03:53","slug":"mc-bench-minecraft-ai-models","status":"publish","type":"post","link":"https:\/\/favtutor.com\/articles\/mc-bench-minecraft-ai-models\/","title":{"rendered":"People are Testing AI Models with Minecraft Builds"},"content":{"rendered":"\n<p>People love Minecraft and like to see who can create the most beautiful builds there. Now, if AI models want to be like us, they need to be creative too. So, this new website lets you test and compare which AI models are good at Minecraft.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Minecraft Benchmarking for AI Models<\/strong><\/h2>\n\n\n\n<p>MC-Bench or Minecraft Benchmarking is a website created by a 12th-grader Aditya Singh. On the website (<a href=\"https:\/\/mcbench.ai\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">mcbench.ai<\/a>), you can compare two AI models on how well they can generate innovative Minecraft creations using the same prompt.<\/p>\n\n\n\n<p><strong>MC-Bench serves as a benchmarking platform specifically designed to evaluate AI models&#8217; capabilities in generating Minecraft builds. <\/strong><\/p>\n\n\n\n<p>Here&#8217;s how it works: when you visit the website, you are shown two creations and you have to vote for which one looks better. For example, here are two tables made by two different AI models:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1217\" height=\"704\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models.jpg\" alt=\"Minecraft builds by AI Models\" class=\"wp-image-7275\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models.jpg 1217w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-768x444.jpg 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-750x434.jpg 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-1140x659.jpg 1140w\" sizes=\"(max-width: 1217px) 100vw, 1217px\" \/><\/figure>\n<\/div>\n\n\n<p>You have to vote for one of them but there is also a &#8220;Tie&#8221; option if you think both are equally good.<\/p>\n\n\n\n<p>After voting, it will reveal the names of the AI models:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1202\" height=\"711\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-Results.jpg\" alt=\"Minecraft builds by AI Models Results\" class=\"wp-image-7276\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-Results.jpg 1202w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-Results-768x454.jpg 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-Results-750x444.jpg 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-Results-1140x674.jpg 1140w\" sizes=\"(max-width: 1202px) 100vw, 1202px\" \/><\/figure>\n<\/div>\n\n\n<p>If you are a Minecraft player, this is a fun game you must try once. Here you can use your gaming skills to judge the AI models.<\/p>\n\n\n\n<p>Here is another example of building Frosty the Snowman:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1152\" height=\"695\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-2.jpg\" alt=\"Minecraft builds by AI Models 2\" class=\"wp-image-7277\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-2.jpg 1152w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-2-768x463.jpg 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-2-750x452.jpg 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2025\/03\/Minecraft-builds-by-AI-Models-2-1140x688.jpg 1140w\" sizes=\"(max-width: 1152px) 100vw, 1152px\" \/><\/figure>\n<\/div>\n\n\n<p>The builds also include Earth from space:<\/p>\n\n\n\n<div align=\"center\"><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">Herein, GPT 4.5 &#8211; Preview (2025-02-27) uses a simple Perlin Noise approximation to &quot;Build our Earth as a sphere viewed from space, as detailed and realistic as possible.&quot;<br><br>Share link below.<br><br>cc: <a href=\"https:\/\/twitter.com\/OpenAI?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">@OpenAI<\/a> <a href=\"https:\/\/t.co\/8dYfl5GJxi\" target=\"_blank\">pic.twitter.com\/8dYfl5GJxi<\/a><\/p>&mdash; Minecraft Benchmark (@_mcbench) <a href=\"https:\/\/twitter.com\/_mcbench\/status\/1900536821798392023?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">March 14, 2025<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<p>Even unicorns:<\/p>\n\n\n\n<div align=\"center\"><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">Sometimes a model produces an elegant algorithm for placing blocks.<br><br>Other times it does the calculations &quot;in its head.&quot;<br><br>Herein, GPT 4.5 &#8211; Preview (2025-02-27) just lays down the blocks to create &quot;A fancy colorful Unicorn.&quot;<br><br>Share link below. <a href=\"https:\/\/t.co\/eVUOtwv3hZ\" target=\"_blank\">pic.twitter.com\/eVUOtwv3hZ<\/a><\/p>&mdash; Minecraft Benchmark (@_mcbench) <a href=\"https:\/\/twitter.com\/_mcbench\/status\/1900158647612678636?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">March 13, 2025<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<p><strong>Overall, users vote on the best Minecraft build before discovering which AI created it. This means it is a human preference leaderboard, just like LMArena.<\/strong><\/p>\n\n\n\n<p>\u200bMinecraft has achieved remarkable success since its release in 2009, becoming the best-selling video game of all time. As of October 2023, it has sold over 300 million copies worldwide. That&#8217;s why the creator of this website used Minecraft for benchmarking AI models. He talked about it to <a href=\"https:\/\/techcrunch.com\/2025\/03\/20\/a-high-schooler-built-a-website-that-lets-you-challenge-ai-models-to-a-minecraft-build-off\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Techcrunch<\/a>:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>&#8220;Minecraft allows people to see the progress (of AI development) much more easily. People are used to Minecraft, used to the look and the vibe.&#8221;<\/p>\n\n\n\n<p>-Aditya Singh<\/p>\n<\/blockquote>\n\n\n\n<p>Traditional AI benchmarks typically use complex metrics and programming challenges that are difficult for the average person to understand. While valuable for researchers, these benchmarks often lack accessibility.<\/p>\n\n\n\n<p>There is also a leaderboard available on the website. The #1 spot is currently held by Anthropic&#8217;s <a href=\"https:\/\/favtutor.com\/articles\/claude-3-7-sonnet-examples\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Claude 3.7 Sonnet<\/a>. It has a win rate of 86% last time I checked. The runner-up is also an AI model by Anthropic: Claude 3.5 Sonnet. OpenAI&#8217;s GPT-4.5 Preview is on number 3.<\/p>\n\n\n\n<p>According to the creator, the leaderboard reflects his own experience with these models, indicating that MC-Bench offers an accurate assessment.<\/p>\n\n\n\n<p>People online are also find this it enjoyable. Some are calling it the &#8220;coolest benchmark ever&#8221;.<\/p>\n\n\n\n<p>As of 15 March 2025, there \u200bare over 10,000 individual build samples have been voted on. There are still 20,000 builds yet to be evaluated, according to the latest update from their X. \u200b<\/p>\n\n\n\n<p>Minecraft&#8217;s open-ended nature makes it an ideal testing ground for AI creativity. Benchmarking AI models in this environment helps determine how well AI can design within Minecraft&#8217;s constraints.<\/p>\n\n\n\n<p>But this is not the first time games have been used for AI research. Classic games like Super Mario Bros, Street Fighter, and Pokemon Red were also used for testing the LLMs recently.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Takeaways<\/strong><\/h2>\n\n\n\n<p>We have seen many ways in which we can test AI models but this is so far the most interesting method I have seen. This also adds some fun in this technical industry that might encourage young minds to get started with the AI world.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>People love Minecraft and like to see who can create the most beautiful builds there. Now, if AI models want to be like us, they need to be creative too. So, this new website lets you test and compare which AI models are good at Minecraft. Minecraft Benchmarking for AI Models MC-Bench or Minecraft Benchmarking [&hellip;]<\/p>\n","protected":false},"author":33,"featured_media":7274,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jnews-multi-image_gallery":[],"jnews_single_post":{"format":"standard"},"jnews_primary_category":[],"footnotes":""},"categories":[57],"tags":[56,108,373],"class_list":["post-7273","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","tag-ai","tag-gaming","tag-minecraft"],"_links":{"self":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/7273","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/users\/33"}],"replies":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/comments?post=7273"}],"version-history":[{"count":1,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/7273\/revisions"}],"predecessor-version":[{"id":7278,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/7273\/revisions\/7278"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media\/7274"}],"wp:attachment":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media?parent=7273"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/categories?post=7273"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/tags?post=7273"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}