{"id":3736,"date":"2024-04-16T06:19:44","date_gmt":"2024-04-16T06:19:44","guid":{"rendered":"https:\/\/favtutor.com\/articles\/?p=3736"},"modified":"2024-04-16T06:19:45","modified_gmt":"2024-04-16T06:19:45","slug":"google-deepmind-robots-ai-soccer","status":"publish","type":"post","link":"https:\/\/favtutor.com\/articles\/google-deepmind-robots-ai-soccer\/","title":{"rendered":"AI Soccer Evolution: Google Trains Robots with Improved Skills"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\"><strong>Google DeepMind Researchers successfully trained 20-inch tall humanoid robots to play 1v1 soccer matches using a deep reinforcement learning approach. With the help of this, the robots learned to run, kick, block, get up from falls, and score goals without the need for any manual programming.<\/strong><\/p>\n\n\n\n<div align=\"center\"><blockquote class=\"twitter-tweet\"><p lang=\"en\" dir=\"ltr\">Soccer players have to master a range of dynamic skills, from turning and kicking to chasing a ball. How could robots do the same? \u26bd<br><br>We trained our AI agents to demonstrate a range of agile behaviors using reinforcement learning.<br><br>Here\u2019s how. \ud83e\uddf5 <a href=\"https:\/\/t.co\/RFBxLG6SMn\" target=\"_blank\">https:\/\/t.co\/RFBxLG6SMn<\/a> <a href=\"https:\/\/t.co\/4B4S2YiVLh\" target=\"_blank\">pic.twitter.com\/4B4S2YiVLh<\/a><\/p>&mdash; Google DeepMind (@GoogleDeepMind) <a href=\"https:\/\/twitter.com\/GoogleDeepMind\/status\/1778377999202541642?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">April 11, 2024<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Robots Soccer Training Approach<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Researchers at Google DeepMind have designed a technique to train low-cost off-the-shelf bipedal robots to play multi-robot soccer well beyond the level of agility and fluency that is intuitively expected from this type of robot. They used a deep reinforcement learning-based (RL) approach and performed their tests on the widely available Robotis OP3 robots.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The researchers explained that generating robust motor skills in bipedal robots is a challenging task because of the inability of current control methods to generalize to specific tasks. The researchers utilized a deep RL technique to control the body of the Robotis OP3 robots thus allowing them to play one-on-one matches.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">With the help of the training method, they aimed at making robots learn a wide variety of tasks such as walking, kicking, getting up from a fall, scoring goals, and defending. They divided the training pipeline into two stages:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"414\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/Agent-training-setup-Google-Soccer-1024x414.jpg\" alt=\"Agent training setup Google Soccer\" class=\"wp-image-3739\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/Agent-training-setup-Google-Soccer-1024x414.jpg 1024w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/Agent-training-setup-Google-Soccer-300x121.jpg 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/Agent-training-setup-Google-Soccer-768x310.jpg 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/Agent-training-setup-Google-Soccer-750x303.jpg 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/Agent-training-setup-Google-Soccer-1140x460.jpg 1140w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/Agent-training-setup-Google-Soccer.jpg 1280w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">They first trained two separate skill policies: one for getting up from the ground and another for scoring a goal against an untrained opponent.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For the get-up skill, they used a sequence of target poses to bias the policy towards a stable and collision-free trajectory. For the soccer skill, the agent was trained to score as many goals as possible against an untrained opponent.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In the second stage, these skills were distilled into a single agent, which was then trained using self-play, where the opponent was drawn from a pool of partially trained copies of the agent itself.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">During this stage, the agent played against increasingly stronger opponents, which were sampled from a pool of partially trained copies of the agent itself (self-play).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">They used policy distillation to enable the agent to learn from the skill policies, regularizing the agent&#8217;s policy towards the relevant skill policy depending on the agent&#8217;s state.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This allowed the agent to integrate previously learned skills, refine them for the complete soccer task, and anticipate the opponent&#8217;s actions. They utilized various techniques such as shaping rewards, domain randomization, and random perturbations to enhance exploration and ensure safe transfer to real robots.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To improve the robustness of the policies and facilitate safe transfer to real robots, the researchers employed techniques such as domain randomization, perturbations during training, and shaping reward terms.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The robots were trained in a simulated environment using the MuJoCo physics engine and then the resulting policy was directly deployed on real Robotis OP3 miniature humanoid robots without any fine-tuning.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Here are The Results<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The trained robots exhibited impressive performance in real-world tests. They outperformed specialized manually designed controllers in key behaviours such as walking (181% faster), turning (302% faster), getting up (63% less time), and kicking (34% faster with a run-up approach). T<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The robots also demonstrated opponent awareness, adaptive footwork, and the ability to quickly recover from falls:<\/p>\n\n\n\n<div align=\"center\"><blockquote class=\"twitter-tweet\" data-media-max-width=\"560\"><p lang=\"en\" dir=\"ltr\">Our players were able to walk, turn, kick and stand up faster than manually programmed skills on this type of robot. \ud83d\udd01<br><br>They could also combine movements to score goals, anticipate ball movements and block opponent shots &#8211; thereby developing a basic understanding of a 1v1 game. <a href=\"https:\/\/t.co\/1Bty4q9tDN\" target=\"_blank\">pic.twitter.com\/1Bty4q9tDN<\/a><\/p>&mdash; Google DeepMind (@GoogleDeepMind) <a href=\"https:\/\/twitter.com\/GoogleDeepMind\/status\/1778378004814586019?ref_src=twsrc%5Etfw\" target=\"_blank\" rel=\"noopener\">April 11, 2024<\/a><\/blockquote> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">During 1v1 soccer matches, the robots showcased a variety of emergent behaviours, including agile movements, recovery from falls, object interaction, and strategic behaviours such as defending and protecting the ball with their bodies. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Remarkably, these robots exhibited higher speed compared to those controlled by traditional scripted methods, indicating the potential utility of this framework for orchestrating more intricate interactions among multiple robots.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<div class=\"jeg_video_container jeg_video_content\"><iframe title=\"Google DeepMind Trained Robots Playing Soccer, Part 1\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/lrlhi1l16Nk?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/div>\n<\/div><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">They smoothly transitioned between different behaviours and adapted their tactics based on the game context. This approach enabled emergent behaviours to be discovered and optimized for specific contexts, allowing the agent to learn and adapt more effectively.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<div class=\"jeg_video_container jeg_video_content\"><iframe title=\"Google DeepMind Trained Robots Playing Soccer, Part 2\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/MauNtcHQsvQ?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/div>\n<\/div><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The results showed that pretraining separate soccer and get-up skills was crucial for success, as attempting to learn end-to-end without these separate skills led to suboptimal solutions. By using a minimal set of pre-trained skills, they simplified reward design, improved exploration, and avoided poor locomotion outcomes.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The ability of the robots to learn and adapt to various complicated situations, combine different skills, and make strategic decisions during soccer gameplay highlights the potential of deep reinforcement learning in the field of AI-based robotics.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"> Such an approach leads to advancements in the development of general robots, rather than training them for specific tasks.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Google Deepmind team is also working on the <a href=\"https:\/\/favtutor.com\/articles\/tacticai-google-football-assistant\/\">TacticAI System<\/a> that provides experts with tactical insights mainly on corner kicks.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The success achieved by researchers at Google DeepMind marks a significant step forward in the creation of intelligent autonomous robots. The deep reinforcement learning approach to training, combined with techniques like domain randomization and self-play, can enable robots to learn complex tasks and adapt to dynamic environments. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Everything about the latest research at  Google DeepMind where they trained robots to play Soccer using a Deep Reinforcement Learning approach.<\/p>\n","protected":false},"author":18,"featured_media":3741,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jnews-multi-image_gallery":[],"jnews_single_post":null,"jnews_primary_category":{"id":"","hide":""},"footnotes":""},"categories":[57],"tags":[56,58,133],"class_list":["post-3736","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","tag-ai","tag-google","tag-research"],"_links":{"self":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/3736","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/users\/18"}],"replies":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/comments?post=3736"}],"version-history":[{"count":2,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/3736\/revisions"}],"predecessor-version":[{"id":3742,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/3736\/revisions\/3742"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media\/3741"}],"wp:attachment":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media?parent=3736"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/categories?post=3736"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/tags?post=3736"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}