AI is rapidly reshaping many industries and gaming is one of them. A recent “Anonymous” innovation allows AI to understand video games from just gameplay’s screenshots. Let us now find out more information about this and what are the concerns surrounding this new tool.
Highlights:
- A new model named VideoGameBunny has been introduced by anonymous researchers.
- It can now comprehend in-game environments from static images and answer questions based on them.
- This model has the potential to revolutionize game player experiences but also raises concerns about enabling cheating mechanisms.
Say Hi to VideoGameBunny
AI companies are constantly working on vision models and one of the best industries to try them out is Video games.
Large Language and Vision Assistant, abbreviated generally as LLaVA, is a model that integrates visual augmentation into large language models.
VideoGameBunny is a LLaVA-style model that helps users understand video game images. It can look at game screenshots and give its description or answer any questions related to it within seconds.
This model makes gaming more accessible. It was built on an architecture called Bunny and trained on an extensive dataset comprising 185,259 game images from 413 titles, along with 389,565 image-instruction pairs that include image captions, question-answer pairs, and a JSON representation of 16 elements of 136,974 images.
The anonymous researchers released two VideoGameBunny models weighing 4 and 8 billion parameters. The entire code and dataset is now available on GitHub. Not just the code, but the training data, and logs are publicly available as well for other people to do further research.
How does it work?
VideoGameBunny is a very advanced and accurate model in comparison to others like Bunny and LLaVA despite being almost 4x smaller in size. This is because it is specially trained for gaming purposes, it can understand the nuances that exist well. Here is a small screenshot of the model comparison in image captioning:
You can see how the other models give some inaccurate descriptions (highlighted in red) of the attached image. On their evaluation dataset, VideoGameBunny has achieved an accuracy of 85.1% whereas Bunny scored 73.3% and LLaVA’s 34B model has reached close to 83.9%.
This shows how well VideoGameBunny was developed. It excels in Video Game understanding and image captioning. The researchers want their model to pave the way for more advanced AI assistants in video game understanding, playing, commentary, and debugging.
The best part is that this AI is not limited to one game, but can be used as an AI assistant for gamers.
While VideoGameBunny might be a useful tool for some, it can be used for cheating. The developers were aware of this fact as they wrote:
“The short-term implications for the gaming industry include enhancing the productivity of game testers and enhancing quality assurance processes. One possible negative impact of such advancement is the facilitation of in-game cheating. As AI models become more adept at understanding game contents, there is a risk that they could be used to create sophisticated cheating tools.”
So, It is intriguing to find out what people use this new tool to build – is it good or bad? Only time will tell us.
In a similar interest in the video game niche, Google has made SIMA AI, which is trained in various gaming environments with the help of natural language instructions from users.
Conclusion
AI is a powerful tool and like anything in the world, it can be used for good and bad. This new VideoGameBunny is also a similar case. While it offers exciting prospects for enhancing player experiences and streamlining development, it also brings the risk of empowering cheats. In the end, everything is in our hands.