AI for Testing Games
I used to tell my team that “games are just too hard to tackle right now” and would tell potential investors “we don’t test games…” Today, video game teams and investors are glad they ignored me.
Video games are perhaps the toughest pieces of software to test. Unlike apps or web pages, many games have an almost infinite number of states, incredibly custom interaction models, and release updates all the time to a finicky audience. Luckily, in the past year, AI has evolved to address many of these testing problems. Let’s explore the pain points of game testing and how AI is helping in the real world today.
In a world worried about “automation”, video games are primarily tested by humans. Smaller games are tested by the developers themselves or their unfortunate friends and early players. Larger teams hire vendors or at the extreme, hundreds of internal manual testers to test the biggest games. The biggest games are complex, update frequently, and make so much money the team cannot afford the risk of shipping a bad version.
If you have been paying attention to ‘nerd news’, you have noticed that AI is starting to play games. Not just chess, but Go, Atari, and even League of Legends. The folks doing this work treat this progress as small and forgettable, steps on the path to building thinking machines. Testers can and should look at this intermediate progress as test infrastructure. These projects leverage a complex combination of deep neural networks, reinforcement learning, and classifiers.
Testing Problem: Stores
Most of the leading games today have a store where players can purchase items such as an axe or shirt for their in-game character. This is where the money is made and so it needs to be tested. These stores are often difficult to automate with any traditional test automation methods. The items in the store’s user-interface are often animated, constantly changing position build to build and appear in custom orders as the player can pick anything at any time.
AI is helping today. AI-based testing approaches don’t rely on knowing the custom, the underlying implementation of the store like Selenium does for example. Because the game renders the axe with varying visual effects at runtime, traditional automation or image-seeking / pixel scanning techniques simply just can’t find the axe.
AI-based testing tools can ‘see’ the axe through the video feed alone. Much like a human, the AI-based testing tools scan the screen looking for something that appears to be an axe. But wait, that axe might be slowly rotating, glowing, or have a feather attached to the end of it for artistic effect. AI-based approaches are great at this ‘fuzzy’ matching for something that looks like a two-dimensional (2D) axe despite the rendering variations.
Even better for the AI approach, items like axes in games are rendered from small 3D models of an axe. Loading up that model and generating thousands of different 2D perspectives of the axe makes for perfect training data for the AI — the AI will recognize that axe on the screen, just like a human who has seen many different axes, from almost any angle or under any dynamic lighting effect.
AI makes it possible to identify the dynamically rendered items in video games — something which foils standard testing approaches.
Testing Problem: Game Play
Modern web or mobile app testing is complex, but even their complexity pales in comparison to video games. People and bots can click about 10 steps deep into an average application, and at each step have perhaps 10 different options of what you could click. This means there are more paths through an app than the number of atoms in the universe. Games however are even more complex. At any moment, the player can choose from many options, every second, and there are interactions with other human and non-human players that are also non-deterministically playing the game and further complicate game state. Creating a test script using traditional technologies to sequence every tap or swipe, step, scroll, or swinging of the axe has uncountably infinite complexity.
Moreover, the content and maps of games are constantly evolving and changing. Some of the best modern games even let the user change the game map in realtime as they play! This makes the writing of ‘test automation scripts’ even more of a challenge and in reality, they are constantly breaking as their hardcoded assumptions of game state are changed underneath them.
It is worth noting that some games have a low-level internal test script so developers can write basic unit tests to verify the core game logic. The game logic is often pretty solid, but that internal game scripting interface doesn’t drive the user interface and all the rendering runtime so it can’t catch issues that real-world players will run into. The most important bugs found in-game testing are often in the rendering, not the game logic layer. There has been some interesting progress testing the underlying game engine with AI such as that shared by the team behind Candy Crush, but those are custom-crafted to individual games and many games have fare more complex user interfaces that still need to be tested before it can be released to players.
AI here comes to the rescue for test automation within gameplay. Applying reinforcement learning, the human tester can describe test steps as simple ‘goals’ for the AI. The AI is then left to play with the game autonomously until it achieves the goal (aka ‘test case’) of say ‘open treasure chest’, ‘score a kill’, ‘equip the axe’, ‘build a ramp’, or ‘exit the spaceship’. The AI can learn how to accomplish these tasks and verify they are possible and working, all without human intervention. No humans have harmed hand-coding test scripts with hundreds of clicks and gamepad movements anymore. Even when the game changes, the AI-based testing approach can autonomously re-learn how to accomplish the same tasks in the new environment.
Testing Problem: App Stores
Today video games top the download lists of most App Stores such as Google Play and the Apple App Store. Most game platforms have their own App Store such as Playstation, and Xbox. These app stores try to ‘test’ all the new versions of games before they are deployed and available to players — they want to double-check the games work before you and I get our hands on them and blame the platform. App Stores also quietly have humans manually testing most new game builds behind the scenes. This is expensive, slow, and doesn’t scale past a few of the very top apps.
AI is helping some of these top App Stores today in a few ways; basic gameplay verification, error recognition, and reuse. AI bots are trained to do basic testing of all apps as above, and while the AI plays with the games, AI-bots also have classifiers watching the game ready to detect any problems in the video rendering quality, text wrapping problems, or rendering problems even after minimize or maximize operations, etc. Most interestingly, since the AI sees and plays the game like a human-based player by watching the screen, the same tests can be reused across multiple devices and platforms — the first time test automation has scaled to the needs of app stores.
Is it a Fun Game?
We will likely still have to wait a few years for AI testing bots to be smart enough to tell if the game is fun to play, or will be a success. That will still probably be a human assessment for a while. In the meantime, modern AI-based testing approaches are rapidly enabling test automation where it was once impossible for games.
AI Future is Here Today
The technology, techniques, and test scenarios listed above aren’t just fanciful ideas — they are already running and deployed in production today.
If you are interested in leveraging an AI-based approach to testing video games, please chat with our friends over at https://www.pinklion.ai .
— Jason Arbon . CEO @ https://www.test.ai