Testing Games on Android using Deep Reinforcement Learning

DeepMind was able to show that deep learning networks can be effectively used to train agents to play games. This is significant because the input their learner takes in is the game screen and a set of controls the agent can perform. In previous work in this area, researchers had to hand extract features which helped the AI agent learn the game. With deep learning that process is automated in the deep learning network. The network learns important features and learns policies that lead to a good score.

Score detection on a Game Screen

There are four main parts of any Reinforcement Learning (RL) algorithm — states, actions, transition function and the reward function. When you use the game screen as input the state space of the problem is really big. A standard RL approach do not scale particularly well with the size of the state space and often require extensive engineering on the part of the designer to minimize the search space. To alleviate this problem, we use use human demonstrations to guide the exploration of the bot. How this RL algorithm works is a part of another series of article.

In this series I will cover the first part of building a system towards automating game testing — reading scores from the game.

The score of the game is the reward to the RL agent. The higher the score, the better the AI agent is playing the game. In DeepMind’s ATARI games, or for that matter, other game interfaces we can directly query the game for the score. However this is not possible when a game is being played on Android.

OpenAI solved this problem in it’s Universe by developing an OCR for reading score of games. In their scenario games were running inside a docker container. They get an image of the game inside the docker and process it for finding a score. Similarly, we can stream a video from any Android device and recognize the score in real time. Our OCR detection algorithm is a supervised learning classification algorithm that has a classification rate of almost 98% and can classify the score at 70fps.

This allows us to feed the score to a reinforcement learning algorithm that is running in parallel and learning to play the game. This reinforcement learning algorithm learns how to play a game, in this case, flappy bird. When the agent has learnt the game sufficiently we can run the agent across multiple phones and carry out stress testing and performance testing.

Recent advances in Deep Reinforcement Learning has made it possible to automate game testing. And we are on it!