An essential aspect of software development is assuring the quality of the code one has developed. Testing is an important feedback element for developers, but testing a larger piece of software can be very time-consuming. As developers, we are basically looking for the most efficient approaches – and when it comes to testing, automation helps!
Why testing is important
The consequences of poorly tested software range from blemishes and malfunctions to a non-operable or non-functioning application. And users are merciless in their feedback on poorly tested software – and rightly so!
If the software is already in live operation and you add even a small new feature, you still have to ensure that the entire application functions smoothly and as expected after the update. That means you need to test your app – again and again. Manual and automated testing are both common software testing mechanisms.
The traditional approach is to have everything tested by the QA team, who manually test each part of the application. They will not only find bugs but also expose other problems that automated testing cannot cover, like bad or misleading user experience (UX), missing dialogue or localisation issues. The downside of manual testing is that it takes individual testers hours and hours of work.
In many cases, automated testing can be a very efficient alternative and make your developers’ lives a lot easier – be it API testing or integration testing, where you write code to verify the functionality of certain functions, code parts or entire software modules.
The challenge: testing Catan Universe
We have seen how essential testing is in general. Of course, it becomes even more important for a live product such as the multiplayer online board game Catan Universe, which we developed for our customer United Soft Media GmbH in collaboration with Catan GmbH.
The ‘Universe’ part of the name indicates a whole new level of complexity: the digital version of a whole universe of game variants and modes, including two games in one! Both come with different scenarios or maps, combinations of different game rules, settings for each game etc.
Two games in one: Catan – a board game for up to 6 players (left), Rivals for Catan – a card game for 2 players (right)
On top of that, Catan Universe offers both a single-player and a multiplayer experience, with the latter using two different matchmaking algorithms to connect multiple clients, that is ‘players’. The multiplayer part itself currently features games for two to six players. Behind the scenes, we are also looking into ways to increase this number even further. One day, we might be able to allow 100 (in words: one hundred) and more players to play one game of Catan simultaneously.
Our solution: automated play testing
Catan Universe is a long-running product, being first released in 2016 and enjoying continuous growth over the years. The overall variety of different game setups with all the different game situations – especially in the endgame phase, which can last up to an hour – could not be covered manually by even the largest QA team. We needed a method to test all game permutations with their different modes and scenarios automatically. So, we built something we call ‘automated play testing’.
What helped us from the start is that we already had an AI (artificial intelligence) implementation for both single-player and multiplayer games: in the game, you can play against well-known Catan characters like Marianne or Louis, who bring their own distinct personalities and game approaches. And while the players enjoy competing against these characters, we also use them as our very own QA testers who can play the game the whole night – without needing pizza and caffeine. Of course, they can never be as clever or creative as human players, but they can cover each and every game situation, and they can do it repeatedly without ever getting bored!
Having our testers in place, what else do we need?
First, the game must start in some kind of autopilot mode in which it will simply log into a user account based on a config file containing the autopilot settings, for example described as JSON data.
{ "autoPilot": true, "email": "autopilot@automail.com", "password": "123456", "autoPlay": false, "turboMode": false }
Above is an example config file for autopilot mode, showing a subset of all the possible parameters that could be used to set up automated play testing.
Second, we want to enter the autopilot mode only in certain situations, so we use command line parameters to control which special functions should be enabled. For example, we differentiate between ‘active’ and ‘passive’ game instances where one active player invites other, passive players to a game.
Starting Catan Universe in autopilot mode as user1 with command line parameters:
CatanUniverse.exe /autopilot /config user1.json
Once we can automatically log in, we should start giving the autopilot input on what it must react to or even start its own logic to communicate with other clients. These could be:
- friend or guild membership requests to build a player base we can interact with,
- lobby invites for ‘custom matches’ (inviting other players that we have in our friend/guild list),
- or the game client could start the ‘auto-match’ matchmaking (searching for any players that match your favourite game settings).
Autopilot mode: automatically logging in and starting a single-player game without any user interaction
We keep the autopilot code separate from our game logic code so that none of our developers can accidentally break the game logic. Of course, we also want to strip all autopilot-related code from the final release build of the game since we don’t want players to use the autopilot AI – they should play on their own! And finally, we definitely don’t want to open any security issues by allowing automatic control of multiple game instances – think of DDOS (Distributed Denial of Service) attacks, for example.
Which brings us to the final puzzle piece: to be able to start as many game clients as we want, we need some external tool to allow so-called ‘multi-boxing’. This is a technique that some players of online games use to play with multiple characters simultaneously, for example to get an increased amount of resources or experience – or simply as an ‘extra challenge’, like playing chess against multiple opponents at once.
Multi-boxing software allows us to start an application multiple times on one single computer, giving each running instance a separate set of files to work with – in our case the user config file – so that they don’t interfere with each other.
Now that the basic pipeline is running with multiple AI players testing the game, we can consider some more advanced settings: it could help to implement a ‘turbo mode’. As the video below shows, the game is running at an incredibly fast speed. This also functions as a stress test for the servers’ multiplayer session handling because the AI will send game actions to the server faster than any human being could ever play the game.
Autopilot mode: mass-testing games in ‘turbo mode’
You may also want to add logging and tracking systems so that you don’t have to visually check the hundreds of instances running. This system should run day and night to cover all game modes and situations, find any crashes or blockers and even identify situations where the AI stops playing the game because it cannot find any valuable actions to perform, thus helping to improve the overall game experience.
Finally, that also means that you should integrate automated play testing into your CI pipeline so that it runs like a regular automated test on every new release you are planning.
Conclusion
We developed our automated play testing solution to increase the quality and lower the costs for the extensive manual testing efforts required to validate the application. Although the initial implementation was cost-intensive and maintenance and adding new features moderately increased the cost, the solution is more economical than the manual effort that would have been required for testing as well as ensuring and improving the quality of the product.