Understanding players in Tom Clancy’s The Division with data science

Tom Clancy’s The Division

Computer games and the gaming industry have seen a massive shift in the last several years. Today’s online games are a far departure from the physical copies of games that could be played on a single device.

Live games are always connected and allow for a lot of interaction to take place between players – and as a result, are a source of huge amounts of data that opens up untapped capacities of data science and analytics for game creating companies. Ubisoft, the publisher of Tom Clancy’s The Division, analysed the data generated by player interaction to understand player behaviour, enhance recommendations and improve player experience in general. Alessandro Canossa and Sasha Makarovych from Massive Entertainment, a video game developer company owned by Ubisoft, explained how they use data science and social network analysis to create value and make a difference for the player community and the gaming industry in their presentation at the last year’s Nordic Data Science and Machine Learning Summit.

Tom Clancy’s The Division
Photo by Hyperight AB® / All rights reserved.

Social network analysis of gamer behaviour in Tom Clancy’s The Division

For those that are not so much into gaming, Tom Clancy’s The Division is an online-only RPG video game and it was the most successful new IP when it launched. To date, it has more than 20 million players worldwide, states Alessandro. As to player interaction in the game, players can play alone, team up with random players or play with friends.

Group play was the focus of Alessandro and Sasha’s analysis because they wanted to explore the phenomenon of social contagion, particularly if behaviours spread across gamer micro-communities, what are the positive behaviours and how they can measure the impact. They set out to analyse this phenomenon because as game creators they are aware that the lifetime of their product – the game – depends entirely on the players, they are the fundamental elements to it.

Tom Clancy’s The Division

Social contagion is an existing term in social science that refers to people’s reflecting and mimicking behaviour of the community around them. And usually, influencers are the ones that set the behaviour which other people from the community follow.

So by default, Alessandro says they set out to discover if there are any influencers in the gamer community and if they impact their community as they do in other digital environments.

By influencers, they do not mean Twitch or YouTube streamers, but players that have an influence on others from within the game.

In order to discover social contagion, they needed to review metrics that describe players behaviour, says Sasha. In particular:

  • Who plays with whom?
  • How often?
  • For how long?
  • Who invites whom?
  • Which games they play.

The social network analysis process of Tom Clancy’s The Division players

The goal of the analysis was to understand the direction of player interactions. Alessandro’s and Sasha’s hypothesis was that influencers are people who: 

  • are passionate about the game, 
  • vocal in the community, 
  • have a wide network of friends and connections, 
  • and are engaged in the multiplayer aspects.

The process started by isolating a sample of 246,041 players, in which they aimed to find two populations: influencers and power users (players with top-level performance, the highest number of hours played, highest score, highest-rated gear).

Tom Clancy’s The Division

To identify power users simple filters were used such as number of hours played, number of friends, highest score and performance. But for influencers, the parameters were more complex such as who creates the most groups, who invites other people mostly, are the invited people quick matches or friends, and whether the matches become friends later on.

To identify influencers Sasha and Alessandro used several social network metrics of centrality and prestige:

  • In-degree
  • Out-degree
  • Betweenness
  • Eigenvector
  • Closeness
  • Page rank

Alessandro explains that they tried to do a correlation between measures, but they were not so expressly correlated to be redundant. All players were compared against all six measures and top scoring players for each measure were intersected to identify the influencers.

After the analysis, they plotted the data on the network supergraph below.

Image source: video presentation of Alessandro Canossa & Sasha Makarovych, Massive Entertainment at Nordic Data Science and Machine Learning Summit 2018.

As the visual itself explains every node represents a person, every colour represents a community and size reflects the social importance of players. Generally, all 49 influencers they identified are centred and their communities are deeply entangled.

The results

Alessandro and Sasha share some statistics they got as a result of the analysis for influencers and power users.

Metrics Influencers Power users
Playtime (amount of time spent playing) 119 hours 454 hours
No. of friends (in their profiles) 208  27
Extent of reach (people they interacted with 342 27
Image source: video presentation of Alessandro Canossa & Sasha Makarovych, Massive Entertainment at Nordic Data Science and Machine Learning Summit 2018.

What can be summarised from the stats is that influencers on average spent significantly less time playing compared to power users. But in turn, they have a much higher number of friends in their community and the extent of their reach is really significant with an average 342 interactions with players. From a broader perspective, all 49 influencers reach a network of 16,000 people, many of which turn from quick matches to friends. On the contrary, power users played only with friends only inside their circle.


To confirm the impact of the 49 influencers, they compared it to 49 power users. The focus of the test was the impact on players who interacted both with influencers and power users by analysing their daily average playtime and social play ratio (time spent playing with other players) two weeks prior to interaction and two weeks after the interaction.

Their hypotheses were:

  • H0 was that influencers would have had no impact whatsoever on the daily average playtime and social play ratio of players who they had interacted with. 
  • Whereas, H1 was that they would see the change in the metrics after the interaction and that influencers actually impact the population of players in the game.
Image source: video presentation of Alessandro Canossa & Sasha Makarovych, Massive Entertainment at Nordic Data Science and Machine Learning Summit 2018.

What they discovered was that an interaction with an influencer presented a pivotal moment in the player’s journey. On average players spend 30 minutes more in the game and play more in groups after having had an interaction with an influencer.

While for power users, there were no similar changes observed.

So, Alessandro and Sasha could confidently say that their hypothesis was correct – influencers do impact the player behaviour in the game.

Tom Clancy’s The Division

The nature of gamer influencers

However, compared to other influencers on social networks like Youtubers or Twitch streamers who have an influence on people from a point of authority, gamer influencers are often unaware of the role they play in the community, Alessandro points out. They don’t make any conscious effort to present themselves as influencers and are not particularly exposed in social media, but they have a fundamental role and are the invisible backbone in the player community.

These influencers are the core in the retention and engagement of a live game such as Tom Clancy’s The Division. They are actively creating a good experience for the players and the dynamics of the multiplayer function. But Sasha and Alessandro wanted to go even further and try to understand what motivates games influencers to act as they do and if they are a particular kind of player profile or any player can become an influencer.

They developed their Ubisoft Perceived Experience Questionnaire to test player motivation. After running it for 8 years with 50,000 respondents, they have a decent result database that allows them to predict the motivation for players by looking at the player behaviour, even without them taking the survey. 

Future steps for Sasha and Alessandro are creating predictive models for player motivation and behaviours that lead to becoming a gamer influencer.

This approach can apply to other industries apart from the gaming industry – industries that have active user interaction may contain a similar core audience that shapes and influences the behaviour of the entire customer base. And with the help of data science and data analytics, organisations can recognise them and leverage their potential.

Add comment