Introduction
I decided to do this project because it is an interesting industry, I love video games, and at the same time I love analyzing data, so, what better way to mix 2 of my hobbies in one project.
Throughout the years, video games industry has been changing according to different generations and trends. It is important to mention that this dataset is from physical copies sold. With this analysis, I would like to discover the next things:
· What video games gender has more sales? Has It been the same case years before?
· How true is that the profitability of a certain video game genre changes depending on people’s generation?
· What company has more revenue with physical video games?
· Does geographic ubication affect videogames sales according to video game gender?
We are going to start with the cleaning step, because it is important to avoid biases and avoid false results. Then, we will continue with the visualization step that will help us to obtain insights easily, and finally, we are going to get our conclusions.
Data Source
This data was obtained from Kaggle, from a user which gathered data from VGChartz, a site that estimated physical video games sales for many years.
https://www.kaggle.com/datasets/ulrikthygepedersen/video-games-sales?resource=download
VGChartz only reported physical retail sales (stores, clubs, cartridges, etc.). Digital sales were not included because neither Sony, Microsoft, Nintendo, or Valve (Steam) shared official digital download figures.
This particularly affects PCs, as since the mid-2000s, most of their sales have shifted to digital platforms (Steam, Battle.net, Origin, Epic, etc.), while many games were still sold in physical format on consoles.
It's important to emphasize that this analysis is based exclusively on physical game sales. A more comprehensive analysis would require information from the companies, which they currently share only partially.
Cleaning process
I decided to clean the data with Power BI, because data looks not dirty enough to use a robust alternative like SQL.
The first thing that I did was to quit platform abbreviation and I used the full name, because not everybody is familiar with all video games platforms.
Then, I created a new column to group platforms by company (for PC I used PC environment because it depends on your operating system).
I Also grouped the Year column, to make lecture easier for some charts, because it’s easier to read year ranges than a lot of years values.
I checked all the data types, to corroborate that all of them are ok, so I changed a couple of them, and I only had to fix year column with a different method. I converted year column to Text type (to avoid errors), and then I created a custom column, adding fake days and months to the years, to be able to convert it to date type easily .
Then I converted that custom column to date. With this I can keep the year when I change the data type, because if I don’t do that step, my year change its format.
Then I just had to rename Custom Column to Years and deleted previous column year.
Now with the table completed, i just had to adjust format for some columns.
I changed the Format for Sales columns; I converted them to currency.
DAX
Starting the visualization, I wanted to add cards to show total Sales by each Region, but Sales values are in Millions, and they are abbreviated, and when I tried to show the Total sales from a Region, it showed values wrongly, so I had to use DAX to fix it and show the correct value. I returned them to the complete format.
Then I configurated the card again to display units in millions.
In this project I only used DAX for this situation, but DAX has a thousand ways to be used according to project requirements and data.
Visualization and Analysis
Now, with all ready to work on Viz, I proceed to create my charts, cards, filters, Etc. Something important to mention is that I didn’t use the video game from 2020 because that’s the only record from that year, so it had no sense to use it. Rows without a date, were not used too. All charts show the sales in millions and the values are abbreviated to avoid saturating the chart. I Could use Billions, but for some charts, the variables would look very low because the difference of sales.
I will explain each chart selected:
Regional sales by genre
I chose the stacked bar chat because we can see the total sales by each gender and at the same time, we can see how many video games each region sold, resuming the information and avoiding the use of an extra chart.
Cards for sales in each region
I used cards because it gives us a fast view for total sales in each region, and also we can filter the whole visualization by Genre and Years grouped, and we can see the sales for each filter.
Gender sales by gear of game release
At the beginning, I grouped some years specifically for this chart, because it would be crazy to create a chart with 40 different years, so grouping them avoids saturate a chart. I could group them each 10 years, but that would concentrate the data a lot, making changes trough out the years not so visible.
Global sales by years
Line chart allow us to see easily changes by years, so we can see how physical video games increased their physical sales trough out the years, and how at some point the sales started to decrease. (I will speak more about it in conclusion section.)
Sales by company
Again, I chose the stacked bar chart because it helps to visualize the total sales by company and how much they sold in each region. If we filter by grouped years, we can see how some companies stayed behind others, and how new companies raised in video games consoles.
Answering questions
In this section I’m going to answer the questions that I put at the beginning of the project. 2 Important points to keep in mind:
1.- Charts do not show actual annual sales, those years are the release date of videogames, what we see is the total cumulative sales of games released in that range of years. We can use those years as references because we can see how successful were those games, and how public preferences influence the sales .
2.- PC sales are not well represented because PC users use to buy digital games, so we can not see the real values for PC environments.
· What video games gender has more sales? Has It been the same case years before?
We can determine that Action gender is currently the dominant, because from year 2000 to 2019 it has been the gender with more sales. It’s important to mention that it was not always like this, in the past, the most dominant gender was platform genre, when video games started with games like Mario Bros from 1985 to 1994, despite that, Action gender has kept the lead by almost 20 years. We would need data from digital sales, to see if the same situation happens.
· How true is that the profitability of a certain video game genre changes depending on people’s generation?
It is true, I can see how the profitability of games according to gender has changed in each generation. When video games started (1980 – 1984), shooter gender was the most profitable. The next generations (1985-1994) Platform genre was the most profitable. At some point of the 90’s (1995-1999) the most profitable genre was Role-playing gender, after that, the tendency has been action video games.
· What company has more revenue with physical video games?
Sony Corporation (PlayStation Consoles) shows that they have more revenue in physical video games, fighting the position with Nintendo, which was at some point the company with more revenue. When Videogames started, Atari company dominated in sales. We would need data from digital sales to compare the totals sales, because that could affect the result. Maybe Nintendo could have more revenue than PlayStation, but that’s another project that I would like to do if companies share their digital sales transparently or if it has changed with current years.
· Does geographic ubication affect video games sales according to gender?
Yes, we can see how Role-Playing genre is preferred in Japan, while in other regions action genre is preferred. It would be nice to explore more regions and see if the result is the same, I would like to understand user’s reasons to prefer action games.
Conclusions
We can conclude that physical action games are trends, so, it is profitable to sale those kinds of games. Shooter genre is the second genre profitable, because we could think that the second profitable genre is Sports, but the tendency of this genre goes down, despite that, sports kept being profitable, followed by role-playing genre. We will need current data to determinate how marked has changed, if tendencies are the same or if they have changed.
It would be necessary to analyze data from digital sales, because that could give us the big picture, and we could see which video games is really the most profitable, because we could see physical and digital sales, and that would make it easier to identify which video game genres companies should focus on developing, while still considering market reactions to avoid wrong decisions.