FIFA Ranking Analysis

This project demonstrates how a flat dataset (FIFA rankings) can be transformed into a normalized relational model and some analysis through SQL. All the SQL queries on this page are written in PostgreSQL and executed using sqliteonline.com, with the PostgreSQL engine selected. 

The data has been picked from kaggle: https://www.kaggle.com/datasets/alexisbcook/data-for-datavis/data?select=fifa.csv

Source file has below structure.

Column Name Data Type Column Description
Ranking Date DATE Data type to store tthe date of ranking
ARG INT Ranking for Argentina
BRA INT Ranking for Brazil
ESP INT Ranking for Spain
FRA INT Ranking for France
GER INT Ranking for Germany
ITA ITN Ranking for Italy

Let's decompose the input data into multiple tables - country, rankcal and fifarank.

Country

Column Name Data Type Column Description
countryid(PK) INT Unique identifier for each country(primary key)
countrycode VARCHAR Short country code(e.g. 'BRA' for Brazil)
countryname VARCHAR Full name of the country(e.g. Brazil)

Rankcal

Column Name Data Type Column Description
dateid(PK) INT Unique identifier for each state. serverse as PK
rankdate DATE The full date when FIFA ranking was published
rankyear INT Year extracted from rankdate
rankmonth INT Month extracted from rankdate
rankday INT Day of the month from rankdate
rankquarter INT Quarter of the year in which the ranking was released

Fifarank

Column Name Data Type Column Description
rankid(PK) INT Unique identifier
countryid INT Links to country table
dateid INT Links to rankcal table
rank INT Actual rank value

Relationship diagram between the 3 would look like this.

Cardinality:

One country  (1) -----< many fifarank records

Each country can appear multiple times in the FIFA ranking history — once for each ranking date. For example, Brazil will have a ranking entry for every ranking event it participated in. Therefore, there’s a one-to-many relationship from the country table to the fifarank table.

One rankcal  (1) -----< many fifarank records

Each ranking is calculated on a specific date. That date and its details (year, month, quarter) are stored in the rankcal table. On a given date, many countries will have rankings. So, this forms a one-to-many relationship from rankcal to fifarank.