This project demonstrates how a flat dataset (FIFA rankings) can be transformed into a normalized relational model and some analysis through SQL. All the SQL queries on this page are written in PostgreSQL and executed using sqliteonline.com, with the PostgreSQL engine selected.
The data has been picked from kaggle: https://www.kaggle.com/datasets/alexisbcook/data-for-datavis/data?select=fifa.csv
Source file has below structure.
| Column Name | Data Type | Column Description |
| Ranking Date | DATE | Data type to store tthe date of ranking |
| ARG | INT | Ranking for Argentina |
| BRA | INT | Ranking for Brazil |
| ESP | INT | Ranking for Spain |
| FRA | INT | Ranking for France |
| GER | INT | Ranking for Germany |
| ITA | ITN | Ranking for Italy |
Let's decompose the input data into multiple tables - country, rankcal and fifarank.
Country
| Column Name | Data Type | Column Description |
| countryid(PK) | INT | Unique identifier for each country(primary key) |
| countrycode | VARCHAR | Short country code(e.g. 'BRA' for Brazil) |
| countryname | VARCHAR | Full name of the country(e.g. Brazil) |
Rankcal
| Column Name | Data Type | Column Description |
| dateid(PK) | INT | Unique identifier for each state. serverse as PK |
| rankdate | DATE | The full date when FIFA ranking was published |
| rankyear | INT | Year extracted from rankdate |
| rankmonth | INT | Month extracted from rankdate |
| rankday | INT | Day of the month from rankdate |
| rankquarter | INT | Quarter of the year in which the ranking was released |
Fifarank
| Column Name | Data Type | Column Description |
| rankid(PK) | INT | Unique identifier |
| countryid | INT | Links to country table |
| dateid | INT | Links to rankcal table |
| rank | INT | Actual rank value |
Relationship diagram between the 3 would look like this.
Cardinality:
One country (1) -----< many fifarank records
Each country can appear multiple times in the FIFA ranking history — once for each ranking date. For example, Brazil will have a ranking entry for every ranking event it participated in. Therefore, there’s a one-to-many relationship from the country table to the fifarank table.
One rankcal (1) -----< many fifarank records
Each ranking is calculated on a specific date. That date and its details (year, month, quarter) are stored in the rankcal table. On a given date, many countries will have rankings. So, this forms a one-to-many relationship from rankcal to fifarank.