A Python script that connects to the Spotify API and retrieves data about my library such as playlists I've made, as well as details about the tracks and artists within them. The script then inserts said data into a SQLite database in a star schema.


The star schema design (as shown below) follows the basic design principles with each tracks entry into a collection (playlist or otherwise) being considered the fact item. With this, details regarding the collection or track are considered the primary dimensions with tracks also connecting to the related artist and album. Outside of the design I’ve also created a logging table to keep track of script runs and errors encountered as well as a view to report counts of how many tracks from each album have been added to a collection.


Originally, my intention for this project was to leave it at this view, providing an easy way of viewing the most popular albums in my collection to help decide what physical copies to buy for myself. Despite this, I realised that I had extracted all the data available from Spotify and only used a portion of it. To make up for this I created a set of dashboards in PowerBI, focusing on Artists, Playlists and Albums.


Visit Github Repo