Overview

This project analyzes artist similarity using data from the Semantic Artist Similarity (SAS) Dataset. It involves building and visualizing artist similarity networks using NetworkX and D3.js.

Files

  • SAS_networkx.py: Python script for building and analyzing the artist similarity network using NetworkX.
  • SAS_D3.html: HTML file for visualizing the network using D3.js.
  • network_data/: Folder containing saved edges, nodes, and graph data.
  • plots/: Folder containing saved plots generated from the code.
  • dataset-artist-similarity/: Folder containing data from the SAS dataset.

SAS Dataset

The SAS Dataset includes artist entities with their biography texts and a list of the top-10 most similar artists within the datasets used as ground truth.

Files in the Dataset

  • mirex_gold_top10.txt and lastfmapi_gold_top10.txt: Top-10 lists of artists for every artist in both datasets.
  • mb2uri_mirex.txt and mb2uri_lastfmapi.txt: MusicBrainz ID, Last.fm name, and DBpedia URI for each artist.
  • Biography Texts: Biography texts of each artist stored as .txt files.

SAS_networkx.py

This Python script builds and analyzes the artist similarity network using NetworkX.

  • Usage Example: See detailed usage and functionalities within the script.

SAS_D3.html

This HTML file provides a visualization of the artist similarity network using D3.js.

  • Usage: Open SAS_D3.html in a web browser to interactively explore the network.

Example Plots