Overview
This project analyzes artist similarity using data from the Semantic Artist Similarity (SAS) Dataset. It involves building and visualizing artist similarity networks using NetworkX and D3.js.
Files
SAS_networkx.py
: Python script for building and analyzing the artist similarity network using NetworkX.SAS_D3.html
: HTML file for visualizing the network using D3.js.network_data/
: Folder containing saved edges, nodes, and graph data.plots/
: Folder containing saved plots generated from the code.dataset-artist-similarity/
: Folder containing data from the SAS dataset.
SAS Dataset
The SAS Dataset includes artist entities with their biography texts and a list of the top-10 most similar artists within the datasets used as ground truth.
Files in the Dataset
mirex_gold_top10.txt
andlastfmapi_gold_top10.txt
: Top-10 lists of artists for every artist in both datasets.mb2uri_mirex.txt
andmb2uri_lastfmapi.txt
: MusicBrainz ID, Last.fm name, and DBpedia URI for each artist.- Biography Texts: Biography texts of each artist stored as
.txt
files.
SAS_networkx.py
This Python script builds and analyzes the artist similarity network using NetworkX.
- Usage Example: See detailed usage and functionalities within the script.
SAS_D3.html
This HTML file provides a visualization of the artist similarity network using D3.js.
- Usage: Open
SAS_D3.html
in a web browser to interactively explore the network.