jolek78's blog

AaronSwartz

3:00 AM. Another one of those nights where my brain decided sleep was overrated. After my usual nocturnal walk through the streets of a remote Scottish town—where even a fox observed me with that “humans are weird” look—I sat back down at my server. Just a quick scan of my RSS feeds, I told myself, then I can start work. When...

We backed up Spotify (metadata and music files). It's distributed in bulk torrents (~300TB), grouped by popularity. This release includes the largest publicly available music metadata database with 256 million tracks and 186 million unique ISRCs. It's the world's first “preservation archive” for music which is fully open (meaning it can easily be mirrored by anyone with enough disk space), with 86 million music files, representing around 99.6% of listens.

The news came from Anna's Archive—the world's largest pirate library—which had just scraped Spotify's entire catalog. Not just metadata, but also the audio files. 86 million tracks, 300 terabytes. I stopped to reread those numbers, then thought: holy shit, how big is this thing?

Read more...