Spotify codes
I love music and I spend most of my day listening to songs. Spotify codes caught my eye from past some days, so I went on finding how do they work. I thought I should share this journey, how even the maker of Spotify codes Ludvig Strigeus,(also the maker of utorrent)was involved.
This is the sequence of the “Girl Of My Dreams” Spotify code:
Spotify Codes are like QR-codes that can be generated to easily share Spotify songs, artists, playlists, and users.
Spotify URIs
Spotify URIs (Uniform Resource Identifiers). Different pieces of media (artists, albums, songs, playlists, users) all have a URI.
A URI is a code you can use for embedding and searching within the Spotify app. Spotify URI codes look like this:
spotify:user:spotify:playlist:37i9dQZF1DXcBWIGoYBM5M.
The 22 characters are the numbers 0-9, characters a-z and A-Z. This means there are 10 + 26 + 26 = 62 possibilities for each character (almost Base64). So the potential number of Spotify URIs is 62^22 which is equal to 2.7e39 which is a veryyy longgg digit so Spotify is not running out of these any time soon.
Spotify Codes When the bars are sorted by height you can see that there are 8 discrete heights that they fall into.
In this function I use scikit-image to calculate the sequence of bar heights from a logo.
|
|
get_heights [0, 5, 1, 2, 0, 6, 4, 3, 7, 1, 6, 7, 7, 7, 7, 3, 1, 6, 3, 7, 0, 7, 0]
According to StackOverflow discussion, The barcode consists of 23 bars, of which only 20 contain information. This means that there are 8^20 pieces of information that can be encoded into the code.
A media reference is first generated by Spotify for a Spotify URI(link to URI para). The media reference is what gets encoded into the barcode and links the barcode to the URI. Spotify maintains a database of media references to URI mappings. Media references let Spotify track which codes are scanned, enable them to theoretically re-map codes, and just look better when encoded(which they successfully made it look, omg I love the aesthetics)
Simple no, no it isn’t. There is a lot more which goes on in the background. Problems :
- Big to small barcode URIs to barcode (The weird thing about this is the amount of information in the URI and the Spotify code do not match up.)
- Error correction
- String encryption
So to solve these problems,codes are then send through CRC Calculation, Convolutional encoding, and reversing (Decoding).
Gray-scale (gray codes)was one of techniques in which ordering of the binary numeral system such that two successive values differ in only one bit, omg stop imaging I know what you all are thinking greyscale disease from Game of thrones, well that is not what is used here.
For example, the representation of the decimal value “1” in binary would normally be “001” and “2” would be “010”. In Gray code, these values are represented as “001” and “011”. That way, incrementing a value from 1 to 2 requires only one bit to change, instead of two.
The media reference can be expressed as a 37 bit binary number:
|
|
CRC Calculation
This is very it got interesting for me, I had computer networks this semester and we learnt all these concepts in theory. I wanted to see how these are actually used and implemented.
Cyclic redundancy check is a method of detecting accidental changes/errors in the communication channel and bits are generated for the media reference. A CRC calculates check bits to verify the integrity of the data.
In simple words its a error checking algorithm, for example if a receiver receviecs a packet of data from sender, how will it know the data was correct and bits are not interchanged during transmission. So transmitter sends the original data, and attaches a fixed number of check bits, which are derived from the data bits by some deterministic algorithm(here CRC). If only error detection is required, a receiver can simply apply the same algorithm to the received data bits and compare its output with the received check bits if the values do not match, an error has occurred at some point during the transmission.
Spotify uses CRC-8 with the following generator:
|
|
Which gives us the following polynomial when fully written out:
|
|
The CRC is calculated by following these steps:
- Pad with 3 bits on the right
|
|
- Reverse bytes
|
|
- Calculate CRC as normal (highest order degree on the left)
|
|
- Reverse CRC
|
|
- Invert check
|
|
- Finally append to step 1 result
|
|
Here is a simple version of a crc calculator (step 3):
|
|
We can use the check bits to verify that the contents of the data match what was intended by the “sender”:
|
|
Convolutional encoding
These 45 bits are then convolutionally encoded. Convolutional codes add redundancy to data to make it possible to decode the data if errors are introduced during transmission.
Resources to learn more these MIT lecture notes
Spotify’s patent
High-Rate Punctured Convolutional Codes for Soft Decision Viterbi Decoding
Convolutional Codes and State Machine View/Trellis (MIT Slides)
The Error Correcting Codes Page
Reed-Solomon Codes and the Exploration of the Solar System
The code can then be transformed into an error correcting code, such as a forward error correcting code. The forward error correction used can, for example, be a convolutional code.
Tail biting
Before we start calculating the parity bits, we must prepend stream with last 6 bits of data:
|
|
Another approach would be to prepend 0
s to the input, but Spotify chose to go with this tail-biting method.
Encoding
We can focus on one generator at a time. Below is a step by step walkthrough of how the parity bits are generated for g0
. I reversed the polynomial because it is easier to display right to left in the example.
|
|
Continue this to the end of the stream of input bits:
|
|
The result of encoding with the first generator, g0
:
|
|
The result of encoding with g1
:
|
|
Here is a (naive) algorithm for calculating the parity bits:
|
|
These two parity bit sequences are then interleaved (abab…):
|
|
Puncturing
At this point we have gone from 45 bits to 90 bits. The rate of this convolutional code is 1/2 (one input bit results in 2 parity bits).
A rate of 1/2 is actually somewhat low for this application (see this article about QR code rates). The rate can be increased by “puncturing” the code. We can remove every third bit to increase the rate to 3/4:
|
|
This is equivalent to puncturing the two parity bit sequences with the patterns 101
and 110
before interleaving. Puncturing is a nice, simple way to increase the rate of your code after you estimate the frequency of expected errors.
Permutation
Spotify then shuffles the data to spread out errors when a whole bar is lost:
A shuffling process may be used to spread out the potential errors (e.g., if a whole bar/distance is missing). Instead of having the encoded bar lengths be consecutive, the lost bits are non-consecutive in the code word. This improves the chances for the forward error correction to work when a whole bar or distance is lost. Thus to scan the optical code, after the lengths of the found bars are converted into bits with certainties, the bits are shuffled back into the right order.
The bits are permuted by choosing every 7th bit (modulo 60) from the input.
|
|
Input and result:
|
|
Map to bar heights
Finally, map the bits to bar heights between 0 and 7 using gray code:
Decimal | Binary | Gray |
---|---|---|
0 | 000 | 000 |
1 | 001 | 001 |
2 | 010 | 011 |
3 | 011 | 010 |
4 | 100 | 110 |
5 | 101 | 111 |
6 | 110 | 101 |
7 | 111 | 100 |
|
|
Three additional bars are added as reference points at the start, end, and 12th position:
|
|
Decoding
Let’s reiterate why we went through the process of calculating the CRC and convolutionally encoding the data:
- Convolutional codes allow us to correct for errors in a message by sending “redundant” bits with the message
- The CRC lets the client check the validity of the decoded Spotify code. If the check bits are not correct, the client can scan the code again. When they are correct, then the client can ping Spotify’s servers. As Ludvig says, “I chose to include a CRC to let the client have a way to more easily discard invalid codes without a backend roundtrip.”
The process of transforming an image of a Spotify code to a media reference follows approximately the same steps as encoding in reverse. However, there is a much more happening while decoding a convolutional code.
Cracking Spotify Codes and making a quest out of it
Spotify’s bug bounty. Great idea, but it was possible to pass in the bug bounty