Cameron MacLeod wrote a nice piece on how Shazam actually works. I’ve been curious since I’ve used Shazam to gather timestamps when I listen to music at the HiFi shows. Sometimes Shazam’s able to recognize a song within a second. It’s impressive. Check out MacLeod’s analysis here.
Here’s a bird’s eye view of how it works:
At first glance, one might question why identifying a song is considered a challenging problem. To comprehend the complexities involved, consider a graphical representation of a song’s audio waveform. Each song is essentially a collection of sound waves, and when visualized, these waves can appear intricate and irregular.
For instance, take a brief section of a song’s waveform. To determine if this audio snippet matches a particular song, a brute-force approach would involve sliding this section along the entire song, checking for a match at every point. This method would be computationally intensive and time-consuming, particularly when dealing with vast music libraries.
Furthermore, the challenge intensifies when dealing with real-world audio recordings, which are often affected by background noise, changes in amplitude, frequency variations, and other distortions. The simplistic sliding approach becomes inadequate under these conditions.
Shazam employs a more sophisticated approach to tackle these challenges. Here’s an overview of how it works:
The choice of using spectrogram peaks as the foundation of Shazam’s fingerprinting technique is deliberate. Spectrogram peaks are less susceptible to noise and can withstand various audio distortions. Moreover, they provide a more concise representation of the audio, reducing the computational load and storage requirements.
The final step involves matching the audio sample’s fingerprint with those in the database. Shazam groups matching fingerprints by songs and calculates a score for each potential match. The song with the highest score is likely the correct identification. This scoring process considers the time alignment of peaks, ensuring an accurate match.
In essence, Shazam’s technology is akin to a musical detective. It listens to a song, extracts unique audio features, and then hunts for the song’s identity within a vast music library. The result is a seamless user experience that transforms the magic of song recognition into a technological reality.
For more in-depth information, check out MacLeod’s awesome article.
Two new premium finishes for the multi-award-winning Px7 S2e and Px8 over-ear wireless headphones The… Read More
XACT is proud to announce the launch of two groundbreaking products that set new benchmarks… Read More
THE ART OF NOISE: INTRODUCING ZEITGEIST GERMANY AND ITS MISSION TO HANDCRAFT UNIQUE EARPHONES FROM… Read More
JCAT, a leader in high-performance streaming-specific products, is proud to announce the release of the… Read More
Worthing, UK, 30th May 2024: British audio brand Bowers & Wilkins and global icon David… Read More
Treehaus often avoids calling a product “The Best.” The world of audio is far too… Read More
This website uses cookies.