The Wide World of Music Recognition
This article was also published on MusicAlly and Hypebot
Most people are by now familiar with how the Shazam music service seems to magically identify the name of the song playing in the background.
With 120 million active users, Shazam’s popularity means it’s the most familiar use for this technology, which itself is the most prominent part of a larger category commonly known as Automatic Content Recognition (ACR).
The science behind the song recognition algorithm relies on a technique called fingerprinting, where small segments of a song to be identified are extracted (usually via an app), and its time, amplitude, and frequency are used to generate a 3D-spectrogram, and peaks of high intensity then identified. This fingerprint is sent to a server to be compared at super-speed to a hashtag database containing millions of song fingerprints, and when there is a fingerprint match, voila, the artist name and song title are then transmitted almost magically to the querying user.
Even though the technology for audio recognition has been around for almost 20 years, it is only in the past few years that even more mainstream applications have utilized it. Alongside Shazam, other major audio-recognition service providers include ACRCloud, Audible Magic, BMAT, Civolution, Gracenote, Mufin, SoundHound, and Rovi.
Shazam-style apps are not the only way music/audio recognition technology can be used, however. Here are some more applications that may not be as obvious to the uninitiated.
Advertising & Broadcasting
According to Nielsen’s Digital Consumer report, “Second-screen activities have transformed TV viewing experience. 84% of smartphone and tablet owners say they use their devices as second-screens while watching TV at the same time. Consumers use second screens to deepen their engagement with what they’re watching, including activities such as looking up information about the characters and plot lines, or researching and purchasing products and services advertised.”
With audio recognition, it is possible during broadcasts (including live broadcasts) to trigger and simultaneously sync and present on the Second-Screen device, additional program information, interaction (eg. voting), interactive competitions social media conversations (eg. relevant Twitter, Facebook, WeChat, Weibo interactions), and advertising/ commerce opportunities.
With the relegation of TV to a background medium by some target audiences, we have seen the development of TV ad synchronization with online ads based on audio recognition technology via Second Screen devices and ad programmatic optimization. In a recent advertising campaign devised by media agency OMD and TVadSync for Vodafone to promote free Spotify Premium for 3 months, corresponding online ads were run at the same time as the Vodafone-Spotify ads on TV. The TV ads were identified and synced with the online ads using audio recognition technology, resulting in a dramatic boost in click-through rates of at least 350% vs a non-synced control campaign.
Where some, particularly broadcasters and marketers, have lamented users’ decreased engagement directly with TV, they can capitalize by creating compelling campaigns via second-screen devices enabled by audio recognition.
In Kazakhstan, producers of the Channel 7 quiz show BOOM enhanced homeviewers’ participation by enabling them to answer the same questions as the studio contestants, in realtime. This simultaneous participation was achieved via their proprietary app, BOOMKZ utilizing third party audio recognition technology, which syncs the accompanying app-incorporated timestamps with the TV show during the broadcast. This transformed homeviewers from being mere armchair participants to bona fide contestants, who stood a chance to win prizes.
Copyrighted Content Identification
YouTube partners who upload videos are probably familiar with their Content ID system, wherein users have to provide relevant video metadata and register their copyright usage rights.
By using advanced third-party audio fingerprinting technology, user-generated content (UGC) operators such as YouTube and SoundCloud are able to identify and screen content with copyright issues and easily remove duplicated content at the back-end.
This solution is also applicable to video or audio providers who are required to censor or screen for unlicensed content and also to track and manage content for monetization tracking and reporting purposes. This effectively automates the monitoring process, thereby saving manpower costs whilst responding in near real-time.
Music Collection Societies/ Performance Rights Organizations
One of the biggest challenges facing musicians today is the fact that the state of music reporting systems is anachronistic with the development and availability of advanced digital technology in the market. Hundreds of millions of dollars belonging to artists are collected and kept in so-called black boxes by music Collection Societies and Performance Rights Organizations (PROs) globally, as they cannot be matched and distributed to artists and composers for a variety of reasons including missing or wrong metadata, outdated or incompatible database systems or – according to some critics – alleged theft.
Many radio stations and music venues around the world are still being made to report music playlists in old-school formats instead of real-time digital systems. The revenues collected are then conveniently divided via an archaic estimation system amongst the top labels and publishers, instead of being distributed fairly to other deserving independent artists and composers. An audio recognition system would definitely enable line-by-line reporting, so that artists can be paid their due revenues in a timely manner.
DJ Monitor is a company that uses audio recognition to identify songs played in clubs, and these playlists are then submitted to the PROs for payment to artists. Beatmap is another company that monitors music played in clubs by using music recognition technology. The playlists of each club, including song and artist information, are then displayed on the Beatmap app. Beatmap also sends the playlist to collection societies so as to enable artists to be paid performance royalties more accurately.
Radio Stations/ Online Radio Aggregators
Many radio stations globally are still utilizing analog systems and their online streaming feeds usually lack basic artist and song names. The proliferation of online radio stations has given rise to aggregators such as TuneIn or Live365, which offer streaming of hundreds of radio stations from around the world.
However, stations that do not offer basic playlist information provide a disservice to both artists and users, and sully the listeners’ experience. With the deployment of music recognition technology, it is now possible to display the artist and song names in real-time instead of having to rely on Shazam each time a user wants to find out the title of the song playing.
As Adriaan van Rossum, founder of watisopderadio.nl explains, “watiseropderadio.nl is the biggest Dutch online radio streaming aggregator and it displays song playlist data for all the popular radio stations in the Netherlands with the help of music recognition technology, which recognizes music played on our radio stations, especially when the playlist of these stations are incomplete.”
Another innovative way for users to find out the names of songs playing on the radio is via the use of music recognition technology integrating with newer services like Twitter.
Twitter users can simply tweet @RadioID and the name of the song playing on the monitored radio station will be displayed in real-time on Twitter. Radio stations can easily integrate this feature as part of their service to listeners. Urban Radio and Costazul FM are two radio stations that offer this song recognition service to their users.
Car manufacturers such as Ford and connected car services providers like Airbiquity have been providing drivers with connected entertainment devices on the move. With integration of music recognition SDKs into in-car devices and connected mobile devices to identify artist name and song titles for non-digital radio feeds that do not provide such information, this has enriched drivers’ experience when listening to music in their vehicles.
Lyrics/ Chords/ Guitar Tab sites
Access to song lyrics, chords, and guitar tabs has traditionally been undertaken via text search. However, to more conveniently and accurately pull up the exact song lyrics/ chords/ tabs, utilizing integrated music recognition technology allows instantaneous identification and display of this information for musicians and listeners. Music recognition technology now incorporates instant lyric displays provided via SDKs from leading lyric provider LyricFind
EUMLab’s Guitar Master app, an all-in-one toolkit for aspiring guitar players, incorporates a music recognition feature that allows the use of the mic on users’ devices to recognize the song playing in the background or on their device. Upon recognition of the song, the matching chord score of that song will be instantly displayed to the user.
Audience Measurement
Instead of deploying additional hardware for audience measurement of TV broadcasts, it is now possible to integrate SDKs into an existing set-top box or mobile device via an app to monitor and measure audience viewership for live TV.
Live TV Channels are monitored at the control end and fingerprinted using audio recognition to populate a master database whilst user-viewed channels are then fingerprinted and these are then matched against the database.
Conclusion
With the prevalent use of smart mobile devices that have more powerful processors, which also incorporate mics and speakers, combined with better bandwidth and faster back-end systems, audio recognition technology is seeing more widespread adoption in the areas mentioned above. At the same time, we are also discovering more innovative and surprising applications utilizing the technology to magically automate erstwhile functions in a faster and efficient manner.