Uploaded by Faith

Guidelines for Search - Siri Music (End to End)

advertisement
Please ensure that you are using the latest version of the guidelines from BaseLine, which
are found in the upper right corner of every task.
Co
nf
11 id
e
98 n
73 tia
38 l
Guidelines for Search - Siri Music (End to
End)
Table of Contents
Please ensure that you are using the latest version of the guidelines from BaseLine, which are found in the upper
right corner of every task. ............................................................................................................................................. 1
Guidelines for Search - Siri Music (End to End) ........................................................... 1
Siri Music (End to End) Guidelines ................................................................................. 3
Introduction ................................................................................................................................................................ 3
Rating Guidelines for HomePod Search ............................................................................................................. 4
Video Results ............................................................................................................................................................. 5
Mandatory Comment.............................................................................................................................................. 5
Query Intent ................................................................................................................................................................ 5
Query Type and Other Input Metadata ................................................................................................................ 6
1. Query Type.......................................................................................................................................................... 6
2. Spell Correction .................................................................................................................................................. 6
1. Song Queries ................................................................................................................. 7
Special scenarios ...................................................................................................................................................... 8
2. Artist Queries ............................................................................................................... 10
Artist Page, Artist Siri Station, and Artist Station .......................................................................................... 11
Album Results for Artist Queries ........................................................................................................................ 11
Other exception: ...................................................................................................................................................... 11
3. Personalized Queries .................................................................................................. 13
Personalized Radio Station................................................................................................................................. 14
4. Album Queries ............................................................................................................. 15
5. Soundtrack Queries .................................................................................................... 16
6. Playlist/Chart Queries ................................................................................................. 16
Exception................................................................................................................................................................... 18
7. Lyrics Queries.............................................................................................................. 19
8. Genre Queries .............................................................................................................. 20
Co
nf
11 id
e
98 n
73 tia
38 l
Exception................................................................................................................................................................... 21
9. Podcast Queries .......................................................................................................... 22
Examples: .................................................................................................................................................................. 22
10. Broadcast Radio Queries ......................................................................................... 23
Exception................................................................................................................................................................... 23
11. Apple Music Hosted Radio Queries ........................................................................ 24
12. Editorial Radio Queries ............................................................................................ 26
13. Ambiguous - Multiple Classifications..................................................................... 27
14. Ambiguous - Unclear Intent ..................................................................................... 27
15. Local............................................................................................................................ 28
16. Action Command Queries ........................................................................................ 28
17. Knowledge Based Queries....................................................................................... 28
Seasonal Results ............................................................................................................. 28
Determining Popularity ................................................................................................... 29
Average Monthly Views Since Upload (AMVSU) ............................................................................................ 29
Using the Recency of a Song ............................................................................................................................. 29
Using Google Trends to assist ........................................................................................................................... 29
Lexicon ............................................................................................................................. 30
Output Type ........................................................................................................................................................... 30
Co
nf
11 id
e
98 n
73 tia
38 l
Updated:
01/19/2023: Artist Essentials/Similar Artists Station demoted to Acceptable and changes in format.
06/09/2022
Note: We are in the midst of updating the Training Videos, please refer to these guidelines for the
latest update.
Siri Music (End to End) Guidelines
homepod-mini.png 101 KB
Introduction
In this document, we explain relevance rating guidelines for HomePod contents. You will use the
BaseLine tool to make these ratings.
What is HomePod?
HomePod is a smart speaker developed by Apple, Inc., first released on February 9, 2018. It
integrates Siri, which can be used to control the speaker and other HomeKit devices. The HomePod
supports proprietary Apple platforms and technologies (Apple Music, iTunes purchases, Apple
Podcasts, Apple Music 1 radio, and AirPlay) as well as other 3rd party apps like Pandora and
Deezer.
If you are not familiar with HomePod, please refer to https://www.apple.com/homepod/ for an
overview.
If you are not familiar with Siri, please check overview on how to Use Siri to play music or podcasts.
The importance of your work as a Rater
Co
nf
11 id
e
98 n
73 tia
38 l
The data we receive from you in the form of high-quality relevance judgments will be used to build
and improve artificial intelligence systems such as search algorithms and machine learned rankers
that power the user experience for Apple HomePod users.
Our ultimate goal is to enrich the customer experience by improving search quality and
enhancing customer satisfaction, and you play an important role in this.
Ask yourself “Why is this particular result returned for this query? Is this relevant? Is it popular
enough to meet the primary intent” Stay curious and do a thorough research to understand the
relationship between a query and the result. Besides, it is highly recommended to keep yourselves
well-informed, as market trends are dynamic and evolving.
Your attention to detail, research and language skills as well as your cultural knowledge of the
market are all critical to the success of our projects. Please keep in mind that your tasks will be spotchecked for quality, and measured against those of your peers. If your accuracy rate is consistently
high enough, you may be qualified to be given a broader access and work as an auditor who gives
feedback to your peers.
Rating Guidelines for HomePod Search
• Perfect: The primary intent of the query or the most likely piece of content implied by the
query.
For example, [Play Drake] with Drake’s artist Siri Station returned would be
Perfect. In the case of a multi-match, a piece of content implied by the query
ONLY if it is incredibly popular or equally popular to the intended piece of
content. For example, [Play home] should return popular songs with the title
‘home’.
• Acceptable: A piece of content that is closely related, but secondary, to the implied query.
The content would be a less popular intent compared to the primary ones that rated
Perfect.
o For example, [Play blood sweat and tears], The primary intent is the song Blood
Sweat & Tears by BTS; but the Blood Sweat & Tears band is popular enough to
be a potential intent (and the artist Siri Station should be rated 'Acceptable').
• Unacceptable: Off-Topic - This piece of content is off-topic to the query.
• Problem: Other - Any problem not listed above and Local/Knowledge Based/Action
Command queries would use this rating. A problem or technical issue with the task in
BaseLine that makes it impossible to judge relevance or a video result of a TV
show/Movie (for Siri Music (End to End) Evaluation) should be rated with this rating as
well (example below).
o
Video Results
Co
nf
11 id
e
98 n
73 tia
38 l
Video results are occasionally returned in search results and should be rated Problem: Other as
they are not music intents.
Problem: Other Examples
Note 1: As Siri occasionally incorrectly transcribes spoken utterances, be cognizant of query intent,
e.g. if the query is [play ain't shit by THE CARTERS], the user is actually looking for the song
Apeshit by THE CARTERS. For highly misrecognized queries, reading the query aloud helps. For
example, the intent for [Play funder] is Imagine Dragons' song Thunder.
Mandatory Comment
Each rating must be explained in the comment box. Even if “optional” is indicated, rating comments
are always mandatory.
The comment should be concise and must only explain why the rating chosen is the correct one in
application of the guidelines.
Query Intent
Researching Query Intent
We expect you to use search resources such as Bing, Google, Yahoo, YouTube, Genius as well as
Apple Music to help you understand the intent of each query. Searching "[query] apple music" or
"[query] iTunes" will help narrow down the intent.
Primary and Secondary Intent
Co
nf
11 id
e
98 n
73 tia
38 l
The primary intent of a query is the most likely intent, i.e., the intent of most users who enter that
query in Apple Music.
A secondary intent, on the other hand, is less likely, or would be a less popular intent compared to
a primary one. A secondary intent could be:
• content relevant to a smaller group of users than for the primary intent
• lower quality/lesser known content such as unpopular covers, remixes from unknown artists.
Query Type and Other Input Metadata
1. Query Type
Each query is assigned a ‘Query type’. On the left hand side in BaseLine, inside the input metadata
section you will find the classification of the query. The classification is already set (you do not have
to classify the query yourself).
You should rate the content relevancy based on (1) the classification of the query and (2)how
relevant is the content in relation to the query and the query type.
The ‘Query type’ aims to help you determine the primary intent of the query. Please also
consider potential secondary intents when doing your ratings.
Should you disagree with the ‘Query type’ classification, please explain in the comment
section why it should be different. You can then apply the appropriate rating scale to
complete your judgement.
Example Comment: “I believe that [red] shall be classified as Album rather than Soundtrack. I
used this query type for the rating.”
TaskExamples:
1. [play love] - Classified as Song but could also be classified as Genre/Category. Love by Keyshia
Cole or Love Apple Music playlist could both be rated 'Perfect'.
2. [play mia] - Classified as Artist but could also be classified as Song. M.I.A siri station or song Mía
by Bad Bunny & Drake could both be rated 'Perfect'
2. Spell Correction
Some queries are misspelled by our users. When this is the case, assume the query intent is for the
correctly spelled query.
If the query is [play the weekend] or [play lasy gaga], the user likely means [play the
weeknd] and [play lady gaga]. Use your best judgement.
Missing words like feat., x, ft. in Artist Queries (like [nicki minaj lil baby] instead of [nicki minaj ft. lil
baby]) would not be considered spelling errors, as there is nothing that's misspelled.
1. Song Queries
Co
nf
11 id
e
98 n
73 tia
38 l
Perfect: The particular song implied by the query, the result that satisfies primary intent.
• The intended song of the query
• Highly popular songs with equal primary intent
1. [Hey Siri can you play Let It Go] - Let It Go by Idina Menzel
2. [déjà vu] - deja vu by Olivia Rodrigo
3. [Happy Birthday] - High quality versions should be rated 'Perfect'. Similar for other kids
rhymes and songs, we do not need to compare the popularity of the content and rate it
based on the quality of the song.
4. [play cardi b and megan thee stallion] - WAP song which is the intended collaboration by
Cardi B and Megan Thee Stallion.
5. [play locked up] - Locked Up (feat. Styles P) by Akon. The remix is more popular than the
original and thus rated ‘Perfect’. Refer to section “Special Scenarios” for more examples.
Acceptable: A popular piece of content which satisfies the secondary intent of the song
query or is related to the primary intent.
• The secondary intent song
• Less popular versions of the intended song
• Album or Playlist containing the intended song
1. [Hey Siri play Rockin' Robin by Jackson 5] - Rockin' Robin (Live at the Forum, 1972) by
Jackson 5
2. [Without You Kid Laroi] - WITHOUT YOU (Miley Cyrus Remix) by The Kid LAROI & Miley
Cyrus
3. [Black Widow] - Black Widow (Original Motion Picture Soundtrack). The song Black
Widow by Iggy Azalea ft. Rita Ora is considerably more popular and the Black Window
Soundtrack can be considered as secondary intent.
4. [play 7 rings by Ariana Grande] - thank you, next album (containing the intended song)
5. [play 7 rings by Ariana Grande] - Ariana Grande Essentials playlist (containing the
intended song)
Unacceptable (Off-Topic): A piece of content that is not related to the song, and/or has little to no
perceived relevance.
• Content with no relevance to the query.
• Unpopular title matches.
• Artist station of the intended song.
[I'll leave the door open] - I’ll Always Leave the Door a Little Open by Freddy Cole
[back in the day by aminé] - DR. WHOEVER by Aminé
[Jump by the Pointer Sisters] - The Pointer Sisters Siri Station
[Play the new Taylor Swift song] - I Knew You Were Trouble (Fish Fugue Remix) song by
Taylor Swift
5. [play more than a memory by Garth Brooks] - Any results returned for this query is not
relevant as the Garth Brooks catalog is not available on Apple Music.
6. [play hooked] - Hooked song by valentina cy
7. [play cardi b and megan thee stallion] - Cardi B Siri Station. The intent is for the
collaborated song WAP, so the siri station of either artist should be rated Unacceptable:
Off-Topic.
1.
2.
3.
4.
Special scenarios
Co
nf
11 id
e
98 n
73 tia
38 l
1. Multiple Song Versions → Some songs are re-mastered/re-recorded over time. Popular
versions of the song will receive the rating of Perfect. This may not be the case always, so
do research.
1. Examples:
1. [play I'm Still Standing by Elton John] - I'm Still Standing
(Remastered) by Elton John. This result is a remastered version and
should be rated as 'Perfect'.
2. [play let me love you (feat. justin bieber)] - Let Me Love You (feat.
Justin Bieber)[Sean Paul Remix] by DJ Snake and Sean Paul. This
result is a less popular remix by DJ Snake & Sean Paul and should
be rated as ‘Acceptable’. The primary intent = Let Me Love You (feat.
Justin Bieber) by DJ Snake.
3. [play Never Enough From "The Greatest Showman”] - Never Enough
by Ciara Brooke. This is an unpopular cover by Ciara Brooke that
should be rated ‘Unacceptable: Off-Topic’. Intent = Never Enough by
Loren Allred from The Greatest Showman (Original Motion Picture
Soundtrack).
2. Remix vs Original → Sometimes a remix will be more popular than the original. In that
case, the remixed song will get 'Perfect' (if it's clearly the most popular), while the original
should receive 'Acceptable'.
1. It is not as common that a remix will be more popular than the original. If both
versions are equally as popular, then both versions can receive Perfect.
1. [waves mr probz] → The Mr. Probz - Waves (Robin Schulz Remix
Radio Edit) would be ‘Perfect’, while the original Mr. Probz Waves would be ‘Acceptable’.
2. [I took a pill in ibiza] → The Mike Posner - I took a pill in Ibiza (Seeb
Remix) would be ‘Perfect’, while the original Mike Posner - I took a pill
in Ibiza would be ‘Acceptable’.
3.
1. For example, for the longest time, the remix of the song Despacito featuring
Justin Bieber was more popular than the original with only Luis Fonsi & Daddy
Yankee. At the moment, the original version of Despacito by Luis Fonsi &
Daddy Yankee is the most popular song.
4.
5. Single vs Album → Although Singles show up as albums on Baseline, Singles are
considered Songs. Single results will include the word ‘Single’ in the title. If the Single
result is the intended song, the Single result is rated Perfect.
1. Example: metti 1000 di orietta berti (English: Play Mille by Orietta
Berti) → MILLE - Single. In this example, we are getting a single result for the
intended song → Perfect
4. National Anthem We should return web search results for national anthem utterances for all
storefronts for the utterances asking for [play country national anthem]. For example: [play Italian
National Anthem] or [play Russian national Anthem]. Thus, any results from Apple Music for these
queries should be rated Unacceptable: Off-Topic.
However, if the user is asking specifically for the title of the national anthem (ex. God Save the
Queen, Star Spangled Banner) then we should rate the results according to popularity. Apple does
not want to make judgements related to what the “correct” national anthem or whether a country
“exists”.
Co
nf
11 id
e
98 n
73 tia
38 l
2. Artist Queries
Co
nf
11 id
e
98 n
73 tia
38 l
The Artist Station, Artist Siri Station, and Artist Page for the intended artist are all rated Perfect.
The Artist Essentials playlist and the Artist & Similar Artists Station are rated Acceptable.
Perfect: The intended artist of the query.
• The Artist Station, Artist Siri Station, and Artist Page
• Vague artist song requests: “play a song by x” or “play x songs”
1. [shuffle drake] - Drake Siri Station
2. [avril lavigne] - Avril Lavigne Siri Station
3. [play a song by Taylor Swift] - Love Story (Taylor's Version). Intent is a song by the artist.
Any song by the artist should be rated Perfect.
4. [play michael jackson song] - Michael Jackson's Artist Page / Siri Station. Since the Siri
station would play at least ONE song by the intended artist, it should be rated Perfect.
5. [play michael jackson songs] - Michael Jackson's Artist Page / Siri Station. Since the Siri
station would play songs by the intended artist, it should be rated Perfect.
Acceptable: A popular piece of content which satisfies the secondary intent of the artist
query or is related to the primary intent.
• The Artist Essentials playlist and the Artist & Similar Artists Station
• Albums by the intended artist
• Popular songs by the intended artist
• Artist playlists such as Love songs/Next Steps/Deep Cuts
1. [miley cyrus] - Miley Cyrus Essentials playlist
2. [play doja cat radio] - Doja Cat & Similar Artists Station station
3. [taylor swift] - Evermore album. When an Artist query returns one album by the intended
artist, the album result should be rated Acceptable. But if the artist only has one album,
then it would be Perfect.
4. [miley cyrus] - Miley Cyrus: Love Songs playlist. If the result is a playlist for the intended
artist, but it is slightly specific, it should be rated Acceptable.
5. [justin bieber] - Peaches by Justin Bieber. When an Artist query returns one popular song
by the intended artist, it should be rated Acceptable.
Unacceptable (Off-Topic): A piece of content that is not related to the artist, and/or has little to no
perceived relevance.
• Content with no relevance to the query
• Unpopular title matches
• Unpopular content by the artist
1. [justin bieber] - Out of Town Girl by Justin Bieber. The result is an unpopular song by the
intended artist from one of his older album, it should be rated Unacceptable: Off-Topic.
2. [justin bieber] - A-List Pop playlist. Although this playlist may contain a song by Justin
Bieber, we cannot ensure that the first song being played is a Justin Bieber song, it should
be rated Unacceptable: Off-Topic.
3. [play the best by garth brooks] - Garth Brooks Greatest Hits: Cover Tribute Album, Vol.
1 cover album. This should be rated Unacceptable: Off-Topic as we do not have any
content by Garth Brooks in Apple Music.
Artist Page, Artist Siri Station, and Artist Station
Co
nf
11 id
e
98 n
73 tia
38 l
[Play 6ix9ine]: In this example, the query returns a Siri Station for the artist 6ix9ine which is
rated Perfect.
[Hey Siri play Bruce Springsteen]: In this example, the query returns the Artist Page for Bruce
Springsteen. Note how it mentions "Artist" below the result. This result is also Perfect.
image.png 146 KB
Album Results for Artist Queries
For Artist queries, if the result is one album by the intended artist, the album result is
rated Acceptable.
Exception → If an artist only has 1 album, we won't be able to create a Siri Station, because we do
not have enough content. The album by the intended artist is rated Perfect.
Examples:
play THE CARTERS → The album EVERYTHING IS LOVE by The Carters is Perfect.
Note: 1 song from the only album by the artist would get Acceptable.
Other exception:
Classical Artist Queries → The intended behavior is to return either an Artist
Station or Essentials playlist and should receive the rating of Perfect.
Examples:
[play some Mozart] - The playlist Wolfgang Amadeus Mozart Essentials is rated Perfect.
[play bach] - Bach Station radio station. The apple radio station can also be rated Perfect.
[Play x greatest hits] or [Play the best by x]→ Any utterance for “greatest hits” or “the best”
should return either (a) a well regarded greatest hits album or (b) the Essentials playlist created
by Apple's editorial team.
Examples: [play Elton John's greatest hits] → The album should be rated 'Perfect'. The artist's
Essentials playlist would also get 'Perfect'.
Song vs Songs → If the utterance is singular (e.g. ‘a song’, ‘song’) then the ‘Perfect’ result can be
a single song or the artist’s Siri Station/Artist page. If the utterance is plural (e.g. ‘songs’) then the
result should be a Siri station, Artist page, playlist, etc.
Lullaby/Children Rhymes → If the user's intent is for a lullaby or children rhyme, as long as the
result is a single high quality song that is relevant to the query, we can rate it as Acceptable. We do
not need to consider the popularity of the song.
Co
nf
11 id
e
98 n
73 tia
38 l
Audiobooks/Audio Dramas Guidelines → Please refer to this document for more rating
examples for different scenarios.
3. Personalized Queries
Co
nf
11 id
e
98 n
73 tia
38 l
Personalized query refers to an utterance that triggers content generated from “For You” or creates
a personalized mix station based on the utterance. “Made For You” personalized mix stations
include “Get Up! Mix”, “Chill Mix” , “New Music Mix”, etc. This is different from Local in that Local
utterances exist in the form of playlists that exist on your phone. Personalized music represents the
user's tastes and preferences.
• Examples: [play something], [play music], [play a song], [play my station], [play music for
me], [play my favorites], [my favorites mix], [Chill mix], [Replay 2021] , [Hey Siri, play me
something I’d like]
Perfect: The intended personalized station.
• Personalized station
• Personalized mixes
1. [DJ for me] - iTS's Station/ Apple Music 1. The user intent is to listen to customized
contents according to personal user history, preferences, library, and actions. Therefore,
any of these outputs should be rated Perfect.
2. [Hey Siri can you play some music] - iTS's Station or Apple Music 1
3. [play my favorite music] - Favorites Mix playlist
4. [play my favorites] - Favorites Mix playlist
5. [play my chill mix] - Chill Mix playlist
6. [play my new music mix] - New Music Mix playlist
Acceptable: A popular piece of content which satisfies the secondary intent of the query or is
related to the primary intent.
• Popular songs
• Keyword Matching
1. [播好歌] [Translation: Play (a) good song] - Good Song by Alex Fong. The token matches
the utterance, but the primary intent is more towards playing a good quality song, hence
the output is Acceptable. A perfect result should be iTS's Station/ Apple Music 1.
2. [play my favorite song] - My Favourite (Ansen Remix) by Danny Darko. There is partial
token matches, but the primary intent is more towards playing the user's favorite mix,
hence the output is Acceptable as it serves a secondary intent. A perfect result should be
user’s Favorites Mix.
3. [play good music] or [put on nice songs] - Easy on Me by Adele. For popular songs that
are being returned as results, we can rate them as Acceptable for broad queries like
these.
Unacceptable (Off-Topic): A piece of content that is not related to the query and/or has little
to no perceived relevance.
• Incorrect station
1. [Hi Siri play music] - Pure Pop station. In this case the result is wrong as we should be
returning a personalized station or Apple Music 1.
2. [Play something about you] - iTS's Station. The intent is the song Something About You by
Hayden James (not the user’s Personalized Siri Station). In this particular situation, the
algorithm seems to have recognized "play something" and overlooked the rest of the
query. For this case, the personalized Siri station should be rated Off-Topic.
Co
nf
11 id
e
98 n
73 tia
38 l
3. [Play something there] - iTS's Station. The user’s personalized Siri Station should be rated
‘Off-Topic’. Intent = the song Something There by Paige O'Hara, Robby Benson, Jerry
Orbach, Angela Lansbury & David Ogden Stiers from Beauty and the Beast (Soundtrack
from the Motion Picture).
Note: Personalized Siri Stations look like this in BaseLine:
station.png 66.8 KB
Personalized Radio Station
Please note that although there is no picture of the content here, it is NOT the same as an
[Unavailable] content as shown above, so please DO NOT use the “Problem: Other” rating.
[play some classical piano] → Classical (Personalized Radio Station). We would rate this Perfect, as
we can see that the user wants some classical piano music and the result is a personalized radio
station (for classical and piano music) that would satisfy the user’s intent.
4. Album Queries
Co
nf
11 id
e
98 n
73 tia
38 l
Perfect: The particular album implied by the query, the result that satisfies primary intent.
• The intended album of the query. Please note album “singles” should be considered as
Songs.
1. [play sour] - Sour album by Olivia Rodrigo
2. [Hey Siri play Carole King Tapestry] - Tapestry album by Carole King
3. [play Taylor Swift's album] - Fearless (Taylor's Version) album by Taylor Swift. When the
intent is not a specific album, any album by the artist, or in the same series should be
rated Perfect.
Acceptable: A popular piece of content which satisfies the secondary intent of the album
query or is related to the primary intent.
• Songs from the intended album
1. [Play cry baby (deluxe edition)] - Play Date song by Melanie Martinez. If the utterance
contains the intent for an album and only one song from the album is returned, it should
be rated Acceptable. The album will receive Perfect.
2. [Play Plastic Hearts miley] - Midnight Sky song by Miley Cyrus. The song Midnight Sky
from the intended album Plastic Hearts by Miley Cyrus should be rated Acceptable.
3. [play justin bieber's album] - Peaches single song. Since the utterance did not specify
which Justin Bieber's album, and the result is a popular song by the intended artist, this
should be rated Acceptable.
Unacceptable (Off-Topic): A piece of content that is not related to the album, and/or has little
to no perceived relevance.
• Content with no relevance to the query.
• Albums by other artists
• Songs from other albums
1. [play Kanye's new album] - 808s & Heartbreak album by Kanye West. Intent is the
latest/newest album by Kanye, but the result is an album released in 2008 and Kanye has
newer releases, so it should be rated Unacceptable: Off-Topic.
2. [play SZA album] - Planet Her album by Doja Cat. Although SZA has a collaboration with
Doja Cat (Kiss Me More) in this album, the user is looking for a SZA album and not just 1
song, so it should be rated Unacceptable: Off-Topic.
3. [play Billie Eilish Happier than ever] - bad guy song by Billie Eilish. Intent is Billie Eilish's
album Happier Than Ever, but the result is a song from another album (WHEN WE ALL
FALL ASLEEP, WHERE DO WE GO?), so it should be rated Unacceptable: Off-Topic.
4. [play when we fall asleep where do we go by Billie Eilish] - Billie Eilish: WHEN WE ALL
FALL ASLEEP, WHERE DO WE GO? radio episode. The intent is an album. The result is
an episode of a radio show that discusses the intended album. The radio show episode is
not a primary or secondary intent and should be rated Unacceptable: Off Topic.
5. Soundtrack Queries
Co
nf
11 id
e
98 n
73 tia
38 l
Perfect: The particular soundtrack implied by the query, the result that satisfies primary
intent.
• The intended soundtrack of the query
• The main theme song from the visual content
• Curated playlists
1. [play coco] - Coco (Original Motion Picture Soundtrack)
2. [play descendants 3] - Descendants 3 (Original TV Movie Soundtrack)
3. [subway surfers] - Subway Surfers (Main Theme). The intent is music from the game
Subway Surfers. The main theme song Subway Surfers (Main Theme) is rated Perfect.
The album Greatest Hits 2020 by Subway Surfers containing the main theme song is also
rated Perfect.
4. [paw patrol] - Paw Patrol Opening Theme. The intent is music from the kids show Paw
Patrol. The main theme song Paw Patrol Opening Theme is rated Perfect. The
album PAW Patrol Official Theme Song & More - EP containing the main theme song is
also rated Perfect. The artist page Paw Patrol containing this album is also rated Perfect.
5. [play The Magic School Bus] - The Magic School Bus Theme. This tv show only contains
one theme song The Magic School Bus Theme, which is rated Perfect.
6. [play Star Wars] - Best Of Star Wars. The content returned is a curated playlist by Disney
which contains popular songs from the franchise, which is rated Perfect.
Acceptable: A popular piece of content which satisfies the secondary intent of the
soundtrack query or is related to the primary intent.
• Songs from the intended soundtrack
1. [paw patrol] - Paw Patrol on a Roll. This song is related to the kids tv show Paw Patrol, but
it's not the main theme song, so it is rated Acceptable. The main theme song, the album
containing the theme song, and the artist page would all be rated Perfect.
2. [play guardians of the galaxy soundtrack] - Flashback - One Hit Wonders. The intent is for
music from the movie Guardians of the Galaxy. The returned album Flashback - One Hit
Wonders contains only one song from the intended soundtrack, which poorly satisfies the
primary intent and is rated Acceptable. The playlist Guardians of the Galaxy by Disney
would be rated Perfect, as well as the official soundtrack album Guardians of the Galaxy
(Original Score).
3. [play euphoria] - Forever. The intent is music from the HBO show Euphoria. A result is
only a single song, Forever, from the intended tv show and is rated Acceptable. The
official soundtrack album Euphoria (Original Score from the HBO Series) would be rated
Perfect.
Unacceptable (Off-Topic): A piece of content that is not related to the soundtrack, and/or has
little to no perceived relevance.
• Content with no relevance to the query.
1. [play narcos mexico] - Narco Corridos Vol 1. The intent is music from the Netflix show
Narcos: Mexico. The result for this query is Narco Corridos Vol 1 which has a partially text
match with the query, but doesn't contain music from the intended tv show.
6. Playlist/Chart Queries
Perfect: The particular playlist implied by the query, the result that satisfies primary intent.
• The intended playlist of the query
Co
nf
11 id
e
98 n
73 tia
38 l
1. [Hey Siri play Pure Spa] - Pure Spa playlist
2. [Hey Siri play Lewis Capaldi Essentials] - Lewis Capaldi Essentials playlist
3. [play a summer playlist] - Songs of the Summer playlist | Summer BBQ playlist
| Summer playlist
4. [play Drake playlist] - Drake Essentials or Drake Next Steps playlist. Any one of the
playlists that we have curated for the artist should be rated 'Perfect'. Other examples
include Drake: Love Songs, Drake: Deep Cut, where the playlist contains all of Drake's
songs.
5. [play a fourth of July playlist] - Fourth of July playlist
6. [Hey Siri play charts] / [Hey Siri play Chart music] - Top 100: USA playlist or US Top 40 |
Chart Hits 2022. Since the query is from the USA Storefront, the charts for USA would be
the Perfect result. For other storefronts, Top 100 or Top Charts for that specific country
would be considered Perfect results.
7. [play best] / [play some popular music] / [play hits] - Top 100: USA playlist (for US
Storefront). The Charts playlist for corresponding storefronts can be rated as Perfect.
Acceptable: A popular piece of content which satisfies the secondary intent of the playlist
query or is related to the primary intent.
• Popular songs
• Secondary intent playlists
1. [play best] / [play some popular music] / [play hits] - Easy on Me by Adele OR any popular
songs. For ambiguous queries like “play best”, we can rate both classic popular single
song results (Bohemian Rhapsody, Hello, Viva La Vida - songs that are no longer
charting) and current popular songs on the charts as Acceptable. In addition, if the
result is a popular content like “best song ever” by One Direction, it can also be a
secondary intent, thus Acceptable.
2. [play Drake playlist] - Inspired by Drake or Drake: Influences playlist. Although these
playlists do not have songs by the intended artist, these playlists would be able to satisfy
user's secondary intent (content discovery)
3. [Amy Winehouse playlist] - Amy Winehouse Siri Station. Any high quality playlist of the
intended artist is eligible for Perfect rating. The Siri station of the intended artist could
satisfy the user's intent as it is contains a collection of the intended artist's songs, thus
Acceptable.
4. [play Rap Life] - Hurricane song by Kanye West. User is looking for this navigational
playlist, but since this song is included in this playlist (at the time of rating), the single song
result can be considered Acceptable.
5. [play top hits] - Stay by The KID LAROI and Justin Bieber. Although user is likely to be
looking for the charts playlist, this song is indeed a top hit so the single song can be
considered Acceptable.
Unacceptable (Off-Topic): A piece of content that is not related to the playlist, and/or has little
to no perceived relevance.
• Content with no relevance to the query.
1. [play Today's Hits] - Sour album by Olivia Rodrigo. Although the album is popular with
songs in the playlist, but since the user is likely to be looking for that particular playlist, it is
unlikely to satisfy the user intent (too specific)
Co
nf
11 id
e
98 n
73 tia
38 l
2. [A-List Pop] - Apple Music 1 radio. Since the result is a hosted radio station, it is unlikely to
satisfy user's intent. In addition, Apple Music 1 is a mix between pop, hip-hop and indie
music, thus slightly different from the intended content.
3. [play best] / [play some popular music] / [play hits] - Study Session playlist. We return a
specific playlist that does not contain current top charting songs.
Exception
[in the charts recently] /[popular recently] → For Top songs "recently", kindly use the Apple Music
chart (select the Daily Top 100 Chart for the storefront) to VERIFY and COMMENT with the actual
rank (of the song in the chart) and date of retrieval. The song would have to be amongst the Top 100
of their market charts on the day of rating.
• If the result is in the Top 100 → Perfect
• Not in the Top 100 and not popular “recently” → Unacceptable: Off-Topic
[Hey Siri please play Mozart bedtime] → The reason why this query is classified as a Playlist/Chart
query is because this is more of an artist functional request (not a genre/mood/activity as it is
specific to Mozart). The user is likely to be looking for Mozart bedtime related content (Mozart +
Bedtime/lullaby/sleep/relaxing) and any collection(playlist/radio/album) of songs that satisfy this
intent can get Perfect (Mozart for Bedtime, Bedtime Mozart, Mozart Deep Sleep etc.). However, if
the result is a single song from any of this album, we would rate it Acceptable, as the song might be
able to satisfy a secondary intent.
7. Lyrics Queries
Co
nf
11 id
e
98 n
73 tia
38 l
Perfect: The particular song that matches the lyrics query, the result that satisfies primary
intent.
• The intended song from the lyrics
1. [Play the song that goes I think that you can read my mind] - Speechless by Dan+Shay
2. [play move it move it] - I Like to Move it song by will.i.am
Acceptable: A less popular piece of content which satisfies the secondary intent of the lyrics
query.
• Remix/different version of the primary intent song
1. [play the song that goes I think that you can read my mind] - Speechless (Acoustic) by
Dan+Shay. An acoustic version of the intended song would be a secondary intent.
2. [play you're beautiful just the way you are] - Scars to Your Beautiful (Live off the Floor
(Bonus Track)) by Alessia Cara. The intended result is “Scars to Your Beautiful” and we
return the live version that also matches the lyrics.
Unacceptable (Off-Topic): A piece of content that is not related to the lyrics.
• Content with no relevance to the query.
1. [Play the song that goes I think that you can read my mind] - Dan + Shay album by Dan +
Shay. The result is the album containing the intended song. However, the user intent is for
that particular song and returning the album would not satisfy the user's intent.
8. Genre Queries
Co
nf
11 id
e
98 n
73 tia
38 l
Perfect: Playlists, Radio Stations and Albums that clearly satisfies the intent of the query.
• Collection of songs that is relevant to the query/user’s intent
1. [workout songs] - Hip-Hop Workout playlist | Pop Workout playlist | Hip-Hop Workout radio
station | Pure workout playlist. All of these content are high quality curated containers that
satisfies the user's intent for workout music and should be rated as Perfect.
2. [country] - Apple Music Country radio | Today's Country playlist | NOW That's What I Call
Country, Vol. 14 album. For the compilation album "NOW That's What I Call Country, Vol.
14", only the latest version should be rated as Perfect, while the older iterations can be
rated as Acceptable.
3. [Play Lullabies] - Lullabies Station
4. [play sleep sounds] - Sleep Sounds playlist
5. [play Christmas music] or [play Easy Listening] - Essential Christmas playlist | Easy
Listening Essentials playlist. The essential(s) playlist contains a good collection of music
that satisfies the user's intent.
6. [play fart noises] - Funny Fart Noises album
Acceptable: A popular piece of content which satisfies the secondary intent of the genre
query or is related to the primary intent.
• Single popular songs
• Slightly specific albums/playlists
1. [play pop songs] - Future Nostalgia album by Dua Lipa. Although the result is an album
filled with Pop songs, but this is slightly specific as it only has songs by Dua Lipa, thus
should be rated as Acceptable
2. [country] - 2010s Country Essentials playlist or Carrie Underwood Essentials playlist.
Decade related/popular country artists' Essentials playlist are slightly specific but the
content is indeed relevant to the user's intent for country music, thus can be rated as
Acceptable
3. [workout songs] - Dynamite by BTS. Dynamite is popular and relevant content that
matches the intended genre. However, one single song is not ideal for the user when the
utterance is looking for a broad genre or mood.
Unacceptable (Off-Topic): A piece of content that is not related to the genre, and/or has little
to no perceived relevance.
• Content with no relevance to the query
• Outdated content
• Unpopular single songs
1. [play some new R&B] - Songs in a Minor album by Alicia Keys. In this case, the album
was released in 2001. Any album or song result that is older than 1 year is not considered
new. This result therefore is not considered new and should be rated Unacceptable: OffTopic
2. [play dance music] - Today's Country playlist
3. [workout songs] - Dear my Friend, by Keung To (姜濤). Dear my Friend, is a popular song
in Hong Kong, but it does not belong to the workout music, we would rate it as
'Unacceptable: Off-Topic'. In other situations where some songs are relevant to the genre,
but they are unpopular, hence being rated Off-topic too.
Exception
Co
nf
11 id
e
98 n
73 tia
38 l
For Kids rhymes and songs, a default perfect version is the version with lyrics. For lullabies, sleep
music is always the music with peaceful beat. Therefore, make sure you click ‘Preview’ and take a
few seconds to listen, because some song do not title themselves with explicit and precise wording
such as piano version or composed in your storefront’s mainstream locale, thereby turning up as not
ideal as you perceive. For single song results for kids rhymes and songs, we do not need to
determine the popularity of the content. As long as it is high quality and contains the tune of the kids
rhyme, it can be rated Acceptable.
9. Podcast Queries
Co
nf
11 id
e
98 n
73 tia
38 l
Perfect: The particular podcast that matches the query, the result that satisfies primary
intent.
• The intended podcast show
• The latest episode of the intended podcast
1.
2.
3.
4.
5.
[play the latest episode of the daily] - The daily podcast
[play the latest BBC news] - Global News Podcast
[Play The Ben Shapiro Show] - The Ben Shapiro Show podcast
[play Wait Wait Don't Tell Me] - Wait Wait... Don't tell me podcast
[play the latest episode of Up First] - BONUS: American Shadows. The result returned is
the latest episode of “Up First”. Although the podcast title is not in the title of the result, it is
important to check the returned content’s podcast series.
Unacceptable (Off-Topic): A piece of content that is not related to the podcast.
• Content with no relevance to the query
• Keyword Matching
1. [play the planet Money] - Tengo Money song by Wildley
2. [play Alex Jones podcast] - Alex Jone podcast. We don't have the Alex Jones podcast in
our catalogue, so the returned result should be rated 'Unacceptable: Off-Topic'.
Examples:
play the latest episode of Up First → Although the podcast title is not in the title of the result, it is
important to check the returned content’s podcast series. “BONUS: American Shadows” is the latest
episode of “Up First” → Perfect
bonus.png 99.3 KB
Shows, episodes, podcasts for ambiguous/multi-match queries → If the utterance contains
"show", "episode", "podcast" (or similar words), we should be returning either Beats 1/Apple Music 1
shows, episodes or podcasts. Research if the query is a podcast or Beats 1/Apple Music 1 show
before making a judgement. Otherwise, the result should return a song, album, or artist.
10. Broadcast Radio Queries
Co
nf
11 id
e
98 n
73 tia
38 l
Perfect: The particular radio station implied by the query, the result that satisfies primary
intent.
• The intended local, satellite or internet radio station
1.
2.
3.
4.
[play 106.5] - 106.5 The Beat
[play bbc radio world service] - BBC World Service
[play fox sports radio] - Fox Sports Radio
[Play 98.5] - 98.5 Virgin Radio station. This is a query from the Canadian Market. The
specific radio station intended is unclear. The result is the 98.5 Virgin Radio station from
Calgary, Alberta, CA. We don’t know if the user is located in Montréal, or in Calgary (or
elsewhere). Therefore, this result is rated 'Perfect'.
Acceptable: A piece of content which satisfies the secondary intent of the radio query or is
related to the primary intent.
• Less popular radio stations
1. [play Radio Dunedin] - OAR FM Dunedin. The result for this query is a less popular radio
station with similar matching title. The intended station is Radio Dunedin.
2. [play Los 40] - Los 40 Rioja. The result for this query is a similar radio station from a
different country.
3. [play radio Nostalgie] - Nostalgie 80 radio. This is a themed digital stream of the intended
radio station, focussing on 80s music. The main radio station Nostalgie would be perfect.
Unacceptable (Off-Topic): A piece of content that is not related to the radio query, and/or has
little to no perceived relevance.
• Content with no relevance to the query.
1. [Hey Siri turn on GB radio] - Lullabies. The result for this query is an editorial radio station.
The intended station is GB News Radio station.
2. [play radio la z] - '00s Pop. The result for this query is an editorial radio station. The
intended station is La Z radio station.
Exception
Note: If we return a Podcast, or the latest Podcast Episode for a radio we don't have in our
catalogue; that Podcast, or Podcast Episode should be rated Perfect.
Example: play BBC Radio 4 → We return the Podcast Friday Night Comedy from BBC Radio 4 by
BBC Radio 4. This result should be rated 'Perfect' as this is the best result we can return knowing
that we don't have the BBC Radio 4 Station.
11. Apple Music Hosted Radio Queries
Co
nf
11 id
e
98 n
73 tia
38 l
Perfect: The particular radio station implied by the query, the result that satisfies primary
intent. Note there are only three Apple hosted radio stations: Apple Music 1, Apple Music
Hits, and Apple Music Country.
• The intended Apple Music Hosted radio station
• For queries for Apple Music radio shows, the latest episode of the intended show is Perfect
1. [play Beats 1 radio] - Apple Music 1. The Apple hosted Apple Music 1 radio station,
previously called Beats 1, is returned.
2. [play Apple Music hits] - Apple Music Hits
3. [play boiler room] - Juliana Huxtable. The latest episode, titled Juliana Huxtable, of the
intended radio show (as of June 2022) is returned.
4. [play radio] - Apple Music 1. The Apple Music 1 station would be the expected result.
Acceptable: A piece of content which satisfies the secondary intent of the radio query or is
related to the primary intent.
• Older episodes from the intended station
1. [play after school radio] - Mark Finds a Gemstone. The result is an episode from the
intended radio station, but not the latest episode.
Unacceptable (Off-Topic): A piece of content that is not related to the radio query, and/or has
little to no perceived relevance.
• Content with no relevance to the query.
1. [play back porch country radio] - Back Porch Country playlist. A playlist is returned,
however the intended content is the Apple hosted radio show Back Porch Country Radio
with Nick Hoffman.
2. [play 80s radio with Huey Lewis] - Huey Lewis & The News. The result is the Huey Lewis
artist station, however the intended content is the Apple hosted radio show '80s Radio with
Huey Lewis.
3. [play radio] - The Radio. The result is the artist page for The Radio. The Apple Music 1
radio station should be returned.
When radio episodes are returned, it is not always clear from the episode title and thumbnail which
radio show the episode is from. In this case, navigate to the intended radio show page on Apple
Music and check if this episode is listed there. For example: The radio episode below is titled
“EPISODE 27” and the thumbnail doesn’t contain the radio show. On the page for the radio show
“Live from the Moon” on Apple Music, this episode is listed as the most recent episode and is thus
rated Perfect.
Co
nf
11 id
e
98 n
73 tia
38 l
Radio Episode Example.png 152 KB
12. Editorial Radio Queries
Co
nf
11 id
e
98 n
73 tia
38 l
Perfect: The particular radio station implied by the query, the result that satisfies primary
intent.
• The intended editorial radio station
1.
2.
3.
4.
[hey siri pure pop radio] - Pure Pop radio station
[play K-pop radio] - K-Pop radio station
[play Disney radio] - Disney radio station
[play Kids and Family radio] - Kids & Family radio station
Acceptable: A piece of content which satisfies the secondary intent of the radio query or is
related to the primary intent.
• Slightly specific relevant station
1. [Play pop radio] - '70s Pop radio. The '70s Pop radio station is too specific and could
satisfy a secondary intent. The perfect station would be Pure Pop.
Unacceptable (Off-Topic): A piece of content that is not related to the radio query, and/or has
little to no perceived relevance.
• Content with no relevance to the query.
1. [play country music radio] - Radio. The result for this query is the song Radio by Lana Del
Rey, however the intended content is Country Station.
2. [hey siri jazz radio] - Jauz. The result for this query is an artist page for Jauz, however the
intended content is Jazz Station.
13. Ambiguous - Multiple Classifications
Co
nf
11 id
e
98 n
73 tia
38 l
A query where there is multiple primary intents under different classifications. Please refer to
the Popularity section below to determine the popularity of the content.
Perfect: A popular result with matching title/name that completes the query.
1. [play Bruno Mars 24K Magic] - The album AND the song are both equally popular. Both
the album and the song are rated Perfect.
2. [play evermore by Taylor Swift] - The album AND the song are both equally popular. Both
the album and the song are rated Perfect.
Acceptable: Less popular results that could satisfy a secondary intent.
1. [Play 4:44] - This is an example where the album 4:44 by Jay Z is more popular than the
eponym song. The album is rated Perfect, while the song 4:44 by Jay Z is rated
Acceptable.
Unacceptable (Off-Topic): An unpopular piece of content that is not related to the query,
and/or has little to no perceived relevance.
1. [play tell me why] - Tell Me Why by Taylor Swift. The intent could be the lyrics to song “I
Want It That Way” by Backstreet boys or song “Tell Me Why” by The Kid LAROI. The
result returned is an unpopular song by Taylor Swift that matches the title.
14. Ambiguous - Unclear Intent
A query where it is impossible to determine a single primary intent. Please refer to the
Popularity section below to determine the popularity of the content.
Perfect: A popular result which is very likely to satisfy a primary intent of the ambiguous
query. Only the most popular content will be eligible for this rating.
1. [play three] - 3 song by Britney Spears. The result returned is a popular song with
matching title.
2. [Play UK] - UK Rap playlist. Result returned is a high quality UK rap playlist.
Acceptable: Less popular results that could satisfy a secondary intent.
1. [Hey Siri play the] - THE song by Gaho & Jian. The result for this query is a less popular
song with matching title at the time of rating. Best rating for this result is Acceptable.
Unacceptable (Off-Topic): A piece of content that is not related to the query, and/or has little
to no perceived relevance.
1. [Hey Siri play for] - For song by C. Duncan. Result is an unpopular song with matching title
and therefore is Off-Topic. For this query, the intent could have also been for ‘4’ or ‘four’.
The album FOUR by One Direction or the album 4 by Beyoncé are both popular and
would have been Perfect results.
15. Local
Co
nf
11 id
e
98 n
73 tia
38 l
Queries that are asking for music that is local to a user's device.
We are not able to evaluate the results individual users have in their personal device libraries so we
cannot return Local results. Because of this, any result that we do get for Local queries should be
rated as Problem: Other.
• Examples: [Hey Siri play my car playlist], [Hey Siri shuffle New playlist], [Hey Siri shuffle my
music playlist], [Hey Siri shuffle 🦧all the feels🦧]
16. Action Command Queries
This includes requests to add songs to the user’s library and other commands
Any result for Action Command queries should be rated as Problem: Other
• Examples: [Hey Siri add this song to library], [Hey Siri stop playing this song], [Tell me the
news], [Hey Siri show me Heat Waves lyrics]
17. Knowledge Based Queries
This includes queries where the user is asking knowledge based questions
Any result for Knowledge Based queries should be rated as Problem: Other
• Examples: [Who sings this song?], [When did this album come out?], [What artist is featured
in this song?]
Seasonal Results
In some locales there are seasonal results which have additional relevance and popularity at specific
times of year, for example Christmas songs are very popular in the US at the end of the year. Please
consider the seasonal popularity when rating results and consider increasing the rating due to
seasonal popularity. E.g. Jazz Christmas playlist should be rated higher for a query [jazz] around
Christmas time.
Determining Popularity
Co
nf
11 id
e
98 n
73 tia
38 l
When researching popularity of returned content, please consider both popularity metrics (Average
Monthly Views Since Upload, views on YouTube, listeners on Spotify, local sources metrics)
and recency of the result to identify the intent.
Average Monthly Views Since Upload (AMVSU)
If there are 2 popular songs with matching title and relatively similar number of average views, both
songs can be eligible for Perfect. The metric we adopt is Average Monthly Views Since Upload
(AMVSU).
Example: [freedom] –> The query has a slightly ambiguous intent that can refer to Pharrell Williams'
or Jon Batiste‘s songs.
• Pharrell William's song has 127M views on YouTube over the last 7 years. AMVSU –>
127÷84 = 1.5M views per month since upload
• Jon Batiste‘s song has 14M views over the last year. AMVSU –> 14÷12 = 1.2M views per
month since upload.
Therefore, even though the Pharrell William's song has more views overall, Jon Batiste‘s song can
also qualify for the rating Perfect, given its recency and monthly popularity.
Note that uploading date may differ from the actual launch date, kindly double check on Google for
accuracy.
Using the Recency of a Song
There are cases were a song was released in the last couple of months and might not have as many
views as another older song with similar title. There is a possibility that the more recent song is the
intent as it is currently charting and/or trending. For cases like this, both the recent song and the
older song with more views can both get a rating of Perfect.
Example:
[Hey Siri play Let It Go] - For this case, the song Let It Go from the movie Frozen is the most popular
song with matching title with 2.5 billion views. This Song would get a rating of Perfect.
The song Let It Go by DJ Khaled ft. Justin Bieber and 21 Savage is less popular with 20 million
views. However, the song is from a little over 1 month ago and was in the top charts. This is a
possible primary intent. Even though this song has significantly less views than Let It Go from the
movie Frozen, it is still eligible for Perfect (as of 2021).
The song Let It Go by James Bay is another popular song (368 million views) with matching title.
However, this song is from 6 years ago, and is less popular than the song from Frozen. We cannot
use recency in this case. The best rating for this song is Acceptable.
Using Google Trends to assist
To find our more about on Recency of the contents, a suggested time range for recency check is 90
days and 30 days. Adjusting accordingly to help you identify when the hype starts. Usually it
coincides with the hype of new album, tv drama, concerts, tidbits. Most interestingly, Google Trends
also allows you to to explore not only Google’s search trends, but also also YouTube search trends.
If a song partially completes a query, and the returned content is popular, should it be Acceptable or
Co
nf
11 id
e
98 n
73 tia
38 l
Off-Topic?
For the example on “Hey Siri play Who You Love” - Perfect could be Who You Love by John Mayer.
“Who Do You Love?” By YG would be considered more Off-Topic
It should be rated Off-Topic since this song is quite dated and there’s also another song (titled “Who
do You Love”) by the Chainsmokers and 5SOS which is more recent than the Drake collaboration.
image.png 199 KB
Lexicon
Output Type
The Output Type is what's showing in brackets in BaseLine under the result. Examples: (Song),
(Album), (Siri Station), (Station), (Playlist), etc.
Download