SHOUTcast XML Metadata Specification

From Winamp Developer Wiki
Jump to: navigation, search

Shoutcast Home | Shoutcast Server (DNAS) | Shoutcast Developer (API) | Shoutcast For Business & Revenue Generation | Shoutcast DSP (encoder Plug-In for Winamp)


Introduction

The aim of this file is to show the different aspects of metadata which can be obtained as part of the SHOUTcast 2.0 system. This aims to be a complete list of what is provided in the xml file which the server will provide and is based on the metadata obtained from the media being played (directly or guessed). As a result, not all fields will be filled in though this depends on the setup being used and what the SHOUTcast source implements.


Specification Details

The following is an example xml file output showing the different aspects of the metadata which could be returned along with specific notes about certain fields i.e. those which can appear multiple times or those from older mappings. One thing to notice is that this is somewhat similar to the ID3v2.3 tag as well as some aspects of the ID3v2.4 tag.

Actual SHOUTcast 2 sources are not guaranteed to return the majority of the fields
shown but provision is provided for them to be sent if detected from source media.


<?xml version="1.0" encoding="UTF-8" ?>
<metadata>
 
  <!-- ALBUM -->
  <TALB>The Blue Album</TALB>
 
  <!-- GENRE (Content type). The v1 value is the ID3V1 genre code.
                             The text is a genre refinement -->
  <TCON v1="24">Subgenre</TCON>
 
  <!-- Note that code or subgenre may be missing.
       There can be more than one of these present -->
  <TCON v1="RX"></TCON>
 
  <!-- The genre code can also be RX (remix) or CR (cover) -->
  <TCON>Death Metal</TCON>
 
  <!-- Song TITLE -->
  <TIT2>Back In the U.S.S.R.</TIT2>
 
  <!-- ARTIST. There can be more than one of these present -->
  <TPE1>The Beatles</TPE1>
 
  <!-- Recording time -->
  <TDRC>
    <!-- YEAR -->
    <year>2008</year>
    <month>12</month>
    <day>25</day>
    <hour>13</hour>
    <minute>24</minute>
    <!-- If this is missing then assume it is UTC -->
    <zone>Z</zone>
  </TDRC>
 
  <!-- COMMENT. There can be more than one of these present -->
  <COMM language="eng" id="whatever">
    This is a comment
  </COMM>
  .
  <COMM language="eng" id="more">
    This is another comment
  </COMM>
 
  <!-- Universal file identifier. The content is base64 encoded binary data -->
  <UFID id="http://www.neil.com">00000003f32f17</UFID>
 
  <!-- Beats per minute. Numeric -->
  <TBPM>60</TBPM>
 
  <!-- Composer. There can be more than one of these present -->
  <TCOM>Paul McCartney</TCOM>
  .
  <TCOM>John Lennon</TCOM>
 
  <!-- Copyright message -->
  <TCOP>2008 Neil Radisch</TCOP>
 
  <!-- Playlist delay (milliseconds between songs in playlist) -->
  <TDLY>45</TDLY>
 
  <!-- Identifier of the program providing the source audio -->
  <TENC>My SHOUTcast Source v3.L337</TENC>
 
  <!-- Provides name of the DJ connection if changed whilst streaming is active -->
  <!-- This is primarily for the Transcoder to pass on DJ information to the server -->
  <DJ>Super</DJ>
 
  <!-- A related page which a client can use to get or show additional information -->
  <!-- about the currently playing song as v1 clients were able to access previously -->
  <URL>http://www.aol.com/backinussr.html</URL>
 
  <!-- Lyricist and / or text writer. There can be more than one of these present -->
  <TEXT>Oscar Hammerstein</TEXT>
  .
  <TEXT>Lorenz Hart</TEXT>
 
  <!-- Content group description -->
  <TIT1>Concept Album</TIT1>
 
  <!-- Subtitle -->
  <TIT3>Those Ukraine Girls</TIT3>
 
  <!-- Musical key of song -->
  <TKEY>C#</TKEY>                     
  <!-- Language in audio described as 3 letter code based on the ISO-639-2 format.
       There can be more than one of these present -->
  <TLAN>eng</TLAN>
  .
  <TLAN>sve</TLAN>
 
  <!-- Length of the audio in milliseconds -->
  <TLEN>60000</TLEN>
 
  <!-- Information about the source media type -->
  <TMED>(VID/PAL) four channel</TMED>
 
  <!-- Original album, movie or show -->
  <TOAL>The White Album</TOAL>
 
  <!-- Original file name -->
  <TOFN>bit_ussr.mp3</TOFN>
 
  <!-- Original lyricists / text writers. There can be more than one of these present -->
  <TOLY>Jean Paul Satre</TOLY>
  .
  <TOLY>Ayn Rand</TOLY>
 
  <!-- Original artist. There can be more than one of these present -->
  <TOPE>Bruce Springstein</TOPE>
  .
  <TOPE>E Street Band</TOPE>
 
  <!-- Original release time -->
  <TDOR>
    <year>2008</year>
    <month>12</month>
    <day>25</day>
    <hour>13</hour>
    <minute>24</minute>
    <!-- If this is missing then assume it is UTC -->
    <zone>Z</zone>
  </TDOR>
 
  <!-- Owner or licensee of the file -->
  <TOWN>Ascap</TOWN>
 
  <!-- Additional performer information -->
  <TPE2>Featuring Yoko Ono</TPE2>
 
  <!-- Name of conductor -->
  <TPE3>Zubin Mehta</TPE3>
 
  <!-- Who did the remix or other interpretation of the original -->
  <TPE4>Sonic cleanup corp</TPE4>
 
  <!-- Part number (2 of 4 discs) Note: total is optional -->
  <TPOS total="4">2</TPOS>
 
  <!-- Name of label or publisher -->
  <TPUB>Capitol Records</TPUB>
 
  <!-- Track number (e.g. 6 of 24 tracks). Note: total is optional -->
  <TRCK total="24">6</TRCK>
 
  <!-- Name of the internet radio station from where this is being broadcast -->
  <TRSN>Neils Radio Station</TRSN>
 
  <!-- Owner of the source internet radio station -->
  <TRSO>Neil Radisch</TRSO>
 
  <!-- Size of the audio in bytes (excluding ID3v2 tag) -->
  <TSIZ>3456627</TSIZ>
 
  <!-- International Standard Recording Code (ISRC) (12 characters) -->
  <TSRC>FRZ039800212</TSRC>
 
  <!-- Settings used for encoding -->
  <TSSE>Profile 16</TSSE>
 
  <!-- Custom text field. There can be more than one of these
       present but each must have a unique id attribute -->
  <TXXX id="contents of neils garden">
    Roses, grapes, raspberries, blackberries
  </TXXX>
  .
  <TXXX id="contents of peters garden">
    beans, kale, raspberries, blackberries
  </TXXX>
 
  <!-- Commercial information url of where the thing can be bought from.
       There can be more than one of these present -->
  <WCOM>http://www.aol.com/getstuff.cgi</WCOM>
  .
  <WCOM>http://capitol.com/store.cgi</WCOM>
 
  <!-- Website with copyright and terms of use information -->
  <WCOP>http://www.aol.com/dontstealmusic.html</WCOP>
 
  <!-- Website specific to the audio track -->
  <WOAF>http://www.aol.com/backinussr.html</WOAF>
 
  <!-- Artist / performer webpages. There can be more than one of these present -->
  <WOAR>http://www.aol.com/paulmccartney.html</WOAR>
  <WOAR>http://www.aol.com/johnlennon.html</WOAR>
  .
  .
  <!-- Audio source web page -->
  <WOAS>http://www.aol.com/beatles.html</WOAS>
 
  <!-- Radio station web page -->
  <WORS>http://shoutcast.com/search.cgi?Neil</WORS>
 
  <!-- Website that allows you to pay for the file -->
  <WPAY>http://get.com/pay.cgi?backinussr</WPAY>
 
  <!-- Publishers web page -->
  <WPUB>http://ascap.com</WPUB>
 
  <!-- Custom url field. There can be more than one of these
       present but each must have a unique id attribute -->
  <WXXX id="backup singers">http://yoko_ono.com</WXXX>
  .
  <WXXX id="engineers">http://george_martin.com</WXXX>
 
  <!-- List of people involved in audio track.
       There can be more than one of these present -->
  <IPLS role="drummer">Ringo Starr</IPLS>
  .
  <IPLS role="janitor">John Smith</IPLS>
 
  <!-- Table of contents from the CD. The content is base64 encoded -->
  <MCDI>000000000......</MCDI>
 
  <!-- Lyrics. There can be more than one of these present -->
  <USLT language="eng" id="verse 1">
    yeah yeah yeah oh baby yeah yeah yeah
  </USLT>
  .
  <USLT language="eng" id="verse 2">
    oh oh oh baby baby baby
  </USLT>
 
  <!-- General binary encapsulated object.
       There can be more than one of these present -->
  <GEOB mime="application/octet" filename="foo.bar" id="whatever">02305310</GEOB>
  .
  <GEOB mime="virus/binary" filename="kill.computer" id="die kitty">02305310</GEOB>
 
  <!-- Play counter -->
  <PCNT>45678</PCNT>
 
  <!-- Popularimeter -->
  <POPM>
    <email>nradisch@panix.com</email>
    <rating>35</rating>
    <counter>5463767</counter>
  </POPM>
 
  <!-- Binary private data. The content is base64 encoded.
       There can be more than one of these present -->
  <PRIV id="huh">2342512370</PRIV>
  .
  <PRIV id="huh2">34343434</PRIV>
 
</metadata>


Extended Specification Details


The main section of the xml file is to provide metadata information from the source media as has been shown in the previous section. In additon to this, there is an extension for the information provided which is optional and is for providing title information for the media to follow the currently playing source media.

The extended section goes instead the <metadata/> block and is formatted as follows:

<extension>
  <!-- Title of the currently playing media -->
  <title seq="1">Artist - Title</title>
  <title seq="2">The Next Artist - The Next Title</title>
  .
  <!-- Titles go upto the maximum number of titles the source knows off -->
  <title seq="XX">The Last Artist - The Last Title</title>
 
  <!-- Title of the next item to be played -->
  <soon>The Next Artist - The Next Title</soon>
</extension>


There is not a limit on how many titles can be sent though there is not much benefit of sending many titles as the SHOUTcast server is unlikely to use more than the current and the next titles. However the means to provide more titles is provided if there is a need.

The format of the titles after seq=1 need to be in the form of 'artist - title' where it is possible to format and provide the titles in this manner, otherwise just 'title' will suffice.

Where possible titles for seq=1 also need to be provided in the same manner as 'artist - title' though the DNAS will typically ignore this and generate the title to show clients based on the actual metadata passed. However it may resort to using this if there is an issue with the metadata provided, so ideally this title should be providing the fully formed version as well as for all other titles provided (as mentioned previously).

The <soon/> block must correctly report the title of the next item to be played otherwise it must be not specified in the xml if it is not known or cannot be reliably obtained e.g.

Using older versions of Winamp (prior to v5.61) in shuffle mode, there is no ability to query the next item to be played and so <soon/> would not be set.
Using any Winamp version not in shuffle mode, then it is possible to calculate the next item and so <soon/> can be set.


Additional Notes


The TDRC (recording time) field is used to replace the following date specifiers from the ID3v2.3 tag if found - TDAT, TYER and TIME. This is done to form 'yyyy-MM-ddTHH:mm:ss'. Additionally, if no TYER is found but TRDC is then the TYER field will be generated from the TRDC field for backwards compatibility.

The TDOR (original release time) will be created from the TORY (original release year) if the TORY field is read from the file, otherwise no other mapping of these fields happens.


Suggested Fields To Support


If an xml file is able to be created, then the minimum which can be provided is the TIT2 entry due to this being the most important information used throughout the SHOUTcast 2.0 system especially when working with legacy client connections. The following example xml shows the minimum which can be provided and is equivalent of the typical v1 style of the metadata formatted as 'artist - title' :

<?xml version="1.0" encoding="UTF-8" ?>
<metadata>
  <TIT2>Song Title - Song Artist</TIT2>
</metadata>


The typical metadata fields expected to be available from sources, though not guaranteed as the information may not be available from the source media are as follows:

Tag Name Description
TIT2 Title
TALB Album
TPE1 Artist
TYER Year
COMM Comment
TCON Genre
TRSN Stream Title
WORS Station Website
TENC Identifier for the Source e.g. SHOUTcast Source DSP Plug-in v2.1.3 042


It can be noted that this is similar to the information available from an ID3v1 tag with some stream related additions. However this is just a recommendation which provides the client some greater flexibility over the handling of the stream metadata. If not then the least number of fields to support should ideally be TIT2, TCON, TRSN, WORS and TENC.


The following example shows a complete xml metadata response with the suggested fields (excluding the Extended Specification Details):

<?xml version="1.0" encoding="UTF-8" ?>
<metadata>
  <TIT2>I Was Made For Lovin&apos; You</TIT2>
  <TALB>The Very Best Of KISS</TALB>
  <TPE1>Kiss</TPE1>
  <TYER>2002</TYER>
  <COMM></COMM>
  <TCON>Hard Rock</TCON>
  <TRSN>My Radio Station</TRSN>
  <WORS>http://www.shoutcast.com</WORS>
  <TENC>SHOUTcast Source DSP v2.3.1.182</TENC>
</metadata>


General Comments

Anyone using the xml file should not fail if tags appear in it which have not been listed in this document. In situations where this does happen then these extra tags should just be ignored. Some of the tags not considered in this version are:

Tag Name Description
MLLT MPEG audio lookup tables for seeking
SYTC Synchronized tempo codes (table of tempo changes in music and how)
SYLT Synchronized lyrics
RVAD Relative volume adjustment
EQUA Equalization
RVRB Reverb
RBUF Recommended buffer size
AENC Audio encryption
LINK Linked ID3v2 data
OWNE Date of purchase
COMR Commercial purchase offers
ENCR Encryption method registration
GRID Group identification registration


GEOB - General Binary Glob


Properties:

   mime - mime type
   filename - associated filename
   id = text identifier
   Data is base64 encoded


APIC - Picture Data


IMPORTANT NOTE: Support of APIC in the xml file is now deprecated as of March 2011 and is instead provided as an in-stream packet of its own instead of in this.


ETCO - Event Code Field


The ETCO tag has a format property where the supported values are:

   0 - absolute time in MPEG frames
   1 - absolute time in milliseconds

The ETCO tag can have one or more event sub-tags. The type property for the event tag is the type of event we're interested in (see ID3v2 docs for list of codes). The time property is the time the event occurs in units indicated by the format property of the outer ETCO tag.


TCON - Genre Field


TCON has a complicated internal format which consists of a series of optional genre codes stored in parenthesis which are followed by subgenre clarification strings (though everything is optional) e.g.

   (24)Death Metal
   (12)(24)Cuban
   (15)
   Die Kitty Die
   (24)Death Metal(12)Cuban

There are also two special 'codes' where 'RX' means remix and 'CR' means cover.

   (24)(MX)DeathMetal
   (CR)Whatever

Due to genre being something we care about it is parsed as indicated in the example xml.


TMED - Media Type Field


The TMED field has a somewhat complicated internal format in that it can be just a string or it can be a media reference from a predefined list with a refinement e.g.

 From my album collection
 (VID)
 (VID)Stereo
 (TT/45)From my old 45 collection

This field is not currently as important to be parsed out as with the genre (TCON) field but is provided for a more complete set of information.


Non-MP3 Field Mapping

This section covers the mapping of metadata for files other than MP3 as is supported by the the SHOUTcast 2.0 tools.


AAC


If an ID3v2 tag is found then handling will follow the standard MP3 handling. If there is no tag then metadata guessing will be used as appropriately.


FLAC


If there are any Vorbis comments found in the FLAC file then the following mappings will be used to get an equivalent complement of metadata to match what is read from an ID3v2 tag:

Vorbis Comment ID3v2 Entry Vorbis Comment ID3v2 Entry
TITLE TIT2 VERSION TPE4
ALBUM TALB TRACKNUMBER TRCK
TRACK TRCK TOTALTRACKS TRCK
ARTIST TPE1 PERFORMER TPE2
COPYRIGHT TCOP LICENSE TOWN
ORGANIZATION TPUB ORGANISATION TPUB
GENRE TCON DATE TDRC
ISRC TSRC ALBUMARTIST TPE2
ALBUM ARTIST TPE2 COMMENT COMM
COMPOSER TCOM PUBLISHER TPUB
DISCNUMBER TPOS DISKNUMBER TPOS
DISC TPOS DISK TPOS
TOTALDISKS TPOS TOTALDISCS TPOS
BAND TPE2 LYRICS USLT
CONDUCTOR TPE3 ENCODING SETTINGS TSSE
ENCODER SETTINGS TSSE ENCODERSETTINGS TSSE
BPM TBPM RATING POPM
COVERARTMIME APIC COVERART APIC


Fields which are not in this list are then mapped to the custom text (TXXX) field with the key being the "description". Any picture metadata in the file will be mapped to the APIC field which is then transmitted in its own in-stream packet instead of in the xml.

OGG


OGG files are handled in a similar manner to FLAC files though there are some differences with the them. As there is no picture metadata in OGG files the COVERARTMIME and COVERART fields will be mapped to the APIC field due to a number of programmes which generate and adding artwork to OGG files in this way.