## Primer
The average Subvert listener probably isn't concerned with metadata from a technical standpoint, but likely _does_ want to reliably filter their music library by artist or album release date. Metadata in an abstract sense isn't itself valuable, but having clean structured data will let you build features that are useful for fans and enable music clients to give them the best possible listening experience. Specificaly, because Subvert isn't a streaming platform, metadata is the _only_ part of the listening process Subvert actually has the power to influence! The rest is up to the artist.
### MusicBrainz, Bandcamp, and Subvert: A Metadata Comparison
The MusicBrainz database is a CC0 licensed open-data project that aims to be a full collection of humanity's published musical works. You can think of it like IMDB for music, except instead of being owned by Amazon, MusicBrainz is stewarded by a non-profit organization and all the data can be downloaded by anybody instead of being locked in IMDB's website and served with a bunch of ads. Alongside the non-profit status and open data licenses, MusicBrainz first launched in July of 2000 making it both a stable architectural choice and a rich data source with over 25 years of user-generated history to draw from. [A full list of applications that interface with MusicBrainz can be found on their website.](https://musicbrainz.org/doc/MusicBrainz_Enabled_Applications) If you've ever ripped a CD without using iTunes or played music with VLC there's a decent chance that some of your files have metadata or cover art because of MusicBrainz!
Enthusiasts use MusicBrainz or Discogs to add ID3 metadata tags to their downloaded library of songs as this often enables superior search and filtering experiences than are otherwise made possible with the metadata downloaded from digital storefronts. Shown here is a metadata comparison of Vulfpeck's _Half of the Way_ as tagged with data from Musicbrainz compared against what most fans will ever have, downloaded directly from Bandcamp and Subvert.
| **Field** | **MusicBrainz Metadata** | **Bandcamp File Metadata** | **Subvert File Metadata** |
| ------------------------------ | ------------------------------------ | -------------------------------------------------------------------- | ------------------------- |
| **Title** | Half of the Way | Half of the Way (feat. Theo Katzman) | “Half of the Way” |
| **Date** | 2018-12-07 | 2018 | |
| **Artist** | Vulfpeck feat. Theo Katzman | Vulfpeck | “Vulfpeck” |
| **Track Number** | 1 | 1 | |
| **Albumartist** | Vulfpeck | Vulfpeck | |
| **Album** | Hill Climber | Hill Climber | |
| **Comment** | N/A | Visit [https://vulfpeck.bandcamp.com](https://vulfpeck.bandcamp.com) | |
| **Encoder** | N/A | | Lavf59.27.100 |
| **Description** | N/A | | subvert.fm |
| **Originaldate** | 2018-12-07 | | |
| **Originalyear** | 2018 | | |
| **Releasetype** | album | | |
| **Albumartistsort** | Vulfpeck | | |
| **Script** | Latn | | |
| **Label** | Vulf Records | | |
| **Releasecountry** | XW | | |
| **Engineer** | Ryan Lerman | | |
| **Mixer** | Jack Stratton | | |
| **Producer** | Jack Stratton | | |
| **Releasestatus** | official | | |
| **Totaldiscs** | 1 | | |
| **Discnumber** | 1 | | |
| **Media** | Digital Media | | |
| **Totaltracks** | 10 | | |
| **ISRC Number** | TCADW1815705 | | |
| **Artistsort** | Vulfpeck feat. Katzman, Theo | | |
| **Artists** | Theo Katzman | | |
| **Tracktotal** | 10 | | |
| **Disctotal** | 1 | | |
These tags notably only include human-readable names as defined by the ID3 spec. While MusicBrainz' database is all UUID-based, the chances that individual user libraries will run into name collisions is reasonably low, but can happen. Unfortunately there is not a widely accepted tagging method that uses both human and machine readable fields for all names.
These tags were extracted with [Exiftool, a metadata viewer with solid ID3 tag support](https://exiftool.org/TagNames/ID3.html). Non-music file metadata related tags have been removed for clarity. The Subvert export was created by uploading an MP3 file with no attached metadata, entering all available fields in the upload form, and downloading the resulting file.
## Subvert's Current Approach
As part of a release's upload flow, Subvert currently allows artists to enter the following:
- Album credits (arbitrary key-value pairs)
- Track credits (arbitrary key-value pairs)
- Label association (arbitrary string)
- Label catalog number (string)
- Album title
- Track title
- Track number
- Unsynced lyrics
- Key (dropdown)
- BPM (currently a string, should be an integer)
- License (dropdown, missing CC0)
- ISRCs
- Genres (global folksonomy of genres! cool!)
- External links
Regarding data collection, this is already an improvement over other digital storefronts, but has some big implementation pitfalls:
1. Metadata tagging is fractured across two steps of the upload flow
- Instrument / studio musician credits are often per-track _or_ just listed for the full album. These are currently inherited by each track in a separate track editing flow and could be combined into a single credit with the option to specify track numbers instead of entering this information on two seperate screens.
- This appears to have been implemented due to the seperate upload flows for singles and albums. These should be reconsidered and all options should be incorporated into a single upload flow for creating a "release" rather than a "single" or an "album". Let users choose the release type (as they do currently) or determine it for them if they only upload one track.
2. "Label" values are stored only with human readable names.
- Given that labels are an important part of Subvert's future platform features, ensuring that duplication is discouraged is important. We want to be able to authoritatively link to labels!
3. Credit metadata is not structured and therefore cannot be written to files according to tagging specs expected by music players.
- The ID3 tagging spec commonly used for music is _quite rigid_. Subvert needs to know what the tag represents to assign it to the correct tag that music players will expect to see.
- Artists will spend a good deal of time entering this, it should make its way into the end product!
## Recommendations
I believe Subvert's primary metadata-related goals should be as follows:
1. Provide customers with industry leading metadata for downloaded tracks
2. Empower artists to accurately credit collaborators
3. Enable customers to search for and find the music they love
In my view, the purchase flow for listeners and the upload flow for artists are the key launch user experiences that will be integral to Subvert's success. Metadata entry is of course part of the latter; making the form easy to use for artists will translate into an improved end product for fans: A properly tagged, high quality music file.
To enable Subvert, both at launch and in the future, to follow through on these desires, I recommend the following changes:
### Audio File Tagging
Different file formats have different tagging formats. To deal with this unfortunate reality, MusicBrainz uses internal names for their tagging system and [maps them to each different tagging spec](https://picard-docs.musicbrainz.org/en/appendices/tag_mapping.html). This list also uses MusicBrainz' internal names and I would recommend you adopt the same spec. 20+ years of thinking has been put into this already.
"Mandatory" tags are ones that everyone would rightfully expect to be added or are otherwise "free" requiring no additional input from the user. Some of them may be editable in the form.
"Optional" tags should be available in the upload form as non-mandatory fields.
#### Album Level: Mandatory
These tags should be written to every file.
- `album` Album title (e.g.: "The White Album")
- `albumartist` Album artist name (e.g.: “The Beatles”)
- `compilation` Boolean value automatically assigned TRUE if the album artist name is "Various Artists" otherwise set by the user
- `copyright` Set to "YYYY ARTISTNAME", year should be set to `originaldate` year value. This tag should be omitted if CC0 is the selected license.
- `license` License information
- Based on the license selected, the ID3 tag may include an optional `WCOP` metadata entry for a license URL. This can be used to add CC licenses for example.
- `date` YYYY-MM-DD date that the release was uploaded or otherwise released _on Subvert_
- `originaldate` YYYY-MM-DD original date of the release
- Make the user enter a release date in the upload form. If it's in the past set `date` to that of publishing and `originaldate` to the value the user entered
- If there is no difference, set both `date` and `originaldate` to the same value
- `encodedby` Set as "Subvert Co-Operative"
- `encodersettings` String of the settings used to encode the file
- `website` Set as the artist's official website as it appears on their profile, otherwise set as the artist's Subvert URL
- `totaltracks` Set to the total number of tracks on the release
- `media` Set as "Digital Media"
- `releasetype` Dropdown select: "single, ep, album, other"
- Set as "single" and hide from the upload from if only one track
- Some artists release remixes that include the original track as part of the download. I'd still count this as a "single" but due to the download having more than one track we _can't_ simply hide the option if more than one track is uploaded
- Set this to "ep" by default in the upload form if the total album length is less than or equal to 15 minutes
#### Album Level: Optional (artist entered)
These tags should be written to every file if available.
- `barcode` Release barcode
- `language` Language of album title and track titles
- `label` Label name
- `catalognumber` Number assigned to the release by a label
- `genre` Folksonomy! Keep a database of all values, allow the user to select one as they type into a field or make up something new!
- This database should be seeded with the [ID3v1 genre list](https://id3.org/id3v2.3.0#Appendix_A_-_Genre_List_from_ID3v1) with Winamp extensions.
- While previous versions of the ID3 spec stored genres as a one byte numerical value, ID3v2.4 allows for _multiple_ arbitrary genre tags to be written to the TCON tag! This is what we'll be following.
#### Per-Track Metadata: Mandatory
These tags should be written on a per-track basis.
- `title` Track title
- `titlesort` Track title's sort name
- `tracknumber` set to track's number as ordered by the user
- `artist` Track artist name(s) separated with commas, ampersands, or "feat." depending on user selection
- `artists` dupe `artist` data
- `movementtotal` Added automatically by Subvert if `movementnumber` is entered based on the `work` title(s)
- `showmovement` Set to TRUE by Subvert if `movement` has data entered into it.
- Some players will show this instead of the title if checked on. Presumably if the user has entered both they know what they're doing here. This is only used for classical music.
#### Per-Track Metadata: Optional (artist entered)
These tags should be written on a per-track basis if available.
- `bpm` Beats per minute value.
- `key` Key of the music
- `isrc` International Standard Recording Code
- `language` Work lyric language as per ISO 639-3
- `comment` Allows the artist to enter per-track comment details!
- `movement` Name of the movement (e.g.: “Andante con moto”)
- `movementnumber` Movement number in Arabic numerals (e.g.: “2”)
- `work` Used for classical music, name of the overall work (e.g.: “Symphony no. 5 in C minor, op. 67”).
#### Credits
These tags should be written on a per-track basis if entered by the artist. While MusicBrainz supports _many_ other artist relationships, these are the only ones that make it into file metadata and therefore the most important ones for launch.
- `arranger`
- `composer`
- `conductor`
- `djmixer`
- `engineer`
- `lyricist`
- `mixer`
- `performer` Key value pair, "Artist : Instrument / VocalType"
- `producer`
- `writer`
---
### Album & Track Credits
Subvert should implement MusicBrainz' album credit relationships spec for keys, but leave artist name values open and not tied to unique identifiers unless a Subvert user ID is optionally linked. This scope ensures that artists on the platform can be linked to properly from credit fields and that their contributions can be organized for their profile pages but it does not prevent name-collisions when searching. If off-platform credited artists were to be unique (like MusicBrainz) you would need to implement extensive merging, voting, and history tools to allow the community to keep the database accurate should a credit be assigned to the wrong artist or if multiple artist entries are created in error. Limiting unique linked-data to on-platform linked credits keeps management of credits to the album owner and (ideally) the linked artist.
This data structure for credits could be implemented as follows:
```json
{
"name": "Jack Stratton", // string
"subvertUserId": "cmgjysovp0004cyw097599le6", // CUID (optional)
"relationship-type": {
"performer": {
"instrument": {
"instrumentName": "Piano", // enum (required field of instrument credit)
"tracks": [1,2,5], // integer array (optional, will be an album-level credit if unspecified and applied to all tracks)
"attributes": ["additional"], // enum array: additonal, guest, solo (optional)
},
"vocals": {
"vocalType": "Lead Vocals", // enum (optional field of vocal credit)
"tracks": [1,2,5],
"attributes": ["additional"],
},
"orchestra": { // Applied to note the credit of an entire orchestra as one entry, written to metadata as a performer with no instrument
"tracks": [1,2,5],
},
},
"arranger": {
"tracks": [1,3,4],
},
"conductor": {
"tracks": [1,3,4],
},
"composer": {
"tracks": [1,3,4],
},
"director": {
"tracks": [1,3,4],
},
"engineer": {
"tracks": [1,3,4],
},
"djmixer": {
"tracks": [1,3,4],
},
"lyricist": {
"tracks": [1,3,4],
},
"mixer": {
"tracks": [1,3,4],
},
"producer": {
"tracks": [1,3,4],
},
"writer": {
"tracks": [1,3,4],
},
"artwork": {
"artworkType": "Graphic Design", // enum, applies to all tracks, Options: graphic design, illustration, photography. Multiple "artworkType" tags can exist per name credited.
},
},
}
```
Instrument Type should pull from the MusicBrainz instrument tree available in [their data dumps directory](https://data.metabrainz.org/pub/musicbrainz/data/json-dumps/). This should allow artists a _lot_ of flexibility! Relying on it as a dependency will allow you to adopt its extensive and localized data structure without the serious maintnance burdeon, expect this to be updated over time.
Also of note is the less comprehensive [DDEX instrument list](https://service.ddex.net/dd/DD-AVS-CURRENT/dd/avs_InstrumentType.html) for future integration with DDEX standards. I'm unaware of efforts to map values from MusicBrainz' list to DDEX's, though the DDEX standard allows for arbitrary entries.
<details>
<summary>Vocal Types List</summary>
Unfortunately MusicBrainz doesn't make its other categories as easy to ingest.
Here are their sub-categories for "vocals":
- Background Vocals
- Spoken Vocals
- Lead Vocals
- Alto Vocals
- Baritone Vocals
- Bass Vocals
- Countertenor Vocals
- Mezzo-soprano Vocals
- Soprano Vocals
- Tenor Vocals
- Contralto Vocals
- Treble Vocals
- A yong male singer with an unchanged voice in the soprano range
- Meane Vocals
- A young male singer with a voice lower than a treble
- Choir Vocals
- Other Vocals
- Whistling
</details>
### Label Credits
Unlike artist relationship credits, I recommend that labels DO get assigned a unique identifier. There are significantly fewer labels than artists — I think name collision issues are unlikely — and this will allow labels to be assigned this value in the future to map and link their name to their label account. Encourage artists to assign their tracks to existing labels to make the process of adopting them later easier!
### UI
- Encourage artists to upload lossless files, display encoding quality stats on the purchase page according to the uploaded file's quality
- Uploading MP3s or compressed media should be discouraged. This will result in re-compression if the file is converted to another lossy format
- If artists add other featured artists to track titles, suggest that they move them to the credit value
- **The following form validation is required on the front-end to ensure the back-end writes tags properly from the database:**
- no `, or ;` in performer credit fields
<!-- TBD, might be fine with ; seperators -->
- Only one of each of the following credits can be attributed per track `Composer, Conductor, Director, Lyricist, Writer`
More comprehensive UI suggestions for streamlining the upload flow are availble in the accompanying Excalidraw file.
## Future Opportunities
### Search and Discovery
Perhaps the most obvious on-platform opportunity! Giving artists the ability to link to other credited artists that are on the platform is great for letting fans find more work from artists they connect with. The first step toward this should be linking on-platform artist credits in the sidebar.
Beyond linking, cleanly organized structured data is the foundational step toward building a recommendation algorithm. The more accurately tagged dimensions one has to work with, the more specific the recommendation algorithm can be! Given a vast quantity of data, the better the recommendations will become.
### Upload Flow Improvements
Using [Node ID3](https://www.npmjs.com/package/node-id3) you can read and auto-populate as many areas of the upload form as possible if artists upload pre-tagged files! What a time saver!
### Royalty Splits
Linking credits to both names and optionally Subvert user CUIDs could be the start of the royalty split distribution flow. I imagine after entering metadata, on a different page users could be presented with all of the contributors on the album and could choose how to allocate revenue as percentages for each linked artist.
### Future Metadata Additions
We can't always implement _everything_ in one go! Here's a few fields that I thin you can reasonably leave for the future:
- `discnumber` User specified numbered gaps in the album, can be used to deliniate A and B-sides
- [Featurebase issue](https://subvert.featurebase.app/p/disc-numbers)
- This has been preemptively added to the metadata writer
- `discsubtitle` Let users specify a name for their discs
- This has been preemptively added to the metadata writer
- `attachedpicture` [ID3 has different tags for images](https://id3.org/id3v2.3.0#Attached_picture) beyond cover art that can be written into track metadata for back covers, shots of the band, etc
- [Featurebase issue for liner notes](https://subvert.featurebase.app/p/release-liner-notes)
- [Featurebase issue for multiple images](https://subvert.featurebase.app/p/multiple-album-art-images)
---
The [DDEX metadata standards](https://kb.ddex.net/implementing-each-standard/) are a more modern (and more comprenehsive) group of definitions for music metadata spanning artists, releases, labels, and more. When implementing and expanding upon these features in the future, I would recommend consulting their data structures to help enable future interoperability. The [allowed value set dictionary](https://service.ddex.net/dd/DD-AVS-CURRENT/dd/intro.html) is handy to make sure your related data structures will be spec compliant.