### TODO: - Sam: we should do one final join on all our dataframes and publish the result on humanities commons - Helen: Continue work on age plots, separated by gender - Luke: Merging of dataframes, also can look into how different professions choose different genres when this is done if you like! # Visualising Music and People on BBC Radio 4's Desert Island Discs programme Exploring notable people and the music that tells the stories of their lives, from 1942 to 2021. *Tags: Visualisation, Networks, Music, Celebrity, Wikipedia, Spotify* **Authors** * Helen Duncan * Bill Finnegan * Luke Hare * Camila Rangel Smith * Sam Van Stroud **Revewiers** * Ed? * Mishka? ## Introduction ["Desert Island Discs"](https://www.bbc.co.uk/programmes/b006qnmr) is a long-running BBC radio programme in which guests are hypothetically cast away to a desert island where they can only bring eight songs (as well as a book and a luxury). Interviewed by the host, currently Lauren Laverne, each castaway shares the story of their life through music. The programme has become an invaluable archive of notable people and musical tastes over the past eight decades. This data story combines a new public dataset of all the guests and their musical selections extracted from the BBC archive with open datasets on people and music. We begin by exploring aspects of the people and music over time. We also investigate the relationships between people and music through network analysis. ## Background ### Desert Island Discs In 1942, a new programme appeared on the airwaves of the BBC that was created and presented by Roy Plomley. The format was simple – a guest shared eight songs that they would want to be stuck with for the rest of their life if exiled to a desert island. After a hiatus from 1946 to 1951, the programme returned with some new innovations: at the end of each episode the castaway chooses a book, a luxury, and one song to save from the waves. The show was hosted by Plomley until his death in 1985, followed by Michael Parkinson, Sue Lawley, Kirsty Young, and Laverne. Now considered a [cultural touchstone](https://www.newyorker.com/culture/cultural-comment/join-me-in-my-obsession-with-desert-island-discs), there have been *[#]* episodes of "Desert Island Discs" as of summer 2021. An invitation to be castaway is a sign of success in your field, whether that is business, sport, art, academia or government. Most episodes feature a single castaway, but sometimes multiple guests are castaway together, especially double acts like Morecambe and Wise or Ant and Dec. *[#]%* of castaways have appeared on the show more than once, with national treasure David Attenborough sharing a record four appearances with comedian/actor Arthur Askey. ### Data Sources [Andrew Gustar](https://twitter.com/andrewgustar), who conducts statistical research into music history, compiled a dataset of castaways and discs from the BBC Radio 4 archive. This data, which is now [available online](https://hcommons.org/deposits/item/hc:37503/), was the basis for Andrew's [analysis](https://www.musichistorystats.com/desert-island-discs/) exploring the most popular songs and artists, as well as the gender of the castways. This data story includes information about an additional 60 episodes from February 2020 to August 2021 (gathered from [Wikipedia](https://en.wikipedia.org/wiki/List_of_Desert_Island_Discs_episodes) and the [BBC website](https://www.bbc.co.uk/programmes/b006qnmr/episodes/player)). It also combines the episode data with additional information about people and music. For people, we are using [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page), a repository of structured data from Wikipedia. For music, we are using [Spotify](https://developer.spotify.com/) to get genre and other information about selected tunes. Rather get into the details of how we pulled this dataset together, we want to keep this data story focused on the data analysis. The source code for the `wikipeople` package is available [on Github](https://github.com/samvanstroud/wikipeople), and the package is also on [PyPI](https://pypi.org/project/wikipeople/). It also takes a few hours to run all the requests for all the castaways and discs in our dataset, so we’ve done this once, and published the resulting dataset. >*digital humanities link* ## Setup ``` import pandas import blah ``` To start, we load the dataset described above. ``` # read the dataset that has been augmented with Wikipedia information pd.read_csv(<humanities_commons_url>) ``` ### Episodes For our first look at the data, we'll plot the episodes per year. - [ ] Number of episodes per year - include host changes on the x-axis You can immediately see the long run of the programme, minus a small hiatus after WWII. This plot also illustrates the long run of Plomley until the mid-1980s, the vast majority of episodes compared to the other four hosts. ## Castaways So who exactly gets invited to be on the show, and how have the guests on the show (castaways) changed over time? From Wikidata, we have pulled castaway profession, gender, country of citizenship, country of birth, and date of birth. The nature of Wikipeda and Wikidata mean that any or all of these fields may be missing for a given castaway, but we'll have to make do. ### Gender We can start by writing some functions to aggregate episodes in the same year, and plot summary metrics per year. ``` <documented code for the time plots> ``` Now we have these functions to make the plots we need, we can first take a look at the distribution of gender through time. ``` <show gender plot> ``` There are a few things that are immediately interesting. The first is that, throughout the bulk of the show's long history, there have been significantly more male castaways than female. Around 1950, it looks as though there was a much more equal split between men and women. However, this conclusion is drawn a bit hurridly. If we look at the number of episodes per year around this time, we see that the show was actually off the air for several years around this period. There was also a year (1946), where only one episode aired, and the castaway was female. As we are giving each year equal visual weight with our plot, the show looks more balanced in terms of gender at it's inception that it probably was. From the beginning of the show, up until the early 2010s, there were about 3/4 male castaways and 1/4 female castaways. From about 2013(?check year), we see a qualitative shift, and the split between males and females quickly becomes much more even. The suddenness of this change suggests it was a active decision taken by the producers of the show. The fact that this happened relatively recently is perhaps surprising. ### Age With the change in gender balance, one might expect the age of castaways to have changed over time. For example, might recent guests on the show be younger to appeal to a younger demographic of listeners? Using the date of birth of the castaway from Wikidata, and the air date of the episode, we can calculate how old the castaways were when they were on the show - [ ] [name=Helen] Oldest and youngest castaways - Put this somewhere that fits in the narrative ``` <code to print oldest and younest info ``` As we can see there are X people under the age of Y and Z people over the age of ?. Do you recognise any of the names? - [ ] [name=Helen] Age distribution of castaways. (If the time plot is not interesting, just integrate over time) Going a bit deeper, we can plot the full age distribution of castaways, integrated over time. - discuss the plot a bit - discuss difference/similarities between male/female age distribution A great example of how the audience and guests are aging together is castaway Sir Cliff Richard. The pop singer [first appeared](https://www.bbc.co.uk/programmes/p009y6z4) on the show at the age of 20 in 1940 (favourite song: Rock Around the Clock by Bill Haley and his Comets, book: The Swiss Family Robinson by Johann Wyss, luxury: guitar) and returned for the 2020 [Christmas epsiode](https://www.bbc.co.uk/programmes/m000qhg8) at the age of 80 (favourite song: It Is Well by Sheila Walsh Featuring Cliff Richard, book: Wuthering Heights by Emily Brontë, luxury: a Gibson acoustic guitar). ### Nationality and country of birth With the information from Wikipedia, we can also track where the guests are from. - show (one of) the plots - talk about how dominant the UK is - any trends For a BBC radio programme, perhaps it isn't a surprise that most castaways are from the UK, with other English-speaking countries with cultural connections (Australia, USA) being very prominent. In fact, the large number of castaways from the USA from the start of the show might make sense in the context of castaway profession. ### Profession From Wikipedia, we have information about professions, but the data isn't all comparable without some initial work. (Should we include any random examples?) - talk about profession "heirarchy" obtained from wikidata - clustering approach taken - show the plots - briefly discuss trends *maybe don't do these:* - [ ] Which profession is the most/least equal in terms of gender. - [ ] which professions have the oldest / youngest people. The professions most frequently represented on Desert Island Discs are people who work in the creative industries - actors, musicians, television personalities. While this makes sense, as it is an entertainment programme with a foundation of music, we were surprised not to see more of a balance between the many professions of "the great and the good". ## Discs In the dataset we introduced in the beginning, we also grabbed information about the songs shared by each castaway. Based on Andrew's initial analysis, we know that classical music dominates the musical selections, and we want to dig a little deeper into the types of music chosen over time, as well what types of people choose different genres building on the analysis above. From Spotify, we can get a list of genres associated with the musical artist and additional info and analysis for each song (for example, danceability). It turns out Spotify has an overwhelming number of microgenres - see Glenn McDonald's everynoise.com for a mind blowing micro-genre experience - and we need to do some work to get a small number of the most common and most easily recognised genres. Plots: - [x] Popularity of different genres over time (in particular declining trend of classical music) - [ ] Plot other spotify info over time, if interesting? ## Castaways + Discs As one final dimension of this story, we want to look at both people and the music that they chose together. Popularity of different genres by different person type (gender, profession, etc). - [ ] Does genre preference depend on profession? Could try looking at: - [ ] do scientists tend to prefer different styles of music to creatives? - [ ] Do older/younger people tend to choose different genres Instead of a deep dive into polticians, we could group everyone into [creative, non-creative, polticians] (maybe just [creative, non-creative] if polticians are not different). - [ ] Compare genre popularity by these groups Non-technical - intro to why we are looking closely at politicians. How do their music tastes map to general popularity? How different are politicians of different political parties? There has been reports that politicans have focus groups to choose their tracks for them, so as to represent themseles in a certain way. Is there any evidence of this in the data? - Look into this, is it true? ### Network Analysis > Technical: Description of nework analysis for people. Code. > * Network of castaways as nodes and edges as common artists (coloured by profession or time period) >* Investigate comunities in the network Todo: - [ ] Deal with people who have been on the show more than once. Plots: - [ ] Are network clusters associated with certain genres - [ ] Who chose the most weird combination of tracks (e.g. who has the most number of tracks in the "other" genre category)? - [ ] who has the least overlap with others in terms of track choice? conversely, who has the most bland choice? Non-technical: Translate what the network analysis tells us about connections between people and music. ## Discussion In this data story, we've been looking at some of the trends of people and music on Desert Island Discs. As a mainstay of the BBC Radio schedule, it likely will continue for many decades to come, and this analysis could be continued at future milestones. Any caveats? For example: % of people we didn't find in wikipedia, % of songs we didn't find in spotify? ### Invited back to the island As a listener, each Friday and Sunday morning we are invited back to the island to hear someone's life story through music. But there are some very special guests who are invited back more than once. We close this story by looking at environmentalist and broadcaster David Attenborough's four episodes. As a fresh 30-year-old TV presenter for his first appearance with Roy Plomley ([06/05/1957](https://www.bbc.co.uk/programmes/p009y8xh)), his first disc was Trouble in Mind by Northern Irish blues singer Ottilie Patterson and Chris Barber Jazz Band. When he returned at the age of 52 ([10/03/1979](https://www.bbc.co.uk/programmes/p009mxny)), he requested to take the book Shifts and Expedients of Camp Life by William Barry Lord to the island, which he consistently requested for his following two appearances. Sue Lawley welcomed back a 72-year-old Attenborough ([25/12/1998](https://www.bbc.co.uk/programmes/p00942qy)) who asked for the luxury of a guitar, having previously taken a piano and binoculars to the island. And finally, for the [70th Anniversary Episode (29/01/2012)](https://www.bbc.co.uk/programmes/b01b8yy0), Kirsty Young invited him back, now 85, where the track he saved from the waves was the 3rd of Johann Sebastian Bach's Goldberg Variations. With any luck, Lauren Laverne will invite Sir David back again soon, at which point he will have the record for the most episodes all to himself. As he reflects once more on his life and favourite songs, perhaps he will talk about his evolution into a trusted voice on climate action. Hopefully, the rising seas of climate change don't threaten this island haven of music and life stories.