# Block 3 Assignment By Mattias Johansen ### Choice of data I chose to create a dataset about the books in my bookcase, a piece of furniture that is both a practical and an aestetic part of my living room. The aesthetic part has become highlighted through the sorting of books by the color of their spines rather than by name or author: ![](https://i.imgur.com/KGlwWgR.jpg) So I wanted to record the colors of the books and these 7 categories: read (yes/no/partly), fiction/non-fiction, library category, language, translated or not, nationality of author and year of first publication. There were so many catergories that I could have recorded, but I chose these because they would maybe help show how broad the content of the collection was globally and historically, while including some of the physicallity of the books, namely the colour of the cover/spine. The aspect of whether I had read the books was in order to achieve reflectiveness about the books I choose to read, or rather have not read even though they are in front of me. ### Data presentation I wanted to use Tableau for my presentation because I wanted to learn how to use a data-visualization program, but it was difficult to figure out how to use it, since there are so many functions. My goal for the presentations became to be easily understandable, but offer multiple dimensions of data, while still having an aestetically pleasing quality to them. I wanted to use the spine color dimension to achieve a kind of "rainbow effect" while still inquiring the data. So I tried to sketch some different things with the different visualisation options like this one: ![](https://i.imgur.com/omqjmf3.png) It was kinda pretty, but the data are not very well conveyed. So I tinkered some more and created a bar chart of the books that I have(n't) read, by color divided into fiction and non-fiction: ![](https://i.imgur.com/8zWukAV.png) I think this visualisation came out better since it created an overview of all books that shows several core insights, while the graphics are still looking a bit like the items that the data came from (even though Tableau wouldn't let me have much freedom with the choice of color, hence the grey "white"). Some insights from the presentation are: 1. The collection contains a bit more fiction than non-fiction 2. I have finished far more fiction than non-fiction books. 3. I haven't read most of the books in my bookcase. 4. The spine colors seems to be divided quite equally in both fiction and non-fiction, though the few purple books are only in fiction and white seem to be more favored in non-fiction. The second insight is probably partly indicative of how I read non-fiction, but probably also the nature of much non-fiction litterature: Some might be reference works that don't makes sense to read from A to Z. Indeed, it probably says more about the category of fiction; the fiction books are mainly novels which culturally encourages (or expects) the full reading of the book. --- Another sketch is a map that represents the nationality of the authors of the books, the amount of books and how many of the total amounts of the books I have read from authors of that nation. ![](https://i.imgur.com/hv0iPtt.png) The most interesting thing about this world map-presentation is that is has been manipulated to look like the collection is more "worldly" than it actually is, by editing the size: ![](https://i.imgur.com/gdYXEph.png) The nations that only has 1 data record, which is the faraway nations such as Japan and South Africa, have been enlarged so it looks like a relatively more significant amount. I did it because otherwise the dots were so small it would be dificult to see at all, but also to explore how easily the visualisation could be manipulated. To compare, I created a map displaying the same data but zoomed in on Europe, to display the acurate relative representation of size: ![](https://i.imgur.com/hhQguQ8.png) Other than the expected domination of Danish authors, I think this visualization shows the sphere of cultural influence upon a Danish readership well: There are a number of Norwiegan and Swedish books, since these are close nations culturally and linguisticly. At the same time the British (and American) books are also quite influencial. That I have read a large part of the books I own from these English speaking nations reflects the cultural significance of learning (English as) a second language and thus being able to enjoy a larger amount of content, not having to rely on translations. I looked deeper into this by producing this graph: ![](https://i.imgur.com/YCDPwUd.png) It becomes clear that I have preference somewhat for the foreign litterature in my bookcase, and increasingly so if they are in English compared to the number of total amount of non-translated books in English. ### Analysis of the work In the categorisation process I became quite aware of the significance and power of choosing which categories should be included in my excel file. I pondered on a lot of different categories, but by ultimately chosing those eight I was at the same time excluding other aspects of the books that could reveal interesting things about the collection. I was definitely choosing things that could be easily determined - or so I thought. I had chosen nationality as a parameter, but it is not a such a fixed attribute as publication year, for example. While I could assume the nationality on most (especially Danish) authors that I knew or from the name, most foreign authors I had to google to be sure. In the case of the author Jung Chang, whose novel *Wild Swans* is about her life in China, this research revealed knowledge that conflicted with my initial data entry as "China". According to wikipedia she is a British citizen, but "Chinese-born" and her books are banned in China. In what respect should the data be recorded in category nationality: natively, legally, geographically or culturally? Is it possible to be two things at the same time. It also became clear why Gray et al. (2016) posits following question when studying the data used in data visualizations: > "What are the rationales, methods and standards in the data infrastructures through which the data is generated?" (p. 298) Some datafied characteristics can be subject to differing standards and the standard used should maybe be stated in some presentations. At the same time, encounting this issue is an opportunity for the person creating the dataset to reflect on what is the relevant metric to use. The process of working with the data representation as a way of exploring the captured data by gaining insights from the visuals made me understand the way the design choices can draw attention to and from certain aspects of the data. The notion of "sandcastling" (Hinrichs et al., 2018) made sense to me as I became aware of how the process creating visualizations is very much like an iterative design process, that works with a quite malleable material in different mediations. #### Bibliography Gray, Jonathan, Liliana Bounegru, Stefania Milan, and Paolo Ciuccarelli. ‘Ways of Seeing Data: Toward a Critical Literacy for Data Visualizations as Research Objects and Research Devices’. In Innovative Methods in Media and Communication Research, edited by Sebastian Kubitschko and Anne Kaun, 227–51. Cham: Springer International Publishing, 2016. Uta Hinrichs, Stefania Forlini, Bridget Moynihan. 'In Defense of Sandcastles: Research Thinking through Visualization in Digital Humanities.' Digital Scholarship in the Humanities (DSH), 2018.