--- tags: Session3 --- # Session 3: Becoming a Data Detective What does it mean for data to be "good?" Is there such a thing as "bad data?" If so, how do we recognize the difference? In this Session, we'll cover these questions by introducing the concept of "metadata." We will then provide more resources for those looking to take their mapping skils to the next level! ## Metadata #### To become data detectives, we have to be readers—and writers—of metadata. Metadata provide information about data. Recording information about datasets is as important as the dataset itself. Recall the questions we first asked of data in Session 1: Who collected the data? For what purpose? What methods were used? When did they collect the data, and when was the data last updated? How was the project funded? These are all questions that can be answered in metadata. This information is vital for understanding how you can and cannot use the datasets. If it exists, metadata will often be on the page where you downloaded the dataset or in a separate textfile that you get when you download the dataset itself. If you are making your own dataset, it is your resposibility to create and share the metadata with others! Be sure to give context to your created data so that others may use it with the correct intentions. ### Metadata on the LMEC Public Data Portal The LMEC Public Data Portal has a variety of unique metadata available which we will walk through the various sections below using the Internet Access dataset as an example. #### General Overview When you enter a page of a dataset on the portal, this section will appear at the top. It includes an overview of where the data comes from, what information it contains, and who can access the data. This info is what you might find on any standard data portal. However, the LMEC portal also offers more comprehensive and critical metadata to consider. ![Reference Link](https://i.imgur.com/Lg1YGZ3.png) ###### Data Set General Overview --- ##### Can You Trust This Data One unique feature of the LMEC public data portal is the "Can You Trust This Data" section evaluated by LMEC Data Arhivist Belle Lipton. These evaluations aim to understand the value and limitations of the datsets that live on the portal. Will using this dataset cause harm to those it is mapping? What is the intent behind the datasets? Data documentation is increasingly important to understand where the data is coming from. ![Reference link](https://i.imgur.com/FzhkeDe.png) ##### Data Lifecycle Section Another unqiue entity you will find on the data portal is individuals involved in the data lifecycle, both creation, maintaining and cleaning. Here you can even find a record of the decisions made by the data creator along with how often the data is updated. This crucial information helps you to make decisions as to what you can apply this data to and to what extent it is valuable for your own purposes. ![Reference Link](https://i.imgur.com/J23QqJM.png) --- ### Bad Metadata? So you've just seen examples of good comprehensive metadata that can help put the data in context. What would a bad example look like? Take a look at another dataset on the points where WiFi is available in Boston. Look at the difference and the sections that are missing. Clearly, there is less information. While the Data Portal strives to provide users with the most information and metadata possible, it does not just make up data! The metadata is missing here because the original source did not provide insight as to how the data was made, collected, and documented. Remember portals are just an instrument used to find data and any info that may be associated with it - it is up to you to evaluate the good data from the bad! ![Reference Link](https://i.imgur.com/LS1SLT3.png) ##### Wicked Free Wi-Fi, an example that lacks metadata in comparison to the first dataset we looked at. This idea of evaluating data and the metadata known about it brings up the idea of **data transparency**, how easy it is to understand where the data is coming from and how it was made. While the first Internet Access Every data set had comprehensive metadata this Wicked Free Wi-Fi dataset does not. When you notice a dataset with bad metadata, exercise caution when using the dataset to evaluate both its merits and especially its limitations. You wouldn't to use an unknown book in your journal article! The same principle applies here! ## From Metadata to Missing Data While we may have missing metadata and lacking data transparency, there is even another category of things that are missing - missing data. These are topics and issues that there are very little to zero known datasets about and people thus far have rarely mapped them. Artist Mimi Onuoha has compiled her thoughts on missing data which you can read [here](https://github.com/MimiOnuoha/missing-datasets). She writes the following: #### Data do not show us a complete picture of the world. Increasingly the concept of "missing data" has been embraced by academics and practitioners to acknowledge that there will always be missing information in even the best datasets. This suggests that while data gives us the power to see things and bring to light new issues, just because a certain set of phenomena has never been structured into recorded observations does not mean that phenomemon is not important. Centuries of societal norms and thought practice have built up into what is mapped and what isn't. We hope this course empowers you to identify what isn't being mapped and what you can do about it! However, sometimes there are strategic reasons for missing data such as privacy and safety of those being coutned. ## In-class exercise and Discussion: Looking at Spatial Data in geojson.io Navigate to the Internet Access Data Set on the open data portal by using this [link](https://lmec-data-portal-dev.netlify.app/#/catalog/dkhm2yhrb). Download the GeoJSON file from the page and save it to your computer. Then bring it into geosjon.io by navigating to Open -> File and then locate the file. Notice what appears. What is visually different between the Internet Access Data and the Boston City Data? Can you see patterns or notice key features of the data in this visual representation? * Breakout rooms: looking at the Internet Access datasets in geojson.io * Discussion about information in the Internet Access data vs. Boston city data using Public Data Portal ## Where can we go from here? In this course, we have worked from the ground up introducing what a map and data are and how to critically evaluate and use them in your daily life. This extends to content that you see in the news to when you want to make your own mapping project! Here are some tips we have when planning out your next data project. Ask yourself the following questions to direct your thinking process! 1. **Start with a question** - what question do you hope to answer? What topic are you interested in learning more about? 2. **Identify data** - plan out what spatial and attirbute data you will you need to bring your idea to life, what resources might have the data you are looking for? 3. **Collect Data** - find the data! Go download, collect, or even create the datasets you need for the project. Remember to record where you found the data and keep the context with the data 4. **Map the Data** - create the map using digital software. While we did not cover this in our sessions, there are many great tutorials and resources on how to create maps with free online software. These include ARCGis Online, Carto, and QGIS just to name a few. 5. **Communicate Findings** - take action! Once you have created your map, use it to tell a story. Put together a narrative to share your story and/or findings wiht others. ## In Class Discussion Briefly go around the room and share 1-2 takeaways from the course that you might work into your everyday life. We acnknowledge that everyone has different backgrounds and opportunities to apply these ideas. If you already have a project in mind, feel free to share that as well! ## Further Resources We hope you enjoyed the course and seeing data in new, critical ways. If you would like to continue learning about critical cartography, mapping, and data here are some more resources you can explore! * [Leventhal Map Center Tutorials](https://geoservices.leventhalmap.org/cartinal/guides/) - More amazing tutorials compiled by the team at LMEC which cover GIS basics and more( * [Data Feminism](https://datafeminism.io/) - A fascinating read about data and the structures is has been governed and shaped by