2022-06-15 Data Carpentry for the Humanities

Welcome to the hack pad for Data carpentry for the Humanities

You can edit this document using Markdown syntax.

Looking after you today are:

Martin Callaghan
Alex Coleman
Graham Blyth

Getting in touch post workshop

To get in touch with Alex and the Research computing team, send us a ticket via https://bit.ly/arc-help

To get in touch with Martin m.callaghan@leeds.ac.uk

Research Computing Leeds Conference 2022

Sign up ->> https://rescompleedscon.github.io/

Accessing the Notebooks from Day 2

Please let us know your Google account email by filling in this Google form:

Agenda

Day Content
Day 1 am Better data handling in spreadsheets
Day 1 pm Low-code Websites with Github Pages
Day 2 am Introducing Python
Day 2 pm More Python

Better Data Handling

Get the data spreadsheet: https://github.com/iaine/humanities-lesson-data/tree/master/data/spreadsheet

Lesson Notes: https://carpentries-incubator.github.io/spreadsheet-humanities-lesson/

GitHub Pages

Lesson notes: https://carpentries-incubator.github.io/jekyll-pages-novice/index.html

Markdown cheatsheet: https://www.markdownguide.org/cheat-sheet

University of Leeds Logo : https://www.paxman-landscapes.com/wp-content/uploads/2015/02/unileedslogo.png

Python 🐍

https://carpentries-incubator.github.io/python-humanities-lesson/

Colab: https://colab.research.google.com/

Extra code snippets

Replacing column names with spaces with underscores

You can access the column names of a pandas DataFrame object with the .columns attribute. This returns a list-like object of the column names which we can map a function across to replace spaces for underscores.

data.columns = data.columns.map(lambda x: x.replace(' ','_'))

There's quite a bit happening here so let's break it down:

  • data.columns = - here we're reassigning the values of data.columns with whatever is returned by the rest of the statement
  • data.columns.map - this is a function call on the .columns attribute. Specifically calling the .map function that will map another function to each item in the collection.
  • Within map we have lambda x:, this is a lambda function. A small anonymous function that we define on-the-fly rather than as an official function. We specify x here as the variable that the lambda function uses i.e. the item from data.columns that map will pass to it.
  • After lambda x: we have the function we want to use, in this case x.replace(' ','_'). Because the items in data.columns are strings we can call string related functions on them such as replace that takes two arguments: the character to replace, and the character to replace it to. In our case we specify a blank space ' ' that we want to replace and an underscore _ that we want to replace blank spaces too.

Putting this altogether will quickly replace blank spaces in our column names with underscores and reassign these new column names to the dataframe object.

And finally!

Who are you and where are you from?

Jo Kershaw, School of Computing, Fluid Dynamics CDT

Arran Rees, Fine Art, History of Art and Cultural Studies! (Congruence Engine project)
Sarah Dawson, School of English
Yan Chen, School of English
Jouna Ukkonen, Faculty of Social Sciences
Zhe Liu, School of Languages, Cultures, and Societies
Adaeze Ohuoba, School of Languages, Cultures and Societies
Cameron Tailford, Science Museum (Congruence Engine Project)
Ohoud ALshehri , School of L
Souad Boumechaal, School of Languages, Cultures and Societies
Eve Smith, Language Centre

If you want the Google Colab files, fill in this form to share your Google email

https://forms.gle/JZYR4pgJM51JCH949

Two questions

What did you enjoy / what would you like more of?

I really enjoyed the exercises and how the explanations were simplified.
Coding together was really helpful.

Very friendly and always ready to slow down and break it down into simpler steps for all levels of experience. Lunch was appreciated! Free coffee next time please :-)

Much more aware of the possibilities now. Eager to carry on learning more/ attend follow up sessions
More comfident to start learning more about Python, thank you for sharing al of these fantastic stuff

What could be do better/ differently?

At risk of being unpopular - exercises to take home? Not for you to mark, just suggestions for things to look up and experiment with.

Would have been really helpful if both trainers used the notes section in the same way. I found it very helpful when the notes were written as it gave me time to think about what was happening rather than trying to write my own notes and keep up at the same time.

I thought we were going to do some visualisations too. It would have been nice to see some of the visual things Python can do.

Use notes across all sections/talks
More practice on real data and how it can be visualised, plots.

Select a repo