Teaching Machine Learning and Data Literacy to Students of Logistics using Jupyter Notebooks

# Teaching Machine Learning and Data Literacy to Students of Logistics using Jupyter Notebooks Teaching machine learning in fields outside of computer sciences can be challenging when the students do not have a solid code knowledge. In this poster description, the requirements for teaching data literacy and code literacy to students of logistics are explored. Specifically, the use of Jupyter Notebooks in a machine learning course for students in logistics is evaluated, using “Teaching and Learning with Jupyter” written by Barba et al. that lists several teaching patterns for Jupyter Notebooks. In the course “Machine Learning in Logistics”, students from logistics Master programs learn about data science and machine learning with a focus on their future work space. A common definition of logistics is the accumulation of seven R (7-Rs): How to transport the right good in the right quantity and in the right quality at the right time to the right place in the right condition while this service can be offered at the right price [Co09]. During the lecture, relevant concepts are taught while in the exercises more practical skills are focussed on, such as how to read in, explore, analyze, and visualize temporal and spatiotemporal data sets as well as large image data sets. Here, Jupyter Notebooks are used as the medium to convey exercises and make the students write and adjust Python code. Students enter the course with a very heterogeneous knowledge about programming partly because coding abilities are not a requirement for the course. Therefore, creating exercises that are appealing for both beginners and programming experts was a challenge. A data scientist in logistics is expected to have skills in both code literacy and data literacy. Code literacy describes, in general, the ability to understand the underlying concepts of the technology that surrounds us. In the scope of the presented course, no prerequisite in coding is required, for code is mainly seen as a tool to teach data literacy. Therefore, the students are mainly taught to read and modify existing code. Data literacy, on the other hand, represents the ability to understand and make decisions from data in their context. In the course, three main axes of this topic are explored with the use of coding exercises: collection of data from data sources, performance of basic statistical analyses including problem solving with machine learning, and creation of visuals for dataset characteristics and experiment results. Over the last two years, students had been continuously invited to provide feedback on the course, both orally and in anonymous, written form. Since it is a highly specialized course in the Master’s programs, the number of participants at the end of the semester was not high enough to obtain statistically significant results from questionnaires. Nevertheless, the open questions lead to the conclusion that the general concept was addressing the expectations of the students. They did not, however, comment on the style of the questions. In the exercise sheets, whenever students were asked to write code, interpret statistics, or describe graphs, this was done with Jupyter Notebooks. The exercise sheets were created without a framework by different authors and showed large individual differences in types and frequencies of specific patterns. This could be out of personal preference, maybe because the creators had seen them at some point of their own education. This led to the idea of a detailed evaluation. Therefore, an overview that lists 22 different types of pedagogical patterns that can be used for teaching with Jupyter Notebooks [Ba19] was found. The patterns were analyzed regarding their ability to teach data literacy and 14 patterns were identified as useful in that regard. It was evaluated, that seven of them were used already, at least in parts, in the scope of the exercise. Furthermore, six patterns were identified, that can enrich the exercises in future and are thus planned to be applied newly or in a larger scope. To gain better insights, more course iterations are required. Jupyter Notebook for teaching code and data literacy to logistics students proved to be successfull in that it allows a comfortable environment for the students to try out coding without the constraints of having to write a piece of code from the beginning to the end. In addition, visualisations and error messages are displayed close to the code that evoked it. This supports better the use of code as a tool for focusing on data literacy. This use of Jupyter Notebook allows for beginners to stay in a safe environment when coding, without restricting experts from writing complex code.