# RRI Script 204-1: What is explainability? ###### tags: `RRI Skills track`, `explainability`, `section 1` **Slides and Notes** [TOC] ## Overall notes :::info * ~~Intro / module overview is not in the root files.~~ * ~~Slides 4 + 5 are duplicates.~~ * This section relies heavily on imagery (not just as additional content but as crucial to the learning materials). Will we providing alternative text to the slides w/ images where the image itself is key to the script/notes? * Slide 8? May be unncessary. * Maybe swap slide 29 ahead of 28? Namely, the Turkey continuation first and then the explanation. ::: ## Slides ### 1 Welcome to the Turing Commons module on Explainability of AI systems! This module is part of our Responsible Research and Innovation skills track. In particular, it is an optional module ### 2 Here is an overview of the Skills Track. It starts with two core modules: What is Responsible Research and Innovation? and The Project Lifecycle. Then we have five optional modules which we call the SAFE-D modules, because they correspond to the five SAFE-D principles: Sustainability, Accountability, Fairness, Explainability (this one), and Data Stewardship. So without further ado, let's get into it! ### 3 Like all of our modules, this one is also split into four sections. The first section is an introduction which goes into what explainability is. The next three correspond to different aspects of explainability: project transparency, model interpretability, and situated explanations. ### 4 Now, let's focus our attention on what explainability is and how we can define it? ### 5 This first section on what explainability is, will be divided as follows. We start with a short introduction, then move on to the scope of explainability, what is explainability, and finish of with factors that support explanations. ### 6 Let's make a start on the introduction! ### 7 The image on this slide shows a palatial scene that looks both futuristic and medieval, with a divine sort of feel. It is called 'Théâtre D’opéra Spatial', and it was created by digital artists Jason M. Allen of Pueblo West, Colorado. When Mr. Allen entered it into the Colorado State Fair annual art competition in 2022, it won first prize, possibly because of how intricate and captivating it is. However, this caused quite a bit of controversy. The reason for this is that Mr. Allen did not 'create' the artwork using traditional software tools of digital artists such as Photoshop or Illustrator's Tablet. Instead, all it took for Mr. Allen to create this image was a single text prompt that he entered into a generative AI model known as Midjourney. Like more traditional forms of 'fine' art such as painting and sculpture that can be messy and organic, digital art can also require hours of hard work. But AI images generated by software such as Midjourney can be created in a matter of seconds, using carefully chosen prompts. A short prompt was all it took to generate the stunning 'Théâtre D’opéra Spatial'. However, in other cases, it is clear from minor details that the AI lacks any real understanding of certain concepts or objects. ### 8 Take the following prompt: Forward slash, imagine prompt colon unexplainable AI, surrealism style. ### 9 The image generated here, although aesthetically interesting, is not really depicting what it was prompterd to: unexplainable AI. This shows that, however impressive image generating AI might sometimes be, because there is no real understanding done by the system, it can often get things quite wrong. ### 10 Returning now to the example of 'Théâtre D’opéra Spatial'. When Mr. Allen entered his image into the competition, which as we know he ended up winning, the public response was mixed. With good reason, other artists whose livelihoods depend on the public valuing and appreciating their hard work were anxious and angry about the widespread adoption of AI-generated art. Many criticised the fact that no real skill or effort was needed to create such images, and that the widespread use of AI image generators would devalue their physical efforts and the role of arts in society. And interestingly, Mr. Allen initally refused to reveal the prompt he had used to create 'Théâtre D’opéra Spatial' saying "If there's one thing you can take ownership of, it's your prompt." You may have an opinion on this controversy, but this module is not about digital art, the value of AI art, or whether AI images are “art”. ### 11 So, you're probably now asking what does an AI generated image have to do with explainability? To put it simply, it's the fact that no one has an idea on how these systems produce the images they do. Like other use cases of generative AI, such as large language models, they are a great example of black box systems, systems that produce outputs or arrive at conclusions without providing any explanations as to why they did so. You may not think an ability to explain how a system produces images matters much from a societal perspective, as the harms they can cause do not arise from the fact that the generative models are unexplainable. Or, you may think that the value generative AI brings to such sectors is important regardless of the model's transparency or interpretability. In both cases, a plausible case could be made in favour of either stance, and vice versa. ### 12 Using generative AI such as Mindjourney for entertainment may not give rise to many significant concerns for society, but the context of its use matters a lot. We will justify the need for the following properties for the use of AI systems. Without them, the risk of harm from the use of AI systems, rises. ### 13 First, we have transparency. For algorithmic tools to be transparent, they should not have barriers to accessing relevant information which can be practical and epistemic. ### 14 The second property is interpretability. This refers to the existence of the relevant information and tools required to understand how the AI model or system works. ### 15 Finally, we have accessible explanations. This refers to the capacity algorithms have to support the communication of accessible explanations. > [name=bneaturing 12-14 are quite difficult concepts / need further explanation if the learner has no prior knowledge] Don't worry if these properties seem a little complicated or abstract. There will be plenty of time to go into them. This module is about understanding what these properties mean and what they need from us, as well as when and why they are important. ### 16 We've covered an introduction to what explainability is about and we're now going to move on to the scope of explainability. Namely, what will and will not be covered as we unpack explainability. ### 17 Our use of the term explainability is holistic. This means we are using explainability as an umbrella term that covers different relevant concepts such as interpretability, transparency and situated explanations. ### 18 The focus of this module is on understanding why explainability matters when discussing and learning about responsible research and innovation. Before we continue however, there are two relevant caveats that must be mentioned. ### 19 Firstly, this is not a module to teach data scientists or machine learning engineers how to use or implement existing methods or techniques. This is not the aim of the module. Instead, the aim is to explore the concept of explainability in AI systems, what its different properties or aspects are, and why they are important for responsible research and innovation. ### 20 Secondly, this module aims to be consistent with widely agreed uses of concepts and terminology, but also has its own unique perspective on the topic. It is important to note there are multiple understandings and perspectives on this topic besides the one presented here today. ### 21 After covering our first two subsections, let's move on to what explainability really is and build up a working definition of the word together. ### 22 Before we discuss explainability, let's look at a closely related concept, 'interpretability'. On screen, there's a quote by Tim Miller from Explanations in Artificial Intelligence: Lessons from the Social Sciences, 2019. The quote reads: Interpretability is the degree to which a human can understand the cause of a decision. In the context of data-driven technologies, interpretability addresses how much a human can understand the consequence of an output from predictive models such as Midjourney. Thinking back to the beginning of the module: why the generative AI model took the prompt given by Mr. Allen to produce the particular image of 'Théâtre D’opéra Spatial'. Interpretability can also be measured in degrees. It is often the case that a model is more or less interpretable depending on factors such as 'who is doing the interpreting' and 'what is being interpreted'. We will take a closer look at these factors later in the module. ### 23 Let's apply what we said in the previous slide, to interpreting images. On the left, we have an image of a chess board and on the right and chest x-ray image. Now ask yourself if you can interpret the meaning behind these images. The image on the left is highlighting a winning move for one of the players, something that can only be interpreted if you know how to play chess. On the right, the chest x-ray is showing lungs with a pulmonary disease, something that may not be obvious for someone who has not spent a lot of time looking at chest x-rays. Unless you are a chess player or radiologist, you will not be able to interpret the significance of the patterns in either of these images. And, dependant on how complex the chess game or physiological issue, it may be that only highly experienced chess players or radiologists could interpret such images. ### 24 Understanding the cause of a decision is no guarantee that the decision can be explained. This also does not guarantee that the explanation would make sense to the person you are explaining to. For example, you're asked what time your partner is expected home from work. You reply, three fifty nine. The person now asks how you were able to make such a highly accurate prediction. If you responded, "They messaged me as they left the office.", this would probably suffice for what we can call a 'folk psychological explanation'. That is, an explanation that we use and hear in everyday life. Most people would be able to work out from your explanation that your partner follows the same route home each day and that, traffic permitting, their journey takes 25 minutes. ### 25 Now, let's look at another example. You have been asked to predict the behaviour of a complex, changing system, such as the local weather in the next 12 hours. Maybe, in this scenario you are a data scientist working for a national meteorological centre, using data-driven technologies to improve forecasting. Namely, you're a specialist on the subject, and an explanation such as "our system provided me with a notification" is not going to cut it. As a specialist, your answer will need more detail. ### 26 These two examples of the arrival time and the weather prediction highlight an important point about explanations. Both examples are situated within specific contexts, each with their own norms for what is considered a valid or acceptable explanation. In other words, the demands of explainability are contextual. ### 27 Our definition of explainability aims to show how contextual it is: The degree to which a system or set of tools support a person's ability to explain and communicate the behaviour of the system or the processes of designing, developing, and deploying the system within a particular context. Defining explainability in this way help us highlight that it varies depending on the sociocultural context in which it is being assessed in. This context sensitivity can also be true for interpretability. However, there is a greater emphasis on communicating the reasons behind a decision or prediction in the definition of explainability. > [name=bneaturing: not sure what is being said here] communicability of reasons To see why, let's look at why its important to be able to explain the behaviour of a system or its outputs. ### 28 The philosopher, Bertrand Russell, had a good, although morbid, example of what is known as the 'problem of induction'. A turkey that is fed by a farmer each morning for a year comes to believe that it will always be fed by the farmer. Each morning the turkey is fed, a new observation that confirms the turkey's hypothesis that it will be fed by the farmer daily. Until, on the morning of Christmas Eve, the turkey eagerly approaches the farmer expecting to be fed but this time, it has its neck broken instead! The problem of induction that is highlighted by this cautionary tale can be summarised as follows: ### 29 What reasons do we have to believe, and justify, that the future will be like the past? Or, what grounds do we have for justifying the reliability of our predictions? ### 30 Being able to deal satisfactorily with the problem of induction matters because we do not want to be in the position of the turkey. We want reliable and valid reasons for why we can trust the predictions made by our systems, especially those that are embedded within safety critical parts of our society and infrastructure. This is also why we do not just care about measuring the accuracy of our predictions. The turkey's predictions were highly accurate (99.7% over the course of its life), but the one time it was wrong really mattered! Obviously, not all predictions carry the same risk to our lives. Where we were making a prediction about when our partner arrives home, we can all understand the uncertancies we associate with our answer. Traffic can be unreliable, our partner bumps into a colleague, and journeys are subject to a lot of change. In cases like this, we are OK with the level of variance in our predictions and our peers can understand that too. But, in cases where the consequences of unreliable or false predictions are more severe or impactful, it is important that we can provide justifiable assurance to others. This means giving evidence an why it makes sense to trust the processess or bheaviour of an AI system. > [name=bneaturing: not sure how to but this in simpler language] it is important that we can provide justifiable assurances to others about the grounds for trusting the behaviour and processes of a system. ### 31 In summary, explainability is about ensuring we have justifiable reasons and evidence for why the predictions and behaviour of a model or system are trustworthy and valid. Understanding this communicative and social perspective also helps us appreciate why explainability is so important in safety-critical contexts, as well as other areas like criminal justice where false predictions have high costs associated with them. If we don't have trustworthy reasons for the reliability and validity of a model's predictions, we are not going to want to deploy it in a context like healthcare where people's well-being depend on high-levels of accuracy or low levels of uncertainty. So, now we have a grasp on what explainability is and why it matters. What do we need to do to ensure its existence? We'll address this in the next section. ### 32 We've now addressed the first three subsections on what explainability is so let us move onto the factors that support explanations in more detail. ### 33 The last part of this module will look at the three factors that support explanations in the context of data-driven technologies. ### 34 The first is the transparent and accountable processes of project governance that help explain and justify the actions and decisions undertaken throughout a project's lifecycle. ### 35 The second is the necessity for interpretable models which are used as components within encompassing sociotechical systems (such as AI systems). This means that in order for a system's decisions to be explainable, the models used to reach those decisiones must be interpretable, at least to a certain degree. > [name=bneaturing: elaborate on this w/ examples maybe?] (such as AI systems) ### 36 The third factor, is an awareness of the sociocultural context in which the explanation is required. And we should give special attention to potential communication barriers, with an emphasis on building explanations that can bridge such barriers. ### 37 Are there any other factors you can think of that we may have missed or under emphasised? > [name=bneaturing: I'm not sure if this slide lends itself to an e-learning video?] ### 38 Let's reflect on all that we've covered in this section so far. We've looked at explainability as an umbrella term, capturing several important factors such as interpretability, transparency, and context. Requests for explanations are shaped by sociocultural expectations. For example, the folk psychological explanation of your partner's arrival time will be very different in terms of what constitutes a good explanations than the professional example of the meterological data scientist explaining why they know the weather forecast for the following day. The problem of induction (let's remember our poor Turkey) asks us to consider whether we have valid and reliable reasons for our explanations. And finally, we introduced important factors that support explainability which include transparency, interpretability, and awareness of sociocultural context. In the next sections of the module, we will go deeper into these factors, starting with project transparency.