# TEA-DT Research Away Day :::info â„šī¸ **Information** - Time: 10:00–16:30, Monday 20th May - Location: CSE/102/103, [Department of Computer Science](https://www.cs.york.ac.uk/contactus/), University of York - Teams Details: - [Meeting Link](https://eur01.safelinks.protection.outlook.com/ap/t-59584e83/?url=https%3A%2F%2Fteams.microsoft.com%2Fl%2Fmeetup-join%2F19%253ameeting_MWQyODhiYjItNmZmZC00YmRlLThmOGUtMTc5ODdkZDgyNDA0%2540thread.v2%2F0%3Fcontext%3D%257b%2522Tid%2522%253a%25224395f4a7-e455-4f95-8a9f-1fbaef6384f9%2522%252c%2522Oid%2522%253a%252270bc8ec4-6733-497e-9c2a-a2fd7c6a7a06%2522%257d&data=05%7C02%7Ccburr%40turing.ac.uk%7C324a52284b234fd39a2808dc765bb503%7C4395f4a7e4554f958a9f1fbaef6384f9%7C0%7C0%7C638515383474566576%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=eHdy4DplZJ6wG%2F%2F7yMcipRz0tkkDbNjJ8PZ%2F5HlNZTQ%3D&reserved=0) - Meeting ID: 368 558 885 09 - Passcode: XbfSF7 đŸ—’ī¸ **Resources** - Please see this note for the originally proposed ideas for discussion: https://hackmd.io/@tea-platform/ByVlvW2JR ::: ## Agenda | Time | Description | Chair | Link(s) | |-------|---------------------|---|---| | 10:00 - 10:30 | Developing a vision statement for the project | Chris | [Miro](https://miro.com/app/board/uXjVKFy2kJU=/) | | 10:30 – 11:00 | ~~Assurance of DTs survey~~ (Vision cont'd) | Sophie | | | 11:00 - 11:45 |Topic 1: Empirical evidence for fit-for-purpose. |Ibrahim| [Paper](https://core.ac.uk/download/pdf/363148691.pdf) | | 11.45 - 12:00 | Break | -- | | | 12:00 - 12:45 |Legal scrutiny of topic 1 |York Team| | | 12:45 - 13:30 | Lunch | -- | | | 13:30 - 14:30| Topic 2: Incorporating LLMs and multi-agent workflows. | ==Chair== | | | 14:30 - 14:45 | Break | -- | | | 14:45 - 15:45 |Legal scrutiny of topic 2 | York Team | | | 15:45 - 16:30 |Project Roadmap (Review and Refine) | Chris | | | 16:30 | Wrap up | -- | | ## Notes ### Topic 1: Determining the kinds of empirical evidence needed to evaluate the extent to which the approach/method/platform "works", including meaningful evaluation crteria. - Safety Cases: An Impending Crisis ([Link](https://core.ac.uk/download/pdf/363148691.pdf)) - Or, a flourishing paradigm? - Increased interest in safety cases beyond safety community: - Increased precision through mathematics - Increased automation through model-based engineering - Improved confidence through uncertainty qunatification - But, the greater adoption has not resulted in greater evidence of effectiveness - Are safety cases working? - Wrong question to ask, as for engineers the case is everything that has been done (i.e. the accumulated evidence). Less interest in the argument side of the case. - Instead, 'Which safety case methods, tool and notations are working?'' - Is a particular method working for a particular domain, organisation, or technology? - "The purpose of Safety Cases is to identify, assess and address serious risks to equipmenmt amd installations before it is too late." - DS 00-56 definition: "A Safety Case is a structured argument, supported by a body of evidence, that provides a compelling, comprehensible and valid case that a system is safe for a given application in a given environment." - Claims made in Health foundation about, for example, promoting structured arguments, are empirical claims, and should be testable. Evidence of benefit needs to be gathered. - Similar concerns raised in [evidence-based policy](https://academic.oup.com/book/11611) (e.g. Cartwright) and [evidence-based medicine](https://www.bmj.com/content/348/bmj.g3725) (e.g. Trisha Greenhalgh) ![Screenshot 2024-05-20 at 11.20.33](https://hackmd.io/_uploads/HJ2f4sumA.png) #### Questions 1. What does it mean for a safety case to work? - Need to differentiate between process and outcomes. - Legal perspective: - Industry versus public sector. Heavily regulated industry (e.g. fines for breaching regulations, independent of harm) versus low regulation sectors (e.g. redress for material harms). - Case of healthcare DT. Did the doctor act in a reasonable manner (i.e. a responsible body of doctors would have acted that way). If the doctor did act reasonably in using the digital twin, then the doctor would be in the clear. 3. What would a better definition be for TEA cases, given the plurality of goals? 4. How could we go about evaluating the efficacy of TEA? How would we operationalise efficacy? 5. How do you assess the organisational readiness for TEA? How can you improve organisational readiness? #### Topic 2: Exploring the use of large-language models and multi-agent workflows to support the development of assurance cases. **Can** this be done? **How** could it be implemented into the TEA platform? **Should** it be done? - Suggested topic for next RAD: An end-to-end scenario for demonstrating how the TEA platform fits withing the ML system engineering process. From requirements elicitation to deployment (would be great to have a look at the project cycle and develop a pretend case study or something?) - working on 'posters' in breakout groups then reconvening to discuss group results