owned this note
owned this note
Published
Linked with GitHub
# UK TRE Community day 2023 discussion
Note: attendees will dropped from public report, paper notes still to be processed
Full day notes: [UK TRE meeting - September 4th 2023](https://hackmd.io/Hdyu2pc6Seq1bPof0xNTYg)
## Topic: Multi-TRE analysis: challenges, governance requirements, federation
[OG notes ](https://hackmd.io/s7qqXKcdTJev5urgjZ2WLA?both)
### Attendees
- Chair: Carole Goble
- Note-takers: Simon Thompson and Peter MacCallum
- Participants: Fatemeh Torabi, Guneet (Uni of Dundee), Calum Macdonald, Peter Maccallum
### Summary
Take home message: Its not about the Technology. In fact the more TREs technically enabled the more risk that the TREs are not fit for purpose for true operation and not trusted for federation. Process and Responsibility => Trust
### Roadmap
**Short term**: understanding what we have
-- define what is a TRE, wrt to **multiple** TREs within a PEST framework that highlights issues that are not just technical, for example includes the diversity of TRE models, the business models of TREs, where risk, responsibility and accountability lay, and includes certifiable PROCESS as a core pillar (shared SOPs). Multi-TREs require new Processes.
-- define a TRE Maturity Model that builds on above to develop a more objective model of TRUST, RISK and RESPONSIBILITY for inter-TRE data exchange. Could be used to assess, compare, and facilitate trust between TREs.
-- a common language scale for the ‘tiers’ of TREs suitable for different levels of inter-TRE sensitivity.
-- identify and clarify PEST bottlenecks with examples
**Medium Term**: shifting to newer ways
-- review different architectures and processes for working between TREs
-- what would be just enough with what we already have (e.g. 5SROCrate as m-TRE middleware using current processes)
-- what m-TRE processes would we need to introduce
-- the role of trusted intermediaries (brokers, federated analytics services) to take on risk and responsibility and reposition the Data Sharing Agreements.
–- e.g global identity services linking identities and records, who takes responsibility?
**Long Term**: radical shift
-- PPIE education outside the PPIE self selecting bubble to counter mistrust of government and conspiracy theory
-- Expectation that data is owned by the NHS?
-- rethink of data holdings and services from Data Warehouses to Data Fabrics.
## Topic: Current state of the art re data linkage/federation/AI&ML&LLM across infrastructures: federation, governance, safe output methods
OG note https://hackmd.io/Ds9PB7_2QSCtvjpcEvDBnA?both
### Attendees
- Chair: Mark Mumme, University of Bristol
- Note-taker: Imre Draskovits, Newcastle University
- James Rafferty, Swansea University
- Chuang Gao, Health Informatics Centre
- Claire, MacDonald, NHS
- Polly Eccleston, University of Bristol
- Richard Williams, University of Manchaster
### Summary
- Federation: it is important to be able to identify with confidence accross TREs which can be a challenge with common names, initials or pseudonymized data; yet there is a potential problem in linking health (inclusive or personal details) with other datasets. A common criteria is required for TRE communication.
- AI & ML: There is confussion between the terms AI & ML with 'Statistical Modelling'. Introducing AI into TREs is a challenge: internet connection (offline machine learning model), multiplicity of AIs leading to different learnings form same dataset, checking models are problematic and difficult, unsure results and unsure contents of the model begs the question of the model's authenticity. Yet statistical models lack precise popular pattern finding models
- Governance: process repeated a lot due to lack of communicaiton between committees. Doing a project between TREs, each TRE will have an approval process, ideally a multi TRE Project requires a single approval process, this decision should be approved across the other one
### Roadmap
- Current state of the art is the overarching question -- needs a TRE panel to decide what is state of the art
- Single 'panel' on a specialty (e.g. health, crime) who deal with specific projects, additionally members of the national TRE supervision
## Topic: TRE sustainability and operations
OG notes https://hackmd.io/KxhUOGDXRU2y0kJQGbo6BQ?both
### Attendees
**James Grant happy to contribute to workshop report**
- Chair: James Grant (awsgrant@amazon.co.uk), Amazon Web Services, supports UK AWS custoemrs running TRE on AWS, and SATRE in this context
- Note-taker: James Grant
Session 1:
- Aaron - HIC, Dundee
- Richard, PM Swansea, TRE host 3 deployments elsewhere
- Steven Newhouse, Barts, Precision Medicine requires a TRE, needs to fund itself after initial funding
- Ben Thomas, UCL Principal RIE, replatform for on-premise
- Emma Squires, PM Dementia Platform UK
Session 2:
- Paul Colville-Nash, UKRI - MRC Data Science Program Manager
- Susan - ONS - Access to data officer
- Thomas Tamblyn - Statistical Support officer (ingress and egress)
### Summary
Sustainability needs to be long term, but how to plan for it when the scenario may change in 5 years.
There is also an issue with research, this is a service yet funding requires to appear to be doing something new each time and funders can prefer not to pay for infrastructure (also challenges with cost stimates and under/over expenditures).
There are several variables and questions about whether they should be free at point of use (distributing against overheads), a membership user model, project fee, standard features being free but charging for high demanding ones. In all cases at least some core funding is required to ensure continuity, specialisation and quality.
What we want to ensure is that a public service exists.
### Roadmap
Roadmap:
A roadmap should address
- Technical knowledge, skills, TRE staff skillsets
- Why do this has to be part of retaining people
- Localising staff makes this easier, centrally models push more to thinking about pay
- To address retention
- Pipeline of talent
- Can TRE model work in R
- Not just technical, IG, where can I get more information
- Consultancy
- Embedded technical/operational/IG knowledge relevant to the problem.
- Research - teaching balance.
- Funding
- Lots of politics, in HPC communities, good for those who get it. Not good for those who have to resort to begging
- Not necessarily good for SDE
- Analysis will follow data
- People with data will need to bolt compute
- HPC allocation modelled SDE account for compute/storage costs
- Why should SDE and HPC be considered differently
10 year plan - scope for accreditation
Chartered research infrastructure?
CSP platform neutral certifications for Data/Cloud
Infrastructure sustainability
People:
- Infrastructure/Developers
- Operations
- Data Scientists
## Topic: Safety and security of Python and R package import into TREs
OG notes https://hackmd.io/gCcnmJ0dR_CY-2c3U2Ke_Q?both
### Attendees
- Note-taker: Richard Williams
-
### Summary
Currently TREs allow access to PyPI and CRAN for less-sensitive data but only specific packages for more sensitive data. Yet there is a variety of current approaches (some TREs have CRAN access while others do not). Despite controls if there was a malicious python/R package you could just write it inside the environment.
It is challenging to establish the line between R & Python files and ai/ml models
Regarding **egress** there are challenges around the labour intensiveness of it, for which there are some autimated tools.
### Roadmap
- Is it possible to lock down a TRE sufficiently so it is possible to allow unlimited ingress? If so best solution as no friction for researchers. Also allows future ingress items such as LLMs / neural nets etc..
- If not, then can TREs collaborate to whitelist (and blacklist) packages to prevent each one needing to repeat work.
- Central register / co-ordination
- But what to do about versioning?
- Could have a dual model:
1. Docker based containerised TREs that are completely locked down meaning that any ingress is allowed
2. TRES with a list of packages that are allowed, and you need to just use those. Process to request new packages
## Topic: Community-based efforts and collaborations in public involvement and engagement
OG notes https://hackmd.io/VU5u1MqRSIauDtPZrY84MA?both
### Attendees
- Chair: Westley
- Note-taker: Katie Oldfield
- Peter Giles
- Michelle
- Hari
- Ballint
### Summary
This discussion was an informative discussion that moslty gave the opportunities to attendees to ask PPIE expert about their experience so far.
PEDRI, in which DARE UK is involved, was mentioned as an ongoing reference effort and the group discussed wheter or not the community is currently doing enough and the best way to do more. It is necessary to include PPIE in funding, make efforts to simplify language and ensure an impact loop of public panels (make sure participants learn about what they contributed to)
### Roadmap
- What would a solution to this problem look like?
Range of public engagement and involvement routes
Impact loop back to public panels
Support on engaging with members of the public
- What resources would be needed (people, time, funds, infrastructure etc.)?
Funding to recruit members of the public who might not normally get involved. Examples of using Sortition and Coal Rabie and IPSOS
Utilising that expertise of external recruitment agencies
Training and support ofhow to communicate with members of the public for academics and 'technicians'
- How can this community support you in getting them?
TRE specific PIE groups
Embedding PIE skills in peoples careers
- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively?
PEDRI
## Topic: Cloud vs on-prem TREs: costs, constraints, pros & cons
OG notes https://hackmd.io/op0_MMVSRwi-Hkdo8nOUow?both
### Attendees
- Note-taker: Ed Chalstrey
- Jianpeng Meng
- Raymond Huonond
- Loki
### Summary
The main decision drivers are security and cost. Cloud is mor eflexible for project wiht different funding sources and does not requires an expensive data centre for research institutions but does not offer the highest levels of security.
A potential solution is a hybrid model where you get a cloud-like infrastructure on an on-prem compute.
- Cloud provision via Jisc (as oppose to direct with the cloud provider) can be cheaper and it also handles SSO: https://www.jisc.ac.uk/forms/uk-access-management-federation-sign-up#
- Resources: Google RADLab: https://cloud.google.com/blog/topics/public-sector/googles-new-rad-lab-solution-helps-spin-cloud-projects-quickly-and-
### Roadmap
- hybrid model
- Solution that is cloud-agnostic and could also run on on-prem hardware
## Topic: The role of independent TRE providers in relation to the NHS national and regional SDEs
OG notes https://hackmd.io/rbA3vtAjT7m7NwEhiWoQ1w?both
### Attendees
- Chair: James Robinson
- Note-taker: Danny Silk
- Andy Boyd
- Emma Turner
- Chris Andrew
- Tim Driscoll
- David Seymour
- Chris Appleton
- Seb Bacon
### Summary
This discussion made evident the multuplicity of current efforts and the difficulty to know what is happening in this space or the direction it is taking.
The discussion identified many potential areas of work and collaboration (see [OG notes](https://hackmd.io/rbA3vtAjT7m7NwEhiWoQ1w?both), from line 86), the us of NHS data held in SDEs via specific TREs can expand the utility of this data but requires a lot of coordination and different levels. Not only between independent TREs but also across regions and institutions.
Challenges arise on how to make this coordination and alignment effective, reconcile different interests (commercial and public) and ensure public and clinicians trust.
The role of HDRUK and TRE Community is seeing as a positive influence.
### Roadmap
There is a lot of uncertainty making a roadmap difficult therefore next steps are:
- Better interaction with NHS
The TRE community can support with awareness of initiatives and collating information on who is doing what.
Current groups to be aware of and start:
- HDR UK
- Each individual SDE
- Southern consortium of SDEs
- NHSE in flux: National SDE: Michael Chapman
Roadmap
1. Vision + strategy
2. Common principles, protocols, assets
3. Expanded communities of practice + knowledge share opportunities -> COnsider national, regional, local links
4. PPIE approach + building trust/confidence with the clinical communities
Workpackages/What would be helpful:
1. Clear vision/value story on why TRE+SNSDE add/evolves
2. One pager on key protocols, ways of working + frameworks to strenghten consistency of messaging
3. Alignment of related data programmes (eg R+D vs FDP)
4. Community of practice + shared assets/lessons/insights so SDEs build on TRE success to date
5. User (eg. researcher) assets: needs, goals, decisions, pain points, requirements
Resources
Consider where expertise sits across:
A. Independent TREs
B. National influences
C. National SDE [RN]
D. SNSDEs
E. Local researchers
F. Common entities/stakeholders in health data space
## Topic: Future governance of the SATRE Specification
OG notes https://hackmd.io/ijXlgdyOT5y2NSg9pfhKow?both
### Attendees
- Chair: Chris Cole
- Note-taker: Jim Madge
- Tim Machin
- Geoff Gray
- Cat Morris
### Summary
SATRE funding ending in October but planning to continue work on the specification, the aim is to be community owned but what the governance actually looks like is uncertain. SATRE aims to be between high-level accreditations (CE+, ISO27001) and the low-level detail of a particular implementation and include demonstrations of how TREs are meeting it.
The next steps seems to need to be socialising the specification and building a peer network.
### Roadmap
- Identify the community and what they need.
- This becomes the targets of the next phase of SATRE.
- Could be
- Peer network
- Auditing/evaluation support
- Organise networks around the pillars
- May help coordinate/focus effort
- Identify contribution mechanism, consensus mechanism
- What would SATRE require to have confidence?
- Part of the HDR UK innovation portal
- Endorsement from highly regarded, trusted bodies, for example, HDR UK, UK SeRP, ADR UK, ...
- Clear mapping, roadmap to ISO27001
- Clear guidance on roles _including_ expected time and skills for that role holder. Avoid TRE staff being overloaded or given unreasonable tasks
- Too much of an imposition? Too specific?
- Guidance on the economics of TREs
- Build your own
- Buy an off-the-shelf solution
- Cloud vs. On-prem
- People costs
- Identify how to fund staff
- First 'round' was DARE UK
- More resources from funders, _e.g._ HDR UK
- What should a dedicated SATRE person do?
- Promotion
- Stewardship of the standard
- Community manager
- User support/outreach
- Engagement with other communities, _e.g._ SDAP
- Stability of funding
- Research funding is not guaranteed
- Ask for money/people donations from SATRE users
- Fee for formal accreditation against SATRE
## Topic: Access and rights metadata to support process choreography and interoperability (e.g. generic researcher applications and authorization)
OG notes https://hackmd.io/hAthdFydSvqeiyCcSXdhLQ?both
### Attendees
### Summary
### Roadmap
## Topic: Addressing data harmonisation between different datasets: do TREs have a role?
OG notes https://hackmd.io/_jknJLGmRNeGD9obs2PIAw?both
### Attendees
- Chair: X
- Note-taker: X
- Sam Cox
- Callum Macdonald
- Tim Driscoll
- Fatemeh Torabi
- Susie
- Emma Turner
- David Seymour
- James Robinson
- Sarah Stewart
### Summary
Data+Analysis=Timely Processing
- Harmonized/OMOPed
- TRE governanced barriers
- Reliability-validated?
- TRE role:cross project share
- DMOPin data sources & adding TRE Specific terms into main repositories
- Mapping tools
- TREs can delegate (Coconnect
- Discovery
- Feasability)
- Clinical input
### Roadmap
## Topic: Sight unseen: how far can we go with keeping data hidden from users?
OG notes https://hackmd.io/vNbwgr5HSHubY5kJ1B_HPQ?both
### Attendees
### Summary
This is the model of [OpenSAFELY](https://www.opensafely.org). Questions explored were how to ensure that the provided metadata is sufficient, how to extend the approach to more complex data (highly relational/linked databases) and the implied need of code review before running on actual data.
In summary this can be done but there are limitations.
### Roadmap
## Topic: Governance of the UK TRE Community
OG notes https://hackmd.io/1O4yIor-TVi3Ln8wIApOLw?both
### Attendees
- Chair: Hari
- Note-taker: Simon
- Tim Hubbard
- Chris Cole
- Will Cocrombe
- Boyd
### Summary
The discussion centred about the purpose and governance of the community, trying to reach a balance between conveyors but still provide enough content and direction not to be an "empty" place
- Universal selling point of UK-TRE: Diversity of audience, and pragmatism: people that are doing something
- Danger of just listening is you don't share your existing knowledge of what will/won't work
- Should put out position statements? Say things if you don't like something? The community should reach a point where what we say is respected. More powerful than individual submissions.
- What should UK-TRE do?
- Be careful not to become just a bureaucratic institution that has some funding, people, writes reports.
- Balance
- Maybe a network that feeds up to DARE/HDR/ADR?
- USP would be it's practical, diverse, not duplicative, ideal audience for people at top to bounce ideas off
- Proper focus groups would be much more expensive
- Some funding for community to organise meetings like this
### Roadmap
- What would a solution to this problem look like?
- Ensure meetings remain attractive, not too officious
- Lightning talks good, reduces duplication
- Networking opportunities
- Long lunch
- People willing to invest time to travel
- "Stir people up and let them go"
- Beach! 🏖️
- No different from what we've got now
- More recognisable branding
- A home? What does "home" mean?
- A formal recognisable figurehead
- What resources would be needed (people, time, funds, infrastructure etc.)?
- Funding for someone to be a formal chair of UK-TRE
- Neutral funding for someone to run community, not funded directly by a single institution
- Maybe multiple people? E.g. coordinator, chair, community manager (junior/senior?), technical?
- Elected chairs to propose direction/funding? Probably too much.
- Instead have a steering committee
## Topic: What are the detractors to standardisation and working together? How can the reasons for the tension work alongside standardisation?
OG notes https://hackmd.io/F7M6sps0TmS-NQGGCfgGKQ?both
### Attendees
- Chair: Rob
- Note-taker: X
### Summary
- Everyone loves a standard as long as it's theirs: because implementing standards = real work on operational systems, everyone has an interest in keeping the standard as close as possible to what they already have.
- Rationalising these different approaches needs space and time to coordinate joint R&D activity that is separate from, but connected to, actual operations.
- We also need to recognise that coordinating across multiple TRE providers will (a) be slower than operational timescales, and (b) need dedicated people who are not trying to run ops at the same time.
- Currently, R&D is typically funded by competetive grants to "innovate", and operational expenses are often top-sliced from these grants.
- Separating operational funding from innovative R&D grants is one thing that would help.
- So: separating ops teams from R&D teams in both people and funding terms is the biggest single help.
### Roadmap
- Funding model evolution
- Speed date - understand who/can collaberate with for mutual benifit
- Value of this activity seen by organisation / funders