Camila Rangel Smith

@VTC7yAqQRpezaQv_KTSPJQ

Joined on Jan 16, 2019

  • Example questions to review paper what do they do with the input data? what model do they use (what's the underlying self-supervised learning task) what do they obtain (it seems they obtain a BERT model often, which is encoder only so you get a bunch of embeddings) what do they test this stuff on? Shaky fundations The review examines 84 foundation models trained on non-imaging EHR data, creating a taxonomy of their architectures, training data, and potential use cases. Most models are trained on small clinical datasets like MIMIC-III or broad biomedical corpora like PubMed and are evaluated on tasks that do not necessarily reflect their utility in health systems.
     Like  Bookmark
  • --- tags: DSWB --- # DSWB - General meeting notes
     Like  Bookmark
  • --- tags: DSWB --- # DSWB - Capacity building and open science WG
     Like  Bookmark
  • Notes like this are copy from documentation for reference to the bullet point. A and E Reference to minisparra'we are utilising MINISPARRA’s example data which is...') Remaining TODO:'a patient id id, a date TODO update header at which the admission occured, an attendance_type, and three' Data description is not reflective of the data:From the first row in this example we can see that patient 2 was admitted on the 16th of August 2016, they had 3 unique diagnoses, 104, 103, and 102. Data when running the script (and in the github pages): Screenshot 2024-03-26 at 11.09.30
     Like  Bookmark
  • We value the participation of every member of our community and want to ensure that every contributor has an enjoyable and fulfilling experience. Accordingly, everyone who participates in the {{ project_name }} project is expected to show respect and courtesy to other community members at all time. In the interest of fostering an open and welcoming environment, we as contributors and maintainers are dedicated to making participation in our project a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
     Like  Bookmark
  • Questions marked * MUST be completed. Section 1: Background 1. Project appraisal meeting* https://github.com/alan-turing-institute/Hut23/milestone/15 2. Project proposal name/working title* Building a “Google Maps” for molecular and cellular biology 3. Brief description*
     Like  Bookmark
  • Reporting period: 01/04/2023 -- 30/09/2023 Notable/substantive activity/outcomes in research progress Affinity-VAE We have developed a machine learning method named Affinity-VAE (Variational Autoencoder) which enhances the performance of $\beta$-VAE by increasing the interpretability of the learnt representations. In the case of Cryo-ET, this is achieved by incorporating our prior knowledge of protein structure into the learning problem. Our method has been been implemented in an open-source library in collaboration with the RFI and STFC. The package is publicly available on The Alan Turing Institute github organisation and under continous improvement and development. https://github.com/alan-turing-institute/affinity-vae Leveraging research activities to build partnerships Francis Crick InstituteFollowing the Living Systems Symposium that we organised, we talked with James Briscoe, Associate Research Director at the Francis Crick Institute. He expressed an interest in developing a deeper partnership with the Turing.
     Like  Bookmark
  • Questions marked * MUST be completed. Section 1: Background 1. Project appraisal meeting* Link to the relevant milestone. 2. Project proposal name/working title* D&S Careers Year 2: Internships and Summer Experience 3. Brief description*
     Like  Bookmark
  • I'm Camila Rangel Smith, a Senior Research Data Scientist at the Research Engineering Group. We are a group of around 45 research data scientist, software and computer engineers that work across all programs of the Turing. Although we are a technical team, from early on we've been committed to embed Equality, Diversity and Inclusion in what we do, and this realised by the EDI service area, which is a formal activity within the group with protected time allocations that I lead. [name=May] Thinking about Fede's point: You could say something like "Within the research engineering group, I take the lead for the EDI strategies." [name=May] I think you could emphasise the quick growth in team size to show why you focused EDI on recruitment. Eg. In the last x years my team (note: my is important here) has grown from 3 data scientists to 45 members. This has given us the opportunity to think hard about how we recruit. [name=myyong] realised by the EDI service area, which is a formal activity within the group with protected time allocations that I lead. This is super strong. You could emphasise it: Eg. My team's business model is that we cover own costs through project work. We show our commitment to EDI by ringfencing a portion of our budget to buy team members' time allocations so that EDI work fall within our work hours and responsibilities.
     Like  Bookmark
  • Monday Slides https://thealanturininstitute-my.sharepoint.com/:p:/g/personal/jroberts_turing_ac_uk/EcTYTWo1MF5NqvdsLEXzneIB9OmdLqfxg8ltLMhjV7O7yg?e=54CqKW Zoom link https://turing-uk.zoom.us/j/91940918296?pwd=a3huV1lRUDRaMVJCNmluVVhOamlUQT09 Instalation instructions Docker is not expected to work Using conda, following instructions and setup the database in the following way
     Like  Bookmark
  • May 11: Testing the effect of the affinity matrix Using identity matrix Affinity Training CM: Validation CM: Embeddings:
     Like  Bookmark
  • Black Tests pre-commit Affinity-VAE for disentanglement, clustering and classification of objects in multidimensional image dataMirecka J, Famili M, Kotanska A, Jurashcko N, Costa-Gomes B, Palmer CM, Thiyagalingam J, Burnley T, Basham M & Lowe ARdoi:10.48550/arXiv.2209.04517 Installation Installing with pip + virtual environments Note: This has been tested in the refactor branch.
     Like  Bookmark
  • 04/04/2023 Brainstorming questions for the survey INTRO: Introducir Global North y Global South communities of practice define: LA-CoNGA, Metadocencia, The Turing Way Context, next steps Herramienta general: https://www.bikeprinciples.org
     Like  Bookmark
  • 2023-04-05 Existing collaboration with Rothamsted Looking to continue this work, animal agriculture focus partners incl Goldman Sachs Pieter to stay until August Out-of-budget now on hold Not much news on AI hubs/AI for net zero MSF-AI institute, responsible, centre for data driven discovery, workshop, open to all https://sites.google.com/view/rainscmu. Breakout activity on generative methods (generative/discriminator themed competition). Coronation bank holiday 2023-03-08
     Like  Bookmark
  • Background Regional climate models (RCMs) contain systematic errors, or biases in their output [1]. Biases arise in RCMs for a number of reasons, such as the assumptions in the general circulation models (GCMs), and in the downscaling process from GCM to RCM [1,2]. Researchers, policy-makers and other stakeholders wishing to use publicly available RCMs,need to consider a range of bias “correction” methods (sometimes referred to as bias adjustment or recalibration). Bias correction methods offer a means of adjusting the outputs of RCM in a manner that might better reflect future climate change signals whilst preserving the natural/internal variability of climate [2]. The aim of clim-recal is therefore to: To provide non-climate scientists with an extensive guide to the application, disadvatanges/advantages and use of BC methods To provide researchers with a collated set of resources for how to technically apply the BC methods, with a framework for open additions To create accessible information on bias adjustment methods for non quantititative researchers and lay-audience stakeholders
     Like  Bookmark
  • To-do: [ ] Ruth to look at debiased dataset, see if it makes sense. [x] Camila to implement 360 day calendar on HADs at resampling stage. [x] Camila to modify debiasing script to loop over large periods of time. [x] Camila to run resampling in all variables. [x] Ruth to understand if we need to group by time.dayofyear and report back [x] Aoife/Camila to debug why code in quantile delta mapping is braking for groups. [x] Make sure environment works both in linux VM and Macs[x] Our VM is not good enough to run it due to memory needed. [x] Once we have have a fully debug CMethods.py, fork library, copy the modified script and create it as submodule (Camila).
     Like  Bookmark
  • Need to run an R script which can take days to run. We can do this on a VM (created VM for this SPC-Step2). Need to install R and find a way to run script from the command line. Adapt script to run on a loop over all years and LADs. Mount fileshare to save output on a given directory. Investigate how to create directory that doesnt exist on R (or have a script that creates it before running the R code?) Need to reformat SPENSER data structure from: LAD_name/package_name/data_for_years.csv to Country/year/LADname_data.csv
     Like  Bookmark
  • New starters take around 2 months to get assigned to a project. Meanwhile they could contribute to Force Multiplier projects and/or join an ongoing project for a few weeks (1 -2 months or until they are assigned to a project). This would allow them to get to know people in the team and learn our ways of working/practices. When joining a project (at 0.5 FTE, covered by REG core funding), they will still have time to do other activites that would help them integrate to the team (e.g. force multiplier projects, teaching, DSG, etc). Furthermore, this could help obtain evidence for their probation. What we propose: All internal force-multiplier projects define one month tasks that can be done by the new joiner and the person in charge offers mentoring time. All active and awaiting start projects should mark with a label ("looking for new starter" ? anything better?) whether they are interested to have a new starter for free for one or two months. This implies that someone on the project will dedicate some time guiding the new joiner and defining a self-contained task. If a new joiner wants to join one of these projects, they should let their Line Manager, the project lead and James G know about this. The contributions made by the new joiner will become part of their probation objectives.
     Like  Bookmark
  • import datetime import h3 # uber geo package import pandas as pd import seaborn as sns import numpy as np import matplotlib import matplotlib.pyplot as plt import json import pickle
     Like  Bookmark
  • 12th January 2023 Summary SCP pipeline is ready and documented. Remaining issues are due to the components of the pipeline and their interactions, running in so many different LADs works a some kind of stress test, finding problems that might have not been noticed before. Solutions have been found for individual repos by cloning them into the Turing organisation and implementing fixes that work for the pipeline. We could do an upstream PR to the original repo, but there is the danger that our fixes could break something else in the original repo. Components that haven't been cloned should at least be fixed on its versions, because updates on the remote repo could break the pipeline. If someone wants to run the pipeline again in the future they must check if any relevant updates have to be included. Components:UKCensusAPI:Original repo: https://github.com/ld-archer/UKCensusAPI.git Issue: Problems found when running Scotland due to and how it unzips the original file that needed manual input. Fixes repo: https://github.com/alan-turing-institute/UKCensusAPI.gitCalls a bash subprocess for unzip, so any system using this now needs that to be installed!
     Like  Bookmark