# UK Biobank Lab Tests 1. Create a folder called `ukbb_datagen` to run the commands in 2. Place the `ukbb_data_dir_structure.tar.gz` in the `ukbb_datagen` folder 3. In the `ukbb_datagen` folder, run the following commands to setup the directories/raw files required for the datagen ```shell= sudo mkdir /data/lab_tests tar -xf ukbb_data_dir_structure.tar.gz sudo mv interim processed raw /data/lab_tests/ sudo chmod -R 777 /data/lab_tests/ ``` 4. Replace the sample files in /data/lab_tests/raw/ with the actual files. `labtests_ukb.icd.sample.csv` `labtests_ukb.labtests.sample.csv` `labtests_ukbiobank.labtests.dict.csv` 5. Clonse the mayo repository in the `ukbb_datagen` folder 6. Go to the datagen folder - `cd [...]/ukbb_datagen/mayo/datagen/ukbb` 7. Create a config.ini file with the following contents. Substitute the sample file names appropriately ``` [matrix] loc=/data/lab_tests/processed/ukbb/matrix [labtests] mapping=/data/lab_tests/processed/test_mapping [diagnosis] patient_diagnosis_loc=/data/lab_tests/processed/ukbb/icd patient_diagnosis_filename=patient_diagnosis_map.npy icd_mapping_loc=/data/lab_tests/raw icd_code_mapping=icd10cm_codes_2019.txt [ukbb] dir_loc=/data/lab_tests/interim mayo_link=/data/lab_tests/raw/UKBiobank_to_Mayo.csv raw_matrix=/data/lab_tests/raw/labtests_ukb.labtests.sample.csv matrix_cols=bioassays.columns.csv pat_icd=/data/lab_tests/raw/labtests_ukb.icd.sample.csv uk_cols_mapping=/data/lab_tests/raw/labtests_ukbiobank.labtests.dict.csv ``` 8. Create a virtual environment outside of the project. Recommended to have it in a separate folder * Create folder ~/venvs * In that folder, run these commands in order * `virtualenv -p python3 (environment name)` * `source (environment name)/bin/activate` * Navigate to `mayo/core-api` * Run `pip install -r requirements.pip` * Navigate to `ukbb_datagen/mayo/datagen/ukbb` 9. run uk_2_sparse.py `python uk_2_sparse.py`