# Dataset 解讀 Data link: https://drive.google.com/drive/folders/1O2S2Ej15L-szub_ZHdm86VTlnnok1n79 (provided by TA) 內容包含8組dataset:subj01-08 每一組dataset的內容如下 - training split - fMRI - images - test split - fMRI - ROI mask - space - mapping 以下說明各檔案內容,也會加註讀取方式 開頭先加入以下程式碼: ```python= from google.colab import drive # let colab can access your drive drive.mount('/content/drive') import numpy as np import csv ``` 以下的檔案路徑皆為簡化過的,直接複製會出錯 我用'..'來把重複的東西替換掉了,例如: ```python= np.load('../subj01/training_split/training_fmri/lh_training_fmri.npy') ``` 可能要改成 ```python= np.load('/content/drive/MyDrive/2023-Machine-Learning-Dataset/subj01/training_split/training_fmri/lh_training_fmri.npy') ``` ## fMRI npy檔,分為lh, rh(左右半球) 二維陣列 Rows代表圖片張數,訓練集5000、測試集150 Columns代表voxel(體素),lh 19004, rh 20544 e.g. training.lh (5000, 19004), test.rh (150, 20544) ```python= test = np.load('../subj01/training_split/training_fmri/lh_training_fmri.npy') print(test.shape) print(test) ``` ## ROI mask 一維陣列,其值代表對應的region(見mapping) 包含以下檔案: > all-vertices_fsaverage_space floc-bodies_fsaverage_space floc-bodies_space floc-faces_fsaverage_space floc-faces_space floc-places_fsaverage_space floc-places_space floc-words_fsaverage_space floc-words_space prf-visualrois_fsaverage_space prf-visualrois_space streams_fsaverage_space streams_space ```python= test = np.load('../subj01/roi_masks/lh.all-vertices_fsaverage_space.npy') print(test.size) print(test) ``` size: with fsaverage (163842) lh without fsaverage (19004) rh without fsaverage (20544) ### mapping 包含6個檔案,分別對應不同ROI mask 描述ROI mask中的值所代表的region ```python= test = np.load('../subj01/roi_masks/mapping_streams.npy', allow_pickle=True) print(test) ``` > mapping_floc-bodies {0: 'Unknown', 1: 'EBA', 2: 'FBA-1', 3: 'FBA-2', 4: 'mTL-bodies'} > > mapping_floc-faces {0: 'Unknown', 1: 'OFA', 2: 'FFA-1', 3: 'FFA-2', 4: 'mTL-faces', 5: 'aTL-faces'} > > mapping_floc-places {0: 'Unknown', 1: 'OPA', 2: 'PPA', 3: 'RSC'} > > mapping_floc-words {0: 'Unknown', 1: 'OWFA', 2: 'VWFA-1', 3: 'VWFA-2', 4: 'mfs-words', 5: 'mTL-words'} > > mapping_prf-visualrois {0: 'Unknown', 1: 'V1v', 2: 'V1d', 3: 'V2v', 4: 'V2d', 5: 'V3v', 6: 'V3d', 7: 'hV4'} > > mapping_streams {0: 'Unknown', 1: 'early', 2: 'midventral', 3: 'midlateral', 4: 'midparietal', 5: 'ventral', 6: 'lateral', 7: 'parietal'} ## image infos TA另外補充的,獨立於8個dataset之外 csv檔 二維陣列,shape(5001, 135) Rows代表訓練集的5000張圖片 Columns為特徵,e.g. 'person', 'bicycle' ... 值為(0, 1),代表該圖片是否符合該特徵 ```python= dataroot = '../image_infos/subj01_infos_train.csv' test = [] with open(dataroot, newline='') as csvfile: test = np.array(list(csv.reader(csvfile))) print(test.shape) print(test) ```