# Pupil-size-and-microsaccade-analysis Matlab scripts for preprocessing pupil size timeseries, detecting microsaccades, extracting features, simple plotting and coordinating with R to perform LMM. Table of contents [toc] # I. Import data and data structure ## A. Folder structure The folder should be conduct with this structure. Project └─ rawdata └─ derivatives  └─ raw   ├─ sub-01    ├─sub-01_task-Stroop_info    ├─sub-01_task-Stroop_beh    └─sub-01_task-Stroop_pup  └─ prec   ├─ olFalg.mat   ├─ VarName.mat   ├─ sub-01    ├─sub-01_task-Stroop_info    ├─sub-01_task-Stroop_beh    └─sub-01_task-Stroop_pup # II. Create project and preprocess data ## A. Create project ```m createProject(); ``` This function simply create raw folders under project path and convert files in rawdata into a standard format used in this toolbox. ## B. Preprocess data ```matlab doPreprocess(projectDir) ``` Input project path as variable *projectDir*. This function would performe multiple steps to preprocess pupillometry data, detect microsaccades, and save processed data in prec folder. Steps included are followed: ### 1. Upsample Data ```matlab nData = upSample(Data); ``` This function upsample raw pupillometry data to 1000 Hz. ### 2. Convert pixel data into diameter ```matlab nDataa = cvtDia(Data); ``` This function linear calibrates the square root of pixel data into diameter data by linear function ```p = polyfit([sqrt(6607) sqrt(3796) sqrt(1624) sqrt(414)],[8 6 4 2],1);``` ### 3. Remove blink and invalid interval ```matlab [nData, rmFlag] = rmBlink(Data,'interp','left'); ``` This step adopts a function ```rawDataFilter()``` from a pupil size preprocessing toolbox (Kret & Sjak-Shie, 2018) to detect blink and invalid interval left eye data, and interpolates these invalid interval. Trials that valid point rate below 30% would be target as outlier trial. See: >1. Kret, M. E., & Sjak-Shie, E. E. (2019). Preprocessing pupil size data: Guidelines and code. Behavior research methods, 51, 1336-1342. >2. [Github repository link](https://github.com/ElioS-S/pupil-size) ### 4. Epoch data ```matlab evtList = targeton; tw = [-1000 4000]; [l_epData, r_epData, rmFlag] = doEpoch(nData.rbkData,evtList,tw); ``` This function epoch pupil data according to a eventlist and a epoch window. Here, this example takes target onset as event, and sets the epoch window from 1000 before to 4000 ms after target onset. ### 5. Baseline correction ```matlab tw = [-200 0]; tw = tw - (-1000); % Make baseline window matching data index nData = Data - mean(Data(tw(1):tw(2),:)); ``` This step subtracts averaging data from 200 ms before to target onset for each trial. ### 6. Extract microsaccade features ```m % To detect sac, remove blink and invalid interval without interpolating [nData.nanData, ~] = rmBlink(nData.cvtData,'nan','left'); [nData.nanData, ~] = rmBlink(nData.nanData,'nan','right'); % Detect binocular candidates for saccades (Engbert & Mergenthaler, 2006). [sacData, rmFlag.sac] = detectSac(nData.nanData,evtList); % Remove saccades those exceed specific criteria tw = [-1000 3999]; % Use whole window, we can epoch small window later thr = [0.1 2]; % Angle amplitude criteria vthr = [0 200]; % Velocity criteria dur = false; % Duration of microsaccades intv = false; % The interval between two microsaccades Sac = rmSac(sacData, tw, thr, vthr, dur, intv); % Extract feature of microsaccads and epoch tw = [-1000 4000]; [RateMatx, AmpMatx, VelMatx, AngMatx] = getSacMTX(Sac,tw); ``` First, This step generates data serie that is blink removed and invalid interval keeped. Then, it detects saccade by applying a toolbox from Engbert and Mergenthaler in 2006 with setting relative velocity threshold as 5, and output rmFlag indicating outlier trial that falied to detect saccad. Also, it only keeps saccades that are coincident in overlaping time at both eyes. Following, it removes saccades that dismatch features of microsaccade. Finnaly, it generates four matrixs, including: 1. RateMatx: 1 represents microsaccads occuring at that time point. 2. AmpMatx: Store amplitude value at that time point. 3. VelMatx: Store velocity value at that time point. 4. AngMatx: Store theta value from polar coordinates system at that time point. See: >1. Engbert, R., & Mergenthaler, K. (2006). Microsaccades are triggered by low retinal image slip. Proceedings of the National Academy of Sciences, 103(18), 7192-7197. ## C. Exclude outlier trials and participants ```matlab excludeOutliers(projectDir) ``` This function aims to indicates bad trials produced in previous steps and are excluded in following analysis, including: 1. Incorrect response 2. Reaction times exceed 3 standard deviations from the mean 3. No enough time point for epoching 4. Valid point rate below 30% when remove blink 5. Fail to detect saccade Also, this step exclude participants that: 1. has low accuracy in any congruency condition (< 48%) 2. detects none microsaccades in each trial within epoch window. # III. Main functions This section would introduce some main functions that this toolbox could implement, including getting interested values, t-test, two-way mixed anova, dymamic, main sequence, polar and box plot. To perform these steps with following instruction, see `Exp_session_02.m` ## A. Extracte interested values To extract interested values, we use this function: ```m Data = getValues(projectDir, ... dv, ... % str; dependent variable tw, ... % [num,num], two element array; if dv is time series variable, input array % to average target window; otherwise, input empty array []. iv, ... % cell; independent variable; {'none'} for no IV; {'cond1'} for one IV; % {'cond1', 'cond2'} for 2 IVs. ivVals);% cell; IV values; indicate certain subcondition for extracting. ``` ### 1. Non-time series data Let's say we were interesting in reaction time for two group, we could apply: ```m Data = getValues(projectDir, ... 'RT', ... [], ... {'group'}, ... {}); ``` Here, the `dv` was setted as `'RT'` which stand for reaction times; RTs is not a time series variable, hence, we leave `tw` empty, `[]`. We also interest in RTs in different group, so we assign `iv` as `{'group'}`. And for now we leave `ivVal` empty, cause we don't want to extract subcondition values. Below is head result of `Data`: ![image](https://hackmd.io/_uploads/H1MDqjJrlx.png) First column stand for the index of subject; second for number of trials used to average; third is the iv condition and fourth is the dv value extracted. :::info Note: To check what variable could be used to assign as DV and IV, you could check *VarName.mat* in *prec* folder and it depents on how you setting behavioural table. ::: However, as you can see, the 16th subject has only 19 trials qualified to generate mean RT, and it may lead to higher measurement error. To address with it, we could consider remove subjects with few trials. ```m Data = rmInsff(Data, n) ``` `n` stand for the criteria number that used to exclud subject who has lower trials than it. ### 2. Time series data If we want to extract time series data, let's say it's left pupil diameter size, we could apply: ``` Data = getValues(projectDir, ... 'Pupil_l', ... [], ... {'group'}, ... {}); ``` We assigned `DV` as `'Pupil_l'`, standing for left pupil size, `tw` as `[]`, `IV` as `{'group'}`. Here is head result: ![image](https://hackmd.io/_uploads/S1uy-n1Hlg.png) The dynamic data is stored in `Pupil_l` column as array format. If we want to extract average value based on certain window, from 1000 to 2000 ms, from dynamic data, we could use follow input: ``` Data = getValues(projectDir, ... 'Pupil_l', ... [1000 2000], ... {'group'}, ... {}); ``` And the output is: ![image](https://hackmd.io/_uploads/Hkl8BhySle.png) ### 3. Extract value according to multiple conditions If we want to extract mean left pupil size value based on both group and condition, we could apply: ``` Data = getValues(projectDir, ... 'Pupil_l', ... [1000 2000], ... {'group','cond'}, ... {}); ``` ![image](https://hackmd.io/_uploads/r1DzUh1Bex.png) By including additional IV, `'cond'`, we have values that extracted from each sub condition. If we only interest in certain sub condition, we could achive it by assign values in `ivVal`. For example, we want extract value that under condition 2 under both group: ``` Data = getValues(projectDir, ... 'Pupil_l', ... [1000 2000], ... {'group','cond'}, ... {1, 2; ... 2, 2}); ``` Here, we assign `ivVal` as `{1, 2; 2, 2}`, it means we want to extract values from group 1 under condition 2 and from group 2 under condition 2. ![image](https://hackmd.io/_uploads/HJQku3Jrle.png) As you can see, we extracted values based on the subcondition we indicated. However, there are `nan` values, that is because dismatch between subject and group. We could simply apply `rmInsff()` to remove these nan rows: ``` Data = rmInsff(Data, 0); ``` ![image](https://hackmd.io/_uploads/Bysgq2yrge.png) ## B. Plot data ### 1. Plot dynamic data To plot dynamic data, we use this function: ```m plotDynamic(Data,varargin) % Input: % % Data: table; dynamic data that used to plot % % Optional input: % % tw: [st,ed,epochST]; default [1, 5000, 1], time window to plot. st: start point; % ed: end point; epochST: epoch start point. % col: cell; default jet(numCond). % xlab: str; default '', label of x axis. % ylab: str; default '', label of y axis. % lnstyle: cell; default '-'. ``` We use left pupil size as example. First, we extract data following steps mentioned previously. ```m Data = getValues(projectDir, ... 'Pupil_l', ... [], ... {'group'}, ... {}); ``` Then, we plot it within window of -200 to 4000 ms ```m plotDynamic(Data, ... 'tw', [-200 3999 -1000], ... 'xlab', 'Time from target onset (ms)', ... 'ylab', 'Pupil size (mm)', ... 'col', [[0 .5 .8];[.8 .1 0]]) ``` ![image](https://hackmd.io/_uploads/H1jL1T1Hle.png) Here is examplt for two condition, group and condition: ```m Data = getValues(projectDir, ... 'Pupil_l', ... [], ... {'group','cond'}, ... {}); plotDynamic(Data, ... 'tw', [-200 3999 -1000], ... 'xlab', 'Time from target onset (ms)', ... 'ylab', 'Pupil size (mm)', ... 'col', [[0 .5 .8];[.8 .1 0];[0 .5 .8];[.8 .1 0]], ... 'lnstyle', {'-','-','--','--'}) ``` ![image](https://hackmd.io/_uploads/r1QQe6kBel.png) ### 2. Box plot In order to generate these plot, following function is used: ```m plotBox(Data,varargin) % Input: % % Data: table; dynamic data that used to plot % % Optional input: % % dot: boolean; default false, plot subject values or not. % col: cell; default jet(numCond). % ylab: str; default '', label of y axis. ``` We take left pupil size mean value in window 1000 to 2000 ms as example. ```m Data = getValues(projectDir, ... 'Pupil_l', ... [1000 2000], ... {'group','cond'}, ... {}); plotBox(Data, ... 'dot', true, ... 'col', [[0 .5 .8];[.8 .1 0];[0 .5 .8];[.8 .1 0]], ... 'ylab', 'Pupil size (mm)'); ``` ![image](https://hackmd.io/_uploads/SyrZm6JHge.png) ### 3. Plot main sequence To do it so, we use: ```m op = plotSQ(projectDir, ... tw, % [num,num]; window used to extract microsaccades iv, % cell; independent variables ivVal, % cell; IV values; indicate certain subcondition for extracting col % cell; color variable) ``` For example, we want to plot the main sequence within 0 to 2000 ms. ``` op = plotSQ(projectDir, ... [0 2000], ... {'group','cond'}, ... {}, ... [[0 .5 .8];[.8 .1 0];[0 .5 .8];[.8 .1 0]]); ``` ![image](https://hackmd.io/_uploads/ry--KTkBle.png) ### 4. Plot polar rate distribution To do it so, we use: ```m op = plotPolar(projectDir, tw, % [num,num]; window used to extract microsaccades iv, % % cell; independent variables ivVal,% cell; IV values; indicate certain subcondition for extracting nbins,% num; how many bins used to plot col, % cell; color variable lty, % cell; linestyle variable rmFlat); % boolean; remove subject with flat eye ``` For example, we want to plot polar distrubution within 0 to 2000 ms. ``` op = plotPolar(projectDir, ... [0 2000], ... {'group','cond'}, ... {}, ... 20, ... [[0 .5 .8];[.8 .1 0];[0 .5 .8];[.8 .1 0]], ... {'-','-','--','--'}, ... true); ``` ![image](https://hackmd.io/_uploads/r18_qpkSlg.png) ## C. Analysis ### 1. t-test This function is apply to construct t-test: ```m [df,tval,pval,cd,f] = doTTest(op,dv,iv,ivVal,paired,tail) % Input: % % op: table; data used to perform t-test % dv: str; dependent variables % iv: str; independent variables % ivVal: cell; values used to compare % paired: boolean; paired sample or not % tail: 'both','left','right'; used to indicate test tail % % Output % % df: degread of freedom of t-test % tval: t value % pval: p value % cd: Cohen's d % f :result in sentence ``` For example, we would like to compare RT difference between group. First, we extract values and take it as input to `doTTest()`: ```m Data = getValues(projectDir, ... 'RT', ... [], ... {'group'}, ... {}); doTTest(Data, ... 'RT', ... 'group', ... {1 2}, ... false, ... 'both'); ``` It will generate the result: ```m t(45) = -0.74, p = 0.462, d = -0.22 ``` ### 2. Two-way mixed ANOVA ```m result = mixed_anova(op,dv,bt,wt,plotflag) % Input: % % op: table; data used to perform t-test % dv: str; dependent variables % bt: str; between factor % wt: str; within factor % plotflag: boolean; plot interaction or not % % Output % % result: degread of freedom of t-test ``` Here is an example attempt to perform 2way mixed anova on RT to investigate group and condition effect: ```m Data = getValues(projectDir, ... 'RT', ... [], ... {'group','cond'}, ... {}); mixed_anova(Data, ... 'RT', ... 'group', ... 'cond', ... false); ``` It's the result output generated: ``` Btw: F(1,45) = 0.56, p = 0.459, pes = 0.12 Wth: F(1,45) = 36.22, p = 0.000, pes = 0.45 Int: F(1,45) = 3.22, p = 0.079, pes = 0.07 Btw1: p(bonferroni) = 0.001 Btw2: p(bonferroni) = 0.000 ``` First line indicates main between effect, second for main within effect, third for interaction effet, fourth for simple within effect under between factor 1, fifth for simple within effect under between factor 2. :::warning `mixed_anova()` is generate by chatGPT, it should be examied in furtur by comparing to result from R. ::: ### 3. Principle component analysis (PCA) This analysis adopt ep_toolkit by Joseph Dien. For more detail, see: >1. [ep_toolkit](https://sourceforge.net/projects/erppcatoolkit/) >2. Chang Y, Chen H, Barquero C, et al. Linking tonic and phasic pupil responses to P300 amplitude in an emotional face-word Stroop task. Psychophysiology. 2023;61:e14479. To perform PCA, use: ```m doPCA(Data,% table; data used to perform pca dv, % str; dependent variable tw) % [st,ed,epochST]; array used to crop data ``` For example, we want to do PCA on left pupil size within window of 1 to 4000 ms and attempt to compare group and condition. ```m Data = getValues(projectDir, ... 'Pupil_l', ... [], ... {'group','cond'}, ... {}); doPCA(Data,'Pupil_l',[1 4000 -1000]) ``` The result output is: ``` Componemnt 1 score: Btw: F(1,45) = 0.00, p = 0.960, pes = 0.00 Wth: F(1,45) = 0.09, p = 0.760, pes = 0.00 Int: F(1,45) = 0.15, p = 0.703, pes = 0.00 Btw1: p(bonferroni) = 0.920 Btw2: p(bonferroni) = 0.707 Componemnt 2 score: Btw: F(1,45) = 0.02, p = 0.876, pes = 0.01 Wth: F(1,45) = 2.95, p = 0.093, pes = 0.06 Int: F(1,45) = 0.35, p = 0.557, pes = 0.01 Btw1: p(bonferroni) = 0.172 Btw2: p(bonferroni) = 0.213 ``` If the we extract data with two factors, it will generate mixed anova result for each component score. :::info For now, this function is setted to generate only two components. And if two factors are inputted, first one shall be an between factor, second one shall be a within factor. ::: It also generate several figures in order to check how this analysis performed: ![image](https://hackmd.io/_uploads/SyxV8RJrll.png) ![image](https://hackmd.io/_uploads/BJT4LAkree.png) ![image](https://hackmd.io/_uploads/By5BIAyHxg.png) ![image](https://hackmd.io/_uploads/B1r8LAkHee.png) ![image](https://hackmd.io/_uploads/B16ALCkBex.png) ![image](https://hackmd.io/_uploads/Sk_yv01Hee.png) # IV. Procedure