Healthcare Project Data Analysis Notes

# Healthcare Project Data Analysis Notes ###### tags: `Healthcare Poject` @ElocinZhan [Toc] ### Participants #### Response Quality Check >**Filtered by Attention Checkers**:small_red_triangle_down: * 2 invalid/rejected because failed both two attention checkers * 295 valid data - 24 failed the 1st Attention checker ==only== - 12 failed the 2nd Attention checker ==only== * Data kept for Straightlining Check: **259** (295-24-12) > **Deal with Straightlining Issues** :small_red_triangle_down: According to [Kim et al. 2019](https://journals.sagepub.com/doi/pdf/10.1177/0894439317752406), they valued and recommended to use the following three measurements to check for straightlining problems: `Simple Nondifferentiation Method`, `Mean Root of Pairs Method`, `Scale Point Variation Measure`. In our case, for each participant, we use ++extended++ `Simple Nondifferentiation Method` to justify whether they were straight lining across the questions. :::spoiler For each participant, we check the following things and finally eliminated the responses of **7+2 =9** participants because they were detected as most likely to perform straightlining. 1. Check if any straight lines appear (select the same result for more than 3 questions, i.e. `[4,4,4,4,4]` means selected somewhat agree four times **consecutively**) 2. Use a List 'record' to store the details of all the sraightlining responses appeared 3. Count the number of times the staightlining appears 4. Find the **most heavy straightlining** and calculate the weight of the straightlined questions in relation to all responses. ::: - [ ] The reason for remove them can be found in [Removed Straightlining Details](https://emckclac-my.sharepoint.com/:b:/g/personal/k1896830_kcl_ac_uk/EWkdt5T_7IxHga60lH78284BPSLF-bDXAx5AzTXrSfnvvw?e=Yf8Dtc) - [ ] ==Data kept for Analysis: **250** (259-9) #### Demongraphics Summary :::spoiler Table <div align="center"> | Variable | Type |Number| | -------- | -------- |-------- | | Gender | Male | 73| | |Female| 83| | |Prefer not to say| 1| | Age| 18 -- 24| 46| | | 25 -- 34 | 57| | | 35 -- 44 | 34| | | 45 -- 54 | 11| | | 55 -- 64 | 8| | | 65+ | 1| | Device Type | Amazon Alexa | 1| | | Google Assistant | 1| </div> ::: ### Mapping to SPSS for Descriptive Analysis 1. **Reverse-coded Questions:** * Q9_1: I don't know a lot about healthcare AI assistants (Familarity) * Q11_3: I don't think using AI technology says a lot about who I am (Community Interest) * Q12_1: I don't usually keep an eye on emerging products using AI technology, especially those that will be beneficial to my health (Tech Attachment) * Q13_1: I don't think it would be risky to interact with a Healthcare AI assistant (Privacy Risk) * Q19_1: I don't think that healthcare AI assistants are likely to replace doctors in the future  (Replacement) #### Descriptive Analysis :::info The reverse-coded questions are highlighted in **red**, but the values have been **reversed back** ::: :::spoiler Details(with Pictures) 1. ++Propensity to Trust++ * Looks fine to me. The value of the reverse coding question is consistent with the other questions in the same group ![](https://i.imgur.com/d5mtPzl.png) 2. ++Trustworthy Function++ ![](https://i.imgur.com/IFd5Gv5.png) 3. ++Security and Privacy Concerns++ * Not sure if the first scale was influenced by the reverse coded ![](https://i.imgur.com/1iTaNcP.png =500x200) 4. ++Contextual Influence++ * The mean value for the first scale is a liitle lower than others ![](https://i.imgur.com/t0ttVlT.png =600x250) 5. ++Trust Scales++ ![](https://i.imgur.com/X3vKphu.png =500x200) 6. ++IUIPC++ ![](https://i.imgur.com/Vh1QuIh.png =500x180) 7. ++SA-6++ ![](https://i.imgur.com/JwGpXYT.png =380x180) ::: ### PLS Analysis #### Data Preprocessing 1. Using SPSS to calculate the ++mean++ value for each second-order constructs. i.e. use`Mean(Familarity, TechAttachment, CommunityInterest, TrustStance)` as the value of `Proprnsity to Trust` > :exclamation: Calculating float numbers in python generates errors. 2. Using SmartPLS 3.3 to analyze the data >Commonly, for higher-order constructs(HOCs), there are ++two++ approaches to validate the scales. >>1)Repeated Indicator Approach >>2)Two-stage Approach 3. Preparation :::spoiler Steps and Figures 1. draw the framework in the SmartPLS workspace 2. Load the data and check if there are any missing values 3. Drag indicators to each constructs and make the connections between constructs ![PLSframework](https://i.imgur.com/3Wf67cY.png =400x600) ::: #### Using Repeated Indicator Approach: 1. Run PLS algorithm ![](https://i.imgur.com/3Thal2J.png =300x100) ![](https://i.imgur.com/lyr8dfT.png) KMO result (.812) and Bartlett's test result is <.001, means good sampling suitability and analysis of suitability factors. ![Exploratory Analysis Results](https://i.imgur.com/g81w5jW.png) --- ### Reference/Source 1. [Raw Data](https://emckclac-my.sharepoint.com/:x:/g/personal/k1896830_kcl_ac_uk/EXP5jkGgUtFMoNiHHi_2xbIBhDpJvDIt_cQCTnpPDaibBQ?e=gEQlLY) : Participants who failed attention checker are marked in yellow. 2. [Clean Data](https://emckclac-my.sharepoint.com/:x:/g/personal/k1896830_kcl_ac_uk/EafpS6PEZ6pKpeGbuGVHb5UBNuyi7lT6FxEQdig3zjKiOw?e=8srW1r) : Removed all invalid records 3. [Raw Demographics](https://emckclac-my.sharepoint.com/:x:/g/personal/k1896830_kcl_ac_uk/EcF1HcnM6WhAj_J__TozzJ0B_EsAujvph5kGoYeANYSkXw?e=PyrPIo) : Participants who failed attention checker are marked in yellow. 4. [Clean Demographics](https://emckclac-my.sharepoint.com/:x:/g/personal/k1896830_kcl_ac_uk/Ef-XqN7d619BvJ_1iFAI45YBpHsBQ0OcRGOfhRQdEyPIFQ?e=I3ZPul) : Removed all invalid records 5. [Paper](https://www.overleaf.com/project/5f7f6fa924ea5a000136348a) : Contains detailed demographics summary 6. [Output](https://emckclac-my.sharepoint.com/:x:/g/personal/k1896830_kcl_ac_uk/Ec0Zy1HJjK5Gho1i-CgpuFABFn-EltGNR_n-ESHshWuPQg?e=YWkNLG) : Data with Question body and ++Numerical values++ only