---
tags: math615
robots: noindex, nofollow
---
# Collaborative SPSS Notes
This set of notes is for students at Chico State to help build a reference for themselves and future users on how to do tasks in SPSS.
* You do not need a HackMD account to edit these notes.
* Click the split screen icon in the top left corner to see the code and HTML output side by side.
* In view only mode (not split screen) you will see a navigation bar
# General resources
* https://libguides.library.kent.edu/SPSS/home
* http://wlm.userweb.mwn.de/SPSS/
* https://www.spss-tutorials.com/
# Syntax Files
A *syntax* file is also called a code file or a script file and has a `.sps` file type.
This is a list of commands to execute to perform data management or analysis tasks
#### <span style="color:red">**READ THIS!**</span> Using SPSS syntax https://libguides.library.kent.edu/SPSS/Syntax
When using the menu options to complete a task, be sure to specify **paste this syntax** to get the code when importing data. You can then copy this code into your own `.sps` script file for that particular assignment.
# Import and preprocessing
* External data files are saved as text files (`.txt`), comma separated values (`.csv`) or as Excel files (`.xlsx`)
* SPSS data files are saved as `.spv` file type
## Import Data into SPSS
https://libguides.library.kent.edu/SPSS/ImportData
* **Excel** Click `File > Open > Data`
* **Text (includes csv)** `File > Read Text Data`
Do not copy code from the HTML side. Open the Markdown side instead.
```
GET DATA /TYPE=XLSX
/FILE='C:\path\to\file.xlsx'
/SHEET=name 'Name-of-Sheet'
/CELLRANGE=full
/READNAMES=on
/ASSUMEDSTRWIDTH=32767.
EXECUTE.
DATASET NAME DataSetExcel WINDOW=FRONT.
```
Already have a `.sav` file?
```
GET FILE = 'd:\mydirectory\mysubdirectory\mydata.sav'.
```
## Only read in selected variables
```
GET FILE = 'd:\mydirectory\mysubdirectory\mydata.sav'
/ KEEP id var1 var15 var17 var88
/ RENAME (var17 var88 = var16 var17).
```
The `KEEP` statement while importing data only works if you are using `GET FILE` and are reading in a pre-existing SPSS data set.
## specify variable types
SPSS seems to think everything is nominal.
Be sure to change all quantitative variables to `SCALE`.
```
VARIABLE LEVEL var1 var2 var3 (SCALE)
/ var4 var5 (ORDINAL).
/ var6 (NOMINAL).
```
## Export SPSS formatted data
After your data file has successfully been imported, you'll want to save the the result as an SPSS data file (*.sav format) by following these steps:
* In the active data window, click `File > Save As`. The Save Data As window will appear.
* Choose the directory where you want the file to be saved.
* Type a name for your file in the File name field.
* If you wish to save only certain variables in your data set, go to 'File' then 'save as' in the save as wndow click Variables and select the variables you wish to keep in your saved data file. However, the point and click method will give you an extremely long code. Writing your own code (syntax below) is a much cleaner code. Point and click may be useful if you have several variables vs. a few.
* When you are finished, click `Save`.
If you choose to only save certain variables in your data set, a `KEEP` statement will be generated as part of the `SAVE DATA` code. This list of variables can be manually modified later.
```
SAVE outfile='FilePath\name_of_file.sav'
/KEEP = Variable_Name1 Variable_Name2.
EXECUTE.
```
# Data processing
A data management file (e.g. `dm_addhealth.sps`) is created that when executed
1. Imports an external data file
2. Executes some data processing steps such as renaming or creating new variables
3. Saves an analysis-ready SPSS data file to your computer for later use.
https://libguides.library.kent.edu/SPSS/data-management
### Changing data type
SPSS seems to save values as Nominal as default.
You can change the data type by doing:
```
VARIABLE LEVEL varname (SCALE).
EXECUTE.
```
### Recoding variables
There are two ways to recode.
`RECODE varname (list of recodes)` overwrites the existing variable.
I recommend you use `RECODE varname (list of recodes) INTO newvarname` to recode the raw variable into a new one. This way if you mess up on the recode (which will happen eventually) you can fix your code and try again without having to read the entire raw data set in again.
Important note: If you are using RECODE INTO want to change only one value (like a `99` to `SYSMIS`), you have to COPY the remaining values into the new variable.
EXAMPLES:
Change values and save copy into new variable
- specify every value the variable can take.
```
RECODE BIO_SEX (1=0) (2=1) (6=SYSMIS) INTO FEMALE.
```
- Only change one out of many values.
```
RECODE SEAQ4D (99=SYSMIS) (ELSE=Copy) INTO NUMCOOLERS.
```
- Change a range of values
```
RECODE H4TR6 (11 thru Highest = SYSMIS) (else=copy) into casual_part.
```
Recoding multiple variables
```
RECODE V1 TO V3 (0=1) (1=0) (2,3=-1) (9=9) (ELSE=SYSMIS)
/QVAR(1 THRU 5=1)(6 THRU 10=2)(11 THRU HI=3)(ELSE=0).
```
Confirm changes were done correctly to recode `BIO_SEX` into `FEMALE`
```
crosstabs BIO_SEX by FEMALE /cells count /missing include.
```
### Applying labels to the variable
`VARIABLE LABELS 'FEMALE'.`
### Applying labels to levels of categorical variable
```
VALUE LABELS FEMALE
0 'male'
1 'female'.
EXECUTE.
```
### Applying labels to categorical variables
http://www.statsmakemecry.com/smmctheblog/using-syntax-to-assign-variable-labels-and-value-labels-in-s.html
----
# Data Exploration and Vizualization
Really good walk through for data exploration
https://libguides.library.kent.edu/SPSS/Explore
### Univariate Categorical
**Frequency Tables**
https://libguides.library.kent.edu/SPSS/FrequenciesCategorical
`Analyze --> Descriptive Statistics -> Frequencies`
```
FREQUENCIS VARIABLES = var
/ORDER = ANALYSIS.
```
**Bar Charts**
`Analyze --> Descriptive Statistics -> Frequencies --> Charts (bar chart)`
```
FREQUENCIS VARIABLES = var
/BARCHART FREQ
/ORDER = ANALYSIS.
```
If you want a barchart with proportions, change `FREQ` to `PERCENT`.
### Univariate Quantitative
`Analyze -> Descriptive Statistics -> Descriptives`
```
DESCRIPTIVES VARIABLES = variable list
/STATISTICS MEAN STDDEV MIN MAX
```
To get median and quantiles you have to use FREQUENCIES, even though a freq table is not appropriate for continuous data. :angry:
The `NTILES=4` gives you the quartiles.
```
FREQUENCIES VARIABLES=casual_part
/NTILES=4
/STATISTICS=STDDEV MINIMUM MAXIMUM MEAN MEDIAN.
```
_Plots_
* Histograms
* Density plots
* Boxplots
* Violin plots
---
### Using the Chart Builder
You may get a message saying "measurement levels shoudl be set properly for each variable". You should pay attention to this, but you can also hit "don't show again" and "OK" to continue.
---
### Bivariate C~C
_Summary Statistics_
* Cross-tab (two way table) `Analyze --> `
_Plots_
* Side by side barcharts
### Bivariate C~Q
_Summary Statistics_
How to calculate summary measures listed above separately for each group.
```
MEANS TABLES=Math BY MathGrade
/CELLS=COUNT MIN MAX.
```
_Plots_
* Side by side boxplots
* Overlaid density plots
### Bivariate Q~Q
_Summary Statistics_
* Correlation
_Plots_
* Scatterplot: `graph --> Scatter/Dot`
* Add trend lines
http://www.unige.ch/ses/sococ/cl/spss/eda/smoothing1.html
### Multivariate
* Scatterplot colored by third grouping variable
https://www.spss-tutorials.com/spss-scatterplot-tutorial/
```
GRAPH
/SCATTERPLOT(BIVAR)=whours WITH salary BY jtype
/MISSING=LISTWISE
/TITLE "Monthly Salary by Weekly Hours".
```
-----
# Inference
## Comparing two means (T-Test)
**To run the two means sample T - Test
*Go to:*
analyse -> compare means -> Independent samples T- Test
* Then you move your dependent (response) variable into Test variable.
* Then move your independent (explanatory) variable into grouping variable.
* Define your groups (this means you tell SPSS that one group (i.e. males) is mu 1 and the second group (i.e. females) is mu 2).
* Then click paste to get the syntax code, and then run the test.
**Reading the SPSS two sample means T - Test**
## Comparing two proportions (T-Test)
## Comparing multiple means (ANOVA)
## Comparing multiple proporions (Chi-squared test of Association)
## Correlation
## Regression
## Categorial Predictors
GLM with categorical variable
1.Point and click Analize -> Generalized Linear Models -> Univariate
2. Choose Type of model, Response(put in dependent variable),Predictors(put variables into factors and covariates), Model (move variables over *Make sure that the :build Term(s)type is "Main Effects"*)
3. Options -> Check "Parameter Estimates"
4. OK/Run
```
UNIANOVA Self_Scale BY Ethnicity WITH Age
/METHOD=SSTYPE(3)
/INTERCEPT=INCLUDE
/PRINT PARAMETER
/CRITERIA=ALPHA(.05)
/DESIGN=Ethnicity Age.
```
----
# Stratified Analysis
1. Sort cases by stratification variable (`jtype`)
`sort cases by jtype.`
2. Split file by the stratification variable (`jtype`)
`split file by jtype.`
3. Calculate summary statistics within each strata.
Example here is correlation of `salary` with `whours`.
`correlations salary with whours.`
4. Turn off file splitting
`split file off.`
* **Reference for regression**
https://www-01.ibm.com/support/docview.wss?uid=swg21479670
# Model results
## Plotting marginal effects models
http://www.statsmakemecry.com/smmctheblog/how-to-plot-interaction-effects-in-spss-using-predicted-values