Collaborative SPSS Notes

This set of notes is for students at Chico State to help build a reference for themselves and future users on how to do tasks in SPSS.

You do not need a HackMD account to edit these notes.
Click the split screen icon in the top left corner to see the code and HTML output side by side.
In view only mode (not split screen) you will see a navigation bar

General resources

Syntax Files

A syntax file is also called a code file or a script file and has a .sps file type.

This is a list of commands to execute to perform data management or analysis tasks

READ THIS! Using SPSS syntax https://libguides.library.kent.edu/SPSS/Syntax

When using the menu options to complete a task, be sure to specify paste this syntax to get the code when importing data. You can then copy this code into your own .sps script file for that particular assignment.

Import and preprocessing

External data files are saved as text files (.txt), comma separated values (.csv) or as Excel files (.xlsx)
SPSS data files are saved as .spv file type

Import Data into SPSS

https://libguides.library.kent.edu/SPSS/ImportData

Excel Click File > Open > Data
Text (includes csv) File > Read Text Data

Do not copy code from the HTML side. Open the Markdown side instead.

GET DATA /TYPE=XLSX
  /FILE='C:\path\to\file.xlsx' 
  /SHEET=name 'Name-of-Sheet'
  /CELLRANGE=full
  /READNAMES=on
  /ASSUMEDSTRWIDTH=32767.
EXECUTE.
DATASET NAME DataSetExcel WINDOW=FRONT.

Already have a .sav file?

GET FILE = 'd:\mydirectory\mysubdirectory\mydata.sav'.

Only read in selected variables

GET FILE = 'd:\mydirectory\mysubdirectory\mydata.sav'
  	/ KEEP id var1 var15 var17 var88
  	/ RENAME (var17 var88 = var16 var17).

The KEEP statement while importing data only works if you are using GET FILE and are reading in a pre-existing SPSS data set.

specify variable types

SPSS seems to think everything is nominal.
Be sure to change all quantitative variables to SCALE.

VARIABLE LEVEL var1 var2 var3 (SCALE)
/ var4 var5 (ORDINAL).
/ var6 (NOMINAL).

Export SPSS formatted data

After your data file has successfully been imported, you'll want to save the the result as an SPSS data file (*.sav format) by following these steps:

In the active data window, click File > Save As. The Save Data As window will appear.
Choose the directory where you want the file to be saved.
Type a name for your file in the File name field.
If you wish to save only certain variables in your data set, go to 'File' then 'save as' in the save as wndow click Variables and select the variables you wish to keep in your saved data file. However, the point and click method will give you an extremely long code. Writing your own code (syntax below) is a much cleaner code. Point and click may be useful if you have several variables vs. a few.
When you are finished, click Save.

If you choose to only save certain variables in your data set, a KEEP statement will be generated as part of the SAVE DATA code. This list of variables can be manually modified later.

SAVE outfile='FilePath\name_of_file.sav'
/KEEP = Variable_Name1 Variable_Name2.
EXECUTE.

Data processing

A data management file (e.g. dm_addhealth.sps) is created that when executed

Imports an external data file
Executes some data processing steps such as renaming or creating new variables
Saves an analysis-ready SPSS data file to your computer for later use.

https://libguides.library.kent.edu/SPSS/data-management

Changing data type

SPSS seems to save values as Nominal as default.
You can change the data type by doing:

VARIABLE LEVEL varname (SCALE).
EXECUTE.

Recoding variables

There are two ways to recode.
RECODE varname (list of recodes) overwrites the existing variable.

I recommend you use RECODE varname (list of recodes) INTO newvarname to recode the raw variable into a new one. This way if you mess up on the recode (which will happen eventually) you can fix your code and try again without having to read the entire raw data set in again.

Important note: If you are using RECODE INTO want to change only one value (like a 99 to SYSMIS), you have to COPY the remaining values into the new variable.

EXAMPLES:

Change values and save copy into new variable

specify every value the variable can take.

RECODE BIO_SEX (1=0) (2=1) (6=SYSMIS) INTO FEMALE.

Only change one out of many values.

RECODE SEAQ4D (99=SYSMIS) (ELSE=Copy) INTO NUMCOOLERS.

Change a range of values

RECODE H4TR6 (11 thru Highest = SYSMIS) (else=copy) into casual_part.

Recoding multiple variables

RECODE V1 TO V3 (0=1) (1=0) (2,3=-1) (9=9) (ELSE=SYSMIS)
  /QVAR(1 THRU 5=1)(6 THRU 10=2)(11 THRU HI=3)(ELSE=0).

Confirm changes were done correctly to recode BIO_SEX into FEMALE

crosstabs BIO_SEX by FEMALE /cells count /missing include.

Applying labels to the variable

VARIABLE LABELS 'FEMALE'.

Applying labels to levels of categorical variable

VALUE LABELS FEMALE
0 'male'
1 'female'.
EXECUTE.

Applying labels to categorical variables

http://www.statsmakemecry.com/smmctheblog/using-syntax-to-assign-variable-labels-and-value-labels-in-s.html

Data Exploration and Vizualization

Really good walk through for data exploration
https://libguides.library.kent.edu/SPSS/Explore

Univariate Categorical

Frequency Tables
https://libguides.library.kent.edu/SPSS/FrequenciesCategorical
Analyze --> Descriptive Statistics -> Frequencies

FREQUENCIS VARIABLES = var
  /ORDER = ANALYSIS.

Bar Charts
Analyze --> Descriptive Statistics -> Frequencies --> Charts (bar chart)

FREQUENCIS VARIABLES = var
  /BARCHART FREQ
  /ORDER = ANALYSIS.

If you want a barchart with proportions, change FREQ to PERCENT.

Univariate Quantitative

Analyze -> Descriptive Statistics -> Descriptives

DESCRIPTIVES VARIABLES = variable list
  /STATISTICS MEAN STDDEV MIN MAX

To get median and quantiles you have to use FREQUENCIES, even though a freq table is not appropriate for continuous data.

Image Not Showing Possible Reasons

The image file may be corrupted
The server hosting the image is unavailable
The image path is incorrect
The image format is not supported

Learn More →

The NTILES=4 gives you the quartiles.

FREQUENCIES VARIABLES=casual_part
 /NTILES=4
 /STATISTICS=STDDEV MINIMUM MAXIMUM MEAN MEDIAN.

Plots

Histograms
Density plots
Boxplots
Violin plots

Using the Chart Builder

You may get a message saying "measurement levels shoudl be set properly for each variable". You should pay attention to this, but you can also hit "don't show again" and "OK" to continue.

Bivariate C~C

Summary Statistics

Cross-tab (two way table) Analyze -->

Plots

Side by side barcharts

Bivariate C~Q

Summary Statistics
How to calculate summary measures listed above separately for each group.

MEANS TABLES=Math BY MathGrade
  /CELLS=COUNT MIN MAX.

Plots

Side by side boxplots
Overlaid density plots

Bivariate Q~Q

Summary Statistics

Correlation

Plots

Scatterplot: graph --> Scatter/Dot
Add trend lines
http://www.unige.ch/ses/sococ/cl/spss/eda/smoothing1.html

Multivariate

Scatterplot colored by third grouping variable
https://www.spss-tutorials.com/spss-scatterplot-tutorial/

GRAPH
    /SCATTERPLOT(BIVAR)=whours WITH salary BY jtype
    /MISSING=LISTWISE
    /TITLE "Monthly Salary by Weekly Hours".

Inference

Comparing two means (T-Test)

**To run the two means sample T - Test
Go to:
analyse -> compare means -> Independent samples T- Test

Then you move your dependent (response) variable into Test variable.
Then move your independent (explanatory) variable into grouping variable.
Define your groups (this means you tell SPSS that one group (i.e. males) is mu 1 and the second group (i.e. females) is mu 2).
Then click paste to get the syntax code, and then run the test.

Reading the SPSS two sample means T - Test

Comparing two proportions (T-Test)

Comparing multiple means (ANOVA)

Comparing multiple proporions (Chi-squared test of Association)

Correlation

Regression

Categorial Predictors

GLM with categorical variable
1.Point and click Analize -> Generalized Linear Models -> Univariate
2. Choose Type of model, Response(put in dependent variable),Predictors(put variables into factors and covariates), Model (move variables over Make sure that the :build Term(s)type is "Main Effects")
3. Options -> Check "Parameter Estimates"
4. OK/Run

UNIANOVA Self_Scale BY Ethnicity WITH Age
  /METHOD=SSTYPE(3)
  /INTERCEPT=INCLUDE
  /PRINT PARAMETER
  /CRITERIA=ALPHA(.05)
  /DESIGN=Ethnicity Age.

Stratified Analysis

Sort cases by stratification variable (jtype)
sort cases by jtype.
Split file by the stratification variable (jtype)
split file by jtype.
Calculate summary statistics within each strata.
Example here is correlation of salary with whours.
correlations salary with whours.
Turn off file splitting
split file off.

Reference for regression
https://www-01.ibm.com/support/docview.wss?uid=swg21479670

Model results

Plotting marginal effects models

http://www.statsmakemecry.com/smmctheblog/how-to-plot-interaction-effects-in-spss-using-predicted-values