# Create boxplots using SPSS

A large university reports the percentage of the entering Freshman class graduating on time in each of 8 years from each of 6 separate colleges which make up the university. The years cover a period of war protest and other upheavals that may have disrupted some student's education plans.[1]

The resulting dataset includes 48 observations and 2 variables:

• College: college ID number

### Dataset

• an SPSS version of the dataset is available on your class website: graduation.sav

Open the dataset in the SPSS data editor.

The following instructions are based on the student version of PASW (SPSS) version 18.

### Calculate descriptive statistics by college

Use the Explore analysis to calculate the mean and standard deviation, and the five-number summary, separately for each of the colleges:

• Click Analyze > Descriptive Statistics > Explore....

The Explore dialog displays.

• Move the "Percent Graduating on Time" variable from the lefthand box to the Dependent List box.
• Move the "College" variable from the lefthand box to the Factor List box, as we want the statistics calculated for each value (factor) of the "College" variable.
• Click the Statistics button.
• Check Descriptives and Percentiles.
• Click Continue.

Although you could create the boxplots along with the summary statistics, instructions for doing so using the Chart Builder are provided below.

• In the Display area of the Explore dialog box, select Statistics (moving the selection from Both).
• Click OK.

A number of tables are created in the output window, containing a number of statistics for each of the colleges, many more than you need. In the tables find (and highlight on a printout) the mean and standard deviation, and the numbers which make up the five-number summary.

### Create side-by-side boxplots

Use the Chart Builder to create the boxplots:

• Click Graphs > Chart Builder....
• Note that the variables in graduation.sav are appropriately defined; click OK in the Chart Builder warning dialog, if it displays.

The Chart Builder dialog box displays:

• In the Gallery area at the bottom of the box, select Boxplot from the listing.
• Select the left-most picture of boxplots (Simple Boxplot) and drag it to the large chart preview window.

A crude preview displays and the Element Properties window opens.

• To create a boxplot for each value of the College variable, click and drag the "College" variable (from the list on the left) to the X-Axis? box at the bottom of the preview window.

Don't worry that the preview graph fails to represent your data. The preview is based on example data.

• To add the variable of interest, click and drag the "% Graduating on Time" variable (from the list on the left) to the Y-Axis? box on the left side of the preview window.

The side-by-side boxplots are now ready to be created.

• At the bottom of the Chart Builder dialog box, click OK.

The Chart Builder dialog box closes and SPSS activates the Output window to display the boxplots.

To make adjustments to the resulting boxplots, double-click the graph displayed in the output window.

The Chart Editor displays, which includes many options for customizing a graph.

• Select Options > Title.

A title is added to the graph with the word "Title".

• Click on the word "Title" to highlight the title box; click on the word "Title", again, to begin editing.
• Enter the title text in the newly added titling area.
• Press enter on your keyboard to add the title to the graph.

As the boxes are rather short, and there is a lot of white space on the graph, let's adjust the scale of the y-axis.

• Select Edit > Select Y Axis.

A Properties box displays.

• Select the Scale tab from the tab bar at top.

Let's adjust the minimum; note that the data value is provided for the min and max: min = 43.2.

• For the Minimum value, uncheck Auto; add a value for the scale minimum in the Custom field (e.g., 30)
• Click Apply.

Close the Chart Editor, the changes are applied to the graph in the output window.

The boxplots can be used to visually compare the distribution of graduation rates among the colleges.

# Interpreting the boxplots

 Consider the following questions: How do the distributions for the different colleges compare with one another, as to center, spread and outliers? If you were to choose one of the University's 6 colleges to attend, based on the available data which would you choose? Why?

### Notes

1. Data story available at the Data and Story Library (DASL), retreived 31 August 2012