StatTutor: Relationship of seat location and academic performance (Question 3)
Description of activity and Excel instructions: Relationship of seat location and academic performance (Question 3)
Dataset: Direct link for body_image.xls not available
Before you begin, you may want to create a new worksheet (; ) for the analysis of Question 3 to avoid messing up the analysis of Q1.
- Right-click on the tab and select 'Move/Copy Sheet....
- Check the box to make a copy.
Calculate descriptive statistics for each value of the categorical variable
Before starting on the table, sort the data by the categorical variable. (Be sure to indicate that your data has a header row.)
Now that the data are sorted, pick an empty area in your worksheet and create a table where the row headers are "Min", "Q1", "Median", "Q3", "Max" and the column headers are the different values of the categorical variable.
For each cell in the table, enter the formula to calculate the statistic listed in the row header. The data range to include in each formula is the range of data in the quantitative variable that corresponds to the categorical variable value listed in the column header.
The formulas you will need are:
- Min: "=min(X2:X99)"
- Q1: "=quartile(X2:X99,1)" [Note that in some operating systems the formula uses a semi-colon instead of a colon between the range and the numeral "1".]
- Median: "=median(X2:X99)"
- Q3: "=quartile(X2:X99,3)" [Note that in some operating systems the formula uses a semi-colon instead of a colon, between the range and the number "3".]
- Max: "=Max(X2:X99)"
Create side-by-side boxplots
Unfortunately, Calc can't create boxplots directly, so we have to create a column graph and alter it to display as side-by-side boxplots. We start by using the calculated Descriptive Statistics to generate new values that will be the basis for our column graph.
Create the necessary statistics
- In each successive cell under Max, type "25th Percentile", "50th Percentile", "75th Percentile", "Lower limit", and finally "Upper limit".
- The values in these new rows, for each seat location, are calculated based on the 5-number summary values. Enter the necessary formulas to obtain:
- 25th Percentile = Q1 (Remember to copy the numbers from Q1 without copying the formulas.)
- 50th Percentile = Median - Q1
- 75th Percentile = Q3 - Median
- Lower limit = Q1 - min
- Upper limit values = Max - Q3
Create the boxplots
Now you are ready to create the boxplots.
- Highlight the values in the 25th Percentile, 50th Percentile, and 75th Percentile rows
- Choose Insert > Chart...
The Chart Wizard displays.
- In 1. Chart Type', choose the stacked column option.
- In 2. Data Range, ensure that the data is correctly specified.
- In 3. Data Series, use the Categories window to specify the category labels.
- In 4. Chart Elements, label and select appropriate options for your chart.
A stacked column graph displays. To create the boxplots, add error bars (to the min and max values) and adjust the color and outlines as needed.
- Use the Insert Y Error Bars... option to add the Upper Limit error bar.
- Use the Insert Y Error Bars... option to add the Lower Limit error bar. (Remember to add this error bar select the "lowest portion of the bar.")
- For each section of the bars, adjust the color and border as necessary.
- If needed, adjust the y-axis scale.
Use the boxplots to visually compare the GPA distributions for students who tend to sit in the front, middle or rear of a classroom.
Return to the StatTutor page to complete the interpretive exercises. (Be sure to include the data in your description of the results, but save your thoughts on what the results mean for the conclusions section.)