StatTutor: Driving experience and monthly auto premium (Question 2)
Description of activity and Excel instructions: StatTutor Driving experience and monthly auto premium (Question 2)
Dataset: Direct link for auto_premium.xls not available.
Exploratory data analysis
- Rename "Sheet 1" to "Q1".
- Rename "Sheet 2" to "Q2".
- Copy and paste the raw data from sheet "Q1" to sheet "Q2".
Create a scatterplot
In preparation for creating the scatterplot,
- Rearrange or delete columns of data so that the data for the two variables to be plotted are adjacent to one another.
- Sort the data so that observations with missing values can be easily avoided when specifying the data for inclusion in the scatterplot.
Use the Chart Wizard to create a scatterplot.
- Select the columns containing the two variables to be plotted.
- Select Insert > Chart....
The Chart Wizard displays.
In 1. Chart Type:
- Select XY (Scatter).
- Select the Points Only version.
In 2. Data Range:
- Click 'Data series in columns.
- Check First row as label.
No changes are needed for 3. Data Series.
In 4. Chart Elements:
- Enter a title for the chart.
- Enter a title for the X axis (explanatory variable).
- Enter a title for the Y axis (response variable).
- Decide if you want to display a grid in the chart background.
- Uncheck the Display Legend box.
Calc creates a scatterplot.
- Check that the scales for the x and y axes are reasonable given the data. Adjust as needed.
If you want to change the shape or size of the data points:
- Double click on the graph to enter edit mode
- Right click on the data points
- Select Object Properties...
- Explore the options in the Icon section, in the Line tab.
Create a correlation coefficient
Use a formula to compute the correlation:
- Click in a cell outside of the first two columns of data.
- Type "=CORREL([range1],[range2])" where [range1] is the first column of data and [range2] is the second column of data.
Return to the StatTutor exercise for Question 2 to provide a comparison of the data in the two groups: describe the key features of the data display and support your description with numerical measures. (Be sure to include the numerical results in your description.)
Calc does not include a formula or data analysis routine to calculate the "t-test for the slope" of a regression line. You can use Excel for this activity, or the web-based p-value calculator for the correlation coefficients at danielsoper.com. (Although note that the online stats calculator returns a slightly higher, although still very small, p-value than the value reported by Excel.)
Return to the StatTutor exercise for Question 2 to report results and draw conclusions.