StatTutor: Driving experience and monthly auto premium (Question 2)

From WikiEducator
Jump to: navigation, search


Description of activity and Excel instructions: StatTutor Driving experience and monthly auto premium (Question 2)

Dataset: Direct link for auto_premium.xls not available.

Exploratory data analysis

  • Rename "Sheet 1" to "Q1".
  • Rename "Sheet 2" to "Q2".
  • Copy and paste the raw data from sheet "Q1" to sheet "Q2".

Create a scatterplot

In preparation for creating the scatterplot,

  • Rearrange or delete columns of data so that the data for the two variables to be plotted are adjacent to one another.
  • Sort the data so that observations with missing values can be easily avoided when specifying the data for inclusion in the scatterplot.

Use the Chart Wizard to create a scatterplot.

  • Select the columns containing the two variables to be plotted.
  • Select Insert > Chart....

The Chart Wizard displays.

In 1. Chart Type:

  • Select XY (Scatter).
  • Select the Points Only version.

In 2. Data Range:

  • Click 'Data series in columns.
  • Check First row as label.

No changes are needed for 3. Data Series.

In 4. Chart Elements:

  • Enter a title for the chart.
  • Enter a title for the X axis (explanatory variable).
  • Enter a title for the Y axis (response variable).
  • Decide if you want to display a grid in the chart background.
  • Uncheck the Display Legend box.

Calc creates a scatterplot.

  • Check that the scales for the x and y axes are reasonable given the data. Adjust as needed.

If you want to change the shape or size of the data points:

  • Double click on the graph to enter edit mode
  • Right click on the data points
  • Select Object Properties...
  • Explore the options in the Icon section, in the Line tab.

Create a correlation coefficient

Use a formula to compute the correlation:

  • Click in a cell outside of the first two columns of data.
  • Type "=CORREL([range1],[range2])" where [range1] is the first column of data and [range2] is the second column of data.

Return to the StatTutor exercise for Question 2 to provide a comparison of the data in the two groups: describe the key features of the data display and support your description with numerical measures. (Be sure to include the numerical results in your description.)

Calculate p-value

Calc does not include a formula or data analysis routine to calculate the "t-test for the slope" of a regression line. You can use Excel for this activity, or the web-based p-value calculator for the correlation coefficients at danielsoper.com. (Although note that the online stats calculator returns a slightly higher, although still very small, p-value than the value reported by Excel.)

Return to the StatTutor exercise for Question 2 to report results and draw conclusions.