User:ASnieckus/StatisticsContent/Learn by doing/Driving experience and monthly auto premium (Question 2)

Description of activity and Excel instructions: StatTutor Driving experience and monthly auto premium (Question 2)

Dataset: Direct link for auto_premium.xls not available.

Exploratory data analysis

 * Rename "Sheet 1" to "Q1".
 * Rename "Sheet 2" to "Q2".
 * Copy and paste the raw data from sheet "Q1" to sheet "Q2".

Create a scatterplot
In preparation for creating the scatterplot,
 * Rearrange or delete columns of data so that the data for the two variables to be plotted are adjacent to one another.
 * Sort the data so that observations with missing values can be easily avoided when specifying the data for inclusion in the scatterplot.

Use the Chart Wizard to create a scatterplot.


 * Select the columns containing the two variables to be plotted.
 * Select Insert > Chart....

The Chart Wizard displays.

In ''1. Chart Type'':
 * Select XY (Scatter).
 * Select the Points Only version.

In ''2. Data Range'':
 * Click 'Data series in columns.
 * Check First row as label.

No changes are needed for ''3. Data Series''.

In ''4. Chart Elements'':
 * Enter a title for the chart.
 * Enter a title for the X axis (explanatory variable).
 * Enter a title for the Y axis (response variable).
 * Decide if you want to display a grid in the chart background.
 * Uncheck the Display Legend box.

Calc creates a scatterplot.
 * Check that the scales for the x and y axes are reasonable given the data. Adjust as needed.

If you want to change the shape or size of the data points:
 * Double click on the graph to enter edit mode
 * Right click on the data points
 * Select Object Properties...
 * Explore the options in the Icon section, in the Line tab.

Create a correlation coefficient
Use a formula to compute the correlation:
 * Click in a cell outside of the first two columns of data.
 * Type "=CORREL([range1],[range2])" where [range1] is the first column of data and [range2] is the second column of data.

Return to the StatTutor exercise for Question 2 to provide a comparison of the data in the two groups: describe the key features of the data display and support your description with numerical measures. (Be sure to include the numerical results in your description.)

Calculate p-value
Calc does not include a formula or data analysis routine to calculate the "t-test for the slope" of a regression line. You can use Excel for this activity, or the web-based p-value calculator for the correlation coefficients at danielsoper.com. (Although note that the online stats calculator returns a slightly higher, although still very small, p-value than the value reported by Excel.)

Return to the StatTutor exercise for Question 2 to report results and draw conclusions.