User:ASnieckus/StatisticsContent/Learn by doing/Mean SAT verbal score (Question 1)

Description of activity and Excel instructions: StatTutor Mean SAT score (Question 1)

Dataset: Direct link for cell_phones.xls not available.

Exploratory Analyses
This exercise begins by taking a look at the data.

Calculate descriptive statistics
Before beginning on the table of descriptive statistics, let's sort the data to make it easier to check the tabulation.


 * Type Ctrl+A to select all of the data.
 * Select Data > Sort....

The Sort dialog displays.


 * Click the Options tab.
 * Check Range contains column labels.
 * Cleck the Sort Criteria tab
 * Under Sort by, in the drop down menu, select the relevant variable
 * Click OK.

Note that the missing values are listed at the bottom

Now let's make the row and column labels for the table.
 * Select a cell to be the upper left corner of the table; choose a cell in an empty section of your worksheet (to the right of the data columns)
 * Type "Statistic".
 * In the cell directly to the right, type the name of the variable.
 * In the each successive cell under "Statistic", type "Mean", "Stand Dev", "Min", "Q1", "Median", "Q3", "Max".

Now let's add the formulas. We use [range] to indicate where the cell range should go. When you enter the range of data, be sure to exclude the cells with an asterisk (*) that indicate missing values. Enter the formulas in the cell to the right of each statistic's label.
 * Mean: Type "=average([range])".
 * Stand Dev: Type "=stdev([range])".
 * Q1: Type "=quartile([range], 1)".
 * Min: Type "=min([range])".
 * Median: Type "=median([range])".
 * Max: Type "=max([range])".
 * Q3: Type "=quartile([range], 3)".

Create the frequency table
First we need to define the bins (also called intervals or classes) in which to group the data. In an effort to create better x-axis labels for the histogram, let's create data labels in the column to the right of the bins.
 * Decide on an appropriate bin width, and then in one of the columns to the right of the data, make a list of the upper limit in each bin, from lowest down to highest. For example, if the data range from 3 to 86, you might choose a bin width of 10. Your bins would start at 10, then 20, 30,..., 90). In general, choose a bin width so that there are between 5 and 15 bins. And, remember the bin number is the upper limit of the bin, so there will be no bin labeled "0".
 * Highlight the block of cells.
 * Select Format > Cells....

The Format Cells dialog displays.
 * Select the Numbers tab.
 * Select Text in the dropdown box under Category.
 * Click OK.
 * Enter the range in each cell (for example, '0-5' instead of just '5').

Now add the frequencies in the next column to the right.
 * Use the FREQUENCY function to calculate the number of observations in each bin. Display the frequency data in the cells to the right of the bin labels. Remember that the FREQUENCY function requires two arguments: data and classes. Indicate the correct range for each of these.

Create the histogram

 * Choose Insert > Chart....

The Chart Wizard dialog box opens. No revisions are necessary on ''3. Data Series''.
 * On 1. Chart Type: select Column.
 * On 2. Data Range: insert the data range (labels and frequencies, but not the upper limit in each bin), and indicate that the data series is in a column, and the first column is a label.
 * On 4. Chart Elements: enter a title, axis labels, and adjust other elements on the chart.
 * Click Finish.

The chart will be inserted into the open sheet. Widen the chart as needed to correctly display the x-axis labels.
 * Move the chart to be situated next to the frequency data.
 * Remove the space between the bars, using the Object Properties... option.
 * Click OK.

Return to the StatTutor exercise for Question 1 to describe the key features of data display and support your description with numerical measures. (Be sure to include the numerical results in your description.)

Test hypothesis

 * Calculate the test statistic using the formula:
 * $$z = \frac{\bar{x} - \mu}{\frac{\sigma}{\sqrt{n}}}$$


 * Calculate the p-value using the Calc function NORMSDIST
 * Adjust the formula so that the resulting value is the area in the extreme of one or both tails, depending on the Ha statement specified.

Return to the StatTutor exercise for Question 1 to report results and draw conclusions.

Create confidence interval
Use the Calc function CONFIDENCE(alpha, standard deviation, sample size) to calculate the value (margin of error) to be added/subtracted to/from the mean. Note that if you want a 95% confidence interval, the alpha value (also termed the significance level) is .05 (1-.95).

Return to the StatTutor exercise for Question 1 to complete the draw conclusions section.