Hypothesis testing of a single mean--verbal and math ability of CS students

This activity provides independent practice in use of the one-sample t-test (comparing a sample mean to a population mean) within the context of the 4 steps of hypothesis testing:
 * 1) State the appropriate null and alternative hypotheses, Ho and Ha.
 * 2) Obtain a random sample, collect relevant data, and check whether the data meet the conditions under which the test can be used. If the conditions are met, summarize the data by a test statistic.
 * 3) Find the p-value of the test.
 * 4) Based on the p-value, decide whether or not the results are significant and draw your conclusions in context.

Research question
A study of freshman computer science majors, designed to investigate why students intending to major in computer science failed to do so, collected data on 224 beginning computer science majors in a particular year. The resulting dataset includes 8 variables.
 * OBS: ID number
 * GPA: The 3-semester grade-point average (0-4 scale)
 * HSM: average high school grade in math (1-10 scale, with 10=A, 9=A-, etc.)
 * HSS: average high school grade in science (1-10 scale, with 10=A, 9=A-, etc.)
 * HSE: average high school grade in English (1-10 scale, with 10=A, 9=A-, etc.)
 * SATM: SAT Mathematics score (circa 1980-82)
 * SATV: SAT Verbal score (circa 1980-82)
 * SEX: 1=male; 2=female

The researchers might have been interested in how the SAT Math and Verbal scores for the computer science students compared with scores from other university students. Let's assume the following population values so you can practice how to compare the sample mean to the population mean:
 * Mean SATM for all university students = 539
 * Mean SATV for all university students = 498

For the analysis, the significance level, α, is set at .05.

Dataset
Obtain the dataset from one of the following:
 * class website: csdata.por (portable file format)
 * |00510|00520|00530|00540|00550|00560|00570|00010|00020|00030|00040|00050|00070|00080|00090|00100|00110|00120|00130|00140|01000|02000|03000|04000|05000|06000|07000|08000|09000|10000|11000|12000|13000|14000|15000|16000|17000|99000| website accompanying Introduction to the Practice of Statistics by Moore, McCabe, and Craig (zip files in various formats)
 * dataset list: csdata.por (portable file format)

Analyses
The following instructions and guiding questions will step you through the analysis process. Copy and paste the following sections ("SATM", "SATV", and "Summarize") into a word processor. Provide responses as indicated.

SATM

 * 1) Let μ be the mean SATM score for the population of students at the university. State the hypotheses that are being tested in this problem.
 * 2) Data collection and examination
 * 3) *Look at the data. Using SPSS, calculate descriptive statistics and create a histogram (see instructions). Describe the data and shape of the distribution.
 * 4) *Explain why the conditions which allow us to safely use the one-sample t test are met.
 * 5) *Would it be valid to use the t test if the data were highly skewed with a few large outliers? Explain.
 * 6) *Using SPSS, run the one-sample t test procedure.
 * 7) *Report the value of the test statistic.
 * 8) *How is the t statistic calculated (write the formula)?
 * 9) *Describe what this t statistic value means.
 * 10) Report the p-value for the statistical test.
 * 11) Interpret the analysis results in the context of the research question.
 * 12) *Indicate whether or not Ho is rejected. Provide evidence.
 * 13) *Draw conclusions based on the results, given the context of the research question.
 * 14) *If Ho is rejected, report a confidence interval appropriate to the given significance level.

SATV

 * 1) Let μ be the mean SATV score for the population of students at the university. State the hypotheses that are being tested in this problem.
 * 2) Data collection and examination
 * 3) *Look at the data. Using SPSS, calculate descriptive statistics and create a histogram (see instructions). Describe the data and shape of the distribution.
 * 4) *Explain why the conditions which allow us to safely use the one-sample t test are met.
 * 5) *Using SPSS, run the one-sample t test procedure.
 * 6) *Report the value of the test statistic.
 * 7) Report the p-value for the statistical test.
 * 8) Interpret the analysis results in the context of the research question.
 * 9) *Indicate whether or not Ho is rejected. Provide evidence.
 * 10) *Draw conclusions based on the results, given the context of the research question.
 * 11) *If Ho is rejected, report a confidence interval appropriate to the given significance level.

Summarize

 * 1) Integrate your findings from the two analyses for SATM and SATV.
 * 2) What limitations (related to sample, research design, choice of analyses...) affect the validity of this research?