Behavior of sample proportion

From WikiEducator
Jump to: navigation, search

Description of activity and Excel instructions:Behavior of sample proportion

Dataset: none

Create 250 samples from binomial distribution with p=.6 and n=100

Excel offers a standard formula for generating random samples. Using what we know about the transformation of random variables, we can generate the random samples in Calc, using the basic formulas. The RAND() function generates numbers between 0 and 1. The expected value of this distribution is .5. For this simulation, we will transform the random variable by adding .1 to each generated value, RAND()+0.1. Then if we round the resulting value to an integer value, we will obtain values such that 1 represents success (in this case female; 0 represents male) and the probability of success is .6.

  • Open Calc with a new worksheet.
  • Enter the formula "=ROUND(RAND()+0.1)" into A1.
  • Copy this formula (grab the cell handle and drag) to all of the cells from A1 to IP100 (creating 250 columns and 100 rows of 1s and 0s).

Each column represents a random sample of size 100 chosen from a population where the proportion of "success" (in our case, females) is 0.6.

We now want to ask the software to find the sample proportion for each of the 250 samples (columns). Since we used the codes 1 and 0, calculating the sample proportions of 1's in each column is the same as finding the average of each column.

  • Under the first column of data (say, in row 102), type the following formula: "=average(A1:A100)".
  • Copy the formula (grab the cell handle and drag) to calculate the average for each of the subsequent columns of data; Calc will automatically adjust the formula to average each column of data.

Now you should have a row with 250 sample proportions corresponding to the 250 samples.

Create a histogram

Create a histogram of the 250 sample proportions ([math]\hat{p}[/math]), to provide a visual display of the sample distribution.

The steps are:

  1. Create the bins, specifying the upper value for each bin. (Use about 13 or so bins.)
  2. Create the frequency distribution, using the FREQUENCY formula.
  3. Create a bar chart of the frequencies, with the bin values serving as categories and removing the space between the bars.

If needed, use the instructions for creating a histogram from the EDA section, to refresh your memory.

Return to the OLI Behavior of sample proportion page to complete the interpretive exercises.