User:ASnieckus/StatisticsContent/Learn by doing/DelayedDepression

Description of activity and Excel instructions: StatTutor -- [https://oli.web.cmu.edu/jcourse/webui/resolver/link/resource.do?src=2ae55ede80020ca6000a8a6d9295d0e5&dst=depression_excel Which of the drugs (if either) delayed the recurrence of depression longer relative to the placebo? (Question 2)]

Dataset: Direct link for depression.xls not available.

Comparing distributions of Time for the different treatment groups
Time is a quantitative variable (measured in number of days). We will compare 5-number summaries and boxplots for each of the three treatment groups.

Sort the data by treatment group
The data is already sorted according to the categorical variable Treat.

Create descriptive statistics
The first step is to create a table of descriptive statistics, the 5-number summary.

To create this table, first make the row and column labels:
 * In an empty section of your worksheet, pick a cell and type "Statistic".
 * Move one cell to the right and type "Lithium (ttt 0)".
 * Move one cell to the right, again and type "Imipramine (ttt 1)"
 * And another cell to the right and type "Placebo (ttt 2)".
 * In the cells underneath Statistic, type "Min", "Q1", "Median", "Q3", and finally "Max".

Now for each statistic, enter the formula to calculate that statistic using the range of data that corresponds to each treatment group:

Starting with the Lithium cells, enter the formulas for the following statistics, where [range] is the range of data for Time for Treat=0.
 * Min: type =min([range]).
 * Q1: type =quartile([range],1). (Note that for some operating systems, the comma between the range and parameter needs to be a semi-colon.)
 * Median: type =median([range]).
 * Q3: type =quartile([range],3). (Note that for some operating systems, the comma between the range and parameter needs to be a semi-colon.)
 * Max: type =max([range]).

Repeat for Treatments 1 and 2, adjusting the data range appropriately.

Create side-by-side boxplots
First you need to create the differences from which the stacked bars are created.
 * In each successive cell under Max, type "25th Percentile", "50th Percentile", "75th Percentile", "Lower limit", and finally "Upper limit".
 * The values in these new rows, for each treatment group, are calculated based on the 5-number summary values. Enter the necessary formulas to obtain:
 * 25th Percentile = Q1 (Remember to copy the numbers from Q1 without copying the formulas.)
 * 50th Percentile = Median - Q1
 * 75th Percentile = Q3 - Median
 * Lower limit = Q1 - min
 * Upper limit values = Max - Q3

Now you are ready to create the boxplots.
 * Highlight the values in the 25th Percentile, 50th Percentile, and 75th Percentile rows
 * Choose Insert > Chart...

The Chart Wizard displays.
 * In 1. Chart Type', choose the stacked column option.
 * In 2. Data Range, ensure that the data is correctly specified.
 * In 3. Data Series, use the Categories window to specify the category labels.
 * In 4. Chart Elements, label and select appropriate options for your chart.

A stacked column graph displays. To create the boxplots, add error bars (to the min and max values) and adjust the color and outlines as needed.
 * Use the Insert Y Error Bars... option to add the Upper Limit error bar.
 * Use the Insert Y Error Bars... option to add the Lower Limit error bar. (Remember, to add this error bar select the "lowest portion of the bar.")
 * For each section of the bars, adjust the color and border as necessary.
 * If needed, adjust the y-axis scale.

Return to the Report Results StatTutor page to complete the interpretive exercises. (Be sure to include the data in your description of the results.)