LAB SESSION 13

INFERENCES INVOLVING TWO POPULATIONS

 

 

INTRODUCTION: When comparing two populations we need two samples, one from each population.  Two kinds of samples can be used: dependent or independent, determined by the source of the data.  The methods of comparison are quite different.

 

CASE 1. DEPENDENT SAMPLE (PAIRED DATA): The two data values, one from each set, that come from the same source are called paired data.  They are compared by using the difference in their values, called the paired difference, d.  Because the distribution of the paired difference, d = x1 - x2, will be approximately normally distributed when paired observations are randomly selected from normal populations, we will use the t-test.  We wish to make inferences about µd where the random variable (d) involved has an approximately normal distribution with an unknown standard deviation (sd).

 

 

Confidence Interval

Consider the data presented in exercise 10.20 of your text.  Use Excel to generate the 95% confidence interval for the mean improvement in      memory resulting from taking the memory course. ( d = after - before)

Retrieve the data file for ex10-020 from the Student Suite CD or enter it yourself in columns A and B.

 

Form the paired difference and put it in Column C.

 

    

Before

After

Difference

93

98

-5

86

92

-6

72

80

-8

54

62

-8

92

91

1

65

78

-13

80

89

-9

81

78

3

62

71

-9

73

80

-7

 

To generate the interval:

            Click into any empty cell.

            Choose:            Tools>Data Analysis Plus > t-Estimate: Mean

            Enter:               Input range: C2:C11

            Select:              Labels (if necessary)

            Enter:               Alpha:  a ( or 0.05)

 

 

The output follows in a separate sheet:

t-Estimate: Mean

 

 

 

 

 

 

 

 

 

Difference

Mean

 

 

-6.1

Standard Deviation

 

4.7947

LCL

 

 

-9.52989

UCL

 

 

-2.67011

 

 

 

 

 

 

 

Hypothesis Testing

To demonstrate the procedure for a hypothesis test on mean difference we will do Exercise 10.28.

Enter the data for Before in column A and for After in column B or by retrieving             it from the Student Suite CD (ex10-028) and calculate the paired differences.

 

Exercise 10.28

 

 

 

 

 

Before

After

Paired      Diff

29

30

1

22

26

4

25

25

0

29

35

6

26

33

7

24

36

12

31

32

1

46

54

8

34

50

16

28

43

15

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Then perform a t-test on the paired differences  (After – Before).

 

Choose:            Tools > Data Analysis>t-Test: Paired Two Sample for Means

Enter: Variable 1 Range: B4:B14

Enter: Variable 2 Range: A4:A14

Select: Labels

Enter:   a  (example 0.05)

     Select: Output Range

            Enter:   A15 (or any empty cell)

            Click:  OK

 

The results you get look like this:

 

 

t-Test: Paired Two Sample for Means

 

 

 

 

 

After

Before

Mean

36.4

29.4

Variance

94.48888889

46.26666667

Observations

10

10

Pearson Correlation

0.810662928

 

Hypothesized Mean Difference

0

 

Df

9

 

t Stat

3.821341258

 

P(T<=t) one-tail

0.002040758

 

t Critical one-tail

1.833113856

 

P(T<=t) two-tail

0.004081516

 

t Critical two-tail

2.262158887

 

 

 

Note: t statistic = 3.82 and the p-value = 0.0041.  How would you interpret these results?

 

ASSIGNMENT:  Do Exercises 10.21, 10.23, 10.24, 10.30  in your text.


 

CASE 2. INDEPENDENT SAMPLES:  If two samples are selected, one from each of the populations, the two samples are independent if the selection of objects from one population is unrelated to the selection of objects from the other population.  Since the samples provide the information for determining the standard error, the t distribution will be used as the test statistic, and the degrees of freedom will be calculated by Excel.            

 

a) Complete the hypothesis test presented in Exercise 10.61 of your text. 

Retrieve the data from the Student Suite CD and note that the data for Diet A is in Column A and Diet B is in Column B.

 

DietA

DietB

5

5

14

21

7

16

9

23

11

4

7

16

13

13

14

19

12

9

8

21

 

            Perform a t-test as follows:

 

Choose:  Tools > Data Analysis > t-Test: Two Sample Assuming Unequal 

                                                                        Variances

 Enter: Variable 1 Range: B1:B11

Enter: Variable 2 Range: A1:A11

            Hypothesized Difference: 0.0

Select: Labels

Enter:   a  (example 0.10)

     Select: Output Range

            Enter:   A15 (or any empty cell)

            Click:  OK

 

 

 

 

We then get the following output:

 

t-Test: Two-Sample Assuming Unequal Variances

 

 

 

 

DietA

DietB

Mean

10

14.7

Variance

10.4444444

46.0111111

Observations

10

10

Hypothesized Mean Difference

0

 

df

13

 

t Stat

-1.978083

 

P(T<=t) one-tail

0.03475571

 

t Critical one-tail

1.7709317

 

P(T<=t) two-tail

0.06951142

 

t Critical two-tail

2.16036824

 

 

 

 

Do the data justify the conclusion that the mean weight gained on diet B was greater than the mean weight gained on diet A, at the  a = .05 level of  significance?

 

Now that we have concluded that there is a difference, let us consider giving a 90% confidence interval estimate for this difference.  The ToolPak does not print a confidence interval directly, but the output from the t-test provides us with the information to construct one. To complete the interval you must compute the formula for the confidence interval.  You can do this directly in the worksheet as follows:

 

Difference of the Means (Diet A - Diet B)

-4.7

SE = SQRT(E9/E10 + F9/F10)

 

2.376037785

t*

 

 

1.770931704

ME=  (t*)( SE)

 

 

4.207800642

lower = mean Diff - ME

 

-8.907800642

upper = Mean Diff + ME

 

-0.492199358

 

 

 

 

So the 90% interval for the difference of means is: (-8.91, -0.49)

 

 

 

 

 

b) Consider Exercise 10.45 in your text.  Retrieve the data from the Student Suite     CD: the data for the males is in Column A and the females is in Column B. 

 

Example 10.45

 

 

 

 

males

females

diffs

76

76

0

76

70

6

74

82

-8

70

90

-20

80

68

12

68

60

8

90

62

28

70

68

2

90

80

10

72

74

-2

76

60

16

80

62

18

68

72

-4

72

 

72

96

 

96

80

 

80

           

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Doing a t-test as above gives the following output:

 

Example 10.45

 

 

 

 

 

 

 

 

 

 

males

females

diffs

 

 

 

76

76

0

 

 

 

76

70

6

t-Test: Two-Sample Assuming Unequal Variances

 

74

82

-8

 

 

 

70

90

-20

 

males

females

80

68

12

Mean

77.375

71.07692308

68

60

8

Variance

69.71666667

85.07692308

90

62

28

Observations

16

13

70

68

2

Hypothesized Mean Difference

0

 

90

80

10

df

25

 

72

74

-2

t Stat

1.907486345

 

76

60

16

P(T<=t) one-tail

0.034004546

 

80

62

18

t Critical one-tail

2.485103323

 

68

72

-4

P(T<=t) two-tail

0.068009092

 

72

 

72

t Critical two-tail

2.787437552

 

96

 

96

 

 

 

80

 

80

 

 

 

 

 

 

Difference of the Means (males-females)

6.298076923

 

 

 

 

SE = SQRT(E9/E10 + F9/F10)

3.301767764

 

 

 

 

t*

2.485103323

 

 

 

 

ME=  (t*)( SE)

8.205234042

 

 

 

 

lower = mean Diff - ME

-1.90715712

 

 

 

 

upper = Mean Diff + ME

14.50331096

 

 

 

So the interval is (-1.91, 14.50).  What does this imply?  (Note the interval includes 0).

 

 

 

ASSIGNMENT: Do Exercises 10.60, and 10.62 in your text.  Both sets of data are    found on the Student Suite CD.

 

Enrichment Assignment: Do Exercise 10.64 or 10.65.  Turn in a typed paper detailing your procedures and results.  Include the session commands you used and a printed copy of your output to substantiate your conclusions