LAB SESSION 14
ANALYZING ENUMERATIVE DATA
INTRODUCTION: The data used in this lab is enumerative
 that is, the data is placed in categories and counted. The observed frequencies list exactly what
happened in the sample. The expected
frequencies represent the theoretical expected outcomes (what is expected to
happen “on the average”). These
expected values must always add up to n.
When
we perform a hypothesis test on these two sets of values, we are really asking,
“how different are they”? If the difference
is small, we may attribute it to the chance variation in the samples. However, if the difference is large there
may be a difference in the proportions in the population and we may reject the
null hypothesis. We can use the c^{2} distribution in our test.
We will first make inferences concerning multinomial experiments and
then extend that to contingency tables.
MULTINOMIAL EXPERIMENTS
A multinomial experiment consists of n
independent trials, whose outcome fits into only one of k possible cells. The probabilities of each of these cells
remains constant and the sum of all the probabilities = 1. For multinomial experiments, we will always
use a right tail critical region of the distribution. The expected frequency for each cell is obtained by multiplying
the probability for that cell by the total number of trials, n.
We can use Excel to calculate the
ChiSquare statistic by entering the data, and the probability for each cell,
calculating the expected values for each cell, and the ChiSquare value for
each cell. We then need to sum each of
these columns. Let us do Illustration
111 from the text, implementing Excel to do the calculations.
Enter the column headings in Row 1
Enter the seven observed values into
column B.
Since there are seven sections, we can
assume the probability of choosing any one of them would be 1/7 of the 119
students. Therefore, we will enter 17
in seven rows of column C. These are
the Expected values.
Next, calculate the sums of each column
by using the S from the toolbar.
In
Cell D2 enter the formula = B2 – C2, and copy it down the column.
In
Cell E2 enter the formula = D2*D2, and
copy it down the column.
In
Cell F2 enter the formula = E2/ C2, and
copy it down the column.
Calculate the sums of columns D and
F. The sum in column D should be 0, to
give you a check on your data. The sum
in Column F is the value of C^{2}.
Compare your results to the text.
Let’s enter some data to make our chart
complete.
In Cell A11 enter a = 0.05
In Cell A12 enter df = 6
In Cell A13 enter C
^{2} = 12.94
In Cell A14 enter pvalue =
To calculate your pvalue click on Cell
B14.
Click: Insert > f_{x}_{
}> Statistical > CHITEST
Enter: Actual Range: B2:B8
Expected
Range: C2:C8
OK
This is what you should see:
Number 
Observed Values (O) 
Expected Values (E) 
OE 
(OE)^{2} 
(OE)^{2}/E 
1 
18 
17 
1 
1 
0.058823529 
2 
12 
17 
5 
25 
1.470588235 
3 
25 
17 
8 
64 
3.764705882 
4 
23 
17 
6 
36 
2.117647059 
5 
8 
17 
9 
81 
4.764705882 
6 
19 
17 
2 
4 
0.235294118 
7 
14 
17 
3 
9 
0.529411765 
Sums 
119 
119 
0 

12.94117647 






alpha =
0.05 





df = 6 





C^{2} = 12.94 











Pvalue = 
0.04397964 




You now have to finish the test and state
your conclusion.
ASSIGNMENT:
Do Exercises 11.11,
11.14, and 11.16 in your text.
INFERENCES
ABOUT CONTINGENCY TABLES
Contingency tables arrange data into a
twoway classification. It involves two
variables, and the first question we need to ask is are they independent or
dependant. The two tests that use
contingency tables are the Test of Independence and the Test for Homogeneity.
In a new sheet, enter the data from
Illustration 114, including appropriate titles.
Illustration
114 







Type of Residence 
Favor 
Oppose 
Total 




Urban 
143 
57 
200 
Suburban 
98 
102 
200 
Rural 
13 
87 
100 
Total 
254 
246 
500 
Click on an empty cell.
Choose:
Tools > Data Analysis Plus > Contingency Table > OK
Enter:
Input range: B5:C7 > OK
Select:
Labels (if necessary)
Enter:
Alpha a (.05)
A new sheet will be created (note the tab
name) that will contain the following:
Contingency Table 










Favor 
Oppose 
Total 
TOTAL 

143 
57 
200 
400 
Urban 
98 
102 
200 
400 
Suburban 
13 
87 
100 
200 
Rural 
254 
246 
500 
1000 
TOTAL 
508 
492 
1000 
2000 





chisquared Stat 

91.7155 


df 


6 

pvalue 


0 

chisquared Critical 

12.5916 

You now have to complete the test, noting
that your df = 2, and state your conclusion.
Let us perform the procedure using the
data from Exercise 11.31. First, label
your columns and rows. Enter your data.
Exercise 11.31 











Day of the Week 
Mon 
Tues 
Wed 
Thurs 
Fri 
Nondefective 
85 
90 
95 
95 
90 
Defective 
15 
10 
5 
5 
10 
Click on an empty cell.
Choose: Tools > Data Analysis
Plus > Contingency Table > OK
Enter:
Input range: B5:C7
Select:
Labels (if necessary)
Enter:
Alpha a (.05)
A new sheet will be created (note the tab
name) that will contain the following:
Contingency
Table 














Mon 
Tues 
Wed 
Thurs 
Fri 
TOTAL 
Nondefective 
85 
90 
95 
95 
90 
455 
Defective 
15 
10 
5 
5 
10 
45 
TOTAL 
100 
100 
100 
100 
100 
500 







chisquared
Stat 

8.547 




df 


4 



pvalue 


0.0735 



chisquared
Critical 

9.4877 



You will still need to frame the null and
alternative hypothesis; set the criteria, and then, using the above results,
draw your conclusion.
ASSIGNMENT: Do the following Exercises 11.33, 11.34, 11.49, 11.58 in your text.