LAB SESSION 6
RANDOM NUMBERS AND PROBABILITY
INTRODUCTION: This lab session is designed
to introduce you to random numbers and their use in simulating
experiments. The outcomes of events in
normal life cannot be predicted, but it is possible to have an idea of what
outcomes are possible. The theory of
probability was developed to help analyze experiments whose outcomes are
uncertain. We can use Excel to simulate
certain experiments such as flipping a coin or rolling a die.
RANDOM NUMBERS
You were introduced to the RANDOM command in Lab 5. There we used the
RAND worksheet function, to
return an evenly distributed random number greater than or equal to 0 and less
than 1 every time the worksheet is calculated. Now
we’ll look at the Random Number Generation analysis tool. This tool is part of the Analysis
ToolPak. This tool fills a range with
independent random numbers drawn from one of several distributions. You can characterize subjects in a
population with a probability distribution.
For example, you might use a normal distribution to characterize the
population of individuals’ heights, or you might use a Bernoulli distribution
of two possible outcomes to characterize the population of coin toss results.
Suppose we want to simulate the outcomes for tossing a
coin 100 times.
Choose: Tools > Data Analysis > Random
Number Generation > OK
Enter: Number of Random Numbers: 100
Distribution: Bernoulli
p-value: 0.5
Select: Output Range: select cell > OK
Give the relative frequency
for a head (1) and a tail (2) based on the Excel output. Certainly, let the computer do the work:
Enter a 0 and a 1 into the
first two cells of column B. (This is
done to indicate the classes of values to be tallied.) Select cells C1:C2 (to store the tallies)
and then continue by typing
=Frequency(A1:A100, B1:B2)
Since this is an array
formula you must press Ctrl + Shift + Enter
Bins |
Frequency |
0 |
46 |
1 |
54 |
Questions:
1. What commands would be used for simulating the rolling of a die
50 times?
2. Create a new Excel workbook and place 50 simulated rolls into
columns A
and B. Give the
relative frequency for the outcomes 1, 2, 3, 4, 5, and 6 based
on the Excel output.
THE LAW OF LARGE NUMBERS
To see how the law of large
numbers works, we need to create a third column with the sums of two dice rolls
simulated by columns A and B. First,
since Excel generates real decimal values, place the INT (the integer function)
of column A values in column C, and the INT of column B values into column D.
in cell E1 enter = C1 + D1
click and drag the fill handle to cell E50
To determine the relative frequency of each outcome:
Enter: the possible outcomes 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 into column F
Select cells G1 through G11, and type =FREQUENCY(E1:E50,F1:F11)
and
press Ctrl + Shift + Enter. To get the relative frequencies, in cell H1 enter
=G1/50 and
click and drag the fill handle to cell H11.
Note: You may
also use Data > Pivot Table Report . . .
to generate a table
containing the outcomes and
their frequencies. See your text for
more information.
first roll |
second roll |
sum |
Bins |
frequency |
rel. freq. |
2 |
3 |
5 |
2 |
0 |
0 |
3 |
4 |
7 |
3 |
2 |
0.04 |
4 |
2 |
6 |
4 |
7 |
0.14 |
5 |
4 |
9 |
5 |
5 |
0.1 |
1 |
5 |
6 |
6 |
10 |
0.2 |
2 |
4 |
6 |
7 |
10 |
0.2 |
1 |
5 |
6 |
8 |
5 |
0.1 |
4 |
5 |
9 |
9 |
7 |
0.14 |
1 |
2 |
3 |
10 |
3 |
0.06 |
2 |
5 |
7 |
11 |
0 |
0 |
1 |
4 |
5 |
12 |
0 |
0 |
3 |
1 |
4 |
|
|
|
5 |
3 |
8 |
|
|
|
1 |
3 |
4 |
|
|
|
... |
... |
... |
|
|
|
Interpreting the results:
1) What is the observed
probability of obtaining a sum of 2 on the dice?
2) What is the observed
probability of obtaining a sum of 7 on the dice?
3) What is the observed
probability of obtaining a sum of 11 on the dice?
Using similar commands,
create two additional columns containing 500 simulated rolls of a single die
and a third column containing the sums of these 500 simulated rolls of 2 dice.
4) Answer the above three questions about this simulation. How do the answers compare to the
theoretical probability? (Use both
numerical and graphic evidence.)
THE BINOMIAL PROBABILITY DISTRIBUTION
Consider the following situation:
Suppose you bought four light bulbs.
The manufacturers claim that 85% of their bulbs will last at least 700
hours. If the manufacturer is right,
what are the chances that all four of your bulbs will last at least 700
hours? That three will last 700 hours,
but one will fail before that?
Consider another
situation. You've somehow gotten
enrolled in a class in advanced Greek
Mythology. You don’t know
anything about mythology but you’re to
take a pop quiz. You'll have to guess
on every question. It's a
multiple-choice test; each of the 20 questions has 3 possible answers. To pass you must get at least 12
correct. What are the chances that
you'll pass?
How would you answer the
above questions? Excel can help us with
this by using the BINOMDIST (The Binomial Probability Distribution Function) to
generate binomial probabilities.
(Remember what a binomial distribution requires.)
Calculating Binomial Probabilities with BINOMDIST
To obtain the probability of each possible outcome for a binomial
distribution with n = 10 and p = 0.1, you will use the following commands. You must first create a column with the
values for which you wish to find the corresponding probabilities. Input the values 0 to 10 into column A. Activate B1, then continue with:
Choose: Insert function, fx >
Statistical > BINOMDIST > OK
Enter: Number_s: A1:A11,
or select cells
Trials: 10
Probability_s: 0.1
Cumulative: false > OK
Drag: fill
handle down to give other probabilities
|
A |
B |
1 |
0 |
0.348678 |
2 |
1 |
0.38742 |
3 |
2 |
0.19371 |
4 |
3 |
0.057396 |
5 |
4 |
0.01116 |
6 |
5 |
0.001488 |
7 |
6 |
0.000138 |
8 |
7 |
8.75E-06 |
9 |
8 |
3.65E-07 |
10 |
9 |
9E-09 |
11 |
10 |
1E-10 |
This results in the following values:
Looking back to our original
questions, to find the probability that three of your four light bulbs will be
successes (last more than 700 hours) and one will fail we use:
“x-values” into column C (0, 1, 2, 3, 4) then activate
cell D1 and
Choose: Insert
function, fx > Statistical > BINOMDIST > OK
Enter: Number_s: C1:C5,
or select cells
Trials: 4
Probability_s: 0.85
Cumulative: false > OK
Drag: fill
handle down to give other probabilities
|
C |
D |
1 |
0 |
0.000506 |
2 |
1 |
0.011475 |
3 |
2 |
0.097538 |
4 |
3 |
0.368475 |
5 |
4 |
0.522006 |
So we see that the
probability of exactly 3 of the 4 bulbs being successes (lasting more than 700
hours) is .368475.
Cumulative Probabilities
The same statistical
function BINODIST can be used to generate cumulative probabilities. A cumulative probability is the probability
that your result will be less than or equal to a particular value. As an example, suppose we calculate the
probability you will fail the test in advanced Greek Mythology. Here n = 20 and p = .3333. You will fail the test if you get less than
or equal to 11 questions correct. (You
will pass if you get 12 or more right.)
The following commands can be used to calculate this probability:
Choose: Insert
function, fx > Statistical > BINOMDIST > OK
Enter: Number_s: A1:A21,
or select cells
Trials: 20
Probability_s: 1/3
Cumulative: true > OK
Drag: fill
handle down to give other probabilities
The cell directly next to
the x-value 11 is the probability you will fail (answer 11 or fewer
correct). So, what is the probability
that you will pass? ( 1 - .9870 = .013)
Mean and Standard Deviation of the Binomial Distribution
Excel can be used like a
calculator to determine the mean and standard deviation for the binomial
distribution. Let’s continue using the
Greek Mythology test example:
Activate
cell C1, then
Enter: =A1*B1
Click and drag: fill handle down to complete the column
calculation
Activate cell D1, then
Enter: =A1*C1
Click and drag: fill
handle down to complete the column calculation
Activate cell C22 and click AutoSum and press enter (this is the mean)
Activate cell D22 and click AutoSum and press enter
(this is the sum of x2P(x))
Finish the calculation for standard deviation as follows:
Activate any
cell and enter =SQRT(D22-C22^2)
mean |
std dev'n |
6.666667 |
2.108185 |
6.67
is the mean number of questions expected to be answered correctly and 2.108 is
the standard deviation expected among the number of questions answered
correctly per test consisting of 20 questions.
ASSIGNMENT: Do Exercises 4.30, 5.26, 5.66 - 5.68 in your text.