1.-10. Here are data on the planets. Year is the number
of earth days it takes for the planet to orbit the sun, and Moons is the number of
moons the planet has.
- Draw the stem-and-leaf plot for the Moons variable, using the tens
digit as the stem and the ones digit as the leaf.
(Be sure to include a prototype for each stem-and-leaf
plot.)
- Which stem is the modal stem, or stems if there are more than one?
- Does the plot exhibit any outliers?
- Redo the plot, but with the tens and ones digits being the stem.
(What are the leaves?)
- What is the modal stem, or stems if there are more than one?
- What kind of skewness, if any, do you see?
- Are there any outliers evident? Are there clusters?
- Draw the stem-and-leaf plot for the Year variable, with the
ten-thousands digit the stem and the thousands digit the leaf.
- What kind of skewness, if any, do you see?
- Redo the plot with the thousands digit the stem and the hundreds
digit the leaf.
What drawback do you see for this plot?
11.-15. Here is a stem-and-leaf plot for the percentage enrolled
in school for the data on the United States plus D. C.
Prototype: "84 | 8" represents 84.8.
84 | 8
85 | 9
86 | 039
87 | 14699
88 | 1225
89 | 257
90 | 345
91 | 0024577
92 | 00123569
93 | 04
94 | 3566
95 | 15
96 | 015
97 | 2
98 | 3
99 | 66
100 |
101 |
102 |
103 |
104 |
105 |
106 | 0
- Are there any outliers evident?
Which state (or DC) is it? (Look through the data.)
What is wrong with the value, if anything?
- Complete the table below for the data leaving out the 106,
(so that there are only 50 observations), in preparation for drawing a histogram of the
data with the given bins. (Use the data from the stem-and-leaf plot.)
Bin |
Frequency |
Relative Frequency |
84 to 86
86 to 88
88 to 90
90 to 92
92 to 94
94 to 96
96 to 98
98 to 100
|
|
|
- Draw the histogram using the table in (b).
- Based on the histogram, approximately what percentage of the
states have school enrollment over 95%?
- Based on the histogram, approximately what percentage of the
states have school enrollment under 85%?
Interactivity:
- Note down the the median and upper and lower quartiles.
- Find the interquartile range.
- Which is closer to the median, the upper quartile or the lower
quartile?
Does this suggest skewness to the higher values or towards the lower values, or neither?
- Find the lower and upper fences. Which numbers are outside of
these fences? (That is, which are the outliers?)
Which animals are these? (You might wish to press the "Order data"
button.)
- Now delete the three outliers from the area to the left of the
plot, and press "OK."
How do the median and quartiles here compare to the same quantities when the outliers were
included?
- What is the interquartile range, and how does it compare to that
when the outliers were included?
- Is there skewness? Are there further outliers?
23.-50. Here are heights in inches and weights in pounds of 132
professional male athletes, in two sports. Also included are their body mass
index numbers, which are defined by
BMI = Body Mass Index = (Weight in Pounds)*703/((Height
in inches)2)
BMI is supposed to measure how overweight or underweight one is. A value in the
range 20-25 is fine; more is deemed overweight; and under 20 is deemed underweight.
It is fairly easy to be overweight under this measure.
Here are boxplots for the three variables:
- What is the median height, approximately?
How tall is the tallest person?
About what percentage of these athletes are taller than six feet? Are these people
generally taller than the average male?
- What are the two quartiles for the Weight variable?
How heavy is the heaviest person?
Are these athletes heavier than the average male, in general?
- What is the median BMI number?
Approximately what percentage is "overweight", i.e., has BMI over 25?
What percentage is "underweight"?
Interactivity:
- How many bins reveal the data best? What feature
do you see in the data?
Interactivity:
- How many bins reveal the data best?
What feature do you see in the data?
- About what percentage of these athletes weight more than 350
pounds?
Interactivity:
- What are the medians, approximately, of the two clusters?
- Which of the two clusters is more spread out?
- Of the three histograms, which showed most clearly two clusters?
- Which showed the clusters better, the histograms or the boxplots?
Interactivity:
The following problems use the same data as the previous problems, but look at the
athletes from the two sports separately.
|
- First are the two boxplots for the heights.
What are the medians for the two groups of athletes?
- What are the shortest heights for the two groups?
- Which group has the larger interquartile range?
- Which group is generally taller?
- Is there any overlap in the heights for the two groups?
- Now choose the "Weight" variable in the list to the left
of the boxplots. What are the medians for the two
groups?
- Is there more or less overlap in the weights than in the heights?
- What percentage in each group has weight over 300 pounds?
- Which group has the larger interquartile range?
- What outliers are there?
- Which group is generally heavier?
- Finally, choose the "BMI" variable. What are the
medians for the two groups?
- Is there any overlap in the BMI's for the two groups?
- What percentage of each group is overweight, that is, has BMI over
25?
- What percentage of each group has BMI over 30?
- What are the interquartile ranges for the two groups?
Are they similar?
- Taking the three variables into account, how would you
characterize the two groups of athletes?
- (Optional) Guess what sports are the two groups associated with?
Interactivity:
- Do you see skewness? If so, in which directions?
- Are there any obvious outliers? About how many?
- Estimate the median. (Click
the mouse on the plot to find the estimated
percentage to the left of where you click.) It is likely your answer will not be an
integer, even though the actual data consist of just integers. (Why?)
- Estimate the quartiles, and the interquartile range.
- About what percentage of people had fewer than 5 dogs plus cats?
- About what percentage of people had more than 10 dogs plus cats?
- About what percentage of people had more than 75 dogs plus cats?
Interactivity:
The next plots are based on the same class as above. The data are split into
groups based on people's heights in inches. The variable of interest is
"MPH," the fastest each person has ever driven a car, in miles per hour.
|
- With the splitting value for height being the median height, 66
inches, what differences are there on the MPH variable between the taller and shorter
people?
- Change the splitting value for height to be approximately the
upper quartile for height. What is the upper quartile for height?
- With the splitting value in (b), compare the shorter and taller
people on MPH.
- Now use the weight variable as the splitting variable, and
splitting value of 135, the median weight. (That is, choose "Weight" in
the upper "Split on" list.)
What differences do you see in the MPH variable for the heavier and lighter people?
- Overall, who tends to drive faster, bigger people or smaller
people?
- Does size really affect how fast people drive, or could there be
another factor influencing both size and speed? (Hint: See the Practice Material for 'Uses in Practice 4'.)
Interactivity:
Here are the average January temperatures for 59 metropolitan areas in the United
States:
27 23 29 45 35 45 30 30 24 27 42 26
28 31 46 30 30 27 24 24 40 27 55 29
32 53 35 42 67 20 12 40 30 54 33 32
38 29 33 39 25 32 55 48 49 40 28 24
23 37 32 33 24 33 28 34 31 29 26
|
- Create the stem-and-leaf plot with these data. Type the leaves into the plot below. The
prototype is given in the plot. When you are done, press "Order leaves" to order the leaves within each stem, then press "OK"to see the corresponding histogram and boxplot.
- What is the modal stem and bin?
- Estimate the median.
- What are the highest and lowest January Temperatures?
- Is there skewness? If so, which way?
Load the "usstates.dat" data set into DataTools. Create the boxplot for the "Poverty"
variable: Choose "Boxplot" from the "Graphics" menu, then choose "Poverty" from
the list that appears. Then press "Create Graph!"
69. What is the median, approximately?
70. Are there any notable outliers?
71. Is there any skewness?
Now find the boxplot for the "Employment" variable.
72. What is the median, approximately?
73. Is there any skewness?
74.-77.
74. The next problem is based on data for the cities in the United States with
populations over 200,000. The boxplots below are for the populations in 100,000's,
divided into groups depending in the cities' land areas in square miles.
Which group has the most extreme outliers?
The next plot is the same as above, but without those three largest outliers so that it is easier to see the boxplots.
- Compare the median populations for the four groups.
- Compare the interquartile ranges.
- Compare the ranges.
Copyright © 2000 CyberGnostics, Inc. All rights reserved.
|