 Excel Lab 1 This lab session is designed to introduce you to the statistical aspects of Microsoft Excel.  During this session you will learn how to enter and exit Excel, how to enter data and commands, how to print information, and how to save your work for use in subsequent sessions.  As with any new skill, using this software will require practice and patience.  Excel is a spreadsheet used for organizing data in columns and rows.  It is an integrated part of Microsoft Office, and so data can be easily imported and exported into word processing documents, databases, graphics programs, etc.  It offers a wide range of statistical functions and graphs and so is a an alternative to specific statistical software. Excel Lab 2 Graphically representing data is one of the most helpful ways to become acquainted with the sample data.  In this lab you will use Excel to present data graphically.  You will be analyzing data using four types of graphs: Circle graphs, Bar graphs, histograms, and cumulative (relative) frequency plots (ogives). Excel Lab 3 The basic idea of descriptive statistics is to describe a set of data in a variety of abbreviated ways.  In this lab you will investigate measures of central tendency and dispersion.  The box-and-whiskers display, a graphical display of the 5-number summary of a set of data, will also be introduced. Excel Lab 4 It is frequently interesting to view the relationship of two variables.  In this lab we will see how Excel can help us plot bivariate data and discover some trends in the relationship.  We can set up the data as ordered pairs, with the independent variable as the x and the dependent variable as the y. Excel Lab 5 Not only is it important to analyze single variables, but frequently one needs to determine if and how two variables are related.  The correlation coefficient is a measure of the strength of the linear relationship between two variables.  In these exercises you will use Excel to analyze this statistic, and these exercises will also give you a very brief introduction to linear regression. Excel Lab 6 This lab session is designed to introduce you to random numbers and their use in simulating experiments.   The outcomes of events in normal life cannot be predicted, but it is possible to have an idea of what outcomes are possible.  The theory of probability was developed to help analyze experiments whose outcomes are uncertain.  We can use Excel to simulate certain experiments such as flipping a coin or rolling a die. Excel Lab 7 The normal distribution is one of the most important distribution functions in statistics. We will now see how the binomial probabilities can be reasonably estimated by using the normal probability distribution.  Later we will need to determine whether normality is a reasonable assumption.  We will start our investigation with a few specific binomial distributions. Excel Lab 8 In an effort to predict population parameters, we need to investigate the variability in the sample means obtained from repeated sampling.  The Central Limit Theorem tells us that the sampling distribution of sample means, , is approximately normally distributed.  In the following lab you will test the results of the Central Limit Theorem. Excel Lab 9 Two indispensable statistical decision-making tools for a single parameter are (i)confidence intervals, and (ii) hypothesis tests to investigate theories about parameters.  In this lab you will learn how to calculate confidence intervals and perform hypothesis tests (assuming we know sigma) using Excel. Excel Lab 10 The t-statistic is used when making inferences concerning the population mean when sigma is an unknown quantity.  We will introduce the t-test and compare the z and t distributions. Excel Lab 11 In this lab we will investigate the inferences that can be made about the binomial parameter p.  Inferences concerning the population binomial parameter p are made using procedures that closely parallel the inference procedures for the population mean m (see lab 10). Excel Lab 12 In this lab we will present the hypothesis test for the standard deviation for a normal population.  When sample data are skewed, just one outlier can greatly affect the standard deviation.  It is very important, especially when using small samples, that the sampled population be normal; otherwise the procedures are not reliable.  However, unlike the analysis for the mean you will not have convenient computer commands to help you. Excel Lab 13 When comparing two populations we need two samples, one from each population.  Two kinds of samples can be used: dependent or independent, determined by the source of the data.  The methods of comparison are quite different. Excel Lab 14 The data used in this lab is enumerative -- that is, the data is placed in categories and counted.  The observed frequencies list exactly what happened in the sample.  The expected frequencies represent the theoretical expected outcomes (what is expected to happen “on the average”).  These expected values must always add up to n.    When we perform a hypothesis test on these two sets of values, we are really asking, “how different are they”?  If the difference is small, we may attribute it to the chance variation in the samples.  However, if the difference is large there may be a difference in the proportions in the population and we may reject the null hypothesis.  We can use the c2 distribution in our test.  We will first make inferences concerning multinomial experiments and then extend that to contingency tables. Excel Lab 15 In earlier sessions you have examined and compared means from two samples.  We will now practice a technique that tests hypothesis about several means.  While we could compare the means in pairs as we have done before, the process could become too unwieldy to be of any use.  Analysis of variance (ANOVA) allows us to test all the means at the same time to see if there is any significant difference between them. Excel Lab 16 In an earlier lab, we looked at bivariate data, and used the linear correlation coefficient to see if there was a relationship between the two variables.  You also looked at a method of developing a line of best fit.  In this lab we will look at a method of deciding whether the equation of that line is of any use to us in making point predictions and developing confidence intervals. Excel Lab 17 All the previous methods we have studied are parametric statistics - based on a population that has a certain distribution and can be applied only when special criteria are met.  Non-parametric statistical methods can be applied when these criteria are not able to be met and assumptions about the parent population (such as normality) cannot be made, since these techniques do not rely on the distribution of the parent population.  Non-parametric methods tend, unfortunately , to waste information and are less sensitive than their parametric counterparts.  This, however, can be compensated for very nicely by increasing the sample size.  Non-parametric techniques are generally easier to apply and are only slightly less efficient than parametric techniques.