Overview:
Many practitioners simply perform normality tests and react to the results without enough understanding of important issues such as sample size implications, impact of outliers, etc. on the test results.
This webinar introduces probability distributions and the Normal (Gaussian) Distribution specifically. The key characteristics and distribution parameters that define the normal model are discussed in the introduction. The concept of distribution model fitting is presented and reasons for normality testing are reviewed.
Next, several methods for testing data for normality are presented. Although some older techniques are referenced, we emphasize the use of probability plotting and goodness-of-fit tests to provide objective assessments of normality. The methodology of hypothesis testing as applied to goodness-of-fit tests is described in detail. We emphasize the correct interpretation of normality test results (e.g. using p-values). We also discuss the risks of making errors in hypothesis tests and how to control those risks.
We provide several common scenarios that lead to rejection of normality. An understanding of these situations is important for determining appropriate actions when a normality test fails. We discuss outliers, unstable processes, and issues caused by discreteness in the data.
Next, we discuss some of the common types of goodness-of-fit tests that may be used (e.g. Andersen-Darling, Kolmogorov Smirnoff, etc.). They differ several aspects and their properties are useful to understand to select an appropriate test. The sample size chosen for normality testing can significantly impact the results, and we discuss the relationship between sample size and the power of normality tests. More data is not necessarily better in this application. We provide some suggestions for sample sizes.
Since a common reason for rejecting normality is the presence of one or more potential “outliers”, we present some outlier tests that may be used (Grubbs, Dixon). We also discuss when it may be appropriate to exclude data from the analysis.
Why you should Attend:
Many types of statistical analyses assume that the underlying raw data follow a Normal Distribution. Common examples include Analysis of Variance (ANOVA), t tests, F tests, and Process Capability analyses using Normal methods. It is important to test the assumption of normality before using methods that require it.
Learning Ojectives Include:
Areas Covered in the Session:
Who Will Benefit:
Steve Wachs has 30 years of wide-ranging industry experience in both technical and management positions. Steve has worked as a statistician at Ford Motor Company where he has extensive experience in the development of statistical models, reliability analysis, designed experimentation, and statistical process control.
Steve is currently a Principal Statistician at Integral Concepts, Inc. where he assists manufacturers in the application of statistical methods to reduce variation and improve quality and productivity. He also possesses expertise in the application of reliability methods to achieve robust and reliable products as well as estimate and reduce warranty. Steve consults and provides workshops in industrial statistical methods worldwide. He also supports Integral Concepts’ Litigation / Expert Witness practice with data analysis.
Steve possesses an M.A. in Applied Statistics from University of Michigan (Ann Arbor), an M.B.A. from the Katz Graduate School of Business, University of Pittsburgh, and a B.S. in Mechanical Engineering from the University of Michigan (Ann Arbor).