#### Transcript 2-1 Data Summary and Display

1 2 2-1 Data Summary and Display 3 2-1 Data Summary and Display 4 2-1 Data Summary and Display 5 2-1 Data Summary and Display Population Mean For a finite population with N measurements, the mean is The sample mean is a reasonable estimate of the population mean. 6 2-1 Data Summary and Display Sample Variance and Sample Standard Deviation 7 2-1 Data Summary and Display 8 2-1 Data Summary and Display 9 2-1 Data Summary and Display The sample variance is The sample standard deviation is 10 2-1 Data Summary and Display Computational formula for s2 Why? 11 2-1 Data Summary and Display Population Variance When the population is finite and consists of N values, we may define the population variance as The sample variance is a reasonable estimate of the population variance. 12 2-2 Stem-and-Leaf Diagram Steps for Constructing a Stem-and-Leaf Diagram 13 2-2 Stem-and-Leaf Diagram 14 2-2 Stem-and-Leaf Diagram 15 2-2 Stem-and-Leaf Diagram 16 2-2 Stem-and-Leaf Diagram 17 2-2 Stem-and-Leaf Diagram 18 2-2 Stem-and-Leaf Diagram quartiles 19 2-3 Histograms A histogram is a more compact summary of data than a stem-and-leaf diagram. To construct a histogram for continuous data, we must divide the range of the data into intervals, which are usually called class intervals, cells, or bins. If possible, the bins should be of equal width to enhance the visual information in the histogram. 20 2-3 Histograms 21 2-3 Histograms 22 2-3 Histograms 10 bins 23 2-3 Histograms 24 2-3 Histograms An important variation of the histogram is the Pareto chart. This chart is widely used in quality and process improvement studies where the data usually represent different types of defects, failure modes, or other categories of interest to the analyst. The categories are ordered so that the category with the largest number of frequencies is on the left, followed by the category with the second largest number of frequencies, and so forth. Ordered by the frequency 25 2-3 Histograms 26 2-4 Box Plots • The box plot is a graphical display that simultaneously describes several important features of a data set, such as center, spread, departure from symmetry, and identification of observations that lie unusually far from the bulk of the data. • Whisker • Outlier • Extreme outlier 27 2-4 Box Plots 28 2-4 Box Plots 29 Table 2-2 30 2-4 Box Plots 31 2-5 Time Series Plots • A time series or time sequence is a data set in which the observations are recorded in the order in which they occur. • A time series plot is a graph in which the vertical axis denotes the observed value of the variable (say x) and the horizontal axis denotes the time (which could be minutes, days, years, etc.). • When measurements are plotted as a time series, we often see •trends, •cycles, or •other broad features of the data 32 2-5 Time Series Plots A cyclic variability Upward trend 33 2-5 Time Series Plots 34 2-5 Time Series Plots 35 2-6 Multivariate Data • The dot diagram, stem-and-leaf diagram, histogram, and box plot are descriptive displays for univariate data; that is, they convey descriptive information about a single variable. •Many engineering problems involve collecting and analyzing multivariate data, or data on several different variables. •In engineering studies involving multivariate data, often the objective is to determine the relationships among the variables or to build an empirical model. 36 2-6 Multivariate Data 37 2-6 Multivariate Data 38 2-6 The Corrected Sum of Cross-Products n S xy ( xi x)( yi y ) i 1 n n xi yi xi yi n i 1 i 1 i 1 n Inner product! 39 2-6 Multivariate Data Sample Correlation Coefficient • Two variables are strong if 0.8 r 1, • moderate if 0.5< r < 0.8, and • weak if 0 r 0.5. 40 2-6 Multivariate Data 41 2-6 Multivariate Data 42 2-6 Multivariate Data Pairwise correlations 43 2-6 Multivariate Data 44 2-6 Multivariate Data Interaction between foam and region 45 2-6 Multivariate Data 46 47