Transcript Chapter 2
STAT 5101
Foundations of Data Science
Instructor: Xinyuan Song
Office: LSB 114, 39437929, email: [email protected]
Teaching Assistant: Xiangnan Feng
Office: LSB G32, 39438527, email: [email protected]
Assessment Scheme
Exercise
Mid-term examination
20%
30%
Final examination
50%
October 23, 2013
7:00-9:00pm
No make-up examination
December 4, 2013
7:00-9:00pm
1
Course Description
This course provides comprehensive coverage of basic
concepts of statistics.
Topics include
exploratory data analysis,
statistical graphics, sampling variability,
point and confidence interval estimation,
hypothesis testing,
other selected topics.
Two computer software: R and Microsoft Excel will be
introduced to describe and analyze data.
2
Learning Outcomes
After completing the course, students should
be able to
understand basic concepts in statistics;
use various statistical methods and techniques to
summarize, present, and analyze data;
read statistical reports and recognize when the quantitative
information presented is accurate or misleading ;
use computer software (R and Excel) to analyze data and
draw conclusions.
3
Textbook and Reference Books
Textbook
Levine, D. M., Stephan, D., Krehbiel, T. C. and Berenson, M. L.
Statistics for Managers Using Microsoft Excel 5th Edition. Pearson Prentice
Hall, 2008.
Reference book
1. Siegel, A. F. Practical Business Statistics 5th Edition. Mc Graw Hill,
2003.
2. Agresti, A. and Franklin, C. Statistics: The Art and Science of Learning
from Data. 2nd Edition, Pearson Prentice Hall, 2009.
3. Fraenkel, J., Wallen, N. and Sawin, E. I. Visual Statistics.
4. Any other textbook for introducing basic statistics.
4
Organization of Textbook
Presenting and Describing Information
Introduction and Data Collection (Chapter 1)
Presenting Data in Tables and Charts (Chapter 2)
Numerical Descriptive Measures (Chapter 3)
Drawing Conclusions About Populations Using Sample
Information
Basic Probability (Chapter 4)
Some Important Discrete Probability Distributions (Chapter 5)
The Normal Distribution and Other Continuous Distributions (Chapter 6)
Sampling and Sampling Distributions (Chapter 7)
Confidence Interval Estimation (Chapter 8)
Hypothesis Testing (Chapters 9-12)
Decision Making (Chapter 17)
5
Organization of Textbook
Making Reliable Forecasts
Simple Linear Regression (Chapter 13)
Introduction to Multiple Regression (Chapter 14)
Multiple Regression Model Building (Chapter 15)
Time-Series Forecasting (Chapter 16)
Improving Business Process
Statistical Applications in Quality Management (Chapter 18)
6
Course Outline
Chapter I Data Collection and Data Presentation
Chapter 2 Numerical Descriptive Measures
Chapter 3 Important Discrete Probability Distributions
Chapter 4 Important Continuous Distributions
Chapter 5 Sampling and Sampling Distributions
Chapter 6 Confidence Interval Estimation
Chapter 7 Hypothesis Testing: One Sample Tests
Chapter 8 Two-Sample Tests
Chapter 9 Chi-squared Tests and Nonparametric Tests
Chapter 10* Selected topic
7
Chapter 1
Data Collection and Data Presentation
Explain key definitions:
Population vs. Sample
Primary vs. Secondary Data
Parameter vs. Statistic
Descriptive vs. Inferential Statistics
Describe key data collection methods
Describe different sampling methods
Probability Samples vs. Nonprobability Samples
Identify types of data and levels of measurement
Use graphical techniques to organize and present data
ordered array
stem-and-leaf display
frequency distribution, polygon, and ogive
scatter diagrams
histogram
bar charts, pie charts
8
Chapter 2
Numerical Descriptive Measures
Mean, median, mode
Range, variance, standard deviation, coefficient of
variation
Five-number summary
Box-and-whiskers plot
Correlation coefficient
9
Chapter 3
Important Discrete Probability Distribution
Define mean and standard deviation
Explain covariance and its application in finance
Binomial probability distribution
Poisson probability distribution
Hypergeometric probability distribution
Negative binomial distribution, geometirc distribution,
multinomial distribution
10
Chapter 4
Important Continuous Distributions
Continuous probability distribution
Characteristics of the normal distribution
Using a normal distribution table
Evaluate the normality assumption
Uniform and exponential distributions
Gamma and Weibull distributions
11
Chapter 5
Sampling and Sampling Distributions
Types of sampling methods
Sampling distributions
Sampling distribution of the mean
Sampling distribution of the proportion
Central Limit Theorem
12
Chapter 6
Confidence Interval Estimation
Point estimate
Confidence interval estimate
Confidence interval for a population mean
Confidence interval for a population proportion
Determine the required sample size
13
Chapter 7
Hypothesis Testing: One Sample Tests
Null and alternative hypotheses
A decision rule for testing a hypothesis
Hypothesis testing
Type I and Type II errors
14
Chapter 8
Two-Sample Tests
Test the difference between two independent
population means
Test two means from related samples
Test the difference between two proportions
F test for the difference between two variances
15
Chapter 9
Chi-Square Tests and Nonparametric Tests
Chi-square test for the difference between two
proportions
Chi-square test for differences in more than two
proportions
Chi-square test for independence
The Wilcoxon rank sum test for two population
medians
The Kruskal-Wallis H-test for multiple population
medians
16