Introduction to Statistics - The Catholic University of America
Download
Report
Transcript Introduction to Statistics - The Catholic University of America
ENGR 104: Lecture 2
Statistical Analysis Using Matlab
Lecturers:
Dr. Binh Tran
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Definitions
Statistics: Science that deals with collection,
tabulation, analysis, and interpretation of data
(qualitative or quantitative) in order to make
objective decisions and solve problems.
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Statistical Measures of Data
Average/(Arithmetic) Mean: The average value
of all observations
Median: Middle observation
Mode: Value where highest number of observations
occurs
Range: Difference between max and min values (rough
measure of data dispersion)
Standard Deviation: Special form of average
deviation from the Mean
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Average/(Arithmetic) Mean
n
Mean:
X
X
i
1
n
Advantage: Easy to
compute
Disadvantage: Distorted
by extreme values
(outliers)
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Median: Middle Observation
Definition: Median value is
middle item when items are
arranged according to size
Advantage: Not distorted by
outliers
Disadvantage:Must be
rearranged according to size
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Mode & Range
Mode: Most common value occurring in set of data
Advantage: Most typical value and independent of the
extreme items
Disadvantage: If values are not repeated and amount of
data is small, then the significance of the mode is limited
Range: Difference between min/max values in series
Advantage: Easy to compute & simplest measure of
dispersion
Disadvantage: No info regarding distribution of data
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Standard Deviation
Definition:
X
n
1=
68.3%
2=
95.5%
i
X
2
1
n
Advantage: Show the
degree of dispersion and
variability
Disadvantage: Not trivial
to compute
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Presentation of Data
Frequency Plot: Histogram of # of occurrences.
Curve Fitting: Polynomial fitting of experimental
data
Time Series Analysis or Trend Plots::
– Analysis of trends in data
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Data Presentation:
Frequency Plot or Histogram
Definition: Graphic
representation of
frequency distribution
Advantage: Quick
visualization of data
Disadvantage: Difficult to
analyze data, unless data is
grouped systematically
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Data Presentation:
Polynomial Curve Fitting
Best fit curve for data
Polynomial Equation:
y a xm a xm 1
0
1
a
xa
m 1
m
Advantage: Large set of data
can be represented by a known
equation
Disadvantage: m>2, process
becomes very laborious
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Data Presentation:
Ex:Polynomial Curve Fitting
Example:
y a x 2 a x1 a
0
1
2
Where,
a 0.0155
0
a 2.1411
1
a 58.4165
2
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Data Presentation:
Time Series (Trend) Analysis
Definition: Graphic
representation consisting of
description & measurement of
various changes or movements of
data during a period of time.
Types of trend measurement
• Semi-average
• Moving average
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Data Presentation:
Semi-Average
Definition: Split data set
into two equal parts; take
average; draw straight line
through two average points
Advantage: Very simple to
calculate
Disadvantage: Only gross
representation of data trends
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Data Presentation:
Moving Average
Definition: A series of
successive group averages
Advantage: Simple to calculate;
more accurate representation of
local changes
Disadvantage: Cannot be
brought up to date
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Data Presentation:
Ex: Three-Item Moving Average
Values Total Moving Average
3
5
15
5.00
7
22
7.33
10
29
9.67
12
36
12.00
14
41
13.67
15
46
15.33
17
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Questions ?
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
Lab #2: Telemedicine Analysis
Lab Report Due: 9/29
Download Telemedicine data for 6
study subjects (txt files)
– http://faculty.cua.edu/tran/engr104/Datafiles.htm
Using Matlab, statistically analyze the
data and report your observations
See handout
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering
LAB QUESTIONS:
Is there a noticeable trend/pattern in the data? Across the
datasets?
Is there a correlation between the blood glucose and high blood
pressure measure over time?
Examine this using a time-series analysis (30-day epochs).
Explain your findings.
Use curve fitting techniques to estimate the regression line best
fitting the data for each subject.
Is there a difference between the effects of tele-monitoring on
diabetics vs. hypertensives (i.e. those with high blood pressure)?
Explain.
– Is there any useful information in the histogram?
© 2003-09 The Catholic University of America
Dept of Biomedical Engineering