Lecture Notes 2d
Download
Report
Transcript Lecture Notes 2d
Lecture 2d:
Performance
Comparison
Quality of Measurement
Characteristics of a measurement tool (timer)
Accuracy: Absolute difference of a measured value and
the corresponding standard reference value (such as the
duration of a second).
Precision: Reliability of the measurements made with the
tool. Highly precise measurements are tightly clustered
around a single value.
Resolution: Smallest incremental change that can be
detected. Ex: interval between clock ticks
Quality of Measurement
accuracy
precision
mean value
true value
Quality of Measurement
The uncertainties in the measurements are called
errors or noise
Sources of errors:
Accuracy, precision, resolution of the measurement tool
Time required to read and store the current time value
Time-sharing among multiple programs
Processing of interrupts
Cache misses, page faults
Quality of Measurement
Types of errors:
Systematic errors
•
•
Are the result of some experimental mistake
Usually constant across all measurements
Ex: temperature may effect clock period
Random errors
•
•
Unpredictable, nondeterministic
Effect the precision of measurement
Ex: timer resolution ±T , effects measurements with equal probability
Quality of Measurement
Experimental measurements follow Gaussian (normal) distribution
Ex:
x measured value
±E random error
Two sources of errors, each having 50% probability
Error 1
Error 2
Measured value
Probability
Pg 48 -E
-E
x-2E
1/4
-E
+E
x
1/4
+E
-E
x
1/4
+E
+E
x+2E
1/4
Actual value of x is measured half of the time.
Confidence Intervals
Used to find a range of values that has a given probability of
including the actual value.
Case 1: number of measurements is large (n≥30)
{x1, x2, … xn} - Samples
Gaussian distribution
m – mean
s – standard deviation
Confidence interval: [ c1, c2 ]
Confidence level: (1-)×100
Pr[ c1 ≤ x ≤ c2 ] = 1-
Pr[ x < c1 ] = Pr[ x > c2] = /2
Confidence Intervals
Case 1: number of measurements is large (n≥30)
Confidence interval: [ c1, c2 ]
c1 x z1 / 2
c2 x z1 / 2
s
n
s
n
x
- Sample mean
s
- Standard deviation
z1 / 2
is obtained from the
precomputed table
Confidence Intervals
Case 2: number of measurements is small (n<30)
Sample variances s2 can vary significantly.
t distribution:
c1 x t1 / 2;n 1
c2 x t1 / 2;n 1
s
n
s
n
x
- Sample mean
s
- Standard deviation
t1 / 2;n 1 is obtained from the
precomputed table
Confidence Intervals
Ex: number of measurements is large (n<30)
Pg 51
90% confidence interval means that there is a 90% chance
that the actual mean is within that interval.
Confidence Intervals
90%
95%
99%
c1= 6.5 c2= 9.4
c1= 6.1 c2= 9.7
c1= 5.3 c2=10.6
Wider interval Less precise knowledge about the mean
Confidence Intervals
Determining the Number of measurements Needed
(c1 , c2 ) ((1 e) x , (1 e) x )
c1 (1 e) x x z1 / 2
z1 / 2 s
n
ex
2
s
n
Confidence Intervals
Determining the Number of measurements Needed
Estimating s:
1. Make small number of measurements.
2. Estimate standard deviation s.
3. Calculate n.
4. Make n measurements.
z1 / 2 s
n
ex
2
Confidence Intervals
Ex:
Pg 53
Confidence Intervals
Confidence Intervals for Proportions
When we are interested in the number of times events occur.
Bimonial distribution:
If np≥10 it approximates Gaussian distribution with mean p and
variance p(1-p)/n
c1 p z1 / 2
c2 p z1 / 2
p (1 p )
n
p(1 p )
n
n
- Total events recorded
mp m/n
Number of times desired
outcome occurs
is the sample proportion
Confidence Intervals
Confidence Intervals for Proportions
Determining the number of measurements needed:
(1 e) p p z1 / 2
p (1 p )
n
( z1 / 2 ) 2 p(1 p )
n
( pe) 2
Confidence Intervals
Ex:
Pg 55
Comparing Alternatives
Three different cases:
Before-and-after comparison
Comparison of non-corresponding (impaired)
measurements
Comparisons involving proportions
Comparing Alternatives
Before-and-after comparison
Used to determine whether some change made to a system
has statistically significant impact on its performance.
1. Find a confidence interval for the mean of the differences
of the paired observations
2. If this interval includes 0, then measured differences are
not statistically significant.
Comparing Alternatives
Before-and-after comparison
Before measurements: b1, … bn
After measurements: a1, … an
Differences:
d1= a1, - b1
d2= a2, - b2 …
c1 d z1 / 2
c2 d z1 / 2
sd
n
sd
n
d
- Arithmetic mean
sd
- Standard deviation
n ≥ 30
Comparing Alternatives
Before-and-after comparison
Ex:
pg 65
Comparing Alternatives
Non-corresponding Measurements
There is no direct corresponding between pairs of measurements.
1. First system: n1 measurements, find x1 and s1
2. Second system: n2 measurements, find x2 and s2
3. Calculate the difference of means and standard deviation of
the difference of means
x x1 x2
2
sx
2
s1
s2
n1
n2
4. If confidence interval includes 0, then no significant difference
Comparing Alternatives
Non-corresponding Measurements
c1 x z1 / 2 s x
c2 x z1 / 2 s x
n1 ≥ 30 and n2 ≥ 30
Comparing Alternatives
Non-corresponding Measurements
c1 x t1 / 2;ndf sx
c2 x t1 / 2;ndf sx
n1 < 30 or n2 < 30
ndf
2
s1
s2
n
n
1
2
2
2
2
2
s1 n1
s2 n2
(n1 1)
(n2 1)
2
2
Comparing Alternatives
Non-corresponding Measurements
Ex: pg 67
Comparing Alternatives
Comparing Proportions
m1 is the number of times the event occurs in system 1 out of a
total of n1 events measured.
p1 m1 / n1
p2 m2 / n2
If m1>10 and m2>10 the it approximates normal distribution with
means
variance
p1
and
p2
p1 (1 p1 ) / n1
and
p2 (1 p2 ) / n2
Comparing Alternatives
Comparing Proportions
Confidence intervals
c1 p z1 / 2 s p
c2 p z1 / 2 s p
where
p p1 p2
Standard deviation
sp
p1 (1 p1 ) p2 (1 p2 )
n1
n2
Comparing Alternatives
Comparing More than Two Alternatives
Analysis of Variance (ANOVA)
n - measurements
k - alternatives
Comparing Alternatives
Comparing More than Two Alternatives
n
Mean of alternative j
yj
k
Overall mean
y
y
i 1
n
n
y
j 1
i
i 1
kn
ij
Comparing Alternatives
Comparing More than Two Alternatives
Deviation of yij from mean
yij y j eij
k
Deviation of yj from y
Therefore
yj y j
yij y j eij
j 1
j
0
Comparing Alternatives
Comparing More than Two Alternatives
Total variance observed:
1. Variance due to the actual differences among alternatives
(SSA)
2. Variance due to measurement errors (SSE)
Comparing Alternatives
Comparing More than Two Alternatives
k
SSA n ( y j y ) 2
i 1
k
SSE
j 1
n
2
(
y
y
)
ij j
i 1
k
Sum of squares total:
SST
j 1
n
(y
i 1
ij
y )
SST SSA SSE
2
Comparing Alternatives
Comparing More than Two Alternatives
F-test
F distribution
Used to test whether two variances are significantly different.
SSA SSE
If SST , SST close to 1, then no significant difference
If they are greater than a critical value, then it can not be said that
there is no significant difference
Comparing Alternatives
Comparing More than Two Alternatives
Mean square:
sa
2
SSA
k-1
Mean square error: se 2
If
F F[1 ;( k 1), k ( n 1)]
SSE
k (n-1)
sa
F 2
se
then SSA SSE
with confidence level of 1
2
Comparing Alternatives
Comparing More than Two Alternatives
Contrasts
Used to compare individual alternatives.
c- contrast
w- weight
k
c w j j
j 1
k
w
j 1
j
0
Comparing Alternatives
Comparing More than Two Alternatives
k
Variance of c: sc
2
w j se
2
2
j 1
kn
Confidence interval:
c1 c t1 / 2;k ( n 1) sc
c2 c t1 / 2;k ( n 1) sc
k (n 1) 30
Comparing Alternatives
Comparing more than Two Alternatives
Comparing Alternatives
Comparing more than Two Alternatives
Ex:
Comparing Alternatives
Comparing more than Two Alternatives