2 - Student Blog

Download Report

Transcript 2 - Student Blog

Kuswanto 2011
Ukuran keragaman




Dari tiga ukuran pemusatan, belum dapat memberikan
deskripsi yang lengkap bagi suatu data.
Perlu juga diketahui seberapa jauh pengamatanpengamatan tersebut menyebar dari rata-ratanya.
Ada kemungkinan diperoleh rata-rata dan median yang
sama, namun berbeda keragamannya.
Beberapa ukuran keragaman yang sering kita temui
adalah range (rentang=kisaran=wilayah), simpangan
(deviasi), varian (ragam), simpangan baku (standar
deviasi) dan koefisien keragaman.
Measures of Dispersion and Variability
f
f
These are measurements of how spread the
data is around the center of the distribution
X
X
2. DEVIATION  DEVIASI = SIMPANGAN
You could express dispersion in terms of
deviation from the mean, however, a sum of
deviations from the mean will always = 0.
i.e.  (Xi - X) = 0
So, take an absolute value to avoid this
Problem – the more numbers in the data set, the
higher the SS
1.Range  Kisaran = Rentang
difference between lowest and highest numbers
Place numbers in order of magnitude,
then range = Xn - X1.
2
2
3
4
5
= X1
= X2
= X3
= X4
= X5
Range = 5 - 2
=3
Problem - no information
about how clustered the
data is
3. Mean Deviation = Simpangan Rerata
Sample mean deviation =  | Xi - X |
n
Essentially the average deviation from the mean
4. Variance = Ragam
Another way to get around the problem of zero sums is to
square the deviations. Known as sum of squares or SS
Sample SS =  (Xi - X)2 = Xi2 - (Xi)2/n
SS is much more common than mean deviation
Example
2
2
3
4
5
= X1
= X2
= X3
= X4
= X5
Sample SS =  (Xi - X)2
X = 3.2
SS = (2 - 3.2)2 + (2 - 3.2)2 +
(3 - 3.2)2 + (4 - 3.2)2 + (5 -3.2)2
= 1.44 + 1.44 + 0.04 + 0.64 + 3.24
= 6.8
Problem – the more numbers in the data set, the
higher the SS
The mean SS is known as the variance
Population Variance (2 ):
2 = (Xi -  )2
N
This is just SS
N
Our best estimate of 2 is sample variance (s2):
S2 =  (Xi - X)2
n-1
=
 Xi2 - (Xi)2/n
Note : divide by n-1
known as degrees of freedom
n-1
Problem - units end up squared
5. Standard Deviation (Standar Deviasi)
=> square root of variance
For a population:
 = 2
 = (Xi -  )2
N
For a sample:
s = s2
s = (Xi - X )2
n-1
s = (Xi - X )2
Example
n-1
2
2
3
4
5
= X1
= X2
= X3
= X4
= X5
s=
(2 - 3.2)2 + (2 - 3.2)2 +
(3 - 3.2)2 + (4 - 3.2)2 + (5 -3.2)2
5-1
X = 3.2
=
1.44 + 1.44 + 0.04 + 0.64 + 3.24
4
= 1.304
6. Coefficient of Variation = Koefisien
Keragaman = KK (V or sometimes CV):
Variance (s2) and standard deviation (s) have
magnitudes that are dependent on the
magnitudes of the data.
The coefficient of variation is a relative
measure, so variability of different sets of data
may be compared (stdev relative to the mean)
s
X
CV =
X 100%
Note that there are no
units – emphasizes that it
is a relative measure
Sometimes expressed as a %
Example:
s (X 100%)
X
1.304 g
CV =
3.2 g
CV =
2
2
3
4
5
= X1
= X2
= X3
= X4
= X5
CV = 0.4075
or
X = 3.2 g
CV = 40.75%
s = 1.304 g
Attention  there is not any UNIT, or %
8. The Normal Distribution (Distribusi Normal) :
68.27%
f
95.44%
99.73%
3
2

X

2
3
There is an equation which describes the height of
the normal curve in relation to its standard dev ()
Normal distribution with σ = 1, with varying means
μ=1
μ=2
ƒ
μ=0
-3 -2 -1 0 1 2 3 4 5
If you get difficulties to keep this term,
read statistics books
Normal distribution with μ = 0, with varying standard
deviations
ƒ
σ=1
σ = 1.5
σ=2
-5 -4 -3 -2 -1 0 1 2 3 4 5
9. Symmetry and Kurtosis
ƒ
Symmetry means that the population is equally
distributed around the mean i.e. the curve to the
right side of the mean is a mirror image of the
curve to the left side
Mean, median and mode
Symmetry
ƒ
Data may be positively skewed (skewed to the right)
ƒ
Or negatively skewed (skewed to the left)
So direction of skew
refers to the direction of
longer tail
Symmetry
ƒ
mode
median
mean
ƒ
Kurtosis refers to how flat or peaked a curve is
(sometimes referred to as peakedness or tailedness)
The normal curve is
known as mesokurtic
ƒ
A more peaked curve is
known as leptokurtic
A flatter curve is known as
platykurtic
Latihan dan diskusi
1.
2.
Banyaknya buah pisang yang tersengat hama dari 16 tanaman
adalah 4, 9, 0, 1, 3, 24, 12, 3, 30, 12, 7, 13, 18, 4, 5, dan 15.
Dengan menganggap data tersebut sebagai contoh, hitunglah
varian, simpangan baku dan koefisien keragamannya. Statistik
mana yang paling tepat untuk menggambarkan keragaman data
tersebut?
To study how first-grade students utilize their time when assigned
to a math task, researcher observes 24 student and records their
time off task out of 20 minutes. Times off task (minutes) : 4, 0, 2, 2,
4, 1, 4, 6, 9, 7, 2, 7, 5, 4,13, 7, 7, 10, 10, 0, 5, 3, 9 and 8. For this
data set, find :
a)
b)
c)
d)
e)
Mean and standard deviation, media and range
Disply the data in the histogram plot, dot diagram and also stem-and-leaf
diagram
Determine the intervals x ± s, x ± 2s, x ± 3s
Find the proportion of the meausurements that lie in each of this intervals.
Compare your finding with empirical guideline of bell-shaped distribution
3. The data below were obtained from the detailed record of purchases
over several month. The usage vegetables (in weeks) for a
household taken from consumer panel were (gram) :

84 58 62 65 75 76 56 87 68 77 87 55 65 66 76 78 74 81 83 78 75 74 60 50 86
80 81 78 74 87
a. Plot a histogram of the data!
b. Find the relative frequency of the usage time that did not exceed
80.
c. Calculate the mean, variance and the standard deviation
d. Calculate the median and quartiles.
4. The mean of corn weight is 278 g by ear and deviation standard is
9,64 g, and than we have 10 ears. If they are gotten from ten
different fields, mean of plant height is Rp. 1200,- and its deviation
standard is Rp 90,-, which one have more homogenous, the weight
of corn ear or the plant height? Explain your answer! Verify your
results by direct calculation with the other data.
5. The employment’s salary at seed company, abbreviated, as follows :
18, 15, 21, 19, 13, 15, 14, 23, 18 and 16 rupiah. If these
abbreviation is real salary divide Rp. 100.000,-, find the mean,
variance and deviation standard of them.
6. Computer-aided statistical calculations. Calculation of the
descriptive statistic such as x and s are increasingly tedious with
large data sets. Modern computers have come a long way in
alleviating the drudgery of hand calculation. Microsoft Exel, Minitab
or SPSS are three of computing packages those are easy
accessible to student because its commands are in simple English.
Find these programs and install its at your computers. Bellow main
and sub menu of Microsoft Exel, Minitab and SPSS program. Use
these software to find x, s, s2, and coefisien of variation (CV) for
data set in exercise b. Histogram and another illustration can also
be created.
7. Some properties of the standard deviation
a) if a fixed number c is added to all measurements in a data
set, will the deviations (xi -x) remain changed? And
consequently, will s² and s remain changed, too? Take data
sample.
b) If all measurements in a data set are multiplied by a fixed
number d, the deviation (xi -x) get multiplied by d. Is it
right? What about the s² and s? Take data sample.
c) Apply your computer software to explain your data sample.
Verify your results by other data.