STATISTICAL DATA ANALYSIS TECHNIQUES
Download
Report
Transcript STATISTICAL DATA ANALYSIS TECHNIQUES
ANALYSING AND INTERPRETING
QUANTITATIVE DATA
76.00
32.00
70.00
34.00
68.00
42.00
66.00
65.00
43.00
58.00
44.00
57.00
45.00
47.00
56.00
50.00
54.00
55.00
HJ. SHAWAL KASLAM
INTRODUCTION
ONE OF THE MAJOR THING IN RESEARCH IS THE
DATA. THEREFORE UNDERSTANDING THE DATA IS
A CRUCIAL PART IN RESEARCH. Some of the
fundamental questions have to be considered are:
1. What is the nature of data?
2. How to gather the data?
3. What is the instrument used to gather the data?
4. How to measure the data?
5. How to analyze the data?
6. How to interpret the output?
The Objective of this presentation is to describe
the fundamental process of analyzing quantitative
data. By the end of this session, the participants
should be able to:• describe, combine, and make inferences from numbers.
• understand the procedures used to obtain several
statistical values.
• use, present and interpret the statistical outputs to
produce a report.
• make conclusion based on the statistical findings.
WHAT IS QUANTITATIVE DATA
ANALYSIS?
QUANTITATIVE DATA ANALYSIS IS A PROCESS OF
TRANSFORMING THE RAW DATA OBTAINED FROM
QUESTIONNAIRES INTO MEANINGFUL
INFORMATION SUCH AS STATISTICAL VALUES [e.g:
% value, mean value etc..] AND TO TEST
STATISTICAL SIGNIFICANT OF THE DATA.
THE STEPS IN DATA ANALYSIS
SELECT THE SOFTWARE [SPSS]
CREATING DATA FILE
KEY IN DATA
EXPLORING DATA
EDITING FILE
ANALYSING DATA
INTERPRETING THE RESULT/OUTPUT
DATA ANALYSIS TECHNIQUES AND THE
STATISTICAL VALUES
DESCRIPTIVE
MEASURE OF
FREQUENCY
DISTRIBUTION
. Percentage value %
MEASURE OF CENTRAL
TENDENCY
. Mean
. Mode
. Media
SKEWNESS & KURTOSIS
Exploration of the variables
INFERENTIAL
Chi-square Test
T-Tests
. One Sample Test
. Paired T-Test
. Independent T-Test
ANOVA
Z-Test
To test statistical
significant of the
variables
PREDICTIVE
Correlation
Analysis
Regression
Analysis
To test
relationship,
statistical
significant and
predict the impact
or changes of the
variables
BASIC DESCRIPTIVE STATISTICS used to explore the data collected and to
summarise and describe those data.
MEASURES OF FREQUENCY DISTRIBUTION - is a display of the frequency of
occurrence of each score value. The frequency distribution can be represented in a
tabular form or, with more visual clarity, in graphical form.
Gender
Valid
Male
Female
Total
Frequency
4
6
10
Percent
40.0
60.0
100.0
Valid Percent
40.0
60.0
100.0
Cumulative
Percent
40.0
100.0
Support from peer
Valid
Undecided
Agree
Strongly Agree
Total
Frequency
11
107
37
155
Percent
7.1
69.0
23.9
100.0
Valid Percent
7.1
69.0
23.9
100.0
Cumulative
Percent
7.1
76.1
100.0
Marital Status
Valid
Bachelor
Married
widower
Total
Frequency
19
134
2
155
Percent
12.3
86.5
1.3
100.0
Cumulative
Percent
12.3
98.7
100.0
Valid Percent
12.3
86.5
1.3
100.0
70
widower
60
Bachelor
50
40
30
20
Std. Dev = .58
10
Mean = 2.7
Married
N = 100.00
0
1.0
2.0
What is your cgpa?
3.0
4.0
5.0
BASIC DESCRIPTIVE STATISTICS
MEASURES OF CENTRAL TENDENCY – INTERVAL or RATIO
DATA
STATISTICAL VALUES
- MEAN
- MODE
- MEDIA
- STANDARD DEVIATION
- MAKSIMUM
- MINIMUM
Statistics
N
Mean
Median
Mode
Std. Deviation
Minimum
Maximum
Valid
Missing
Support
from peer
155
0
4.1677
4.0000
4.00
.53232
3.00
5.00
Support
from peer
155
0
3.2710
4.0000
4.00
.92097
1.00
5.00
Support
from peer
155
0
4.3419
4.0000
4.00
.57478
3.00
5.00
Support
from peer
155
0
3.0581
3.0000
4.00
.95509
1.00
5.00
Support
from peer
155
0
4.2323
4.0000
4.00
.62231
2.00
5.00
Support
from peer
155
0
3.8452
4.0000
4.00
.65606
2.00
5.00
MEASURE OF CENTRAL TEDENCY
COMPARE MEAN
Descriptive Statistics
N
Tahap peng urusan kualiti
Tahap program kualiti
Tahap sokongan staf
Tahap kepuasan
pelang gan
Ceramah kualiti
Valid N (listwise)
121
120
120
Minimum
2.00
2.00
2.00
Maximum
5.00
5.00
5.00
Mean
3.8182
3.8250
4.1250
Std. Deviation
.67082
.68185
.76216
119
2.00
5.00
3.9664
.70028
128
119
1.00
9.00
2.8906
1.50712
BASIC DESCRIPTIVE STATISTICS
MEASURES OF SKEWNESS & KURTOSIS – refer to the shape of
distribution and are used with interval and ratio level data.
100
120
100
80
80
60
60
40
40
20
20
Std. Dev = .87
Std. Dev = .53
Mean = 3.6
Mean = 4.17
N = 155.00
0
3.00
3.50
Support from peer
4.00
4.50
5.00
N = 155.00
0
1.0
2.0
Work Environment
3.0
4.0
5.0
5
4
3
2
1
Std. Dev = 22.74
Mean = 72.8
N = 20.00
0
20.0
30.0
VAR00001
40.0
50.0
60.0
70.0
80.0
90.0
100.0
BASIC DESCRIPTIVE STATISTIC
CROSS TABULATION – to explore the
intersection of two variables
Marital Status * INCOME GROUP Crosstabulation
Count
B
Marital
Status
Total
Bachelor
Married
widower
11
25
36
INCOME GROUP
C
D
7
1
71
33
2
80
34
E
Total
5
5
19
134
2
155
INTERPRETING BASIC DESCRIPTIVE
STATISTICS
BASIC DESCRIPTIVE STATISTICAL
VALUES CAN BE USED TO EXPLORE
AND EXPLAIN THE RESEARCH
QUESTIONS.
Example: Research question “is there any
different of mean score between the group
[male and female] of sample study?
Report
What is your gender
MALE
FEMALE
Total
Mean
N
Std. Deviation
Mean
N
Std. Deviation
Mean
N
Std. Deviation
Post-score
76.4000
15
9.53040
72.8000
15
14.78030
74.6000
30
12.35565
Pre-score
54.2667
15
10.79330
51.8667
15
13.22264
53.0667
30
11.92197
Descriptive Statistics
N
beneficial to new
students
run smoothly
lecture hall was
comfortable
satisfied with the room
guidance by the librarian
Valid N (listwise)
Item
Minimum
Maximum
Mean
Std. Deviation
80
1.00
5.00
3.2750
.87113
80
1.00
4.00
2.8500
.91541
80
1.00
5.00
3.5625
.82437
80
80
80
1.00
1.00
5.00
5.00
3.4625
3.6875
.91325
.77286
Mean
SD
Rank
Guidance by the
Librarian
3.6875
.77286
1
Lecture hall
3.5625
0.82437
2
Satisfaction with
room
3.4625
0.91325
3
INFRENTIAL STATISTICS - tests for
difference of means and tests for statistical
significance.
The purpose of difference of means tests is to
test hypotheses. The most common techniques
are called
- T-Test
- ANOVA
- CHI-SQUARE
Hypothesis
Ho : ∂1 = ∂2 No significant different
H1 : ∂1 ≠ ∂2 There is significant different
Rule of significant test SPSS
Calculated Test value ρ
Critical value α [alfa]
By convention, in social science α = .05 or 0.01
CRITERIA OF REJECTION or ACCEPTANCE
If significant test value ρ < α [0.05 / 0.01]
reject Ho [ There is significant difference]
If significant test value ρ > α [0.05 / 0.01]
fail to reject Ho [No significant difference]
T-TEST – used to determine whether
there is a significant difference between
two sets of scores
One-sample T-test – is used when you have data
from a single sample of participants and you wish to
know whether the mean population from which the
sample is drawn is the same as the standard mean.
e.g: test whether the mean of students test score
is the same as the standard mean = 70
One-Sample Statistics
N
Test score 1
30
Mean
67.7000
Std. Deviation
12.50145
Std. Error
Mean
2.28244
DATA
Test score
65
56
58
79
80
65
65
67
68
69
The Result from the study is
One-Sample Test
Test Value = 70
Test score 1
t
-1.008
df
29
Sig . (2-tailed)
.322
Mean
Difference
-2.3000
95% Confidence
Interval of the
Difference
Lower
Upper
-6.9681
2.3681
T (29) = -1.008, ρ = 0.322, ρ > 0.05
∴ Fail to reject Ho
Conclusion there is no significant difference
between the sample mean of population with the
standard mean [Test value].
Independent T-Test – is used to test whether the
difference between means for the two sets of
scores is significant.
A study was done to compare job stress between two
employee groups (administrative and support). Data were
solicited from a randomly selected sample.
Test the hypothesis on the difference at .05 level of
significance.
Group Statistics
Job stress
Employee groups
Administrative
Support
N
10
10
Mean
22.8000
22.1000
Std. Deviation
2.39444
2.68535
Std. Error
Mean
.75719
.84918
The result of independent sample T-Test
Independent Samples Test
Levene's Test for
Equality of Variances
F
Job stress
Equal variances
assumed
Equal variances
not assumed
Sig .
.640
.434
t-test for Eq uality of Means
t
df
Sig . (2-tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower
Upper
.615
18
.546
.7000
1.13774
-1.69030
3.09030
.615
17.768
.546
.7000
1.13774
-1.69253
3.09253
T(18) = .615, ρ = .545
ρ > 0.05
∴Fail to reject Ho
Conclude that there is no significant difference in job
stress between administrative and support groups at
.05 level of significance.
Paired T-Test – used to determine whether the
difference between means for the two sets of
scores is the same or different.
A training program was conducted to improve
participants participants’ knowledge on ICT. Data
were collected from a selected sample both
before and after the ICT training program. Test the
hypothesis that the training is effective to improve
participants knowledge on ICT at 0.05 level of
significant.
Paired Samples Statistics
Pair
1
Post-Test scores
Pre-Test scores
Mean
15.1000
12.3000
N
10
10
Std. Deviation
2.99815
1.88856
Std. Error
Mean
.94810
.59722
The result of Paired T-Test
Paired Samples Test
Paired Differences
Mean
Pair
1
Post-Test scores
- Pre-Test scores
2.8000
Std. Deviation
Std. Error
Mean
1.81353
.57349
95% Confidence
Interval of the
Difference
Lower
Upper
1.5027
4.0973
T(9) = 4.882, ρ = .001
ρ < 0.05
∴Reject Ho [Null hypothesis]
Conclude that the training program was
effective to improve participants knowledge
on ICT at .01 level of significance
t
4.882
df
Sig . (2-tailed)
9
.001
ANOVA One Way Analysis of variance – wish
to compare means of more than two groups.
ANOVA is also provide post hoc analysis
to determine pair of groups that are
significantly difference
Data on perception toward management was
gathered from a randomly selected sample
comprising of three from a randomly selected
sample comprising of three employee groups
(supervisory, line and support). Test the
difference in perception among the three
groups at .05 level of significance.
The result of INOVA
Descriptives
Perception towards management
N
Supervisory
Line
Support
Total
9
10
10
29
Mean
30.3333
21.1000
19.6000
23.4483
Std. Deviation
3.50000
3.07137
3.56526
5.75442
Std. Error
1.16667
.97125
1.12744
1.06857
95% Confidence Interval for
Mean
Lower Bound
Upper Bound
27.6430
33.0237
18.9029
23.2971
17.0496
22.1504
21.2594
25.6371
Test of Homogeneity of Variances
Minimum
25.00
17.00
14.00
14.00
Maximum
35.00
25.00
24.00
35.00
F (2, 26) = 27.542, p = .000
Perception towards management
Levene
Statistic
.194
df1
df2
2
26
Since sig-F (.000) < α (.05)
Sig .
.825
∴Reject the null hypothesis
ANOVA
Perception towards manag ement
Between Groups
Within Groups
Total
Sum of
Squares
629.872
297.300
927.172
df
2
26
28
Mean Square
314.936
11.435
F
27.542
Sig .
.000
Conclude that there is a significant
difference in perception towards
management between the three
employee groups at .05 level of
significance.
Post-hoc Analysis – which pair is
significance different?
Multiple Comparisons
Dependent Variable: Perception towards management
Tukey HSD
(I) Employee groups
Supervisory
Line
Support
(J) Employee groups
Line
Support
Supervisory
Support
Supervisory
Line
Mean
Difference
(I-J)
9.2333*
10.7333*
-9.2333*
1.5000
-10.7333*
-1.5000
*. The mean difference is sig nificant at the .05 level.
Std. Error
1.55370
1.55370
1.55370
1.51226
1.55370
1.51226
Sig .
.000
.000
.000
.588
.000
.588
95% Confidence Interval
Lower Bound
Upper Bound
5.3726
13.0941
6.8726
14.5941
-13.0941
-5.3726
-2.2578
5.2578
-14.5941
-6.8726
-5.2578
2.2578
Chi-Square Test for independence or
relatedness – nonparametric techniques
A study was conducted to determine whether job stress is
significantly related with employees group.
Chi-Square Tests
Pearson Chi-Square
Likelihood Ratio
Linear-by-Linear
Association
N of Valid Cases
Value
4.667a
6.225
9
9
Asymp. Sig.
(2-sided)
.862
.717
1
.532
df
.391
20
a. 20 cells (100.0%) have expected count less than 5. The
minimum expected count is .50.
The result Pearson Chi-square = 4.667, p = .862
X2 ( 9, N = 20) = 4.667, p > .05
∴ Fail to reject Ho
Conclude that there is no significance
relatedness of job stress with employees group.
PREDECTIVE STATISTICAL ANALYSIS
TECHNIQUES – CORRELATION ANALYSIS
CORRELATION ANALYSIS – used to look
at the relationship between two variables
in a linear fashion.
The correlation coefficient has a range of
possible values from -1 to +1. The value
indicates the strength of the relationship,
while the sign ( + or - ) indicates the
direction.
Guildford Rule of Thumb
x
The result of Correlation analysis
Correlations
Anxiety
Team cohesiveness
Pearson Correlation
Sig . (2-tailed)
N
Pearson Correlation
Sig . (2-tailed)
N
Team
Anxiety
cohesiveness
1
-.783**
.
.000
21
21
-.783**
1
.000
.
21
21
**. Correlation is sig nificant at the 0.01 level (2-tailed).
R = -.783
Sign r = .000 , p < 0.01
There is a negative and high relationship
between anxiety (X) and team cohesiveness (Y)
PREDECTIVE STATISTICAL ANALYSIS
TECHNIQUES – REGRESSION ANALYSIS
Regression analysis – The result of
regression is an equation that represents
the best prediction of a dependent variable
from several independent variables.
Regression analysis is used when
independent variables are correlated with
one another and with the dependent
variable.
The purpose of regression analysis
Determine relationship between one or more IVs and
one DV
Predict value of the dependent variable on value of
independent variables (X’s)
CONCLUSION
Analyzing quantitative data is the most
interesting part of a research. It is
important that the presentation of the data
is effective in bringing the objectives of
the study to the forefront and in stating
clearly the research outcome.
That all, Thank you very much.