Transcript to view.
STATISTICS
Meena Ganapathy
MEANING
Statistics
Latin-status
Italian statistica
Germany Statistik
French statistique
Statistic – Singular- One value associated e.g., wt of
one person
Plural e.g., wt of more values
Statistics as singular branch of science- It is the
combination of logic & Mathematics.
DIFF. BRANCHES OF
STATISTICS
1) Medical Statistics
2) Health statistics
3) Vital statistics
4) Biostatistics
STATISTICS
It is the branch of Science which deals with technique of
collection, compilation, presentation, analysis of data & logical
interpretation of the result.
USE OF STATISTICS
1.To collect the data in best possible way.
2.To describe the characteristics of a group or a situation.
3.To analyze data & to draw conclusion from such analysis.
DEFINITION
Variable :- A characteristic that take different values in different person
places or things.
E.g. Ht, Wt, B.P., Age;’
It is denoted by capital x = x
E.g., x: ht
X1, x2, x3, x4…….xn
N= total numbers of observation
ATTRIBUTE
A qualitative characteristic like age, sex, nationality is called as
attribute
CONSTANT
The characteristic which does not change its value or nature is
considered as constant
E.g. blood group, sex
OBSERVATION
An event or its measurement such as BP., Is as event & 120/80
mm of Hg. Is as measurement
OBSERVATION UNIT
The source that gives observation such as object person etc.
DATA
A set of values recorded on one or more observational unit is
called as data. It gives numerical observation about observational
unit.
e.g., HT, WT, Age.
= equal to
< Less than
> greater than
=< less that & equal to
=> greater than & equal to
≠ not equal to
∑ Summation
Short forms
A.M.- arithmetic mean
H.M.- harmonic mean
G.M.- Geometric mean
C.V.- Coefficient of variation
S.E.- Standard error
S.D.- Standard deviation
D.F.- Degree of freedom
C.I.- Confidence interval
E :- Expected value of cell of contingency table
O :- Observed value of cell of contingency table.
N :- Population size
N :- Sample size
L :- Level of significance (I.O.S)
Ho :- Null hypothesis
H1 Alternative hypothesis
TYPES OF DATA
Qualitative and quantitative
Discrete and continuous
Primary and Secondary
Grouped and ungrouped
QUALITATIVE &
QUANTITATIVE DATA
Qualitative data :-It is also called as enumeration data. It represents
particular quality or attribute there is no notion of measurement. It
can be classified by counting individuals having the same
characteristics.
E.g. Sex, religion, blood group
QUA N T I TAT I VE DATA
It is also called as measurement data. This can be measures by
counting the characteristics in the variable.
E.g. Ht, Wt, BP, HB
DISCRETE & CONTINUOUS
Discrete :- Here we always get a whole number.
E.g. no of people dying in road accidents no. of vials of polio
vaccine.
Continuous :- In this data there is possibility of getting fraction
like 1.2, 2.1,3.81. i.e. it takes all possible values in a certain range.
E.g., Ht, WT, temp
PRIMARY AND SECONDARY
Primary :- The data obtained directly from a individual
gives precise information. i.e., when the data is collected
originally by the investigator for the first time is called
primary data.
E.g. to find no. of alcoholic person in Karvenagar
area. By the investigator.
Secondary :- When the data collected by somebody or
other person is used the data is called secondary data.
E.g. Census hospital records
UNGROUPED AND
GROUPED
Ungrouped :- When the data is presented in raw way , it is
called as ungrouped data
E.g. Marks of 5 students
20,30,25,20,30
Grouped :- When the ungrouped data is arranged according
to groups, then it is called as grouped data.
E.g. Marks Students
20
2
30
2
25
1
M E T H O D S O F DATA
C O L L E CT I ON
Observation
Visual
Instrument
Instrument Properties
Reliability
Validity
Interviews & self administered questionnaires
Use of documentary sources (secondary data)
CLASSIFICATION OF DATA
Definition :- The process of arranging data in to groups or classes
according to similar characteristics is called as classification & the
group so formed are called as class limits 1 class interval.
OBJECTIVES OF
CLASSIFICATION OF DATA
1.It condense the data
2.It omits unnecessary information.
3.It reveals the important features of the data.
4.It facilities comparison with other data
5.It enables further analysis like competition of average, dispersion
(Variables ) data.
F R E QU E N C Y
A) Frequency
Definition :- No. of times variable value is repeated is called as
frequency.
B) Cumulative class frequency
Definition :-Cumulative frequency is formed by adding frequency of
each class to the total frequency at the previous class. It indicates the no.
of observations < upper limit of the class limit.
Representatives
Symbol
Sample
Population
1. Mean
X bar
M
2. SD
$
o2
3. Variance
$2
o2
4. Proportion
p
P
2
Q
5. Complement of
proportion
DATA PRESENTATION
Meena Ganapathy
M E T H O D S O F P R E S E N TAT I ON
O F DATA
Tabulation.
Charts and diagrams.
METHODS OF PRESENTATION OF
DATA
Caption
heading
Stub
heading
S
T
U
B
Total
Caption
Total
Subheading
Body of the
Table
IMPORTANT POINTS IN
MAKING A TABLE
Table No. :- If many tables are present
Title :- Should be small
Head note :- Whatever is not covered in title can be written in head
note.
E.g. expressing units
Caption :- column heading
According to characteristics
Stub :- raw
Subheading
Body :- content
Foot note:- Short forms or
Source note :- resource it is important because it shows reliability of
table.
RULES AND GUIDELINES FOR
TA BU L A R P R E S E N TAT I O N
1. Table must be numbered
2. Brief & self explanatory title must be given to each table.
3.The headings of columns & rows must be clear, sufficient,
concise & fully defined.
4. The data must be presented according to size or importance
chronologically alphabetically or geographically.
5. Table should not be large.
6. Foot note should be given whenever necessary providing
additional information sources or explanatory notes.
TYPES OF TABLE
1.One way table/simple table
2.Two way table
3.Complex table
1.ONE WAY TABLE/ SIMPLE
TABLE
When there is only one characteristics is described in a table then it
is called as simple table
EXAMPLE OF ONE WAY TABLE
Class interval
Frequency
Tally Mark
Frequency
3–4
IIII
5
5- 6
II
2
7–8
IIII
5
9 - 10
III
3
TWO WAY TABLE
In this table data is classified according to two characteristics it
given information about two interrelated characteristics.
Sex
Types of anemia
Total
Boy
s
160
15
260
Girls 190
120 45
355
Total 350
205 60
615
Frequency distribution
table qualitative data
distribution of types of
anemia
According to sex
85
COMPLEX TABLE
Information collected regarding 3 or 4 characteristics & tabulated
according to these characteristics such a type of table is called as
complex table.
EXAMPLE OF COMPLEX TABLE
Fasting blood
Male
Female
Total
Glucose
51-60 & 61-70yrs
51-60 & 61-70 yrs
120-129
4
4
2
2
12
130-139
1
3
3
1
8
140-149
2
4
1
3
10
150-159
2
3
3
2
10
160-169
4
5
3
3
15
170-179
5
4
5
4
18
180-189
1
2
1
1
5
19
25
18
16
78
A DVA N TAG ES O F A G R A P H S &
D I AG R A MS
1. Information is presented in condensed form
2. Facts are presented in more effective & impressive manner as
compared to tables
Easy to understand for a layman.
Create effect which last for longer time
Facilitate the comparison.
Help in revealing patterns.
DISADVANTAGES
Approximate results instead of accuracy
Gives only a general idea
Not sufficient for statistical analysis
T Y P E S O F D I AG R A MS F O R
QUA L I TAT I V E DATA
Bar: Simple, Multiple or complex, Component & Proportional
Pie or Sector
Pictograms
Shaded Map / Contour / Spot Maps
BAR DIAGRAMS
It is used to compare variables possessed by one or more groups.
SIMPLE BAR DIAGRAM
Here only one variable is presented
Bars are at uniform distance from one another
It can be drawn vertically or horizontally
Each should have title & source note
No. of dependents at home
120
103
97
No. of subjects
100
80
60
47
40
34
21
20
17
0
None
1
2
3
No. of dependents
4
5 and
above
PIE OR SECTOR
DIAGRAMS
When the data is presented as sum of different components for
one qualitative characteristics we use pie diagrams.
Patients age distribution in percentage
21%
34%
19-29
30-39
40-49
50-59
19%
26%
PICTOGRAMS
This diagrams are useful for lay people. E.g., Village map
indicating temple, trees etc…
SPOT MAPS
In this diagram a map of an area with location of each case of an
illness, death etc… are identified with spots or dot or any other
symbol.
TYPES OF DIAGRAMS FOR
QUANTITATIVE DATA
Histograms
Frequency polygon
Frequency curve
Cumulative frequency curve
Line graph
Scatter diagram
Population Pyramid
HISTOGRAMS
It is the graphical representation
of frequency distribution. It is a
series of adjacent rectangles erected
on bars
Areas of these bars denote the
frequency of respective class
interval.
X axis base of bars shows class
width of class interval
Y axis frequency / No of
observations
90
80
70
60
East
West
North
50
40
30
20
10
0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
FREQUENCY POLYGON
It is representation of categories of continuous & ordered data
similar to histogram. It can be drawn in two ways: Using histograms,
with out using histograms.
Uses: it is used when sets of data are illustrated on the same
diagram such as temperature, & pulse, birth & death rate etc…
350
300
250
200
150
100
50
0
Series1
Series2
1
2
3
4
5
6
7
SCATTER DIAGRAMS
It is prepared after tabulation in which frequencies of two
variables have been cross classified
It is graphic representation of co relation between two variables
SCATTER PLOT
700
600
500
400
300
200
100
0
Series1
0
5
10
15
LINE DIAGRAMS
It is used to show the trends of events with the passage of time.
E.g., rising & falling
LINE GRAPH
700
600
500
400
300
200
100
0
Series1
Series2
1
2
3
4
5
6
7
LINE & BAR
14
12
10
8
6
4
2
0
700
600
500
400
300
200
100
0
1
2
3
4
5
6
7
Series2
Series1
MEASURES OF CENTRAL
TENDENCY
Mode-Value that occurs most frequently
Median –point below and above 50% of cases fall
Mean-mathematical average( sum of scores divided by the total #
of scores
Level of measurement plays a role in which central tendency
measure you
Mean-interval & Ratio
Mode-Nominal
Median-ordinal
VARIABILITY / CENTRAL
DISPERSION
Extent to which scores deviate from each other
Homogenous
Heterogeneous
Range-highest score-lowest
Distance between high & low scores
Standard Deviation (SD)
Difference between individual score and mean
Weight of person A=150 lbs
Mean =140
Deviation =+10
SD ( average deviation from mean )
Formula
BIVARIATE STATISTICS
Associations between 2 variables
Correlations
INFERENTIAL STATISTIC
Hypothesis testing
Null Ho
No actual relationship between variables
There will be no difference in grant writing ability between nurses
who attend and do not attend the research short course
Accept the null Ho
Reject the null Ho
Type I Error
Reject the null when it is actually true
Type II Error
Accepting the null when it is actually false
Level of significance
Probability of committing Type I Error
Set by the researcher
Usually set at p =.05
Lowering risk to Type I increases risk of Type II
PARAMETRIC TESTS
Involve estimation of at least one parameter
Interval level data / Ratio scale
Assume variables are normally distributed
NONPARAMETRIC TESTS
Nominal or ordinal level data
Less restrictions about distributions
Between subjects testing
Men versus women
Within subjects testing
Same group compared pre and post-intervention
DIFFERENCES BETWEEN 2
GROUP MEANS
Parametric
T-tests for independent groups
Paired t-Tests
Nonparametric
Mann Whitney U
Wilcoxon signed rank test
DIFFERENCES BETWEEN 3
OR MORE GROUP MEANS
Parametric
One-Way Analysis of Variance (ANOVA)
F ratio test
Post-hoc tests to see which groups differ from each other
LSD; Bonferroni
Multifactor ANOVA (MANOVA)
More than 2 IVs
Usually for more complex analyses
EG., Human behavior, feelings
Repeated Measures ANOVA
3 or more measures of same DV for each subject
EG., subjects exposed to 3 or more different treatment conditions
3 more data collection points of DV over time (longitudinal)
Nonparametric ‘analysis of variance’
Kruskal wallis
TESTING DIFFERENCES
IN PROPORTIONS
DV is nominal level
Chi square test
RELATIONSHIPS
BETWEEN 2 VARIABLES
Pearson’s (interval level)
Spearman’s rho or Kendall's tau
(ordinal)
POWER ANALYSIS
The probability of obtaining a significant result is called power of
a statistical test
Insufficient power-greater risk of Type II error
4 components
Significance level-more stringent, lower the power
Sample Size-increases, power increases
Population effect size (gammaY)- how strong effect of IV is on
the DV
Power (1-B)-probability of rejecting null Ho
MULTIVARIATE
STATISTICS
Simple linear regression
Make predictions about phenomena
R-correlation
R2proportion of variance in Y accounted for by combined Xs
Analysis of Covariance (ANCOVA)
Tests significance of differences between group
means after adjusting scores on DV to eliminate
effects of covariate (s)
Anxiety pre and post biofeedback therapy
One hospital = treatment
One hospital = control
Post anxiety DV; hospital condition IV
Pre anxiety scores- covariate
Discriminant Analysis
Predicts group membership
Nurses who graduate versus drop outs
Cancer patients adhere to treatment versus those who don’t
Logistic Regression
Binomial Logistic Regression
DV is categorical (2 groups)
Odds of Belonging to one group
Multinomial Logistic Regression
DV is categorical (. 2 groups)
Odds of belong to one group