S4-Variable identification

Download Report

Transcript S4-Variable identification

Parameters, Variables, & Evidences
Identifying Research Question
Seeking the right variables & evidences (questions)
Quality of Data
Instrument to capture information systematically
Techniques for Analysis
Measurement of Perception
Attitude: A tendency to evaluate a stimulus with some
degree of favor or disfavor, usually expressed in
cognitive, affective, or behavioral responses
A learned tendency of an individual
Expressed as opinion
OR
Primary Observations / Secondary Data / Archives
DATA
Primary Data
Secondary Data
SCALES
(order, distance, & origin)
Nominal Scale, Gender
Ordinal Scale, Grades – A, B, C
Interval Scale, Years - 2001-2007
Ratio, GDP
Some Basics
Descriptive statistics
(parametric)
 Sample
 Population
Statistics
Parameter
Raw data, source, & authenticity
Sampling & Estimation
Sampling
1. Simple random Sampling:
equal probability of being
picked
2. Systematic Sampling:
selected at an uniform
interval
3. Stratified Sampling:
selected from homogeneous
groups / strata
4. Cluster Sampling:
make clusters and choose
any one of them
Statistical inference is based on
simple random sampling
Sampling Distribution:
 Sampling distribution of the mean:
A probability distribution of all the possible means of the
samples is a distribution of sample means.
Sampling distribution of the median:
Sampling distribution of the proportion:
Standard error:
The standard deviation of the distribution of a sample
statistic is known as the standard error of the
statistics.
Example: standard deviation of the distribution of sample
means is termed as standard error of the mean.
μ
σ = standard deviation of this distribution
The population distribution:
μ = the mean of the distribution
The sample frequency distribution:
x1
x2
x3
x4
The sampling distribution of the mean
μx
μx = mean of the sampling distribution of the means
σx = standard error of the mean
= standard deviation of the sampling distribution of mean
Sampling from Normal Populations
Properties of the sampling distribution of the mean when
the population is normally distributed
μx = μ
σx = σ/√ n
Standard error of the mean for infinite population:
σx = σ/√ n
Standard error of the mean for finite populations:
σ
(N - n)
σ x = ---- * √ ----------
√n
n-1
With a finite population multiplier
Standardizing the sample mean:
Standard score; standard deviation from the mean of a
standard normal probability distribution
x - μ
Z=
-------
σx
Sample mean, population mean, standard error of the
mean
The Central Limit Theorem:
The mean of sampling distribution of the mean will equal
the population mean regardless of the sampling size, even
if the population is not normal.
As the sample size increases, the sampling distribution of
the mean will approach normality, regardless of the shape
of the population distribution.
The significance of the central limit theorem is that it
permits us to use sample statistics to make inferences
about population parameters, without knowing anything
about the shape of the frequency distribution of that
population other than what we can get from the sample.
Estimation
Reason for estimates: To make statistical inferences about the
population from a sample.
Types of estimate:
Point Estimate: It is a single number that is used to estimate
an unknown population parameter.
Limitations
Often insufficient, right or wrong
Example: Total weight of students, CGPA of students in a
high school
Point estimate is more useful, if it is accompanied by an
estimate of the error that might be involved.
Interval Estimate: It is a range of values used to estimate
a population parameter.
Criteria of a good Estimator
Unbiased
If the statistic tends to assume values that are above the
population parameter as frequently as it assumes values
that are below the population parameter.
Efficiency
It refers to the size of the standard error of the statistic
If we compare two statistics from a sample of the same size,
and try to decide which one is the more efficient estimator, we
would pick the statistic with the smaller standard error
Consistency
If as the sample size increases, the statistic becomes
closer to the values of the population parameter, then
that statistic is consistent.
Sufficiency
An estimator is sufficient if it makes so much use of the
information in the sample that no other estimate could
extract from the sample, additional information about the
population parameter.
DATA
Primary Data
Secondary Data
SCALES
(order, distance, & origin)
Nominal Scale, Gender
Ordinal Scale, Grades – A, B, C
Interval Scale, Years - 2001-2007
Ratio, GDP