Lecture 4: Population & Sampling Designs

Download Report

Transcript Lecture 4: Population & Sampling Designs

Population & Sampling
BY
MOAZZAM ALI MALIK
Population & Sampling
What is a Population?
 Population is a large group of people which you specify to
conduct the research and to answer the research question.
 It is the whole or the entire group; the research study is
being conducted to get information about, whose
properties are analyzed to find the answer or the solution
and the results are drawn.
What is a Sample?
 A sample is a finite part of a statistical population
whose properties are studied to gain information
about the whole (Webster, 1985).
 Sampling is the process of obtaining information
from a subset (sample) of a larger group (population)
 Number of the sample depends upon the
requirements or the scope of the research.
Why to Chose a Sample?
 The very large populations
 The economic advantage
 The time factor
 The partly accessible populations
 The destructive nature of the populations
When might you sample the entire population?



When your population is very small
When you have extensive resources
When you don’t expect a very high response
Characteristics of a Good Sample
Good sampling design should:
• Relate to the objectives of the investigation
• Be practical and achievable;
• Be cost – effective in terms of equipment and labour;
• Provide
estimates of population parameters that are truly
representative and unbiased.
Ideally, representative samples should be:
• Taken at random so that every member of the population of data
has an equal chance of selection;
• Large enough to give sufficient precision;
• Unbiased by the sampling procedure or equipment.
Factors that influence sample representativeness
 Sampling procedure
 Sample size
 Participation (response)
Sampling Terminology
The Population/Universe
• The class, families living in the city or electorates
from which you select you select your sample are
called the population or study population, and are
usually denoted by the letter N
Sampling Design
• The way you select students, families or electors is called
the sampling design or sampling strategy.
Sampling Terminology
Sampling elements/Units/Cases/Respondents
•
•
•
The unit about which information is collected
Typically the elements are people
However, schools, universities, corporations, etc. Any of them
could be elements.
Sampling Frame
• The actual list of sampling units (or elements). e.g. if you want to
study “Students at the University of Gujrat/Lahore”, there is a
list of such sampling units (but there are a number of definition
issues to be resolved here)
Sampling Terminology
Sample/Study Population
• Almost impossible to guarantee that every element meeting your
•
•
•
•
definition of “the population” has a chance to be selected into the
sample.
Thus the “study population” will be somewhat smaller than “the
population”
A subset of a population selected to estimate the behaviour or
characteristics of the population.
Sample Statistics
Your findings based on the information obtained from your
respondents (sample) are called sample statistics.
Population Parameters
The estimates arrived at from sample statistics are called population
parameters or the population mean.
Sampling Terminology
Sampling Errors
 Errors that occurs when we use a statistic based on sample to predict







the value of a population parameter
Approval rating: 63-68% by different polling agencies; population 66%
(unknown)
Random sampling: ±3% (margin of error)
Response Bias
Due to the way a question is asked or worded
The order of questions
Incorrect response (characteristics of interviewees
(race); lying)
Nonresponse bias: missing data
Unreachable; refuse to participate; fail to answer Qs
Sampling Designs
Basically two sampling strategies available:
 Probability sampling
 Non-probability Sampling
11
Probability Sampling
In Probability Sampling, each member of the population has a
certain probability to be selected into the sample
Types of Probability Sampling





Random
Stratified Random
Systematic
Cluster
Multistage Sampling
Random Sampling
 Population members are selected directly from the
sampling frame
 Equal probability of selection for every member (sample
size/population size)
 400/10,000 = .04
 Use random number table or random number generator
Systematic Sampling
 Order all units in the sampling frame based on some
variable and number them from 1 to N
 Choose a random starting place from 1 to N and then
sample every k units after that
Systematic Sampling
Stratified Sampling
 The chosen sample contains a number of distinct
categories which are organized into segments, or
strata
–
equalizing "important" variables
•
year in school, geographic area, product use, etc.
 Steps:
– Population is divided into mutually exclusive and
exhaustive strata based on an appropriate population
characteristic. (e.g. race, age, gender etc.)
– Simple random samples are then drawn from each
stratum.
Stratified Sampling
 The sample size is usually proportional to the relative size
of the strata.
 Ensures that particular groups (e.g. males and females)
within a population are adequately represented in the
sample
 Has a smaller sampling error than simple random sample
since a source of variation is eliminated
Stratified Sampling
Cluster Sampling
 The Population is divided into mutually exclusive and




exhaustive subgroups, or clusters, usually based on
geography or time period
Each cluster should be representative of the population i.e.
be heterogeneous.
Means between clusters should be the same (homogeneous)
Then a sample of the clusters is selected.
then some randomly chosen units in the selected clusters
are studied.
Cluster Sampling
Procedure
•
Divide population into clusters
(usually
along
geographic
boundaries)
•
Sample clusters randomly
•
Measure units within sampled
clusters
Cluster Sampling
Two types of cluster sampling methods.
One-stage sampling. All of the elements within selected
clusters are included in the sample.
Two-stage sampling. A subset of elements within selected
clusters are randomly selected for inclusion in the sample.
MULTISTAGE SAMPLING
 Complex form of cluster sampling in which two or more levels of
units are embedded one in the other. This technique, is essentially the
process of taking random samples of preceding random samples.
 Not as effective as true random sampling, but probably solves more of
the problems inherent to random sampling. It is an effective strategy
because it banks on multiple randomizations. As such, extremely
useful.
Example
 First stage, random number of districts chosen in all
states.
 Followed by random number of villages.
 Then third stage units will be houses.
 All ultimate units (houses, for instance) selected at last step are
surveyed.
Non-probability Sampling
Members selected not according to logic of probability (or
mathematical rules), but by other means (e.g. convenience,
or access)
Types of Non-Probability Sampling

Convenience sampling

Purposive/Judgment sampling
Snowball sampling
Quota sampling


Convenience Sampling
 Convenience Sampling
• A researcher's convenience forms the basis for selecting a
sample.
• Sometimes known as grab or opportunity sampling or
accidental or haphazard sampling.
• A type of nonprobability sampling which involves the sample
being drawn from that part of the population which is close to
hand. That is, readily available and convenient
Examples
 People in my classes
 People with some specific characteristic (e.g. tall)
 People living in cities
Purposive/Judgment Sampling
 Select the sample on the basis of knowledge of the
population: your own knowledge, or use expert judges to
identify candidates to select
 Typically used for very rare populations, such as deviant
cases.
Snowball Sampling
 Typically used in qualitative research
 When members of a population are difficult to
locate, for covert sub-populations, non-cooperative
groups
 Recruit one respondent, who identifies others, who
identify others,….
 Primarily used for exploratory purposes
Research design - sampling
26
Quota Sampling
• A stratified convenience sampling strategy. The population is first
segmented into mutually exclusive sub-groups, just as in stratified
sampling. It begins with a table that describes the characteristics of the
target population
– e.g. the composition of postgraduate students at UOG/UOL in terms
of faculty, race, and gender
 Then select on a convenience basis, postgraduate students in the same
proportions regarding faculty, race, and gender than in the population
 In quota sampling the selection of the sample is non-random.
For example interviewers might be tempted to interview those who look
most helpful. The problem is that these samples may be biased because
not everyone gets a chance of selection. This random element is its
greatest weakness and quota versus probability has been a matter of
controversy for many years.
27
Thank You