Statistics and Design of Experiments: Role in Research
Download
Report
Transcript Statistics and Design of Experiments: Role in Research
Statistics and Design of
Experiments:
Role in Research
George A. Milliken, PhD
Department of Statistics
Kansas State University
Manhattan, Kansas
September 2000
Department of Statistics
Kansas State University
1
Statistics: A collection of procedures
and processes to enable researchers in
the unbiased pursuit of Knowledge
Statistics is an important part of the
Scientific Method
State a Hypothesis
Interpret the
Results—Draw
Conclusions
Design a
Study and
Collect Data
Analyze the Data
September 2000
Department of Statistics
Kansas State University
2
State a Hypothesis: The OBJECTIVE or
OBJECTIVES of the Study
A HYPOTHESIS OR SET OF HYPOTHESES
should state exactly what you want to DO
or LEARN or STUDY
SHOULD ANSWER
What are the factors to be studied and what
relationships are to be investigated? What
is the experimental material? Etc.?
September 2000
Department of Statistics
Kansas State University
3
The area of STATISTICS would not be needed if
each time you measured an experimental unit
you would obtain the same response or value
BUT, THE RESPONSES ARE NOT THE SAME
SINCE THERE IS VARIABILITY or NOISE IN THE
SYSTEM
STATISTICAL METHODS EXTRACT THE
SIGNAL FROM THE NOISE TO PROVIDE
INFORMATION
One of the Statistician’s JOBS is to make sense
from DATA in the presence of VARIABILITY or
noise by using DATA ANALYSIS TOOLS
September 2000
Department of Statistics
Kansas State University
4
DESIGN VS. ANALYSIS
The PURPOSE OF DATA COLLECTION is to
GAIN INFORMATION OR KNOWLEDGE!!
Collecting Data does not guarantee that
information is obtained.
INFORMATION
≠DATA
At best:
INFORMATION=DATA+ANALYSIS
September 2000
Department of Statistics
Kansas State University
5
If data are collected such that they contain NO
information in the first place, then the analysis
phase cannot find it!!!
The best way to insure that appropriate
information is contained in the collected data
is to DESIGN (plan) and Carefully Control the
DATA COLLECTION PROCESS
The measured variables must relate to the
stated OBJECTIVES of the study
September 2000
Department of Statistics
Kansas State University
6
If you have a good design and process for
data collection, it is quite often straight
forward to construct an analysis that
extracts all of the available information
from the data
The ROLE of a STATISTICIAN is to work with the
REAEARCH TEAM (or researcher) from the
START of the study
September 2000
Department of Statistics
Kansas State University
7
The MOST IMPORTANT TIME for the
statistician to become involved with a
research study is in the very BEGINNING
A STATISTICIAN CAN HELP OBTAIN THE
MAXIMUM AMOUNT INFORMATON FROM
AVAILABLE RESOURCES
September 2000
Department of Statistics
Kansas State University
8
HOW???
HELP WITH THE DESIGN OF THE EXPERIMENT
DETERMINE SAMPLE SIZE NEEDED
DEVELOP PROCESS OF COLLECTING DATA
DISCUSS VARIABLES TO BE MEASURED AND
HOW THEY RELATE TO THE OBJECTIVES OF
THE STUDY
PROVIDE METHODS OF ANALYZING THE DATA
HELP TRANSLATE STATISTICAL CONCLUSIONS
INTO SUBJECT MATTER CONCLUSIONS
September 2000
Department of Statistics
Kansas State University
9
THE CORE HELP FROM THE STATISTICIAN IS IN
THE DESIGN OF THE EXPERIMENT
Help with selecting conditions that relate to the
objectives of the study
Selecting the Experimental Units
Deciding when REPLICATIONS exist
Determining the ORDER in which the
experiment is to be carried out
THE DESIGN OF THE EXPERIMENT IS CRITICAL
September 2000
Department of Statistics
Kansas State University
10
COMPONENTS OF DESIGNED EXPERIMENTS
TREATMENT STRUCTURE:
Factors or Populations or Treatments
related to the objectives of the experiment:
Brands of Product, Types of Uses of
Product
DESIGN STRUCTURE OR EXPERIMENTAL UNITS:
Factors used in blocking the experimental
units as well as characteristics of exp. Units
Washing Machine, Person Using Machine,
Products evaluated in Session by Taste Panelist
September 2000
Department of Statistics
Kansas State University
11
Complete Designed Experiment
Treatment
Structure
Design
Structure
RANDOMIZE –
randomization plan to
assign Treatment of TS to
Experimental Units in DS
September 2000
Department of Statistics
Kansas State University
12
RANDOMIZATION IS THE INSURANCE POLICY
AGAINST INTRODUCING BIAS INTO THE STUDY
Selecting an appropriate Treatment Structure,
necessary Design Structure, and required
Randomization Process provides the
Statistician the information needed to
construct an appropriate model
APPROPRIATE MODEL = BEST ANALYSIS
September 2000
Department of Statistics
Kansas State University
13
Key to the Design of the Experiment is the
Concept of REPLICATION
REPLICATON: The independent observation
of a treatment
An Experimental Unit Provides a Replication
of the level of a Factor if the level is
randomly assigned the the Experimental
Unit and observed independently of the
other Experimental Units
Must make sure that Sub-samples are not
considered to be Replications
September 2000
Department of Statistics
Kansas State University
14
The Variability among Experimental Units
treated independently alike provides the
estimate of the variance (or Standard Error) to
be used as the measuring stick for comparing
the levels of treatments randomly assigned to
those Experimental Units
Between Sub-sample variance is generally
much less than between Replication variance
It is critical that the Replications are
appropriately Identified
Treatment Structure, Design Structure (with
experimental units and replication) and
Randomization describe the total Design
September 2000
Department of Statistics
Kansas State University
15
ANALYZE THE DATA:
Use the COMPLETED DESIGNED EXPERIMENT
and the data type to construct an appropriate
analysis
Use Statistical Software – SAS, RS/1, JMP
A software package you know will provide
valid results
September 2000
Department of Statistics
Kansas State University
16
The Statistician will provide the STATISTICAL
interpretation of the results from the analyses –
STATISTICAL ANALYSES CONCLUSIONS
The Statistician will help the Researcher
TRANSLATE the statistical analyses
conclusions into subject matter conclusions
Discuss how the statistical analyses provide
results that relate to the STATED OBJECTIVES
of the study. The expected results should be
written along with the objectives. Results that
are not expected should be looked at carefully
September 2000
Department of Statistics
Kansas State University
17
Washing Machine Example:
4 brands or models -- one machine each
3 types of laundry – Whites, Wash/wear, Denim
3 persons to operate the Machines
For each person:
Randomly assign the order of Brands
For each Brand, randomly assign the order of Types
September 2000
Department of Statistics
Kansas State University
18
Random Order of Brands for Person 1
Brand D
Brand B
Denim
White
W/W
Denim
White
W/W
W/W
White
Denim
Machine
Brand A
Brand C
White
Denim
W/W
Random Order of Types
within each Machine
Re-Randomize for each Person
September 2000
Department of Statistics
Kansas State University
19
Persons are Blocks of Machines
Machines are Experimental Unit for Brands
and Variance is computed by Person*Brand
Compare BRANDS by using the variability
among Machines Treated Alike
September 2000
Department of Statistics
Kansas State University
20
The Machines within a Person are Blocks for
Types – Three Loads per Machine
The Loads within a machine are the
Experimental Units for Type and Brand*Type
Variability among Loads treated alike provides
the measuring stick for comparing the levels of
Type and Brand*Type
This Design Involves Persons as Blocks and
Two Sizes of Experimental Unit
Machine and Load
September 2000
Department of Statistics
Kansas State University
21
If you ignore that this design involves TWO
sizes of Experimental Units and there are Two
Error Terms, the resulting error term is a
combination of these two error terms
The combined error term is Too Large for
making comparisons involving Type and
Brand*Type – won’t find things that are there
The Combined error term is Too Small for
Making comparisons involving Brand – will
declare things to be different when they are
not
Statistical Conclusions can be very misleading
September 2000
Department of Statistics
Kansas State University
22
STATISTICIAN’S JOB – to figure out
how the study is being ran and help
identify the type of design that is
being used which includes
determining if more than one size
of experimental unit is involved
This is accomplished BEST when the
Statistician is involved at the Beginning of the
Study
September 2000
Department of Statistics
Kansas State University
23
SALSA TASTING EXPERIMENT
NINE TYPES OR BRANDS OF SALSA
A PERSON CAN TASTE ONLY THREE
SALSAS DURING THE SESSION
TWELVE PERSONS WILL BE USED IN THE
STUDY
September 2000
Department of Statistics
Kansas State University
24
ASSIGNMENT OF PRODUCTS TO PERSONS –
with order
Person
Person
Order 1
2
3
2
3
1
C
A
B
7
F
D
E
2
H
I
G
8
A
G
D
3
E
B
H
9
C
I
F
4
G
B
F
10
D
H
C
5
I
E
A
11
F
A
H
6
C
G
E
12
B
D
I
September 2000
Order 1
Department of Statistics
Kansas State University
25
Each Product is Tasted 4 times – there are Four
Replications of each product
Since each person tastes only Three of the
products, how do we compare the products?
Want to compare the Products as if each
Person had tasted all of the Products
The Analysis obtains predicted values for each
Product for each Person
September 2000
Department of Statistics
Kansas State University
26
The Product Means of these Predicted Values
are the “ADJUSTED MEANS” for each Product
Called LEAST SQUARES MEANS by SAS®
The LSMEANS are the Predicted Means as if
Each of the Persons has Tasted and
evaluated all of the products
September 2000
Department of Statistics
Kansas State University
27
Some times characteristics of experimental
units are measured – to be used as possible
covariates
Study the effect of three types of Drugs on a
persons heart rate
Randomly Assign 12 persons to each of the
Drugs -- person is experimental unit
Dose the person with the assigned drug
and measure the heart rate after 15
minutes
September 2000
Department of Statistics
Kansas State University
28
Persons do not have identical heart rates
before being given the respective drug
Measure the initial heart rate – heart rate
before giving the drug
We want to compare the Drugs as if all
experimental units (persons) had the same
initial heart rate
September 2000
Department of Statistics
Kansas State University
29
Analysis of Covariance uses a regression model
to obtain predicted after drug heart rate values
as if all persons had initial heart rates of, say, 74
beats per minute
The Drug Means of these predicted heart rates
are used to compare the Drugs – These means of
Predicted Values are called LSMEANS
September 2000
Department of Statistics
Kansas State University
30
LSMEANS are adjusted means and occur in
several venues
1. Obtain treatments’ means when not all
treatments are observed the same number of
times by each person
2. Obtain treatments’ means when the
experimental units do not have identical
values of the covariates
September 2000
Department of Statistics
Kansas State University
31
Another Role of the Statistician is to provide
appropriate models for the analysis of the data
from a given study in order to take into account
the Design Structure and covariates to provide
estimates of the treatment effects as if all
experimental units had observed all treatments
or all experimental units had the same value of
the covariate -- provide appropriate LSMEANS
September 2000
Department of Statistics
Kansas State University
32
Involving the Statistician in the Beginning of
the Study will
1. improve the chance of conducting a
successful experiment
2. Speed up the turn around of the analyses
since was involved with the design
3. Reduce the costs associated with the
experiment -- making sure the sample size
is adequate to provide the needed
detectable differences
September 2000
Department of Statistics
Kansas State University
33
THE END
THANK YOU
FOR
LISTENING
September 2000
Department of Statistics
Kansas State University
34