Statistics for teachers

Download Report

Transcript Statistics for teachers

Session 7
Introduction to Research
and Evaluation
Topic 1: Research Questions and
Hypothesis Testing
And
Topic 2: Introduction to Statistics
For Tonight
Today
 Review the contents of the proposal
 Topics tonight
– Finish the research questions
– Types of data review
– Hypothesis Testing
– Intro to Stats
The phases of a research project
 Problem statement
 Purpose
 Hypothesis development / research
question(s)
 Population / Sample type
 Results reporting (data)
 Statistical testing
 Conclusions Recommendations
Parts of the Research Report







Chapter 1
Chapter 2
Chapter 3
Chapter 4
Chapter 5
References
Appendix
Components of Chapter 1








Introduction
Background of the study
Problem statement
Significance of study
Overview of methodology
Delimitations of study
Definitions of key terms
Conclusion (optional)
Characteristics of
Components in Chapter 1
 Introduction – 1 paragraph – 3 pages
– Gets attention - gradually
– Brief vs. reflective opening
 Background – 2-5 pages
– History of problem, etc.
– Professional vs. practical use
– Be careful of personal intrusions
Characteristics of Components in
Chapter 1
 Problem Statement – ½ page
– States problem as clearly as possible
 Significance of study – 1 pgh. to 1 page
– Answers: “Why did you bother to conduct the
study?”
– Be careful of promising too much
Ways to Convey
Significance
 Problem has intrinsic importance, affecting
organizations or people
 Previous studies have produced mixed
results
 Your study examines problem in different
setting
 Meaningful results can be used by
practitioners
 Unique population
 Different methods used
Characteristics of
Components
 Delimitations – as needed
– Not flaws
– Establishes the boundaries – can study be
generalized?
– Consider:
 sample
 Setting
 time period
 methods

Stating
the
Problem
Developing a hypothesis
:
– Methods: estimation and hypothesis testing.
 Estimation, the sample is used to estimate a
parameter and a confidence interval about the
estimate is constructed.
– Parameter: numerical quantity measuring some
aspect
– Confidence Interval: range of values that estimates a
parameter for a high proportion of the time
 Hypothesis Testing: the most common use
– Hypothesis: an intelligent guess or assumption that guides
the design of the study
– Null hypothesis: there is no difference or there is no effect
– Alternative hypothesis: there is a difference or there is an
effect
– Hypotheses: more than hypothesis, which are related to the
population
TYPES OF DATA
Variables
 Two categories:
 Independent
– Variables in an experiment or study which are
not easily to be manipulated without changing
the participants.
 Age, gender, year, classroom teacher, any
personal background data, etc
 Dependent
– Variables which are changed in an experiment
 Hours of sleep, amount of food, time given to
complete an activity, curriculum, instructional
method, etc.
Variables
 A variable: any measured characteristic or attribute that differs for
different subjects.
 Two types:
– Quantitative: sometimes called "categorical variables.“
 measured on one of three scales:
– Ordinal: first second or third choice (most of the children
preferred red popsicles, and grape was the second choice)
– Interval: direct time periods between two events ( time it
takes a child to respond to a question)
– Ratio scale: compares the number of times one event
happens in comparison to another event. (example: the
number of time a black card is pulled in comparison to the
number of times a red card is pulled)
– Qualitative:
 measured on a nominal scale.
Types of Data
 Nominal Data -- Data that describe the presence or absence of some
characteristic or attribute; data that name a characteristic without any
regard to the value of the characteristic; also referred to as categorical
data. Male = 1 Female = 2, blue, green, etc
 Ordinal Data -- Measurement based on the rank order of concepts or
variables; differences among ranks need not be equal.
 interval data -- Measurement based on numerical scores or values in
which the distance between any two adjacent, or contiguous, data
points is equal; scale without a meaningful or true zero
 Ratio Data -- Order and magnitude…. Measurement for which
intervals between data points are equal; a true zero exists; if the score
is zero, there is a complete absence of the variable.
– Four levels:
 nominal: assigning items to groups or categories
– Examples: Classroom, color, size
 Ordinal: ordered in the sense that higher numbers represent higher
values
– Examples 1= freshmen, 2= sophomore
 Interval: one unit on the scale represents the same magnitude on the
trait or characteristic being measured across the whole range of the
scale.
– Interval scales do not have a "true" zero point,
 it is not possible to make statements about how many times higher
one score is than another.
 Ratio: represents the same magnitude on the trait or characteristic
being measured across the whole range of the scale.
– DO have true zero points
Nominal level of
measurement
 Assigns a number to represent a
group (gender; geography)
 Numbers represent qualitative
differences (good-bad)
 No order to numbers
 Statistics -- mode, percentages,
chi-square
Ordinal level of
measurement




Things are rank-ordered -- >, <
Numbers are not assigned arbitrarily
Assume a continuum
Examples -- classification (fr, soph,
jr, sr), levels of education, Likert
scales
 Statistics--median (preferred), mode,
percentage, percentile rank, chisquare, rank correlation.
Interval level of
measurement
 Equal units of measurement
 Arbitrary zero point--does not indicate
absence of the property
 Example -- degrees, Likert-type
scales (treatment), numerical grades
 Statistics -- frequencies, percentages,
mode, mean, SD, t test, F test,
product moment correlation
Ratio level of measurement




Absolute zero
Interval scale
Examples -- distance, weight
Statistics -- all statistical
determinations
Which are these?









Never married
Lower middle Class
Divorced
Age
Separated
Middle class
Widowed
Weight
Religious Affiliations




Height
Political Affiliations
Distance
freshmen
Which are these?









Never married N
Lower middle Class O
Divorced N
Age I/R
Separated N
Middle class O
Widowed N
Weight I/R
Religious Affiliations N





Height I/R
Political Affiliations
Distance I/R
freshmen O
Minutes I/R
N
Key Point
 Statistical Significance must be
distinguished from practical significance
– Even a small difference in a large sample might
be significant if the sample is large
– No p-value of a .0001 means that 1 in 10000
times the difference observed will occur by
chance (no real difference between groups)
Example Hypothesis
 There will be no significant difference in the
EOC scores for schools that use CAERT
and those that don’t.
 The EOC exam scores for schools using
Caert and those that don’t will not be
significantly different.
 The EOC exam scores for schools using
Caert and those that don’t will be
significantly different.
Statistics for Teachers
Statistics
“If you can assign a number to it,
you can measure it”
Dr. W. Edward Demming
 Statistics
– refers to calculated quantities regardless of whether or
not they are from a sample
– is defined as a numerical quantity
– Often used incorrectly to refer to a range of techniques
and procedures for analyzing data, interpreting data,
displaying data, and making decisions based on data.
Because that is the basic learning outcomes of a
statistics course.
What is the mean medium
and the mode in this
example?
Descriptive statistics
 Descriptive statistics
– summarize a collection of data in a clear and understandable way.
 Example: Scores of 500 children on all parts of a standardized test.
 Methods: numerical and graphical.
– Numerical: more precise- uses numbers as accurate measure
 mean the arithmetic average which is calculated by adding
a the scores or totals and then dividing by the number of
scores.
 standard deviation. These statistics convey information
about the average degree of shyness and the degree to
which people differ in shyness.
– Graphical: better for identifying patterns
 stem and leaf display : a graphical method of displaying
data to show how several data are aligned on a graph
 box plot. Graphical method to show what data are
included. The box stretches from the 25th percentile to the
the 75th percentile
 historgrams.
 Since the numerical and graphical approaches compliment each
other, it is wise to use both.
Inferential statistics
For choosing a statistical test
variables fall into 2 groups
 Continuous variables are numeric values that can
be ordered sequentially, and that do not naturally
fall into discrete ranges.
– Examples include: weight, number of seconds it takes
to perform a task, number of words on a user interface
 Categorical variable values cannot be sequentially
ordered or differentiated from each other using a
mathematical method.
– Examples include: gender, ethnicity, software user
interfaces
Tools for Measuring
 Measurement is the assignment of numbers to objects or
events in a systematic fashion.
– Four levels:
 nominal: assigning items to groups or categories
– Examples: Classroom, color, size
 Ordinal: ordered in the sense that higher numbers represent higher
values
– Examples 1= freshmen, 2= sophomore
 Interval: one unit on the scale represents the same magnitude on the
trait or characteristic being measured across the whole range of the
scale.
– Interval scales do not have a "true" zero point,
 it is not possible to make statements about how many times higher
one score is than another.
 Ratio: represents the same magnitude on the trait or characteristic
being measured across the whole range of the scale.
– DO have true zero points
Data Analysis
 Explaining and interpreting the data:
– Data are plural
 You are looking at more than one number or group of numbers;
subject-verb agreement is important when writing.
 Central Tendency: measures of the location of the middle or the center
of the whole data base for a variable or group of variables
– Frequency: the number of times a number appears
– Mean: the arithmetic average
– Mode: the number that appears most often
– Median: the number in the middle when numbers are arranged by
value
– Skew: A distribution is skewed if one of its tails is longer than the
other. Data may be skewed positively or negatively.
 Standard deviation: the amount of variance between each sigma
Inferential statistics
 Inferential statistics
– Infers or implies something about population from a
sample.
 Population: A total group
 Sample: A few from the whole group
 Representative sample: a sample that is equally
propionate to the population
 Random Sample: a sample that is chosen strictly by
chance is not “hand-picked”
– Probability: the percentage of change that an event will
occur
Parameters vs Statistics
Parametric vs Non-Parametric
 Definitions again
– Parameter is the true value in the population of
interest (everyone)
– Statistics is a number you calculate from your
sample data in order to estimate the parameter
 Example:
– All the Ag Teachers of the state
– Only 25 teachers selected from the 285 that
exist
What can make the sample different
from the true value/result of the whole?
 Students taught by teachers using Caert will
score higher on end of course exams than
those who do not.
– True difference – one group actually has a
higher capacity to learn.
– Random Variations -- The two populations have
identical means and the observed differences is
a coincidence of sampling
– Sampling error (bias) Poorly selected samples
not representing the population.
Parameters or Parametric Data
 Parameter: a numerical
quantity measuring some
aspect of a population of
scores.
– Parameters are usually
estimated by statistics
computed in samples
 Quantity Parameter
Greek letters are
commonly accepted for
writing formulas
 Statistical symbols are
most common in
reporting actual data
analysis in reports or
articles.
Greek letters are used to designate
parameters
Quantity
Parameter
Statistic
Mean
μ
M
Standard deviation
σ
s
Proportion
π
p
Correlation
ρ
r
Stats tests & types of data each use





1 Sample t-test ·
1 Continuous Dependent Variable with normal distribution ·
0 Independent Variables
1 Sample Median ·
1 Continuous Dependent Variable with non-normal distribution ·
0 Independent Variables
Binomial test ·
1 Bi-level Categorical Dependent Variable ·
0 Independent Variables
Chi-Square Goodness of Fit ·
1 Categorical Dependent Variable ·
0 Independent Variables·
2 Independent Sample t-test · 1 Continuous Dependent Variable with normal distribution
–

Wilcoxon Signed Ranks Test ·
–




1 (2 level) Categorical Independent Variable
1 Continuous Dependent Variable with non-normal distribution ·
1 (2 level) Categorical Independent Variable
Chi Square Test ·
1 Categorical Dependent Variable ·
1 (2-level) Categorical Independent Variable
Fisher Exact Test ·
1 Categorical Dependent Variable ·
1 (2 level) Categorical Independent Variable
Paired t-test 1 Continuous Dependent Variable with normal distribution, · 1 (2 Level) Categorical Independent Variable
One-way repeated measures ANOVA 1 Continuous Dependent Var w/normal distribution
–
1 (Multi-Level) Categorical Independent Variable

Friedman Analysis of Variance by Ranks 1 Continuous Dependent Var w/ non-normal distribution

One-way ANOVA ·
–
–

1 (Multi-Level) Categorical Independent Variable
Kruskal Wallis
–
1 Continuous Dependent Variable with normal distribution
1 (Multi-level) Categorical Independent Variable
1 Continuous Dependent Variable with non-normal distribution
1 (Multi-level) Categorical Independent Variable

Linear Discriminant Analysis

Factorial ANOVA
–
–


1 Continuous Dependent Variable with normal distribution
2 or more Categorical Independent Variables
1 Continuous Dependent Variable with normal distribution
1 Continuous Independent Variable with normal distribution
Multiple Regression 1 Continuous Dependent Variable with normal distribution
–

1 or more Continuous Independent Variable with normal distribution
Linear Regression
–
1 Categorical Dependent Variable
Multiple Continuous Independent Variables with normal distribution
ANCOVA 1 Continuous Dependent Var w/normal distribution
–
2 (or more) Categorical or Continuous Independent Variables with normal distribution
Results
Results
 At the end of the trial experience schools
using Caert had EOC exam scores that
were 18% that were higher than those
schools that did not use Caert.
– Alpha set at p<.05
– Observed P value of .03
Conclusion
 Interpretation: Given that there is no true
(other than scores) difference between
schools using Caert and those that don’t,
the probability of observing a 3% (.03) or
more difference due to chance is less than
.05
ANOVA
 A factorial ANOVA has two or more
categorical independent variables (either
with or without the interactions) and a single
normally distributed interval dependent
variable.
ANOVA
 In statistics, ANOVA is short for analysis of
variance. Analysis of variance is a collection of
statistical models, and their associated
procedures, in which the observed variance is
partitioned into components due to different
explanatory variables.
– The initial techniques of the analysis of variance were
developed by the statistician and geneticist R. A. Fisher
in the 1920s and 1930s, and is sometimes known as
Fisher's ANOVA or Fisher's analysis of variance, due
to the use of Fisher's F-distribution as part of the test of
statistical significance.
Z test
 The Z-test is a statistical test used in inference
which determines if the difference between a
sample mean and the population mean is large
enough to be statistically significant, that is, if it is
unlikely to have occurred by chance.
 The Z-test is used primarily with standardized
testing to determine if the test scores of a
particular sample of test takers are within or
outside of the standard performance of test takers.
Pearson Correlation
 The PEARSON Correlation is a calculation
between the correlation coefficient between two
measurement variables when measurements on
each variable are observed for each of N subjects.
– (Any missing observation for any subject causes that
subject to be ignored in the analysis.) The Correlation
analysis tool is particularly useful when there are more
than two measurement variables for each subject. It
provides an output table, a correlation matrix, showing
the value applied to each possible pair of measurement
variables.
Two-Sample t-Test
 The Two-Sample t-Test analysis tools test
for equality of the population means
underlying each sample. The three tools
employ different assumptions: that the
population variances are equal, that the
population variances are not equal, and that
the two samples represent before treatment
and after treatment observations on the
same subjects.
Research Techniques
 Types of hypothesis testing:
– T-test: comparing the mean of two groups
– ANOVA: Analysis of Variance – used to compare the
means of several variables
– Correlation: compares the relationship of two groups
– Chi Square of independence: explains if is a relationship
between the attributes of two variables.
– Linear regression: the prediction of one variable based
on another variable, when the relationship between the
variables is assumed to assumed to be linear.
Normal Curve
In practice, one often assumes that data are from an approximately
normally distributed population. If that assumption is justified, then
about 68% of the values are at within 1 standard deviation away from
the mean, about 95% of the values are within two standard deviations
and about 99.7% lie within 3 standard deviations. This is known as the
"68-95-99.7 rule" or the "Empirical Rule".
Key points
Comparing groups for Sig Diff
Key Terms
 Use for new terms in profession (cognitive
processing skills)
 Give preciseness to ambiguous term (learner)
 General term used in special way (learning style)
 Writing definition
– State term
– Give broad class to which term belongs
– Specify how term is used that differs
 Conclusion – not always used
– Summarizes if necessary
– Tells reader what to expect
Survey Construction
 Parts:
– Title
– Directions introduction to survey
 Scales
– Items (a list of statements or questions)
 Usually with a scale of some type
–
–
–
–
Rating
Ranking
Semantic differential
Likert type scale
 Demographical info
Likert type scale
 Ice cream is good for breakfast
– Strongly disagree
– Disagree
– Neither agree nor disagree
– Agree
– Strongly agree
Rating
 Scale of 1 to 5
or 1 to 7 , etc….
– ? 1 = Best or highest
– ? 5 = Best or highest
 Even number of items or odd?
– Forced choice – no fence sitting
– Middle – allows a middle ground response
– Might allow for not opinion, (NA or NO)
Semantic differential
 “In order to succeed you must know what
you are doing, like what you are doing,
and believe in what you are doing”
Will Rogers
Setting Alpha Level
 Set alpha at something like 0.05
 Conduct a statistical test
 Obtain a p-value
 Parametric tests
–
–
–
–
Pearson Product Correlation Coefficient
Student t-Test
The z-Test
ANOVA
 Nonparametric tests
–
–
–
–
Chi-Squared
Spearman Rank Coefficient
Mann-Whitney U Test
Kruskal-Wallis Test