Transcript st_intro_01

Applied statistics for testing and evaluation STAT01
– MED4- Introduction
Introduction
Lecturer:
Smilen Dimitrov
1
STAT01 - Introduction
About the course and communication
•
•
Course: Applied statistics for testing and evaluation – MED4
Course teacher – Smilen Dimitrov, teaching assistant (TA) Kristina Daniliauskaite
– Contact per e-mail welcome: [email protected] ([email protected]); TA:
[email protected]
– When writing, you are welcome to use your group e-mail]
– However, if you do not use the group e-mail, but your own individual addresses,
please include the other group members’ addresses (or the group email) in
Carbon Copy (CC) when writing the e-mail
– Course website: http://www.smilen.net/stat/
•
Course requirements – PE course
– Most of you will develop a software application as a product in your MED3 project
– You are expected to perform user testing and evaluation of your product, and
provide statistical analysis of the results in your project report. These results and
the analysis is expected to be discussed as part of your group project exam. In
addition, for the individual part of the group exam, you are expected to discuss
and answer PE questions.
2
STAT01 - Introduction
Statistics - starting notes
•
Statistical analysis refers to doing something useful with data (letting its
meaning free).
•
Statistics is neither really a science nor a branch of mathematics. It is
perhaps best considered as a meta-science (or meta-language)
•
One of the hardest things is choosing the right kind of statistical analysis depends on the nature of your data and on the particular question you're
trying to answer.
•
Statistics is intimately related to the scientific method.
•
Back end - analyzing data and stating conclusions, 'front end' of the process
requires expertise in the specific subject matter, such as economics,
biology, ecology, or medialogy.
•
No substitute for experience; the way to know what to do, is to have done it
properly lots of times before
3
STAT01 - Introduction
Introduction
•
Scope of application of statistics is enormous - unavoidable in research
about Human Senses - Digital Perception (semester theme)
•
In media technology, used for testing and evaluation in two main areas:
– Technical - signal processing algorithms, systems
– Psychological/social – user response to products and interfaces, ratings
•
Statistics – finding unknown parameters and relationships through
organization and study of collected data
•
This course – based on Statistics - An Introduction using R by M. Crawley
–
–
–
–
•
Introduction to basic concepts in statistical analysis
Usage of the free statistical programming language R
Heavy use of Internet resources
5 modules: 1hr40m lectures, 1hr40m exercises
In industry: Microsoft - Statistical Media Processing research project
4
STAT01 - Introduction
Experiments and statistics problems
•
Introduce terminology
– Experiment (general) – asking the Universe a question
– Answer – perform measurements or observations => collect data
– Possibility for misinterpretation of data – proper understanding of
statistical analysis and experimental design
• Descriptive statistics
–
• Statistics
– the study of data
– problem-solving process
that seeks answers to
questions through data
–
methods and tools for
collecting data
models to describe and
interpret data
• Inferential statistics
–
systems and techniques for
making good decisions and
accurate predictions based
on data
5
STAT01 - Introduction
Experiments and statistics problems
•
For us
– ‘asking the Universe a question’ -> statistics problem
– Process of recording measurements -> data collection
– Experiment -> method of data collection
•
Components of a statistics problem
– Ask a Question
– Collect Appropriate Data
– Analyze the Data
– Interpret the Results
6
STAT01 - Introduction
Examples of statistics problems
•
Video – Room-Measurement Activity
(link)
•
Video – bias and measurement error
(link)
7
STAT01 - Introduction
Examples of statistics problems
Suppose you were curious about the relative
heights and arm spans of men and women.
1. Ask a Question
Are men typically taller than women?
Do men typically have longer arm spans than
women?
2. Collect Appropriate data
Using a meter stick, measure the heights (without shoes) and arm
spans (fingertip to fingertip) of three men and three women. Record
your measurements to the nearest centimeter.
1. Ask a Question
How much does a penny weigh?
2. Collect Appropriate data
Use a metric scale to weigh 32 pennies to the nearest centigram
(1/100 of a gram).
Based on the data, how much would you expect
the 33rd penny to weigh?
1. Ask a Question
Should nuclear power be developed as
an energy source?
2. Collect Appropriate data
Twenty-five people completed the
following questionnaire:
8
STAT01 - Introduction
Examples of statistics problems
•
Media technology example - The Optimal Thumbnail experiment
(http://www.otal.umd.edu/SHORE/bs21/experiment.html)
1. Ask a Question
- Given several image thumbnail sizes, which is the optimal size in relation
to accurate and quick recognition?
2. Collect Appropriate data
• Devise an experiment (and test a hypothesis) where the subject is asked to
recognize images, and measured two dependent variables: time to
recognition, and accuracy of identification.
9
STAT01 - Introduction
Variables and variability
•
If you measure the same thing twice you will get two different answers. Due
to :
– the changing nature of things (heterogeneity),
– association with something else changing, or
– errors
•
Variables - characteristics that may be different from one observation to the
next
–
–
•
things that we measure, control, or manipulate in research.
symbol (A, B, x, y, etc.) that can take on any of a specified set of values
When we measure these characteristics, we assign a value for each
variable. This set of values for a given variable is known as data
Random error - nonsystematic measurement error that is
•
Measurement errors
beyond our control, the effects average out to zero over a series
of measurements.
Measurement bias (systematic error) - favors a particular
result. A measurement process is biased if it systematically
overstates or understates the true value of the measurement.
10
STAT01 - Introduction
Qualitative and quantitative variables
• Some questions are answered with a number, some
not
–
Interval
–
Ratio-scale: A scale that has a meaningful zero value and equidistant
measure: doubling principle (10 yrs is twice as old than 5 yr)
Interval scale: Interval scales have equidistant measure however the
doubling principle breaks down in this scale (50° is not half as hot as 100°
Celsius)
Ratio
• Quantitative data/variables - measurement expressed
in terms of numbers
Measurement scales
–
Nominal: When there is not a natural ordering of the categories. Examples
might be gender, race, religion, or sport.
Ordinal: When the categories may be ordered. Categorical variables that
judge size (small, medium, large, etc.) are ordinal variables.
Ordinal
–
Nominal
• Qualitative (categorical) data/variables - measurement
expressed by means of a natural language description
(not in terms of numbers)
11
STAT01 - Introduction
Methods of data collection
•
•
• Dependent variable (DV) - the second variable the experimenter
measures
• 'if you read a Wiki, then you will have enhanced knowledge.'
Experimental
exist in at least two levels
research
Data collection - integral part of statistics
Methods of data collection - methods to gather information about
the world
– Experiments - the only way to determine causal relationships
between variables
• Independent variable (IV) - manipulated by an experimenter to
– Sample surveys - the selection and study of a sample of
items from a population. A sample is just a set of members chosen
Correlational
– Observational studies - the most primitive method of
understanding the laws of nature. Basically, a researcher
goes out into the world and looks for variables that are
associated with one another.
research
from a population, but not the whole population. A survey of a whole
population is called a census.
• phoning the fifth person on every page of the local phonebook and asking
them how long they have lived in the area.
• Observations have the equivalent of two Dependent Variables
12
STAT01 - Introduction
Descriptive and inferential statistics
•
Descriptive statistics - methods used to summarize or describe a collection
of data
– Analysis by bringing out the information the data contains
– Steps:
• Collect data
• Classify data
• Summarize data
• Present data
• Proceed to inferential statistics if there is enough data to draw a
conclusion
•
Inferential statistics - modeling patterns of data, to draw inferences about
the thing being studied
– Analysis by testing or retesting a hypothesis.
13
STAT01 - Introduction
Descriptive statistics
•
Descriptive statistics - a branch of statistics that denotes any of the many
techniques used to summarize a set of data.
–
•
Techniques
–
–
–
•
Allows us to describe groups of many numbers. One way to do this is by reducing them to a
few numbers that are typical of the groups, or describe their characteristics.
Graphical description
Tabular Description
Summary statistics
Two objectives for summary statistics
– choose a statistic algorithm that shows how different units seem similar a measure of central tendency (typical value – location).
•
•
•
arithmetic mean,
the median,
the mode
•
specific values from the quantiles.
– choose another statistic that shows how they differ - a measure of
statistical variability (spread).
•
•
•
the range
standard deviation
inter-quartile range
•
•
•
the variance;
variance square root,
absolute deviation.
•
•
Normal distribution
Skewed distribution
14
STAT01 - Introduction
Inferential statistics
•
Inferential statistics or statistical induction do not just describe numbers,
they infer causes
–
•
comprises the use of statistics to make inferences (informed guesses) concerning some unknown aspect
(usually a parameter) of a population - to draw inferences about situations where we have only gathered part
of the information that exists
Dealing with probability – as we’re dealing with guesses/predictions. Two
schools differ:
– frequency probability using maximum likelihood estimation - The frequentists
understand probability in the common sense---i.e. if an event has probability 1/6 then in many trials the event
will happen 1/6 of the time – well defined experiments.
– Bayesian inference - Bayesians, on the other hand, hold that probability is a measure of our belief
(or confidence) in some event happening. Bayesians update their belief in the light of new data using Bayes
theorem - apply probabilities to arbitrary statements.
•
Generally when we have a research question, we can form from it a
research hypothesis or a set of hypotheses – however, usually not directly testable using
inferential statistics.
•
Statistic algorithms – tools
•
•
•
T-test
·ANOVA
Correlation
•
•
•
Factorial
Regression
Chi-Squared
•
•
Probability
Distributions
15
STAT01 - Introduction
Average and the arithmetic mean
•
In mathematics, an average or central tendency of a set (list) of data refers
to a measure of the 'middle' of the data set.
•
There are many different descriptive statistics that can be chosen as a
measurement of the central tendency. The most common method, and the
one generally referred to simply as the average, is the arithmetic mean
•
In statistics, mean has two related meanings:
– the average in ordinary English, which is also called the arithmetic mean
(and is distinguished from the geometric mean or harmonic mean). The
average is also called sample mean.
– the expected value of a random variable, which is also called the
population mean.
16
STAT01 - Introduction
Average and the arithmetic mean
•
A set of data X  ( x1, x2 ,..., xn )
•
1 n
1
Arithmetic mean x   xi   x1  x 2    x n 
n i 1
n
•
(Sum notation)
n
x
i m
•
i
 x m  x m 1  x m  2    x n 1  x n
Example
1. Ask a Question
How many raisins are in a half-ounce box of raisins?
2. Collect Appropriate data
We counted the number of raisins in 17 half-ounce boxes:
Box:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Number of
raisins:
29
27
27
28
31
26
28
28
30
29
26
27
29
29
25
28
28
And the arithmetic mean is …
17
STAT01 - Introduction
Average and the arithmetic mean
•
Many kinds of averages
•
Arithmetic mean - the sum of all measurements divided by the number of observations in the data
set
•
•
•
•
•
•
•
•
•
•
•
Median - the middle value that separates the higher half from the lower half of the data set
Mode - the most frequent value in the data set
Geometric mean - the nth root of the product of n data values
Harmonic mean - the reciprocal of the arithmetic mean of the reciprocals of the data values
Quadratic mean or root mean square (RMS) - the square root of the arithmetic mean of the
squares of the data values
Generalized mean - generalizing the above, the nth root of the arithmetic mean of the nth powers
of the data values
Weighted mean - an arithmetic mean that incorporates weighting to certain data elements
Truncated mean - the arithmetic mean of data values after a certain number or proportion of the
highest and lowest data values have been discarded
Interquartile mean - a special case of the truncated mean
Midrange - the arithmetic mean of the highest and lowest values of the data or distribution
When is arithmetic mean improper – average rate of return of investments –
the numbers multiply, so geometric mean must be used.
18
STAT01 - Introduction
Introduction to R
•
R – statistical computing programming language
– Powerful tool for statistical modelling:
– Data exploration, tabulating and sorting data, drawing plots of data
– Sophisticated calculator to evaluate complex arithmetic expressions,
and a flexible object oriented language
•
Installation
•
In class example - finding the average of a number of raisins in a box
– Data collection in Excel
– Using R, finding the average and plotting bar graphs.
19
STAT01 - Introduction
Exercise for mini-module 1 – STAT01
Exercise A
1. Collect the following data about the members of your group in an Excel sheet:
a)
b)
c)
Name
Age
Previous education
2. Import the data into R, and find the average age of the group members.
3. Using R, plot the individual ages of the group members as a bar graph.
Exercise B
Repeat exercise A, for all students of the MED3 class. Compare the found age average of
the entire class, with the age average for your group.
Delivery:
Deliver the collected data (in tabular format), the found age averages and the bar-graphs
in an electronic document.
20