Presentation Template - UW Courses Web Server

Download Report

Transcript Presentation Template - UW Courses Web Server

381
QSCI 381 - Winter 2012
Introduction to Probability and
Statistics
381
Basic Information

Instructor:



Teaching Assistant



Mr Thomas Pool ([email protected])
Office hours: See web-site
Class web-site


Dr André Punt (FISH 206A; aepunt@u)
Office hours: Contact directly
http://courses.washington.edu/qc381aep/
Prerequisites for this course



MATH 120,
a score of 2 on the advanced placement test, or
or a score of 67% on the MATHPC placement test
Class Structure
381



Lectures (BNS 117): M, Tu, W, Th
(9.30-10.20)
Computer laboratory sessions (MGH
044): F (9:30-10:20)
Weekly homework assignments.
Class Evaluation
381




Submission of homework assignments.
Homework assignments (30%; based
on the best 8 of 9).
Mid-term examination (30%).
Final examination (40%).
381
Course Overview





Introduction (2 lectures)
Summarizing data (4 lectures)
Probability (3 lectures)
Probability distributions (6 lectures)
Making inferences from data (17
lectures)
Course Textbooks
381

Required

Elementary Statistics by Larson and
Farber

Optional

An EXCEL manual
381
The Course and the Web Page


The slides for each day’s lecture will be
placed on the web-page at the start of
the day.
The readings for the week are already
on the web-page.
381
What is Statistics About?
Statistics is the science of collecting,
organizing, analyzing and interpreting data in
order to make decisions
Statistics is the science of data-based decision
making in the face of uncertainty
The Statistical Cycle
381
1.
2.
3.
4.
5.
6.
Identify the questions that are to be
addressed.
Select a set of hypotheses related to the
question.
Collect data appropriate to the question.
Summarize and analyze the data.
Do the results make sense / are they
consistent with other information.
Repeat steps 2-5.
Statistics and the Natural Sciences
381

Statistics are a key part of the doing
business in the natural sciences today:




“Eliminating harvesting will reduce the risk
of extinction by 20%”;
“50% of fish caught in the fishery are
immature”; and
“80% of fish mature by age 5”.
Statistics is not just summarizing data.
Some definitions-I
381


- information coming from
observations, counts, measurements, or
responses.
The data you will be analyzing will
almost always be a sample from a
population.
Some definitions-II
381



- the collection of all
outcomes, responses, measurements or
counts that are of interest.
- a subset of a population.
We will almost always be dealing with
samples and hoping to make inferences
about the population.
381
Samples and Populations
Samples and Populations
381


It is important to be able to identify: a) the
data set, b) the sample, and c) the
population.
This isn’t always so easy:


Data = 10 counts of predator numbers in West
coast Marine Reserves.
Populations = a) West coast marine reserves, b)
U.S. marine reserves, c) World marine reserves, d)
Marine reserves off the west coast that can be
sampled?
Parameters and Statistics-I
381



- a numerical description of
a characteristic of the population.
- a numerical description of a
characteristic of the sample.
We will often wish to make inferences
about parameters based on statistics.
Parameters and Statistics-II
381


Whether you are dealing with a parameter or a
statistic depends on whether the data relate to
the whole population or only a subset of it.
Examples:



Average length of all fish passing a weir.
Average length of a sample of the fish passing the
same weir.
Note: sometimes a quantity could be both a
parameter and a statistic depending the
definition of the population (and the question
being addressed).
Branches of Statistics
381



- relate to
organizing, summarizing, and displaying
data.
- relate to using
a sample to draw conclusions about a
population.
Inferential statistics involves drawing a
conclusion from some data.
Inferences vs Summaries
381

This can be quite subtle. Consider:



Average length of females and males:
90cm and 100cm respectively.
Descriptive statistics: the values.
Inference: males are (in general) larger
than females.
Data Classification-I
381



- attributes, labels, nonnumerical values.
- numerical measurements
or counts.
Note: Numbers can be “qualitative”
(e.g. when analyzing data from surveys,
the haul number is qualitative)
381
Data Classification-II
Species
Species
#
Ocean
Basin
Maximum
Age
Merluccius capensis
1
Atlantic
7
Merluccius paradoxus
2
Atlantic
5
Merluccius productus
3
Pacific
20
Which fields are qualitative and which are quantitative?
Levels of Measurement-I
381

A data set can be classified according to the
highest level of measurement that applies.
The four levels of measurement, listed from
lowest to highest are:
1. Nominal
2. Ordinal
3. Interval
4. Ratio
Levels of Measurement-II
381

qualities.


- Categories, names, labels, or
Species name, maturity state, river sampled
- The data can be arranged in
order, but there is no way to assign
numerical values to the differences
among levels.

Condition of a released fish (live, dubious, dead).
Levels of Measurement-III
381

- Data can be ordered and
values subtracted, but ratios make little
sense / zero is simply a “reference”
level.


Year, Month, Temperature
- As for interval data, but zero
and ratios of values have meaning.

Height, length, weight, speed, number of
recaptures.
381
Levels of Measurement-IV
(Cheat sheet)
Level
Nominal
Put in
categories
Yes
Arrange
in order
No
Subtract
values
No
Divide
values
No
Ordinal
Yes
Yes
No
No
Interval
Yes
Yes
Yes
No
Ratio
Yes
Yes
Yes
Yes