卫生统计学

Download Report

Transcript 卫生统计学

Health Statistics
卫生统计学
1
Teaching Group
 Chuanhua Yu(宇传华),Professor
 Ying
 Lu
Hu(胡 樱), Associate Professor
Ma(马露), Associate Professor
 Jing Sun(孙静)
Lecture
2
Content
Textbooks
 Bowers
David. Medical Statistics from Scratch:An
Introduction for Health Professionals (Second
Edition). John Wiley & Sons Ltd, 2008
 Neil
A. Weiss. Introductory Statistics(9th edition
).Addison-Wesley,2012
4
Reference Books
 Petrie
Aviva, Sabin Capoline. Medical Statistics at
glance. Blackwell Science Ltd.,2000.
 LE CHAP T. Introductory Biostatistics. John Wiley
& Sons Ltd, 2003.
 主编:
方积乾《卫生统计学 》(第七版),人民
卫生出版社2012.8+《卫生统计学学习指导 》
 方积乾主编.生物医学研究的统计方法.
北京:高等
教育出版社,2007.6(ISBN:9787040208412),
604页
7
DOWNLOAD English book
Chinese book
Course Center of Wuhan University
Method of learning
 Preparation
 Attending
 Recording
 Reviewing
 Practice
11
Exercise requirements
 Writing in English or
 Detailed
Chinese
process
 Submit on time
12
Chapter 1
Introduction
绪论
13
Contents
 Statistics & Health statistics
 Types of data
 Basic concepts of statistics
14
New words
 Statistics
统计学
 Biostatistics
生物统计学
 Health statistics
卫生统计学
 Quantitative data
定量数据
 Qualitative data
定性数据
 Numerical data
数值型数据
 Categorical data
分类型数据
15
连续型数据
 Discontinuous data 非连续型数据
 Discrete data
离散型数据
 Binary data
二分类数据
 Ordinal data
有序数据
 Nominal data
名义数据
 Variable
变量
 Random event
随机事件
 Probability
概率
 Continuous data
16
 Population
 Sample
 Sample size
 Random sampling
 Sampling error
 Parameter
 Statistic
 Descriptive analysis
 Inferential
analysis
总体
样本
样本量
随机抽样
抽样误差
参数
统计量
描述分析
推断分析
17
1,What is statistics?
Latin word
English word
status
state
Data
analysis
statistics
modern
classical
Data collection
Data collection
Data interpretation
Data interpretation
Conclusion evaluation
18
Basic idea of statistics
City A
City B
Data collection: 100 students
200 students
30 short-sight 40 short-sight
Interpretation: 30% abnormal 20% abnormal
Conclusion: The eyesight of students in city A
are more bad than those in city B.
Evaluation:
1) Is this conclusion correct?
2) What is the reliability (置信度) of the conclusion?
19
From Wikipedia, the free encyclopedia
Statistics(统计学) is the
study of the collection, organization, analysis,
and interpretation of data. It deals with all
aspects of this, including the planning of data
collection in terms of the design of surveys
and experiments.
Biostatistics (生物统计学)
It is a branch of the statistics, in which
the data are derived from the biological
sciences and medicine.
21
Health Statistics
It is a branch of the statistics, in which the data are
derived from the medical researches about health
care, health services, diseases prevention etc.
22
Application of data analysis in the
medical research

What is the average birth-weight of infants born in
Hubei province?

Is there any differences of curative effect between
two types of medicine?

Which risk factors impact on the occurring of
stomach cancer?

How to predict the likelihood of recovery from a
disease (prognosis of a disease)?
23
The position of statistical analysis
in medical research
doctor
patients
improved knowledge
feedback
data
statistical
analysis
evaluate
treatment
24
An important and necessary
course for medicine students
 It
is an important & useful tool for enhancing
the personal ability of medical research;
 It
is also a necessary subject, like the medical
imaging, taken part in an important role for
improving the medical science.
25
History of statistics
 Beginning
in 17th century, it is an application
branch of probability theory
 Calculators (1960
‘s) made the basic statistical
analysis more applicable
 Computers
(1980 ’s) made the multi-variate
statistical analysis more applicable
26
2, Types of data
Data (数据):
It is an information derived from the
different measuring devices.
27
Source of data
Routine records
Surveyed
records
Experimental records
External information
28
Types of data
Continuous ( 连续)
Discrete (离散)
data
Ordinal ( 顺序)
Nominal (名义)
29
Quantitative or numerical
data
Quantitative / numerical data
(定量 / 数值数据):
It is the data with unit (or scale) and
origin .
eg, blood pressure --- continuous
white blood cells, Pulse rates --- discrete
32
Qualitative or categorical data
(定性 / 分类数据):
It is the data without unit (or scale) and
origin .
eg, sex
curative effect
occupation
---binary (二分类)
---ordinal (有序多分类)
---nominal (无序多分类)
33
3, Basic concepts of statistics
Variable(变量):
It is a set of data observed from different
persons, place, things, etc, and it
describes a certain characteristic.
34
Data and variable
id
sex
age
weight height health
1
m
25
70
1.75
a
2
f
21
57
1.66
b
3
f
28
55
1.6
a
4
m
23
79
1.7
c
5
m
29
60
1.77
c
35
Event & random event
Event (事件):
Suppose something has k different outcomes,
and only one outcome can occurs at a time,
then each possible outcome is called as an
event.
eg, tossing a two-sides coin: head or tail.
36
Random event (随机事件):
If the outcome of an event is unknown
before trial, then this event is called as a
random event.
eg, throwing a
6-sided die:
1,2,3,4,5,6
37
Probability (概率) is a measure of the
likelihood of a random event occurring, using
“P” represents it, and
0≤P≤1.
P= a/n
= the outcome of interest / the number of possible outcomes
• P = 0, the event is impossible
• If P = 1, the event is certain
38
1) tossing a two-sides coin: head or tail.
head
P (head facing up) = 1/2
2) throwing a six-sides die: 1,2,3,4,5,6 faces.
P (number ‘6’ facing up) = 1/6
2
39
Population
research target conclusions
population
The blood glucose
concentration of
female aged 25-39
in Hubei province
The weight of
boys aged 5-10
year olds in
Wuhan city.
full data set :
X1,X2,…,XN
40
Population: the universal set of all objects
under study.
Sample: Any subset of the population.
A large population may be impractical and costly to
study, collecting data from every member of the
population. A sample is more manageable and easier to
study.
After collecting and organizing the data, a summary is
made such as average values. Hopefully valid
conclusions can be made on the whole population based
on the sample data.
Therefore it is important that the sample data collected
be representative of the population. Otherwise
conclusions may be invalid. Conclusions are only as
reliable as the sampling process, and information can
change from sample to sample.
41
Sample
population
small
population
sample
large
population
subset of data:
x1,x2,…,xn
42
Sample (样本):
It is a subset of population, and the conclusions
about the population can be drawn from it.
Sample size (样本量): n
The number of individual observations in a
sample.
44
Random sample
A subset of population
Random
sample
sample
Random sampling
The conclusions about
the population can be
drawn from the sample.
Each member of the
The selection of any member
population has an equal
from population does not
chance to be selected
influence the selection of any
equal chance
other member.
independent each other
45
Random sample (随机样本):
Each member of the population has an equal
chance to be selected and all samples are
independent each other.
46
How to draw a random sample?
There are many methods to draw a
random sample, for example:
Simple random sampling
System random sampling
Random
numbers
Clustering random sampling
Stratified random sampling
47
Table of random numbers
10 09 73 25 33 76 52 01 35 16 35 67 23 48 79 80 93 90 11 16
37 24 20 48 05 64 89 47 42 96 24 80 52 40 37 20 63 61 04 02
08 42 26 80 53 19 64 50 83 03 23 20 90 25 60 15 35 53 47 78
Example: random sampling 10 students from a class with 70
students.
1. Assign each student in the classroom a unique number from 1 to 70.
2. Determine the beginning 2-digits number to be used on the random number table,
eg, 1st row and 6th column, the 2-digits number is 76.
3. Read the random numbers from the beginning number, if the random number >70
or =0, then jump to next number, otherwise, record the number in a paper, if the
number is repeated appearing, also jump to next number, do this until a total of 10
different random numbers between 1-70 are recorded.
4. Select the students who has the recorded random number. ie,. The 10 students are
those who have the random numbers: 52, 01, 35, 16, 67, 23, 48, 11, 37, 24.
48
Parameter and statistic
population
sample
X 1 , X 2 , , X N
x1 , x2 ,, xn
Parameter (参数)
Statistic(统计量)
F ( X 1 , X 2 ,, X N )
population mean
1

N
Greek letter
N
X
i 1
i
f ( x1 , x2 ,, xn )
sample mean
1 n
x   xi
n i 1
English letter
49
What is a good estimate of
parameter?
statistic
parameter
1

N
1 n
x   xi
n i 1
N
X
i 1
i
known
unknown
good estimate
unbiased
precision
consistent
50
Types of error
systematic error(系统误差)
----avoidable
error
random error(随机误差)
----unavoidable
Difference between true and
estimate.
51
Sampling error (抽样误差):
A type of random error, it is due to the random
sampling and the variation between
individuals.
Random variable(随机变量):
A variable measured with random error.
Fixed variable(固定变量):
A variable measured without random error.
52
Types of statistical analysis
 Descriptive
 Inferential
analysis (描述分析)
analysis (推断分析)
53
Summary
 Health statistics
 Types of data
 Population and sample
 Random sampling
 Parameter and statistic
 Random event & probability
54
Reading
Bowers David. Medical Statistics from Scratch:An Introduction
for Health Professionals (Second Edition). John Wiley & Sons
Ltd, 2008
Neil A. Weiss. Introductory Statistics(9th edition).AddisonWesley,2012
《卫生统计学》(第7版)
第一章
绪 论
55
The end
56