Analysis of Medical Data - Florida State University College of Medicine
Download
Report
Transcript Analysis of Medical Data - Florida State University College of Medicine
Analysis of Medical Data
Research Perspective
Nancy B. Clark. M.Ed.
Director of Medical Informatics Education
FSU College of Medicine
Spring 2004
http://www.med.fsu.edu/informatics
Objectives
Review statistical concepts to be on Step 1.
Determine what data exist relative to a clinical
question or formal hypothesis
use IT to locate existing data sources
identify and locate existing data sets
Within institution
Outside institution
Analyze, interpret and report findings
Select and use appropriate computer software: Excel,
SPSS
Use software to perform simple statistical analysis and
portray results graphically
Interpret reports
Prerequisite Skills
(Step 1 USMLE)
•
Fundamental concepts of measurement
•
•
•
•
•
•
•
Scales of measurement
Distribution, central tendency, variability,
probability
Disease prevalence and incidence
Disease outcomes (eg, fatality rates)
Associations (correlation or covariance)
Health impact (eg, risk differences and ratios)
Sensitivity, specificity, predictive values
More Prerequisite Skills
(Step 1 USMLE)
Fundamental concepts of hypothesis testing
and statistical inference
Confidence intervals
Statistical significance and type I error
Statistical power and type II error
More Step 1 Topics
Fundamental concepts of study design
Types of experimental studies (eg, clinical trials,
community intervention trials)
Types of observational studies (eg, cohort, casecontrol, cross-sectional, case series, community
surveys)
Sampling and sample size
Subject selection and exposure allocation (eg,
randomization, stratification, self- - selection,
systematic assignment)
Outcome assessment
Internal and external validity
Scales of Measure
Nominal – qualitative classification of equal
value: gender, race, color, city
Ordinal - qualitative classification which
can be rank ordered: socioeconomic status
of families
Interval - Numerical or quantitative data:
can be rank ordered and sizes compared :
temperature
Ratio - interval data with absolute zero
value: time or space
Distribution, Central Tendency…
Mean
…Variability, Probability…
Mean
Median
Mode
Standard deviation
Statistical Significance p < .01
Confidence Interval
Statistical Significance
Type I and Type II errors
Null Hypothesis = Ho
Reject Ho
Ho True
Ho False
Type I error
Correct decision
Do Not Reject Ho Correct decision
Type II error
Statistics Online Textbook
The Statistics Homepage
http://www.statsoftinc.com/textbook/stathome
.html
Disease Prevalence and Incidence
Prevalence
probability of disease in entire population at any
point in time
2% of the population has diabetes
Incidence
probability that patient without disease develops
disease during interval
0.2% or 2 per 1000 new cases per year
Sensitivity, Specificity
sensitivity =
a / (a+c)
specificity =
d / (b+d)
Patients
with
disease
Test is
positive
Test is
negative
Patients
without
disease
a
b
c
d
Predictive Value
Positive predictive
value = a / ( a+b)
Negative predictive
value = d / (c+d)
Post-test probability of
Test is
disease given positive
positive
test = a / (a+b)
Post-test probability of
disease given negative Test is
test = c / (c+d)
negative
Patients
with
disease
Patients
without
disease
a
b
c
d
Good Resource Sen, Spc, PV
An Introduction to Information Mastery
http://www.poems.msu.edu/InfoMastery/defa
ult.htm
Diagnosis
Sensitivity and specificity
Predictive values
Likelihood ratios
InfoRetriever
Calculators: Epidemiology, Diagnostic test
Fundamental Concepts of Study Design
Good Resource
Epidemiology for the Uninitiated
BMJ
Online Textbook
http://bmj.com/collections/epidem/epid.shtml
Finding Health Statistics
Types of Health Statistics Questions
Fact lookups
Research
Presentations
Social and Policy indicators
Strategies for Finding Health Stats
Use Portal
Start at Internet site
Start with book or article
Internet Portals of Health Stats
Lists of links that provide starting points for
browsing or searching
Keyword search in portal vs Google
General idea what you want
The Related Health Services Research Web
Sites
http://www.nlm.nih.gov/nichsr/hsrsites.html
The NCHS portal: http://www.cdc.gov/nchs/
Other Statistical Web Sites
CDC Data and Statistics
http://www.cdc.gov/scientific.htm
FedStats Home Page
http://www.fedstats.gov/
Compare these two
U Michigan’s Statistical Resources on the
WEB – HEALTH
What type of stats
Lexis-Nexis Statistical Universe
Subscription resource
Searches stat data
Subject List
Limit search
Reports or tables
http://web.lexisnexis.com/statuniv?B1=Connect+to+Statistic
al+Universe
MMWR
Morbidity – illness
Mortality – death
http://www.cdc.gov/mmwr/
Disease Trends
Tables - searchable
Health Care Data
Healthcare Cost and Utilization Project
HCUPnet
Hospital discharges
Ambulatory service
Costs
Amount of care
By diagnosis and procedure
Surveys of hosp, physicians, nursing homes
Health Consequences
Costs to society, individuals
Cost from care
Costs of illness
Impact on infrastructure
HCFA=>CMS Health Accounts
http://www.cms.hhs.gov/statistics/nhe/default.
asp
State and International Data
Floridahealthstat.com - Where Florida Health
Data Resides
DOH Epidemiology
KFF State Health Facts Online
United Nations Statistics Division
World Health Organization Research Tools
Individual Datasets
EMR
Billing
CDCS
Customized data collection tools
Data Analysis
Selecting the Appropriate Software
Spreadsheet
Numerical (interval or
ratio) data
Sums
Averages
Standard deviations
Simple charts and graphs
Statistical Software
Nominal or Ordinal data
Comparisons of two+
groups
Frequency tables
Complicated charts and
graphs
Normal curves
Class intervals
Statistical significance
Spreadsheets
Excel
Pocket Excel
Data Tables
Field names at top
Each row is a record (sample)
Sorting whole table
By one column
By more than one column
Sorting individual sections
Descriptive Statistics
Distribution
frequency distribution
Histogram
Central tendency
Mean
Median
mode
Dispersion
Range
Standard deviation
Variance
N
Not P (inferential
stats)
Central Tendency
Mean
Median
=MEDIAN(A2:A7)
Mode
=AVERAGE(b2:b1500)
=MODE(A2:A7)
N
=COUNT(A2:A1500)
=COUNTBLANK(A2:B5)
Dispersion
Range
Standard deviation
=MAX(A2:A60)- MIN(A2:A60)
=STDEV(A2:A110)
Variance
=VAR(A2:A110)
Distribution
Frequency distribution
Not easy – use SPSS
FREQUENCY(data_array,bins_array)
Use help
Histogram
Bar chart of frequency table
Hands on experience
Analyze data in examples2.xls
Statistical Software
Intro to SPSS
Statistical Software
SPSS
Provided by request/justification
Lab Computers
Start => Programs => SPSS for Windows =>
SPSS 11.0 for Windows
Start Screen
Don’t show this dialog
in the future.
OK
Open Breast Cancer Survival
Data View
Views
Variables
View
File Information
Utilities Menu
File Info…
Output window
Descriptive Statistics
Analyze Menu
Descriptive Statistics
Frequencies
Select Age ►
Click Statistics button
In Central Tendency
Mean, Median, Mode
In Dispersion
Standard Deviation,
variance
In Percentile Values
Quartiles
Continue
OK
Graphing
Graphs Menu
Pie…
Summary for Groups of
cases
Lymph Nodes ►
OK
Histogram with Normal Curve
Graphs Menu
Histogram..
Select Age ►
Check Display Normal
Curve
OK
Simple Correlation Analysis
Age and Tumor Size
Analyze Menu
Correlate…
Bivariate
Select Age ►
Select Pathological Tumor Size ►
Check Pearson and Spearman – Two tailed
OK
Is there a correlation? Negative or Positive?
Is it statistically significant?
Save Output
Save on All Users drive
Under Nancy.clark
SPSS Output Files
Name it your name: ie, KerryBachista.spo
Importing Data
From Excel, SAS,
dBase, etc.
Variable names first
row
File Menu, Open
Files of Type
Data…
Excel
Tutorial, Samples
Demo.exe
Type in Labels
Pick Type of variable
Enter Value Labels
Etc.
SPSS Tutorials
In the Help Menu
On Informatics Web page
Books:
Statistics for Social & Health Research (Sage)
Argyrous, George
Statistics Applied to Clinical Trials (Klawer
Academic Publishers)
Cleophas, Ton J., et al
Objectives
Determine what data exist relative to a clinical
question or formal hypothesis
use IT to locate existing data sources
identify and locate existing data sets
Within institution
Outside institution
Analyze, interpret and report findings
Select appropriate computer software: Excel, SPSS
Use software to perform simple statistical analysis and
portray results graphically
Interpret reports
Questions?