Mining biomedical terminology from literature
Download
Report
Transcript Mining biomedical terminology from literature
CDT Seminar Overview:
Health Informatics
Clinical informatics
Large amount of clinical data – BIG DATA
EHR, hospital discharge letters
guidelines, protocols, etc.
tests, measurements,
medical literature (case notes, ...)
Ultimate aim: MAKING SENSE OF THIS DATA
to support clinical research and facilitate
clinical decision support
Close collaboration with clinical teams and
pharmaceutical industry, local and wider
Health e-research centre (HeRC)
New £18M centre to be opened soon
Datasets
Link
Value
Science and Industry
(R&D)
Link
Ingredients
Experts
Insights
Data Quality
Improved Care for
Patients and Communities
(Service)
Methods
Health e-research centre (HeRC)
CS areas in need
Data management
Machine learning, data mining
Text mining
Information management
privacy preservation
User interface design
High-performance computing
Knowledge management
ontologies, logics, Bayesian modelling
reasoning
Clinical text mining
Extract data from Electronic Health Records
(EHRs)
Challenges
Highly condensed text
often without proper sentences
list of medications, symptoms, acronyms, etc.
Terminological variability and ambiguity
orthographic, acronyms, local conventions
Various sections
previous history, social/family background
Recording “practice” vary
aneurism size: ‘large’, between 20-30mm
Patient: X
Date: 12.02.2007.
Medication: Enalapril 20mg
Duration: 7 days
Frequency: 2 X 1
Mode: oral
Reason: hyperthension
Dg. cardiac arrest, ….
Example: extract status of diseases
UoM performance (ranked 1st/28)
Micro-average: Accuracy (0.9723)
Macro-average: P (0.8482), R (0.7737), F-score (0.8052)
#Eval
#Corr
#Gold
Precision
Recall
F-score
Y
2267
2132
2192
0.9404
0.9726
0.9562
N
56
40
65
0.7142
0.6153
0.6611
Q
12
9
17
0.7500
0.5294
0.6206
U
5709
5640
5770
0.9879
0.9774
0.9826
Yang, H., Spasic, I., Keane, J., Nenadic, G.: A Text Mining Approach to the Prediction of a
Disease Status from Clinical Discharge Summaries, JAMIA 16(4):596-600
Clinical “narratives”
very anxious
dry cough
feeling low
no herion use
Mining health-care Web 2.0
Sentiment mining
of health-related
social media
e-epidemiology
suicide prevention
quality of life
assessment
...
HeRC research themes
CoOP
“Coproducing observation with patients”
MOD
“Missed opportunities detector”
SEA-3
“Scalable endotypes of asthma, allergies and
andrology”
DOT
Diabesity outcomes translator
FIN
Trials feasibility improvement network
Linked2Safety
An advanced environment for clinical research
based on clinical care information in EHRs and clinical
trial systems
a) early detection of patients’ safety issues
b) identification of adverse events
c) identification of suitable cohorts for clinical trials
Use semantic technologies (Linked Data) and
data/text analytics
Inter-disciplinary at Manchester involving
CS, Medicine and Mathematics
http://www.linked2safety-project.eu/
Clinical document management
Dynamic documentation knowledge services
find the right forms/questions depending on the
patient and clinical observations
reasoning
present it to the users
Tasks/areas
Modelling (ontologies, description logics, SW)
Data analytics and integration
User interface design
Systems biology
Large-scale extraction and contextualization of
biomolecular events
extraction of host-pathogen interactions
molecular modelling of thyroid cancerogenesis using
text mining
Modelling dynamics of small blood vessels and
roles of smooth muscle cells
combine literature mining and structured data
Contacts
Goran Nenadic
text mining, information management
e-health research
Bijan Parsia
Knowledge management, reasoning
GUI
John Keane
data management/analytics
decision support systems