HeqepWorkshopHI_Latifulx

Download Report

Transcript HeqepWorkshopHI_Latifulx

Research in Health Informatics:
Bangladesh Perspective
Dr. Abu Sayed Md. Latiful Hoque
Professor, Dept. of CSE, BUET
Workshop on Health Data Analytics
Outline
2






Motivation
Introduction
Health Informatics, Data, and Problems
Research on the Development of Data Marts and
Warehouses and Applications
Research on Data Mining for better Health Services
Conclusions
Motivation for Developing National Health
Data Warehouse (NHDW) of Bangladesh
3

Bangladesh needs to develop NHDW for
 Better
Healthcare Delivery
 Better Health related Research
 Better Health Monitoring and Administration

Developed Countries like US, UK or Australia
already developed DWs
3
Motivation (2)
4
Define National Reference level
WHO's Hemoglobin thresholds used to define anemia [29]
In Bangladesh, rule of thumbs is for Woman > 15 years, non pregnant Hb> 11 Good; Hb >=10.0 ok, Less than 10
medication needed.
4
Motivation (3)
5
National health trend analysis
5
Motivation (4)
6
Location Dependency analysis
Finding Impact of different region of Bangladesh on the test
result

Arsenic effect on Chandpur;
Forecasting of disease, virus or epidemic

Juice of Palm tree of which location is more suspicious for
Nipah virus
Motivation (5)
7
Redundant/ Fraud Testing Awareness
If For a Costly Test T3:
Age(X,<30) ^ (Gender=‘M’) => Negative (X, T3)
[Support =70%, Confidence=95%]
National awareness can be developed not to perform
the test at initial level for Young Males.
Introduction

What Is Data Mining?

Data mining (knowledge discovery from data)

Extraction of interesting (non-trivial, implicit, previously unknown and
potentially useful) patterns or knowledge from huge amount of data

Alternative names

Knowledge discovery (mining) in databases (KDD), knowledge
extraction, data/pattern analysis, data archeology, data dredging,
information harvesting, business intelligence, etc.
Knowledge Discovery (KDD) Process


This is a view from typical database
Pattern Evaluation
systems and data warehousing
communities
Data mining plays an essential role in
Data Mining
the knowledge discovery process
Task-relevant Data
Data Warehouse
Data Cleaning
Data Integration
Databases
Selection
Research on Health Data
10

Using effective data mining tools & algorithms possible to
produce useful information from Health dataset
Applications of Health Data Mining
Patient Visit Cycle in Bangladesh
11
Patient Visit Cycle (Cont’)
12
Patient Visit Cycle (Cont’)
13
Patient Visit Cycle (Cont’)
14
Patient Visit Cycle (Cont’)
15
Existing NHDW Overview
16
Now DGHS has:
• 33 aggregated data sets
• Most health programs are included
• Data entered up-to 4, 501 Union level facility
• 3 individual data sets
• Data entered from 13,000 community clinic
Research on Health Data: Challenges
17


Architecture of National Health DW of Bangladesh
Preprocessing
Missing Value Problem
 Noisy Data Problem
 Data Transformation


Record Linkage
No Standard Patient Identification in data
 Missing Value Problem
 Noisy Data Problem
 Privacy Preservation

Proposed Architecture of NHDW, Bangladesh
18
Sample Fact and Dimension Tables of NHDW for
Pathological Data
19
Designing NHDW Data Cube
20
Snapshot of Pathological Data
21
Our Approach: Patient Identification Technique
based on Secured Record Linkage (PITSRL)
22
Patient De-Identification with Linkage
Preservation (PDLP) Technique
23
Dataset Quality
24
633,609
550,415
77,021
Avaiable Health Reocrds
Health Records With Valid
Mobile/Phone Number of Patients
Health Records With no Phone
Number
6,173
41,251
Health Records With Invalid String Health Records Without Birth Info
in Phone Number Attribute
Research on Specialized Group: Diabetic
Historical
Dataset
Food Intake
Knowledge
Insulin Type &
Dose
Exercise
Acquirement
(t+PH)
time
t+PH
Stress
Blood Glucose
Level
Current Time
t
Conceptual View of BGL Prediction Model
Scopes of Future Research
Improvement of data quality:

A dataset with sufficient amount of
variations is indispensible.

The quality of dataset depends on accuracy of
estimation and accumulation of life style data
(diet, insulin, exercise, stress etc.)

An intelligent entry module can be developed
for estimating effects of emotional events on blood
glucose level.
Scopes of Future Research
Development of Prediction Technique:
A model can be developed with neural network
based classification techniques for predicting
glucose concentration for improved performance.
Development of Therapy Optimizer:
Development of an inexpensive, hand-held ANN
diabetes therapy system for patients, which
will incorporate a BGL monitoring device.
Conclusions
28




Health Data Warehouse is very essential for a Nation for
better healthcare delivery
Main challenges are proper framework development ,
data preprocessing (e.g., cleaning, missing value
imputation), record linkage establishment and privacy
preservation
We have proposed a national framework for integration
of enormous, diverse health data to facilitate knowledge
discovery.
We have performed some preprocessing on health data
such as cleaning, normalization etc.
Conclusions (2)
29




Proper Record Linkage and Privacy Preservation are two
big issues for Integrated health systems
We have developed Patient Identification Technique
based on Secured Record Linkage (PITSRL) for Privacy
Preserved Record Linkage
For a noisy health dataset of 633609 patients, we
achieved 87% sufficient record linkage key.
For a training dataset of 100 patient records, PITSRL
achieved 100% accuracy of identifying unique and
duplicate patients.
Conclusions (3)
30


Data mining technique can be used for the prediction
of severity of diseases e.g., diabetic, blood pressure
etc.
Specialized patient management system need to be
developed for special care for critical patients
31
Thank You
[email protected]