Vadu Data Sharing Initiative

Download Report

Transcript Vadu Data Sharing Initiative

Benefits of data access







Public money
Professional responsibility to share data
Data access and scientific progress
Fosters open scientific community and transparency
in scientific inquiry
Allows for verification, refutation and refinement of
findings
Promotes new research
Use for evidence based policy making




Improves methods and measurements
Encourages multiple perspectives
Protects against faulty data
Increases chances of future funding
Risks of data access


Poor quality data
Ownership, authorship
–
–


Sponsor concerns
Financial costs for data collection / access
–

Fear that secondary analyst may publish before
Concern of loss of control over data
Secondary researcher gains more, loses less
Perceived risk to privacy / confidentiality
–
–
Breach of confidentiality
Statistical disclosure

Ethical use of data
–
–

Use for non-analytic, non-research purpose
Concern of integrity, competence of secondary
analyst
Lack of incentives for data sharing
–
Lack of disincentives for not sharing data
Developing a Prototype for Data Sharing
Among Indepth-Network Member Sites –
Building capacity in data management
-Kanchanaburi
(Thailand)
-Wosera (Papua New Guinea)
-Vadu (India)
-International Institute of Information
Technology (I2IT), India
Indepth Network (SIDA SAREC small
grants proposal on capacity building)
Stakeholders





Vadu, India.
Kanchanaburi, Thailand
Wosera, Papua New Guinea
I2IT
Indepth - Network
Aims

Strengthen data sharing mechanism within INDEPTH
sites and their preparation for sharing with other partners

To develop a prototype for data sharing amongst Indepth
sites
Objectives

Define minimal and optimal data sets that allow data sharing
and data analysis amongst sites.

Develop and standardized system of Unique data structure
amongst Indepth sites to allow for data sharing and merging
for cross site comparisons.

Strengthen data collection systems suitable for data sharing
and promote data sharing at Indepth sites safeguarding site
and citizen interests.
Steps towards data access & data
sharing


Commitment to data sharing
Minimum data sets
–
–




Technical issues
Technology related issues
Minimum quality assurance checks
Modality of data access
Ensuring confidentiality, anonymity
Legal, financial, ethical, scientific considerations
Technical Procedure

Site Data manager sends data in format decided by all sites

Each site have freedom to send data in any database manager
which they are using for data management

I2IT converts data into the Centralized Database in My-SQL as
backend and PHP as front-end

I2IT uploads converted data on its server and gives permission
for accessing and using the data to each site as well as other
users
Data Sharing, Upload & Download
Vadu
MySQL & PHP
Kanchanaburi
SQL Server
Wosera
Fox-Pro &Visual Basic
Minimum Dataset
ETL Tool (Created by I2IT)
(To convert data into MySQL & PHP)
U
P
L
O
A
D
Central Database (Created and Managed by I2IT)
MySQL & PHP
Server
Vadu
upload data
DOWNLOAD
Kanchanaburi
Other Users
(With prior online Permission to
view data)
Wosera
For Users other than member
sites
Other Users
(With prior online Permission to
view data)
Permission by I2IT for
data downloading
Create User
Request to I2IT by E-mail
Approval by site for
sharing data
Vadu
I2IT
Request site by E-mail
Kanchanaburi
Terms and Condition by sites for prior permission
Wosera
I2IT - tasks

I2IT performs Extraction of the Data provided by the sites.

Conversion of given fields into the fields as per the prototype.

Uploading and Storage of the data into the website.

Providing web accessing facility to the sites and others interested.

Providing authorizations and implementing security constraints for
data accessing through the web site.
ETL?




ETL is Acronym for Extract, Transform and Load.
ETL involves extracting the data from various data
sources like Oracle, Visual FoxPro, Excel, SQL
Server etc.
Transformation involves converting the data from
source structure to destination structure.
This transformation is crucial for entire project
Example: Conversion of PID from 12 digit to 15 digit
of vadu, India
APTGNT0023004(12 digit)
ETL (SAS)
106001068002304 (15 digit)
I2IT team
Continue..



We have Extrated nearly 70,000 records from Vadu database and
converted their 12 digit alphanumeric PID into 15 digit numeric
format and stored into the centralized data base.
Conversion of Paupa New Gunea and Kanchanaburi records is
currently in progress
All process of ETL is documented meticulously
Software & Hardware Requirements

Operating System
: Linux/Unix/Windows

Data Base
: My-SQL

Front end
: PHP/Java

ETLTOOLS
: BASE SAS 9.1.3

HARDWARE
: PENTIUM DUAL PRO

HARD DISK
: 160 GB.

RAM
: 1 GB
Architecture
End users
vadu
kanchanaburi
ETL
web
Centralized
database
wosera
End users
Website tour
http://localhost/index.php
Thank You