Transcript on HID
FBIRN NeuroInformatics
Working Group Update
David B. Keator
University of California, Irvine
FBIRN AHM
March 13-14, 2006
October Milestones
Phase I data uploaded and downloadable for public
analysis
Phase II image upload
Tools available for Society for Neuroscience roll-out
Phase II clinical data uploads
Prepare for derived data uploads
FBIRN IT Vision
Result Images and XML
wrapper in Data Grid
FMRI Images
•Automated image upload to
Data Grid/HID for sharing
DICOM, NIFTI
Data Grid
(Local)
FIPS: FSL Image
Processing Scripts
FIPS
Results
fMRI Scanner
HIDB(s)
(Local)
Clinical Data
•Computer aided scale input via
clinical data entry interface
Results with standard descriptions
in HIDB (i.e. data provenance)
Multi-Site User Query
FBIRN Federated Data
UMN
HID
p2
p1
p2
p1
p2
Stanford p1
HID
UCLA
p2
HID
UCI
HID
UI
HID
p1
p2
UCSD
HID
p1
UNM
HID
= Data Integration Environment
= PostgreSQL test site
= Phase 1 / Phase 2 data
p1
p2
Duke: 48
BWH: 18
MGH: 11
UCLA: 37
UCSD: 5
UCI: 58
UNM: 44
UI: 63
UMN: 52
Yale: 56
p1
MGH
BWH HID
HID
Yale
p2 HID
p2
p1
p2
392 Subject
Visits
p1
Duke
HID
Architecture Overview
Subject
Management
CALM/GAME
Assessments
Multi-Site
Query
XCEDE
Services
Clinical /
Demographics
Study
Protocols
Study
Data
Web Application
Core
Hierarchy
Schema
Oracle
Database
PostgreSQL
File System
Data
Data Grid
Phase I Traveling Subject Dataset
https://portal.nbirn.net/BIRN/cgi-bin/Downloads/fBIRN_PhaseI.cgi
Phase II Study: Image Data Volume
21,038 raw image files per subject
9.0
FBIRN Shared Data Files
Number of Files (Millions)
8.5
2.4 GB of raw image data per
subject
25 GB to 40 GB of processed
image data per subject (depending
on hypotheses tested)
8.0
7.5
7.0
6.5
6.0
5.5
5.0
July 05 Aug 05 Sept 05 Oct 05 Nov 05 Dec 05 Jan 06
10 million slices of functional imaging data in Phase II
7 Terabytes of image data for all of the Phase II analyses
(conservative estimate of 25 GB/subject)
BIRN Tools Download
http://www.nbirn.net/Resources/Downloads/
Architecture Overview
Subject
Management
CALM/GAME
Assessments
Multi-Site
Query
XCEDE
Services
Clinical /
Demographics
Study
Protocols
Study
Data
Web Application
Core
Hierarchy
Schema
Oracle
Database
PostgreSQL
File System
Data
Data Grid
HID Improvements
Created the following scripts to streamline HID
creation process:
• Create HID database schema
• Add initial data set to HID so that HID web application can
function
• Create database users for mediator to access HID
Created the following programs for HID data
management:
•
•
•
•
Migrate assessment data when an assessment is modified
Export subject assessment data to a file in csv format
Add experimental visits, studies and segments to HID
Export HID clinical data in XCEDE formatted XML
Architecture Overview
Subject
Management
CALM/GAME
Assessments
Multi-Site
Query
XCEDE
Services
Clinical /
Demographics
Study
Protocols
Study
Data
Web Application
Core
Hierarchy
Schema
Oracle
Database
PostgreSQL
File System
Data
Data Grid
Pseudo-Mediated Query Interface
Can query all sites with HID installation
Pseudo-mediated
• SQL query sent to each “registered” site
“Registered” means your HID has been told about the other sites
Can query Oracle and PostgreSQL installations
• Currently can only drill down to more detailed results from
a returned query if logged into the same database platform
Export CSV formatted clinical data returned from a
multi-site query
http://head.bic.uci.edu:8080/clinical/index.jsp
Pseudo-Mediated Query Interface
HID Subjects by Site
50
45
35
30
25
20
15
10
5
Ya
le
:
:
M
:
St
an
fo
rd
:
UN
N
:
UM
SD
UC
Number of Phase II Assessments in HID by Site
Site
500
450
400
350
300
250
200
150
100
50
Site
Ya
le
:
M
:
St
an
fo
rd
:
UN
:
UM
N
:
UC
SD
LA
:
UC
I:
UC
G
H:
M
Io
wa
:
Du
ke
:
0
BW
H:
Num. Phase II Assessments
I:
LA
:
UC
UC
G
H:
M
Io
wa
:
Du
ke
:
0
BW
H:
Num. Subjects in HID
40
Mediator - “View” on HID:
Assessments and MR Data
View across fBIRN HID resources
Multi-Site Query Across fBIRN Sites
Architecture Overview
Subject
Management
CALM/GAME
Assessments
Multi-Site
Query
XCEDE
Services
Clinical /
Demographics
Study
Protocols
Study
Data
Web Application
Core
Hierarchy
Schema
Oracle
Database
PostgreSQL
File System
Data
Data Grid
CALM Layout Improvements
Architecture Overview
Subject
Management
CALM/GAME
Assessments
Multi-Site
Query
XCEDE
Services
Clinical /
Demographics
Study
Protocols
Study
Data
Web Application
Core
Hierarchy
Schema
Oracle
Database
PostgreSQL
File System
Data
Data Grid
E-Prime, Presentation, etc. to XCEDE
events
This process is driven by
a “parsing” file, mapping
rows and columns to
events and event
characteristics.
Event data
Event extraction
into XML
Blue
square
Button 1
Button 2
10 secs
2
XML
Red
square
Stimulus
presentation
Red
square
Event data from various
stimulus presentation
programs is converted
from tabular text into a
standard XML
representation.
0 secs
Low Low High Low Low Low
tone tone tone tone tone tone
MR scanner
Red
square
Data Acquisition
XML events
(stimulus/response)
XCEDE events to FIPS
“Queries” of the XCEDE events files extract
timing for selected events into format required
for analysis package
Queries can be simple:
• description=‘stimuli\1000.wav’
• type==‘encode’ & description==‘listone’
Queries can be complex:
• description=‘stimuli\1200.wav’ and
not(preceding
sibling::*[description=‘stimuli\1200.wav’]/ons
et >= (onset - 9))
query syntax
XPath
XCEDE
XPath
XML tools released
BXH/XCEDE Tools released to the public through
the BIRN website.
• binaries for Linux
• includes tools for:
creating wrappers for image data
QA programs
event-based analysis tools
XCEDE events files
Query Atlas Anatomy Browser
Query Atlas
Plans for Coming Year:
Develop readers and an intuitive interface for loading BIRN
data into Slicer, and tools for saving modified Slicer scenes to
BIRN database.
Improve information visualization in Slicer’s 3D viewer
(improved text rendering, label and marker arrangement and
dynamic behavior, and interactive selection of scene
elements).
Develop queriable comment-markers that can be anchored in
the 3D scenes to convey relevant information (visibility
toggles on/off, clicking opens wiki page where text/images
narrate an observation or comment associated with the
marker);
Improve ways to visualize fMRI activation maps along with
anatomical data and FreeSurfer parcellation labels.
Ibrowser and fMRIEngine tools in
Slicer
Multi-volume processing and fMRI analysis have been
included with the Slicer 2.6 release.
Ibrowser permits timeseries data reorienting, smoothing,
preview, and timecourse plotting.
FMRIEngine permits first level GLM-based fMRI analysis,
supports anatomy- and activation-based ROI analysis,
permits interactive visualization of the activation map and
voxel timecourse plotting.
Currently adding the ability to incorporate Ising Priors into
the computation of parametric maps.
Detailed use-case tutorial for the fMRIEngine has been
developed.
Test the tutorial on local user-groups and refine it; then we
will make the tutorial and tutorial dataset available to the
wider community.
fMRIEngine – Slicer 2.6
Informatics Working Sessions
Monday: 1 – 4:30pm NI Group Working Session
(Maybe in UCI BIRN
Conference Room if we want to use the SmartBoard or have VTC support)
Introduction/Review Previous Milestones from October BIRN
AHM (15 min.)
Database Maintenance
HID Development (see attached)
Data QA/QC (30 min.)
Tuesday: 8:30 – 10:30 am NI and Stats Group Data Mining
Session
What is Data Mining, Examples
What data mining activities might we want to investigate for
October BIRN AHM? (60 min.)
How do we need to organize the existing data to support data
mining activities? (30 min.)
Tuesday: 2:00 – 5:00 pm NI Working Group Session (Maybe in UCI
BIRN Conference Room if we want to use the SmartBoard or have VTC support)
Derived Data – SRB/HID (60 min.)
SRB (30 min.)
Mediator (45 min.)
XML (30 min.)
Milestones/Wrap-Up (15 min.)
Information Technology (IT) vs.
Neuroinformatics (NI)
Processing information by computer. IT is the latest moniker for the industry. There have been several
before it, namely "electronic data processing" (EDP), "management information systems" (MIS) and
"information systems" (IS). The term became popular in the 1990s and may embrace or exclude the
telecom industry, depending on whom you talk to. http://www.pcmag.com/
…the branch of engineering that deals with the use of computers and telecommunications to retrieve
and store and transmit information wordnet.princeton.edu/perl/webwn
IT is a term that encompasses all forms of technology used to create, store, exchange, and use
information in its various forms (business data, voice conversations, still images, motion pictures,
multimedia presentations, and other forms, including those not yet conceived). It's a convenient term for
including both telephony and computer technology in the same word. It is the technology that is driving what
has often been called "the information revolution.“ www.planetech.co.uk/glossary.htm
Information Technology is the general term used to describe general computing.
www.z2z.com/site01/itglos02.html
Neuroinformatics is an emerging discipline which attempts to integrate neuroscientific information from
the level of the genome to the level of human behavior. A major goal of this new discipline is to produce digital
capabilities for a web-based information management system in the form of interoperable databases and
associated data management tools. Such tools include software for querying and data mining, data
manipulation and analysis, scientific visualization, biological modeling and simulation, and electronic
communication and collaboration between geographically distinct sites. The databases and software
tools are designed to be used by neuroscientists, behavioral scientists, clinicians, and educators in an effort to
better understand brain structure, function, and development. http://neurovia.umn.edu/IGERT/
What Does NI Stand For?
NI = NeuroInformatics
NI = No, I will not fix your computer!