Many Domains

Download Report

Transcript Many Domains

How Will Applications Drive
Future Data-Intensive Systems?
Data-Intensive Computing Workshop
Applications Break-Out Session
Some Driving Applications
• Google-style Search
• Social Networking
(Facebook/Twitter)
• Data warehouse mining
• Biomedical
• Sensor networks (e.g.,
video, radar)
• Cosmology
•
•
•
•
•
•
•
•
Astro
Climate
Fusion
Machine translation
National security
Disaster preparedness
Financial analytics
GIS
Many Domains benefit from Data-Intensive Computing
Common Application Structures
• Big Data
background
Derived
Data
• Big Data
Query
live
Query
Derived
Data
Anticipated vs. ad hoc analysis/queries
Application Trends: Scale
E.g., Climate Change Studies need:
• 5 orders of magnitude data scale
• 5 orders of magnitude speed scale (including algorithmic
improvements)
But More than That…
Application Trends: Features
• SW as service, pervasive
mobile clients
• P2P interaction
• Built-in verifiability/
provenance of answers
• Too much raw data; must
decide what (derived) data
to retain
• Dealing with privacy
controls, role-based
authentication
• Multi-resolution, Multi-D
visualization (multi-sensory
presentation) at scale
• Queries expressed using
multimedia
• Heterogeneity, Cross data
sources
• Increased value of
data=>increased demand for
data security/integrity
Big Data Challenges:
Around the Corner for All of Us
Reducing App Development Time
Key issues:
• Effective workflow tools: need for convergence to open,
standard tools (Multi-user: Tasks are collaborative)
• Effective big data libraries & frameworks
• Avoid recoding when scale changes
• Use familiar APIs (C.S. stuff just works)
Some Lessons Learned
• Curriculum mismatch between domain scientists and
computer science courses
• Hard to determine the resource needs of an app a priori
• Cross-disciplinary work is challenging
– More cross-disciplinary possibilities in sharing Big Data
• Typically not a big data cliff: can make do with less data,
but improve with more data
– Although some apps need min data size to be useful
– Meet needs of those already feeling the pinch vs. Trying
to leap ahead
• Economics: data is free, networking is free
– Payment may not be money: what demand of users