- Tetherless World Constellation

Download Report

Transcript - Tetherless World Constellation

Bringing Data Science, Xinformatics
and Semantic eScience into the
Graduate Curriculum (solicited)
EGU2012-11224 (EOS 6/ ESSI2.3)
April 25, 2012, Vienna
Peter Fox (RPI) [email protected]
Tetherless World Constellation
tw.rpi.edu
Themes
Future Web
•Web Science
•Policy
Hendler
•Social
Xinformatics
•Data Science
•Semantic eScience
•Data Frameworks
Fox
McGuinness
Semantic Foundations
•Knowledge Provenance
•Ontology Engineering Environments
•Inference, Trust
Multiple depts/schools/programs ~ 35 (Post-doc, Staff, Grad, Ugrad)
Govt. Data
•Open
•Linked
•Apps
Application
Themes
Env. Informatics
•Ecosystems
•Sea Ice
•Ocean imagery
•Carbon
Hendler/ Erickson
Fox
McGuinness/Luciano
Platforms:
Bio-nano tech center
Exp. Media and Perf. Arts Ctr.
Comp. Ctr. Nano. Innov.
Data Intensive
Health Care/ Life Sciences
•Population Science
•Translational Med
•Health Records
Context
http://tw.rpi.edu/web/Courses
Experience
Data
Creation
Gathering
Information
Presentation
Organization
Knowledge
Integration
Conversation
Data Science Xinformatics Semantic
eScience
4
Web Science
Also at RPI
• Data Science Research Center and Data
Science Education Center
• http://www.rpi.edu/about/inside/issue/v4n17/dat
acenter.html
– Over 35 research faculty, 5 post-docs, ? grad
students
• Data is one of Rensselaer Plans’ five thrusts
• Other key faculty
– Fran Berman (VPR)
– Jim Myers (Director CCNI)
Curriculum
• Web Science and IT – undergrad, and MSc.
and PhD. (with science concentrations)
• Environmental Science with Geoinformatics
concentration
• Bio, geo, chem, astro, materials - informatics
• GIS for Science
• Master of Science – Data Science (pending)
• Multi-disciplinary science program (2012) PhD
in Data and Web Science
E.g. IT with Env. Sci.
•
•
•
•
•
•
•
•
•
ERTH-1200 Geology II (4 credits) - spring
CHEM-2250 Organic Chemistry I (4 credits) - spring
ERTH-2210 Field Methods (2 credits) - fall
IENV-1920 Environmental Seminar (2 credits) - spring
BIOL-2120 Intro. to Cell and Molecular Biology (4
credits) - spring
IENV-4500 Global Environmental Change (4 credits) fall
ERTH-4180 Environmental Geology (4 credits) –
spring
ERTH-4963 Xinformatics (4 credits) – spring
IENV-4700 One Mile of the Hudson River (4 credits) fall
Geoinformatics concentration
• CSCI1000 - Computer Science I
• CSCI1200 - Data Structures
• CSCI2300 - Introduction to Algorithms
or ERTH 4750 - Geographic Information
Systems in the Sciences
• CSCI4380 – Databases
• CSCI4961 - Data Science
• CSCI4960 – Xinformatics
• ERTH 4980 – Senior Thesis
Web Science Learning
Objectives
• Students will demonstrate knowledge and be able to explain the three
different "named" generations of the web (a/k/a Web 1.0, Web 2.0, and Web
3.0) from mathematical, engineering, and social perspectives
• Students will demonstrate the ability to use the dynamic programming
language Python to develop programs relating to Web applications and the
analysis of Web data.
• Students will be able to understand and analyze key Web applications
including search engines and social networking sites.
• Students will be able to understand and explain the key aspects of Web
architecture and why these are important to the continued functioning of the
World Wide Web.
• Students will be able to analyze and explain how technical changes affect
the social aspects of Web-based computing.
• Students will be able to develop "linked data" applications using Semantic
Web technologies.
Data Science Objectives
• To instruct future scientist how to sustainably
generate/ collect and use data for their
research as well as for others: data science.
• To instruct future technologists how to
understand and support essential data and
information needs of a wide variety of
producers and consumers
• For both to know tools, and requirements to
properly handle data and information
• Will learn and be evaluated on the full lifecycle of data and relevant methods,
10
technologies and best practices.
Learning Objectives
• Develop and demonstrate skill in data
collection and management
• Know how to develop and apply data models
and metadata models
• Demonstrate knowledge of data standards
• Develop and demonstrate the application of
skill in data science tool use and evaluation
• Demonstrate the application of data life-cycle
principles and data stewardship
• Demonstrate proficiency in data and
11
information product generation
Xinformatics Objectives
• To instruct future information architects how to
sustainably generate information models, designs
and architectures
• To instruct future technologists how to
understand and support essential data and
information needs of a wide variety of producers
and consumers
• For both to know tools, and requirements to
properly handle data and information
• Will learn and be evaluated on the underpinnings
of informatics, including theoretical methods,
12
technologies and best practices.
Learning Objectives
• Through class lectures, practical sessions,
written and oral presentation assignments and
projects, students should:
– Develop and demonstrate skill in development and
management of multi-skilled teams in the application
of informatics
– Demonstrate ability to develop conceptual and logical
information models and explain them to non-experts
– Demonstrate knowledge and application of
informatics standards
– Demonstrate skill in informatics tool use and
evaluation
13
Modern informatics enables a new
scale-free framework approach
Semantic eScience
Objectives
• Ontology Development, Merging and
Validation
• Semantic Language and Tool Use and
Evaluation
• Use Case Development and Elaboration
• Semantic eScience Implementation and
Evaluation via Use Cases
• Semantic Application Development and
Demonstration
• Group Project and Team Development, Use
Case Implementation and Evaluation
Discussion…
• Science and interdisciplinary from the start!
– Not a question of: do we train scientists to be
technical/data people, or do we train technical
people to learn the science
– It’s a skill/ course level approach that is needed
• Education and research semi-coupled
• We must teach methodology and principles
over technology *
• Data science must be a skill, and natural like
using instruments, writing/using codes
• Team/ collaboration aspects are key **
• Foundations and theory must be taught ***
Progression after progression
Informatics
IT Cyber
Infrastru
cture
Cyber
Informatics
Core
Informatics
Science
Informatics
Requirements
Science,
Societal
Benefit
Areas
Example:
• CI = OPeNDAP server running over HTTP/HTTPS
• Cyberinformatics = Data (product) and service ontologies, triple store
• Core informatics = Reasoning engine (Pellet), OWL
18
• Science (X) informatics = Use cases, science domain terms, concepts
in an ontology