Transcript Document

Worked example: Global Change
Information System
Peter Fox, and … others
Xinformatics 4400/6400
Week 10, April 8, 2014
And yet, we are still not done..
http://4.bp.blogspot.com/-7mYclB2oypk/TWrlhBPvHxI/AAAAAAAAALc/mwjhBbuZ9kU/s1600/yawn4.jpg
Assignment 3
3
Assignment 3
4
Reading – long ago
5
The Global Change
Research Act and USGCRP
• USGCRP was mandated by
Congress in the Global Change
Research Act (GCRA) of 1990 (P.L.
101 – 606)
“To provide for development and
coordination of a comprehensive and
integrated United States Research
Program which will assist the Nation
and the world to understand, assess,
predict, and respond to humaninduced and natural processes of
global change.”
6
U.S. Global Change Research Program
The Program:
• Coordinates Federal research
to better understand and
prepare the nation for global
change
• Prioritizes and supports cutting
edge scientific work in global
change
• Assesses the state of scientific
knowledge and the Nation’s
readiness to respond to global
change
• Communicates research
findings to inform, educate,
and engage the global
community
7
Global Change Information System
(GCIS)
Vision:
A unified web based source of
authoritative, accessible, usable, and
timely information about climate and
global change for use by scientists,
decision makers, and the public.
8
Global Change Research Act (1990),
Section 106
…not less frequently than every 4 years, the
Council… shall prepare… an assessment which–
• integrates, evaluates, and interprets the
findings of the Program and discusses the
scientific uncertainties associated with such
findings;
• analyzes the effects of global change on the
natural environment, agriculture, energy
production and use, land and water
resources, transportation, human health and
welfare, human social systems, and biological
diversity; and
• analyzes current trends in global change,
both human- induced and natural, and
projects major trends for the subsequent 25
to 100 years.
9
Previous National
Climate Assessments
Climate Change Impacts on
the United States (2000)
Global Climate Change Impacts
in the United States (2009)
http://nca2009.globalchange.gov
Target date for next
NCA: 2013
10
NCA 2009
http://nca2009.globalchange.gov
11
Prototype Use Case
Name
Discover and visit data center website of dataset used to generate report figure.
Goal
The NCA Report reader sees a figure and wants to know where the data came from.
Summary
A reader of the NCA is browsing the content via the website. He/she sees a figure and wants to know where the
data came from. A reference to the publication in which the figure originated appears in the figure caption. Selecting
the link to the source publication displays a page of information about the publication including, if available, the
publication DOI. The page also includes references to the datasets cited in the publication. Following each of dataset
reference links presents a page of information about the dataset, including links back to the agency/data center
webpage describing the dataset in more detail and making the actual data available for order or download.
Actors
Primary Actor - reader of the NCA
Preconditions
Reader is viewing the NCA online report
Post Conditions
Reader visits the data center dataset website
Normal Flow
1) System is presenting the NCA report to the reader in a web site. Presentation includes report figure with caption
that includes reference to source publication.
2) Reader selects publication reference in figure caption
3) System displays information about publication, including DOI (if available).
4) Publication information includes publication dataset citations.
5) Reader selects a dataset cited by the publication.
6) System displays information about dataset including links to agency / data center webpages where more
information and (potentially) data download links are available.
7) Reader selects the data center link and is redirected to data center dataset webpage.
Assessment links to information
13
Traceable accounts…
Magic here !
14
Under the hood – a
graph
15
Key Message &
A Traceable Account
Key Message vs.
“General” Message
Prototype 1
• Initial Implementation of UC-1
• Exposes Linked Data API
– RESTful
– RDF/XML, TTL, HTML, JSON supported
• Hosted at TWC / RPI
– currently placeholder data
– http://globalchange.tw.rpi.edu/elda/gcis/report/nca
2009.html
• Implemented using Epimorphics Linked Data
API (ELDA)
– http://code.google.com/p/linked-data-api/ (spec)
– https://code.google.com/p/elda/ (implementation)
Linked Data API
Architecture
Prototype Screenshot
21
GCIS
22
GCIS
• Create an entity from
the structured metadata
about each thing – tag
with related concepts.
• Identify it with a
persistent, controlled
identifier.
• Present with a human
readable web page and
a machine interface.
• Represent all
relationships between
items.
23
GCIS and W3C Prov
For GCIS, we have agents (people, projects, agencies,
data centers, publishers, etc.) who are associated with
activities (measuring, deriving, modeling, analyzing,
authoring, publishing, archiving, distributing,
visualizing, etc. ) the entities (software, data, images,
figures, papers, reports, etc.) related to global change.
We assign local identifiers to each (so we can
persistently resolve them) and capture and represent
their relationships.
If possible, we link with external authorities:
agency data centers, journal publishers,
Researcher ID (researcherid.com) or ORCID
(orcid.org).
24
Computer science-y things
wasDerivedFrom
wasInformedBy
used
ENTITY
ACTIVITY
wasGeneratedBy
startedAtTime,
endedAtTime
wasAttributedTo
wasAssociatedWith
AGENT
actedOnBehalf
Diagram from W3C PROV group and Ivan Herman
Non-specialist Use Case
Name
Find Latest Datasets by Keyword
Goal
Search for datasets associated with the keyword “snow”, list search results by recentness of publication.
Summary
User story:
I want to look for information concerning “snow.” I don’t know if it is a CLEAN word or a GCMD word or don’t
even know what GCMD or CLEAN is. How would I do it, and what would I see on my monitor during the
process?
Assumptions
The reader is not assumed to have knowledge regarding the GCMD Keywords (or other) vocabulary.
Actors
Primary Actor - reader of the NCA
Preconditions
TBD
Post Conditions
Reader is presented with a list of datasets associated with the keyword “snow” sorted by dataset publication date.
Normal Flow
TBD
Notes
We are looking into two user interface options for dataset selection by keyword
1) As a free-text search where the user inputs “snow”.
2) Present the user a faceted browse interface with a vocabulary faceted which presents the user with terms from a
structured vocabulary. The user can manually select the term(s) which match or contain “snow”.
We intend to implement prototypes of both.
Free-text Search by
Keyword (ELDA)
Faceted Browser (S2S)
Data type
Facet
Vocabulary
Facet
Climate Literacy & Energy
Awareness Network (CLEAN)
• http://cleanet.org/index.html
• The CLEAN project, a part of the National Science
Digital Library, provides a reviewed collection of
resources coupled with the tools to enable an online
community to share and discuss teaching about
climate and energy science.
• Science Vocabularies for middle school to
undergraduate students
• Vocabularies hosted at
http://serc.carleton.edu/admin/manage/view_vocab.p
hp?vocab_id=161
CLEAN Vocabulary
CLEAN Vocabulary (cont.)
CLEAN Vocabulary (cont.)
Interagency Information
Integration
GCIS can use relationships between all relevant
information about global change across the agencies:
o From observations to datasets to research papers to models
to analyses to organizations to people to synthesized reports
to human impacts...
o Determine agency interdependencies -- An EPA analysis
uses a NOAA model dependent on observations from a
NASA satellite.
o Can present unique interagency metrics "How many papers
referenced datasets from a specific satellite?"
o Direct users back to agency data centers for more detailed
information and the actual content and data.
GCIS “Data Mining”
Structured information with relationships allows
integrated data mining, searching, metrics.
o What projects provided data used to produce figures that were
referenced in the 2013 NCA section about coastal sea level
rise impacts?
o Which data centers hold data referenced by papers related to
forests in the midwest?
o Which agencies have people working on projects related to
societal impacts of extreme weather events?
o Show me the latest papers about health impacts of air quality
in California. Which datasets were used in the analysis of air
quality in California?
Be not afraid of informatics
Adopt, adapt, adapt, adapt,…
Coordinate, finds gaps, be synergistic.
Project check-in
• There are a few students who have
dropped the course…
• Red: Aayush Chhabra, Eric Dobson, Rikhya Ghosh, Daniel
Zhao
• Orange: Eric Hayden, Ankita Khandelwal, Sisi Liu, Travis
Scavone
• Yellow: Jennifer Chan, Benno Lee, Evan McCarty, James Ryan
• Green: Javier Camino, Lakshmi Chenicheri, Jonathan Dieter,
Melissa Hay
• Blue: Mike Moore, Michael O’Keffe, Ranjani Sargunaraj, Jessie
Sodolo
• Indigo: James Cataldo, Xueyang Guan, Thomas Hughes,
Shruan Li, Amar Vishwanathan Kanna
• Violet: Sarah Cooper, Nicolle Negdely, Anirudh Prabhu,
36
Renaldo Smith, Dian Yu