data publishing - COLAB
Download
Report
Transcript data publishing - COLAB
The Power of Networks for
Environmental Protection
Environmental Compliance Consortium
Managing Environmental Information Forum
June 21, 2006, Philadelphia, PA
Brand Niemann, US EPA Office of Environmental Information, Enterprise
Architecture Team
http://colab.cim3.net/cgi-bin/wiki.pl?BrandNiemann
June 9, 2006
Also see http://colab.cim3.net/file/work/SICoP/2006-06-21/SICOP206212006.ppt
1
Overview
• Information networking holds enormous promise for enhancing the
quality and quantity of information available to the public, and
communicating that information more openly and quickly, with
attention to the needs of specific audiences.
– Question 1: What potential does information networking hold for
environmental and public health improvements?
• Federal and state environmental regulatory agencies took a first
leap toward building an environmental information network when
they established the National Environmental Information Exchange
Network.
– Question 2: Where is the Exchange Network now, and where does it
need to go? What does that evolution imply about the actions federal,
state, and local environmental agencies need to take?
• Non-governmental organizations are also building information
networks to improve environmental and public health.
– Question 3: What issues, such as content accuracy, do nongovernmental networks need to address? (This session will explore this
emerging issue.)
Source: http://www.complianceconsortium.org/Events/MEI_Forum_June_06_Philadelphia/index.htm
2
Question 1
• What potential does information
networking hold for environmental and
public health improvements?:
– Answer 1. Be like Wikipedia.
– Answer 2. Be like Mashups.
– Answer 3. Be like the new CIA.
3
Question 1
• Answer 1. Be like Wikipedia.
– Wikipedia is an international Web-based free-content encyclopedia. It
exists as a wiki, a type of website that allows visitors to edit its content;
the word Wikipedia itself is a portmanteau of wiki and encyclopedia.
Wikipedia is written collaboratively by volunteers, allowing most articles
to be changed by anyone with access to a computer, web browser and
Internet connection.
– The project began on January 15, 2001, as a complement to the expert
written (and now defunct) Nupedia, and is now operated by the nonprofit Wikimedia Foundation. Wikipedia has more than 3,800,000
articles in many languages, including more than 1,100,000 in the
English-language version. Since its inception, Wikipedia has steadily
risen in popularity and has spawned several sister projects.
– Wikipedia's most notable style policy is that editors are required to
uphold a "neutral point of view", under which notable perspectives are
summarized without an attempt to determine an objective truth.
http://en.wikipedia.org/wiki/Wikipedia
4
Question 1
• Answer 2. Be like Mashups.
– A mashup is a website or web application that seamlessly combines
content from more than one source into an integrated experience.
– Content used in mashups is typically sourced from a third party via a
public interface or API. Other methods of sourcing content for mashups
include Web feeds (e.g. RSS or Atom) and JavaScript.
– Much the way blogs revolutionised online publishing, mashups are
revolutionizing web development by allowing anyone to combine
existing data from sources like eBay, Amazon.com, Google, Windows
Live, and Yahoo in innovative ways. The greater availability of simple
and lightweight API's have made mashups relatively easy to design.
They require minimal technical knowledge and thus custom mashups
are sometimes created by unlikely innovators, combining available
public data in new and creative ways. While there are many useful
mashups, others are simple novelties or gimmicks, with minimal
practical utility.
– Advocates and Supporters of Web 2.0 applications claim that mashups
exemplify this new movement with their active user participation and
interaction.
Source: http://en.wikipedia.org/wiki/Mashup_%28web_application_hybrid%29
5
Question 1
• Answer 3: Wikis and Blogs, Panel 1-5: Knowledge in Action, eGov
Knowledge Management Conference, D. Calvin Andrus, Ph.D.,
Chief Technology Officer, Center for Mission Innovation, CIA, April
21, 2006:
– Two recently invented information sharing tools can enable the
Intelligence Community to adapt to changes quickly:
• One is a self-organized, hyperlinked “encyclopedia” called a Wiki, which is
free from personal opinion.
• The other is self-published hyperlinked points of views on the topics of the
day, called a Blog
– Because they are real-time, self-authored, hyperlinked bodies of
knowledge that are open to everyone on the system, they can adapt as
fast as a person can enter information.
– At CIA, we have created nearly 500 internal blogs in the last 6 months
(a few dozen are active) and we have an internal Wiki that has
generated about 10,000 pages in about a year.
– Once a critical mass is reached, new social, political, and economic
systems start to emerge. This is what authors Downes and Mui call the
Law of Disruption.
Source: http://events.fcw.com/events/2006/KM/downloads/KM06_1-5_Andrus-CIA.pdf
6
Question 2
• Where is the Exchange Network now, and
where does it need to go? What does that
evolution imply about the actions federal,
state, and local environmental agencies
need to take?
– Answer 1. Luttner, Chelan, and Thompson,
November 29, 2005.
– Answer 2. Molly O’Neill, April 19, 2006.
– Answer 3. Brand Niemann, May 19, 2006.
7
Question 2
• Answer 1: Exchanging Data - Future Strategies for the
Environmental Information Exchange Network, Environmental
Information Symposium 2005, Flagship Session Track 3, November
29, 2005, Mark Luttner, Director, Office of Information Collection,
OEI, U.S. EPA, John Chelen, The Hampshire Institute, and Steve
Thompson, Executive Director, Oklahoma Department of
Environmental Quality:
– Over the last several years the Environmental Information Exchange
Network infrastructure has blossomed into a robust mechanism that
supports EPA and 40 state and tribal partners, and flows environmental
data across all major media areas.
• Where are the future opportunities for this partnership?
• Should we create an Exchange "Macro-Network" that extends our
infrastructure and partnership to health organizations, local governments,
and the private sector?
• What are the new value-added services (geospatial, data publishing,
identity management, quality control) that we might provide to our Network
partners?
• Does facility data present an opportunity for new approaches to
environmental data management among Network partners?
Sources: http://epa.gov/oei/proceedings/pdfs/luttner.pdf
and http://epa.gov/oei/proceedings/pdfs/chelen.pdf
8
Question 2
• Answer 2: Molly O'Neill, ECOS, Overview of Other
Networks, in Track 1: Partnering with Other Networks &
Agencies, at the 2006 Exchange Network Users'
Meeting, April 18 - 19, 2006, San Francisco, CA:
– Exchange Network is on course with Federal initiatives.
– Serving business needs will go a long way to staying the course.
– We also need to be cognizant of these other initiatives to make
sure we continue to be in alignment:
•
•
•
•
Global Justice Information Sharing Initiative.
Public Health Information Tracking.
National Information Exchange Model (part of Global Justice).
Federal Enterprise Architecture Data Reference Model.
– Commonalities: Service Oriented Architecture, XML, Data
standards and definitions.
Source: http://www.exchangenetwork.net/2006Meeting/Presentations/GlobalNetworks_ONeill.ppt
9
Question 2
• Answer 3: Use Service Oriented Architecture, the
Federal Enterprise Architecture Data Reference Model,
and Semantic Wikis and Information Management:
– February 2006 – to present, EPA Data Architecture for Data
Reference Model 2.0.
• Wiki Page: http://colab.cim3.net/cgi-bin/wiki.pl?EPADataArchitectureforDRM2
– May 23-24, 2006, SOA for E-Government Conference, for the
CIO Council’s Architecture & Infrastructure Committee.
• Wiki Page: http://colab.cim3.net/cgi-bin/wiki.pl?SOAforEGovernment_2006_05_2324
– June 13-15, 2006, The Role of the New Data Reference Model
2.0, Invited Presentation at the Internal Government Agency
Communications, Knowledge-Sharing and Collaboration
Conference.
• Wiki Page: http://colab.cim3.net/cgi-bin/wiki.pl?SICoP/SemanticWikisandInformationManagementWG
10
Question 2
• Answer 3 (continued):
– The Exchange Network of data flows
“exchanges” environmental data in closed,
proprietary formats to open, standardized
formats (XML) in a “network” that delivers that
data from state nodes to EPA’s central node
(CDX).
– The environmental community wants access
to information that will answer questions
about facility compliance, environmental
impacts, etc.
11
Question 2
• Answer 3 (continued):
– The environmental community will need to
build a “Collaborative Network” of services
that answers the questions it has.
– Semantic Wiki Technology is recommended to
pilot that “collaborative network” of services
so it can be built inexpensively and rapidly by
non-computer programmers (subject matter
experts) based on Service-Oriented
Architecture and the Federal Enterprise
Architecture’s Data Reference Model.
12
Question 3
• What issues, such as content accuracy, do nongovernmental networks need to address?
– Answer 1: The spontaneous creation of the KatrinaHelp.info Web
site early in the crisis was particularly noteworthy. The wiki uses
collaborative software so anyone can add content. By contrast to
the rarely updated federal, state and local government Web
sites, volunteers constantly revised it. It's still the most
comprehensive information source for those affected by Katrina.
– Isn't it risky letting anyone contribute? Yes, but people can
monitor the wiki for accuracy and remove erroneous or malicious
content. After comparing KatrinaHelp.info to its woeful
government counterparts, I think people would agree the risk is
worth it.
13
Question 3
• What issues, such as content accuracy, do nongovernmental networks need to address?
– Answer 1 (continued): The science of "emergent"
behavior is producing a higher level of effective
collaboration than any individual effort can achieve,
especially in unpredictable situations such as
disasters. KatrinaHelp.info is a perfect example.
– Source: W. David Stephenson: Power to the people,
Federal Computer Week, December 5, 2005,
http://www.fcw.com/article91586-12-05-05-Print
14
Question 3
• What issues, such as content accuracy, do nongovernmental networks need to address?
– Answer 2: This is what I would like to pilot using a
Semantic Wiki that addresses the scenario: “What is
the compliance status of facility X?”, Using the
following data sources:
•
•
•
•
US EPA
State
Facility Itself
Other
– Answer 3: An example of how I would do that is
shown in the following slides.
15
Question 3
• Current Work Flow Process:
– Search for data and metadata sources.
– Cut-and-paste those data sources to your tools, e.g. Word,
Excel, PowerPoint, etc. and deal with multiple proprietary
formats.
– Resolve the difference in semantic naming and identification
technologies and standards used.
– Model your information, e.g. table of contents, relational data
structure, etc. for analysis and reporting.
– Enter your information in a network collaboration tool to share it
and connect it with others.
• Semantic Wiki Work Flow Process:
–
–
–
–
Subscribe to information services.
Import, integrate, and analyze multiple proprietary formats.
Discover and connect relationships with Semantic Agents.
Make part or all of your information available on the public Web
and/or secure private networks in W3C standard formats.
16
Question 3
• Background on Pilots:
– This will build on our experience with building a major distributed XML
Web Services Network, first for FedStats with statistical data from the
more than 3000 counties and 50+ state and territories (1999), for which
I won two awards (Gore Hammer Award-2000, and Mark Forman and
the Quad Council-2002), including the over 3000 LEPC’s, continuing
with the 400+ community indicators programs for NICS (National
Infrastructure for Community Statistics) CoP (2005), and now for the
611 some regions used by NARC (2006). This work supplements and
compliments that for the EPA-State National Environmental Information
Network (NEIN) with its 45 some state nodes.
• Sample Node in Dynamic Knowledge Repository Network (see next
slide):
– Each node (hierarchy on left side), authored in a Semantic Wiki-like tool,
is an XML Web Service that uses SOAP-like messages to communicate
between nodes that are composed in a Service-Oriented Architecture
(SOA) in which the services provided are: structure, searchability, and
semantics, that is DRM 2.0 – compliant.
17
Question 3
See http://web-services.gov and Dynamic Knowledge Repositories, EPA Region 4
18