SGW-I2-NEIC-5-8

Download Report

Transcript SGW-I2-NEIC-5-8

Science Gateways
What are they and why are they having such
a tremendous impact on science?
Nancy Wilkins-Diehr
[email protected]
NeIC abstract
• Science gateways, also known as web portals, are
having a tremendous influence on access to high
performance computing. This talk will describe the
launch of science gateways in the US TeraGrid program
in 2004 through the use of XSEDE resources today. I
will highlight gateway efforts more generally, for
example the International Workshop on Science
Gateways and a new workshop beginning in Australia. I
will also discuss the results from a recent 29,000
person US-based survey on gateway use, development
activities and the need for a community support
model.
What is a science gateway?
science gateway /sī′ əns gāt′ wā′/ n.
1. an online community space for science and engineering research and education.
2. a Web-based resource for accessing data, software, computing services, and equipment specific to the needs
of a science or engineering discipline.
Gateways:
A natural result of the impact of the Internet on
worldwide communication and information retrieval
Less than 25 years since the release of Mosaic!
• Implications for the conduct of science continue to evolve
– 1980’s, Early gateways, National Center for Biotechnology Information BLAST
server, search results sent by email, still a working portal today
– 1989 World Wide Web developed at CERN
– 1992 Mosaic web browser developed
– 1995 “International Protein Data Bank Enhanced by Computer Browser”
– 2004 TeraGrid project director Rick Stevens recognized growth in scientific
portal development and proposed the Science Gateway Program
• Today, science gateways are pervasive
– 29,000 person survey indicates wide use and wide participation in
development in the research community
– Explosion of digital data makes gateways a necessity
• Growing analysis needs in many, many scientific areas
• Sensors, telescopes, satellites, digital images, video, genome sequencers
• Exponential growth in computing simulations
• As the Web grows more capable, it is increasingly important to science
vt100 in the 1980s and a
login window on Gordon today
=======
# Full path to executable
executable=/users/wilkinsn/tutorial/
#! /bin/sh
#PBS -q bin/mcell
dque
Why are gateways worth the effort?
• Increasing range of
expertise needed to tackle
the most challenging
scientific problems
#PBS -l nodes=1:ppn=2
#PBS -l walltime=00:02:00
Working directory, where Condor#PBS -o #pbs.out
will write
#PBS -e G
pbs.err
#PBS -V# its output and error files on the
cd /users/wilkinsn/tutorial/exercise_3
local
machine.
../bin/mcell
nmj_recon.main.mdl
– How many details do you
want each individual scientist
+(
to need to know?
• PBS, RSL, Condor
• Coupling multi-scale codes
• Assembling data from
multiple sources
• Collaboration frameworks
initialdir=/users/wilkinsn/tutorial/exe
rcise_3
&(resourceManagerContact="tg# To set the working directory of the
login1.sdsc.teragrid.org/jobmanagerremote job, we
pbs") # specify it in this globus RSL,
which will be appended
(executable="/users/birnbaum/tutorial/bin/
mcell")# to the RSL that Condor-G
(arguments=nmj_recon.main.mdl)
generates
(count=128)
globusrsl=(directory='/users/wilkins
(hostCount=10)
n/tutorial/exercise_3')
(maxtime=2)
(directory="/users/birnbaum/tutorial/exerci
# Arguments to pass to executable.
se_3") arguments=nmj_recon.main.mdl
More users access supercomputers via
gateways than from the command line in 2014
Active users
Gateway users
3,500
3,000
NSF XSEDE users
2,500
2,000
1,500
1,000
500
0
This is also true at NERSC
Many other gateways taking off with hundreds of thousands of users
(HUBzero, Galaxy, materialsproject.org, more)
2,984
2,398
Proliferation of Science Gateways
These use XSEDE
Cyberinfrastructure for Phylogenetic
Research (CIPRES)
• Most popular science gateway in
XSEDE
– ~40% of all XSEDE users
• In use on 6 continents
• Cited in major journals (Cell,
Nature, PNAS)
• Used at major research institutions
(Stanford, Harvard, Yale)
• Used by ~60 researchers for
curriculum delivery
• Supports hundreds of publications
every year
• Used in 80% of EPSCoR states
• Used by a 15-year-old high school
student who won the
Massachusetts state science fair
with no support from SDSC staff
Gateways changing the
face of scholarship
“To me this is the essence of a research university, but now this is a
global university. It is not just Purdue, or the people in my group, or
the people that run nanoHUB, it's really more than 1,000 content
contributors and 380 tool developers, most of whom are volunteers
that have contributed to the hundreds of tools.”
Gerhard Klimeck
“A former student of mine published eight tools on nanoHUB, serving over
6,000 people with his tools. He then joined a university as a professor and
introduced nanoHUB. Use of the gateway from that university
skyrocketed; he used nanoHUB in existing classes, created new classes,
and infused it in his research.” Ultimately, the professor’s department
head associates his two-year rise to tenure with the notoriety and
innovation he gained through nanoHUB.
Constant evolution in gateway technologies
IPython, Rstudio
But uncertain funding can disrupt effectiveness
Typical 3-year research funding cycle
New
project
prototype
Scientists
disillusione
d
Funding
ends
Early
adopters
Publicity
Wider
adoption
Gateways enable research, but are not research
projects themselves
Science Gateways Institute
2012 NSF Software Institute conceptualization award
2015 NSF Software Institute implementation proposal ($15M)
Are you building websites that serve your
science discipline?
Do you wish you could connect with and
learn from others who are doing the same
thing?
We are building an institute to serve you—and others like you—with resources,
services, experts, and ideas for creating and sustaining science gateways. Sign up to
join the conversation: http://sciencegateways.org/volunteer/
science gateway /sī′ əns gāt′ wā′/ n.
1. an online community space for science and engineering research and education.
2. a Web-based resource for accessing data, software, computing services, and equipment specific to the needs of a science or
engineering discipline.
Millions of dollars are spent on gateways, but
developers face several challenges
•
•
•
•
•
They often work in isolation even though development can be quite similar across
domain areas
They bridge cyberinfrastructure—locally, campus-wide, nationally, and sometimes
internationally
They need foundational building blocks so they can focus on higher-level, grandchallenge functionality
They struggle to secure sustainable funding because gateways span the worlds of
research and infrastructure
The goal of the institute would be to provide coordinating activities across the National
Science Foundation, offering several services and resources to support the gateway
development community:
– An incubator service offering consultation and documentation about business planning and software
development.
– An extended support team to build gateways and share their expertise.
– A forum to connect members of the development community.
– A modular, layered framework that supports community contributions and allows developers to
choose components.
– Workforce development to help train the next generation for careers in this cross-disciplinary area.
•
Sharing expertise about technologies and strategies would allow developers to
concentrate on the novel, challenging, and cutting-edge development needed by their
specific user communities
29,000-person survey on gateways in 2014
4957 responses from across domains
How important are gateways to your work?
Somewhat or very important
Specialized Resources
Data collections
Data analysis tools, including visualization and mining
Computational tools
Tools for rapidly publishing and/or finding articles and data
specific to my domain
Educational tools
Percent
75%
72%
72%
69%
67%
Platforms for fostering group or community collaboration
63%
Simplified interfaces that eliminate the need to learn coding
62%
Citizen science and other public engagement resources
Workflows that automate or capture tasks or processes
Scientific instruments, such as telescopes, microscopes, or
sensors
47%
42%
39%
57% involved in application creation
Frameworks
or platforms
6%
Workflows
6%
Citizen science
resources
5%
Collaboration
tools
8%
Interfaces to
scientific
instruments
4%
Interfaces to
sensor data
4%
Other
2%
Data
collections
15%
Data analysis
tools,
including
visualization
and mining
16%
Educational
tools
18%
Computational
tools
16%
Gateway development projects require many
different types of people
80%
Yes, we had this
No, but wished we had this
No, did not need
68%
70%
% of developers
60%
50%
50%
40%
40%
33%
30%
20%
49%
44%
43%
41%
35%
29%
30%
27%
25%
17%
16%
32%
30%
20%
17%
9%
10%
11%
15%
14%
15%
0%
Usability
Consultant
Graphic Designer
Community
Liaison/Evangelist
Project Manager
Professional
Student or Post-doc
Software Developer Programmer
Security Expert
Quality Assurance
and Testing Expert
Proposed Service
What common
services would be
helpful?
% Interest
Evaluation, impact analysis, website analytics
72%
Adapting technologies
67%
Web/visual/graphic design
67%
Choosing technologies
66%
Usability Services
66%
Visualization
65%
Developing open-source software
64%
Support for education
64%
Community engagement mechanisms
62%
Keeping your project running
62%
Legal perspectives
61%
Managing data
60%
Computational resources
59%
Mobile technology
59%
Database structure, optimization, and query expertise
59%
Data mining and analysis
58%
Cybersecurity consultation
57%
Website construction
57%
Software engineering process consultation
53%
Source code review and/or audit
51%
High-bandwidth networks
45%
Scientific instruments or data streams
44%
Management aspects of a project
38%
NSF’s Software Infrastructure for Sustained
Innovation program
Gateways are one of two
featured areas in the
software institutes program
Awards anticipated in 2016
The future is bright for
gateways!
Thank you
Questions?