IntelSim - The Quilt
Download
Report
Transcript IntelSim - The Quilt
Easing Access for Researchers
Slide 1
Slide 1
Collaborators wants an environment where
managing members & access to resources is
FAST and EASY
This!
Not This!
Slide 2
Introductions
• Moderator: Jill Gemmill, Clemson University
• Nathaniel Mendoz, TACC
• Tracy Futhey, Duke University
• Mike Kirgan, Florida International University
Slide 3
TACC Snapshot Today
•
•
•
•
Personnel
– ~135 Full time staff (~50 PhD), 15 students
Funding
– Roughly 85% externally funded, through 7 major projects and more than
20 smaller grants.
– TACC Represents (mostly NSF and UT) cumulative investment of more
than $300M in buildings, systems, and staffing
Facilities
– Datacenter upgrade to 10MW completed in 2012.
– New office facility to be completed by end of 2015 (accommodating
~150 staff total).
Services
– HPC, HTC, Visualization, Large scale data storage, curation and
analysis, Web portals and Gateways, API design, consulting, etc.
Slide 4
TACC Provides a Comprehensive Computational
Science Ecosystem
Stampede
Maverick
HPC Jobs
6400+ Nodes
10 PFlops
14+ PB Storage
Vis & Analysis
Interactive
Access
132 K40 GPUs
Wrangler
Lonestar
HTC Jobs
1800+ Nodes
22000+ Cores
146 GB/node
Corral
Data Intensive
Computations
10 PB Storage
High IOPS
Stockyard
Shared Workspace
20 PB Storage
1 TB per user
Project Workspace
Data Collections
6 PB Storage
Databases
IRODS
Rodeo
Cloud Services
User VMs
XXX VCores
XXX PB
Vis Lab
Ranch
Immersive Vis
Colaborative
Touch Screen
3D
Tape Archive
160 PB Tape
1TB Access
Cache
Computation: Stampede, Lonestar
Visualization: Maverick, Stallion.
Storage and Archive: Corral, Ranch (160 PB)
Cloud: Rodeo, Chameleon, Jetstream
Data Driven Computing: Wrangler, Rustler
Reasoning Systems: Yellowhat
Connectivity: 100Gb/s to Internet2
Tools, APIs, Algorithms, Consulting, and Team
Slide 5
Enabling New Modes of Computing
• ~3k projects, ~10k direct users through SSH
• But we have ~35k+ (maybe 50-60k) users who access
us through various APIs
– web gateways
– Galaxy, iPlant, DesignSafe, etc.
– Many don’t even know they use TACC – we’re just “the
cloud”.
• Data: ~3PB per month move in and out of the Center
– More than 1PB through REST API, and that volume is
increasing 20% *per month*.
Slide 6
Enabling Apps in Non-traditional ways
•
•
•
In collaboration with Stanley Watowich at
UTMB, TACC developed a web portal to
provide a graphical interface for conducting a
screen for identifying small molecules that bind
to a target protein.
– Users can screen against libraries of
ligands up to 642k in size.
– Basic screening is done, and results are
collated and best matches returned in a
simple download.
70 researchers have used more than 5 million
hours on Lonestar through the portal
“We report the discovery of a novel smallmolecule inhibitor of the dengue virus (DENV)
protease (NS2B-NS3pro) using a newly
constructed Web-based portal
(DrugDiscovery@TACC) for structure-based
virtual screening.”
Journal of Chemical Information and Modeling
“Identification of a Novel Inhibitor of Dengue Virus
Protease through Use of a Virtual Screening Drug
Discovery Web Portal”
Slide 7
Data at the Speed of Trust 1440588
Overall Goal & Primary Use Case
• Facilitate and ease research collaboration on campus and
off campus, including
– access to protected and sensitive data via flexible
computational resources, and
– through the enablement and simplification of identity
and access management and provision of access to
local resources.
• Duke interdisciplinary and inter-institutional collaboration
(NSF DIBBs ACI-1443014) to develop and utilize synthetic
data techniques for analysis of highly sensitive data (20+
years of OPM data on 30,000+ federal employees).
Slide 8
Data at the Speed of Trust 1440588
Preconditions: Pain Points & Opportunities
• Enabling secure access to datasets and analytic tools
without requiring researchers double as sys admins
– SDN enabled environment (NSF OCI-1246042; CNS
1243315) on an MPLS network with extensive VRF
deployment (xxx VRFs)
– Mature virtualized environment not just in data center but
enabling public labs functionality, student development, etc
• Sponsoring and managing Guest Accounts for
collaborators
– Extensive use of Shib (1900 enabled applications)
– Extensive use of Grouper (740,000 groups)
Slide 9
Data at the Speed of Trust 1440588
Approaches: Technical and Social
• Use SDN, Switchboard and simple controller to provision
bypass networks for researchers with servers/data in
departmental locations and/or central data centers.
• Enable remote access via web based services, virtualization,
and container technologies to simplify access and improve
performance to protected network resources and to other
services (both Windows and X based services; “no VNC”)
• Extend federation and access via Shib'ed resources and
negotiate and extend rich attribute release for broad or
narrow access rights (all graduate students at Princeton vs.
Research Collaborator Y at University X)
Slide 10
Secure Access for Everyone (SAFE) 1440728
• The Secure Access for Everyone (SAFE) project is
about building a standards-based integrated Identity and
Access Management (IAM) system that supports trusted
collaboration and sharing of CI resources across
multiple institutions
• Drivers
– Slow process for collaborators to get access resources
– Sharing data and accessing systems were too
cumbersome, especially for infrequent users
– The amount of research requiring collaboration across
multiple institutions keeps increasing
Slide 11
Key Goals of SAFE to Support Multi-Institutional
Collaborations
• Allow collaborators to use their home institutions
credentials to access campus CI resources, regardless if
that resource has a web or command line login (i.e. ssh)
• Access requests must be processed quickly
• Provide a user friendly way to access campus CI
resources, including a single portal, easy to use
interfaces, and documentation for each system
• Provide a convenient way to share and access data
• Provide all of this in a secure fashion
Slide 12
Components of SAFE
1) Federated Web Portal provides a single place that all
researchers can go to log-in and connect to any SAFE
integrated CI resource. This portal leverages InCommon
for federated authentication.
2) Identity and Access Management system was built into
the SAFE portal and integrated with the campus IAM
system. Together, they process all access requests,
automatically provisions accounts, recertify accounts
annually when needed and link a user’s InCommon
credentials with an internally created FIU account.
Slide 13
Components of SAFE (cont.)
3) Login interfaces - Both customized web and command
line interfaces (such as ssh) are provided through the
SAFE web portal. This secures connections without
requiring the use of VPN. Only VPN and Web Portal
connections are allowed to connect directly to many CI
resources. Local credentials do not need to be known by
the collaborator as the web portal will automatically pass
the appropriate credentials
4) Other Components – Shibboleth, CAS, LDAP, AD,
Shellinabox, custom IAM & IBM Security Identity Manager
5) Current CI Integrations – HPC, Virtual Computing Lab,
Collaborative Data Storage System, Library Databases
Slide 14
Account Approval & Login Process
• Access requests are approved by the PI online in the SAFE
IAM system, renewed once a year. Local accounts are
created and linked to InCommon credentials.
• The IAM system provisions (and de-provisions when needed)
local accounts to the appropriate CI resources
• SSH logins are supported via passing encrypted local
credentials to a modified open source web based SSH
interface, that then automatically logs the collaborator into a
local account on the desired resource.
• Web based resources either support Shibboleth, been
modified to do so by the SAFE team or in some cases a
customized alternate web interface is being created
Slide 15
FeduShare: A User-Managed Collaboration
Framework (ACI-1440609)
We have been modeling and designing
campus infrastructure as a closed system
with identities & resources we
What if we modeled and designed
for open collaboration instead?
Slide 16
The Project: Two Use Cases + a Catalog
Use Case 1: Federated access to a
campus HPC cluster via console
logon -- in PRODUCTION
SYSTEMS (Year 1)
Use Case 2: Federated access to
multiple clouds/SDN testbeds (eg:
GENi and CloudLab ) (Year 2)
Catalog: Open Source Software
candidates to use for FeduShare
framework components (Years 1 & 2)
https://sites.google.com/site/fedushare/
Slide 17
Decision Matrix
•
CILogon
ECP
SSH
ECP
PAM
SSH
Keys
Web
Portal
❌
gsissh
❌
ecpssh
✔️
✔️
✔️
Software exists today
✔️
✔️
❌
✔️
✔️
Password not exposed to
server
No extra registration step
✔️
✔️
❌
✔️
✔️
❌
cert
✔️
✔️
❌
key
✔️
No new user-managed keys
❌
✔️
✔️
❌
✔️
Uses SAML for SSH login
❌
✔️
✔️
❌
✔️
Native SSH client
✔️
✔️
✔️
✔️
❌
browser
No special client software
Slide 18
Happy Side Effect: Mobile Logon using Shib-ECP
• converting our hybrid mobile web app into a
native iOS app (Android next)
• We wanted a login-once paradigm
• Integrate native login with Shibboleth since
most other campus services use it
• FeduShare work at Clemson ensured that our
IDP supported ECP and was configured
properly
https://github.com/OpenClemson/SwiftECP
Slide 19
Panel Discussion Questions
• How is managing access to resources hosted on
campus changing due to the increase in
collaboration? How is your organization
responding to this change?
• What do you consider to be the most novel or
innovative aspect of your project?
• What do you see as the most important barriers to
access for researchers that have yet to be solved?
Slide 20