Scalability - wgiss

Download Report

Transcript Scalability - wgiss

International Directory Network (IDN)
Scalability, Security and Interoperability
WGISS, 2006
Tom Northcutt
Systems Administrator: GCMD
September 13, 2006
I. Scalability, Interoperability
# GCMD/IDN Web Page Hits Since January 2003
8000000
7000000
Cache opened to Internet
Search robots
6000000
4000000
Introduction
of the new
web page
3000000
2000000
1000000
0
Ja
nM 03
ar
M -03
ay
Ju 03
l
Se -03
pN 03
ov
Ja 03
nM 04
ar
M -04
ay
Ju 04
l
Se -04
p
N -04
ov
Ja 04
nM 05
ar
M -05
ay
Ju 05
l
Se -05
p
N -05
ov
Ja 05
nM 06
ar
-0
6
#hits
5000000
month
Middleware Search/Retrieval Component
(Integration of spatial, freetext, and controlled queries)
Multi-Layer Search Component
delegates
3) joins
Controlled Vocabulary Database Layer
User
performs
1) query
Freetext (Lucene) Database Index Layer
7) Refines search
2)
Spatial Database Index Layer
Controller
Cache
Set of unique IDs
Search
Results
6) Returns to user
4)
Result
Processor
Title set information, brief
5) summary, dataset links, etc.
Scalability:
Core GCMD/IDN Architecture
Complexity:“it is hard to make things look easy.”
–
–
These are complex queries, with very fast search results.
Another example: data resolution refinement
●
●
difficult to implement
Makes it easier for the user
Scalability
●
Conventional
clustering approach
–
–
–
Load balancing
High availability
(source: redhat.com)
Scalability: GCMD/IDN Implementation
Stateful, Web Proxy Based Clusters
http://gcmd.nasa.gov/Keywords.do? ...&lbnode=2
Accelerated
Caching
http://gcmd.nasa.gov/DocumentBuilder/...
Scalability: Extensibility of
Stateful Web Proxy Clusters
http://gcmd.nasa.gov/OAI-script? ...
Harvester
SOAP
XML-RPC
http://gcmd.nasa.gov/ontology.wsdl
http://gcmd.nasa.gov/soap/http
http://gcmd.nasa.gov/xml-rpc
http://gcmd.nasa.gov/ajax/some.jsp
AJAX
Scalability:
Stateful Web Proxy Clusters
How we implemented this architecture:
–
–
–
Modified version of Squid proxy server
Custom perl scripts to implement state and redirection
Dynamic query caching done on the server end so each
refinement uses cached results
Scalability:
Advantages of Web Proxy Clusters
for CEOS Partners
●
Accelerated Caching
Load balance nodes
Stateful architecture
Open source
●
Multiple uses:
●
●
●
–
–
–
–
Web services
Browse imagery
Metadata search
Data access and retrieval
Scalability:
Google
Map
Scalability:
Google Map
●
Utility:
–
–
–
Google map is a form of spreading the load
Utilize third party resources for map generation
Google’s resources are distributed globally
II. Security
Security:Transparent Bridge Filters
Project Segregation
Network protection
B
R
I
D
G
E
Firewall
Network monitoring
Port remapping
Intrusion detection
BRIDGE
B
R
I
D
G
E
Internal
Firewalling
Network monitoring
Security:Transparent Bridge Advantages
Applicability for CEOS Partners
●
Applicable to heterogeneous environments
–
Unobtrusive
●
●
–
●
Ultra secure: invisible at the IP level
Implements emerging security policies
–
–
–
●
No changes needed on servers or network controllers
May assist with GRID network security requirements
Two factor authentication
Efficient encryption, authentication
Port knocking capabilities
Open source
–
–
Non-proprietary
Universal
Conclusion
●
IDN continues to grow in popularity
–
–
●
Users
Earth science partnerships
The system continues to develop
–
–
–
–
Scalability
Security
Usability
Interoperability
End