Grids and eScience”

Download Report

Transcript Grids and eScience”

“Grids and eScience”
Mark Hayes
Technical Director - Cambridge eScience Centre
GEFD Summer School 2004
Outline of this talk
1. What is “eScience”?! (as opposed to just plain science.)
2. A brief history of the Internet
3. Some examples of succesful eScience
4. Environmental eScience
5. Where all this is heading…
eScience - a definition
“eScience is about global collaboration in key
areas of science and the next generation of
infrastructure that will enable it.”
Dr.John Taylor, Director General of the Research Councils 1998-2003
eScience - my definition
“eScience is research into new ways of using
the Internet to do science.”
In the beginning…
The computer as a communication device
"The collection of people, hardware, and software...
will become a node in a geographically distributed
computer network…. Through the network... all the large
computers can communicate with one another. And through
them, all the members of the community can communicate
with other people, with programs, with data, or with a selected
combination of those resources.”
J.C.R.Licklider, “The Computer as a Communication Device”
Science and Technology, April 1968
The ARPAnet in 1970
A brief history of the Internet
1962 – Paul Baran of RAND invents packet switched networking
1968 – Licklider’s vision
1969 – ARPAnet goes online
1973 – Bob Kahn & Vint Cerf invent TCP/IP
1979 – Usenet & MUDs invented
1983 – TCP/IP established as a standard
1987 – number of hosts > 10,000
1989 – number of hosts > 100,000
1989 – Tim Berners-Lee invents the World Wide Web
1992 – number of hosts > 1,000,000
http://www.isoc.org/internet/history/
International connectivity - 1991
International connectivity - 1997
International bandwidth
From “3D geographic network displays” - Cox et al, ACM Sigmod Record - December 1996
What does the Internet look like?
http://www.cybergeography.org/
Using the Internet to do science
• Online publication of papers, pre-prints
e.g. http://www.arxiv.org http://www.pubmedcentral.org/
• CPU cycle scavenging, e.g. SETI@home, climateprediction.net
• The Human Genome Project: free access to data
• Sloan Digital Sky Survey: online database of astronomical data
http://www.sdss.org/
Early distributed computing
1.2 million CPU years so far...
Brute force attempt to crack strong encryption
Protein folding
SETI@home
The world’s most powerful distributed super-computer
delivered 65 Teraflops/second yesterday
(Earth Simulator is 35 Tflop/s)
Latest Stats
http://setiathome.ssl.berkeley.edu/totals.html
14th September 2004
Total
Last 24 Hours
Users
5,170,918
1,934
Results received
1.5x109
1.4x106
Total CPU time
2x106 years
1,115 years
Floating Point
Operations
5.6x1021 ops
5.6 zeta ops
5.6x1018 flops/day
65 Teraflops/s
It’s not just compute cycles...
An exponential growth in data from many areas of science.
Human genome project
1995-2003
5 institutions sequenced the bulk of the human genome,
depositing raw data in public FTP servers within 24 hours of
it being sequenced. 3 copies of the data are mirrored in the
UK, US & Japan.
Annotating the data is an ongoing world-wide collaborative
effort. See e.g. http://www.biodas.org/ http://www.ensembl.org/
For more on the human genome project:
http://www.sanger.ac.uk/HGP/overview.shtml
http://www.genome.gov/
Environmental eScience
http://www.climateprediction.net
http://ndg.badc.rl.ac.uk/
(NERC DataGrid)
http://www.earthsystemgrid.org/
Where all this is heading
New science, carried out by “virtual organisations” enabled
by the internet.
VO = distributed data, compute resources, people
Technology:
Globus - http://www.globus.org/
Condor - http://www.cs.wisc.edu/condor/
Access Grid - http://www.accessgrid.org/
The Access Grid
High end video conferencing
and collaboration technology.
O(100) nodes world wide.
Presenter
mic
Presenter
camera
Ambient mic
(tabletop)
Audience camera
“...one of the most compelling glimpses into the future I’ve seen since I first saw NCSA Mosaic.”
Larry Smarr
Real-time “what if” scenarios
• An explosion!
• A dangerous chemical
escapes!
• Where is the pollutant
headed?
• Who needs to be
evacuated?
The gViz project, Ken Brodlie et al, Leeds University
http://www.visualization.leeds.ac.uk/gViz/ http://www.allhands.org.uk/proceedings/papers/67.pdf
Coupled models
•
•
•
•
The GENIE project, Paul Valdes et al
flexibly couple together
“component” models to form
a unified Earth System Model
(ESM),
execute the resulting ESM
across a computational Grid,
share the distributed data
produced by simulation runs,
and
provide high-level open
access to the system, creating
and supporting virtual
organisations of Earth
System modellers.
http://www.genie.ac.uk/
How you can get involved...
• NIEeS - http://www.niees.ac.uk/
• National eScience Centre (Edinburgh)
http://www.nesc.ac.uk/
•Your local eScience Centre