“Data GRID” Activity in Japan

Download Report

Transcript “Data GRID” Activity in Japan

Data GRID Activity in Japan
Yoshiyuki WATASE
KEK (High energy Accelerator Research Organization)
Tsukuba, Japan
[email protected]
Contents






Network Infrastructure
Application candidates of Data GRID
High Energy Physics in Japan
R&D in Data GRID
Collaboration on Data GRID
Conclusion
Network Infrastructure



The new implementation of the photonic network for Japanese
Academic Network: SuperSINET was approved in this FY2001
and will be operated by NII: National Institute of Informatics.
Five research areas have been defined as strategic user groups.
• High Energy Physics and Fusion Science
• Astronomy and Space Science
• Nano Technology
• GRID Computing
• Bio informatics
NII organizes coordination groups of the above areas.
• HEP, Fusion, Astronomy and Space Science are potential users
of Data GRID environment.
SuperSINET
Internet 10Gbps
Photonic network with OXC
WDM path : GbE
NIFS
IP router
(or Layer3 SW )
KEK
Nagoya U
Tohoku U
NIG
North Kanto Hub
OXC
Nagoya Hub
Osaka Hub
Tokyo Hub
Osaka U
US and Europe
NII Chiba
Kyoto U
NII
Hitotsubashi
ICR
Kyoto-U
ISAS
End node services:
10Gbps IP Backbone
peer-to-peer GbE links
U Tokyo
Internet
NAO
IMS
U-Tokyo
Application candidates of
Data GRID

Astronomy and Space Science
– Radio Astronomical Observation with
Realtime VLBI
….on going
– Radio Telescope Array at Chile: ALMA
(Atacama Large Millimeter/Submillimeter Array)
…...2010

HEP Application
– KEKB/Belle experiment ….on going
– LHC/Atlas experiment …..2006
Radio Telescope Array: ALMA
International Joint Project
Europe, Japan and North America Prepare for
Joint Construction of the Giant Radio Telescope
"ALMA" at Atacama desert in Chile.
2
12 m Telescope x 64 in 14 km
Distributed data analysis is required.
High Energy Physics

KEKB/BELLE
– KEKB is the e+ e- collision (3.5 GeV + 8 GeV)
experiment to search for CP violation
Similar to SLAC/BaBar experiment
– Data generated per day = ~ 0.5 TB/day
– Sharing data analysis load between KEK and
universities: Tokyo, Nagoya, and Osaka.
– Resources at KEK: CPU 12,000 SPECint95
Storage 650 TB (tape), 10 TB(disk)
BELLE: http://bsunsrv1.kek.jp/
B Factory Experiments e- e+
BELLE
K,p,g,e,...
BB
Positron
Electron
KEK
GbE
OXC
Osaka
GbE
GbE
Nagoya
Tokyo
High Energy Physics

LHC/ATLAS
– Japan is a member of ATLAS group
– Experiment starts in 2006
– Tier 1 Regional Center at Univ. of Tokyo
• Typical Tier 1 Center would have
– 200,000 SPECint95
– Storage ( 1/3 of Event Summary Data )
Disk 500TB + Tape 250 TB = Total ~1 PB
R&D for Data GRID

Bulk Data/File transfer over gigabit network
– “Data Reservoir”... U. Tokyo +Fujitsu
• Buffered parallel trans
• Data transfer protocol
i-SCSI
– WAN Emulator … KEK
Server/
Switch
SuperSINET
(1 - 10 Gbps)
Server/
Switch
i-SCSI
~ TB
• Long Fat Network emulator with variable latency

Scalable file system: DAFS (Direct Access File System)
R&D for Data GRID
Gfarm(Grid Datafarm): AIST, TITECH, KEK
1. PC farm with large local disks
2. Gfarm Filesystem: Prallel file system with Meta Database
3. Large data file is divided and stored into the local disk
in parallel by Gfarm parallel I/O API.
Collaboration on Data GRID

Possible Testbed systems for Data GRID
– KEK, TITECH, Univ.Tokyo, AIST
– resources and network
• KEK
~100 CPU + Storage: Local disk +HPSS
• TITECH ~ 500 CPU funded by JSPS ( $1.2 M)
• Tokyo ~ 50 CPU


Collaboration with LHC Data GRID
iVDGL
– AIST and TITECH cooperate with iVDGL
AIST: Nat’l Inst. of Advanced Industrial Science and Technology(Tsukuba)
TITECH: Tokyo Institute of Technology(Tokyo)
Collaboration on Data DRID

ATLAS Data Challenge (DC)
– Demonstration of distributed data analysis
• DC 0 2001
local test
• DC 1 Feb. - Jul. 2002 ~1000 PC’s test: CERN + RC’s
• DC 2 Spring - Autumn 2003
“1/2 scale” test : CERN + All RC
– International network connections
• NII will provide links to US.
(planning ~ Gbps in 2003)
Conclusions






Data GRID has broad application area in
scientific research.
World wide high speed network is essential to
our success.
Standardization of GRID middleware is crucial.
R&D are in progress by HEP and CS.
Collaborate with other Data GRID activities
Recent Target is Atlas Data Challenge.