1 Belle II Computing and requirement of the netowk LHCONE Asia-Pacific workshop @ Nantou, Taiwan Belle II computing resource, design, network Network data challenge Trans-Pacific Trans-Atlantic LHCONE(-like layer) for Belle II ? Takanori Hara (KEK) [email protected] 13 Aug., 2014 Integrated Luminosity (ab-1) Luminosity Prospect 80 70 2 SuperKEKB Commissioning starts in 2015 9 months/year 20 days/month 60 Belle -1 ~1 ab 50 40 target integrated luminosity 50 ab-1in 2022 34 2 target instantaneous luminosity 2.1 x 10 /cm /s 30 35 2 8 x 10 /cm /s 20 10 0 2010 2012 Experiment Belle II ALICE (Pb-Pb) ALICE (p-p) ATLAS CMS LHCb 2014 2016 Event size 2018 Physics run starts in 2017 2020 2022 calendar year Rate @ Storage Rate @ Storage [kB] [event/sec] [MB/sec] 300 6,000 1,800 (@ max. luminosity) 50,000 2,000 1,500 1,500 55 100 100 600 150 4,500 4,000 200 700 225 (<~1000) 250 (LHC experiments : as seen in 2011/2012 runs) (/ab) Hardware Resources for Belle II 14.00 Yearly integrated luminosity 12.00 10.00 6.00 100 4.00 50 2.00 1,400 1,200 1,000 800 600 400 200 0 200 150 8.00 0.00 250 Year1 Year2 Year3 Year4 Year5 Year6 Year7 CPU (kHEPSpec) total integrated Analysis MC (reproduce) MC Data (reprocess) Data Challenge challenge Year1 Year2 Year3 Year4 Year5 Year6 Year7 0 160 140 120 100 80 60 40 20 0 3 Tape (PB) for raw data total integrated PNNL KEK Challenge challenge Year1 Year2 Year3 Year4 Year5 Year6 Year7 Disk space (PB) total integrated Analysis MC Data Challenge challenge Year1 Year2 Year3 Year4 Year5 Year6 Year7 (/ab) Hardware Resources for Belle II 14.00 250 Yearly integrated luminosity 200 12.00 10.00 150 8.00 6.00 100 4.00 50 2.00 0.00 1,400 1,200 1,000 800 600 400 200 0 Year1 Year2 Year3 Year4 Year5 Year6 Year7 0 CPU (kHEPSpec) total integrated Analysis MC (reproduce) MC Data (reprocess) Data Challenge challenge Year1 Year2 Year3 Year4 Year5 Year6 Year7 2014 ATLAS =975kHS CMS =745kHS ALICE =373kHS LHCb =218kHS 160 140 120 100 80 60 40 20 0 Pledge summary of LHC experiments : http://wlcg-rebus.cern.ch/apps/pledges/summary/ 4 Tape (PB) for raw data total integrated PNNL Challenge challenge CMS =77PB ATLAS =81PB KEK ALICE =28PB LHCb =21PB Year1 Year2 Year3 Year4 Year5 Year6 Year7 Disk space (PB) total integrated Analysis MC Data Challenge ATLAS =100PB CMS =62PB LHCb =18PB challenge ALICE =31PB Year1 Year2 Year3 Year4 Year5 Year6 Year7 5 Belle II Collaboration c.f. ATLAS, 38 countries, 177 institutes, ~3000 members CMS : 42 countries, 182 institutes, 4300 members ALICE : 36 countries, 131 institutes, 1200 members LHCb : 16 countries, 67 institues, 1060 members 23 countries/regions 97 institutes 577 colleagues as of June 30, 2014 Asia : ~45% Japan :137 Korea : 34 Taiwan : 22 India : 20 China : 15 Australia :18 N. America : ~15% US : 63 Canada : 17 Europe : ~40% Germany : 83 Italy : 59 Russia : 37 Slovenie : 14 Austria : 14 Poland : 11 p. a c re Belle II Computing Model Detector Raw data storage and processing Raw data duplex. processing raw data storage and (re)process MC production and Physics analysis skim mdst storage KEK Data Center PNNL Data Center Raw data mdst Data mdst MC dashed inputs for Ntuple CPU Raw Data Center Asia Disk Europe 1 Europe 2 Tape Regional Data Center GRID site MC production site user analysis (Ntuple level) until Year 3 6 Local resource Cloud site Computer cluster site Current status of computing 7 15 countries/regions 27 sites (+ 2 non-Belle II sites) 70 kHS (100 kHS @ max) 3rd 6200M 2nd events 560M HEPHY (Vienna) and MPPMU (Munich) joined recently GRID, Cloud, local cluster is available events 1st 60M First official release of MC samples events BB generic decay/continuum tau pair -1 (corresponding to 100fb w/ and w/o BG) Trans-pacific / trans-atlantic network data tranfer challenge 300 kHS 120 sites Belle II now 70 kHS LHCb d ie f i d mo Detector Belle II Computing Model PNNL Data Center raw data storage and (re)process (100% ) MC production and Physics analysis skim mdst storage after Year 4 (raw data part) North America KEK Data Center (30% ) Canada Data Center (10% ) Raw Data Center Germany Data Center India Data Center (10% ) Korea Data Center (10% ) Local resource Italy Data Center Europe Regional Data Center GRID site (20% ) (20% ) Asia MC production site user analysis (Ntuple level) 8 Cloud site Computer cluster site 9 Raw Data Distribution until Year 3 KEK Data Center PNNL Data Center (100% ) (100% ) North America (copy from KEK) Scenario 1 KEK Data Center PNNL Data Center (100% ) (30% ) Canada Data Center (10% ) Germany Data Center India Data Center (10% ) Korea Data Center (10% ) Scenario 2 (2step copy, KEK PNNL Europe) Asia (20% ) Italy Data Center (20% ) Europe North America KEK Data Center PNNL Data Center (100% ) (70% 30%) Canada Data Center (10% ) India Data Center (10% ) Korea Data Center (10% ) Asia Germany Data Center (20% ) Italy Data Center (20% ) Europe mDST/MC Data Distribution 10 mDST (data) is copied in Asia, Europe, and USA For the MC data seems to be natural to be the similar structure better network ? in each region completeness of the dataset in each region easier maintenance ? unbalance of resources data copy between three regions 1-set mDST MC main center : GridKa/DESY (Germany), CNAF (Italy) SiGNET (Slovenia) CYFRONET/CC1 (Poland) BINP (Russia) HEPHY (Austria) CESNET (Czech rep.) ISMA (Ukraine) INFN Napoli/Pisa/Frascati /Legnaro/Torino (Italy) ULAKBIM (Turkey) : spain, saudi arabia 1-set mDST MC main center : KEK (Japan) KISTI (Korea) NTU (Taiwan) Melbourne U.(Australia) IHEP (China) TIFR (India) many Japanese Univ. : thai, vietnam, malaysia, ... 1-set mDST MC main center : PNNL U.Vic. / McGill (Canada) VPI, Hawaii, ... many US univ. : mexico 11 Scenario 1 [Gbit/s] 8 6 4 2 Total in-bound KEK PNNL Germany Italy Korea India Canada KEK PNNL Korea to PNNL Germany Italy Slovenia Australia Canada Germany Italy to Europe Austria China Czech Rep. India Malesia Mexico Poland KEK Russia Saudi Arabia Spain* Taiwan Thailand* Turkey [Gbit/s] 20 Year1 Year2 Total out-bound 16 Year3 Year4 Year5 Year6 Year7 KEK from KEK PNNL Korea Germany Italy Slovenia 12 Australia Canada Austria China 8 Czech Rep. India Malesia Mexico from PNNL 4 Poland Russia Saudi Arabia Spain* Year1 Year2 Year3 Year4 Year5 Year6 Year7 12 Scenario 2 [Gbit/s] 14 10 6 Total in-bound KEK PNNL Germany Italy Korea India Canada KEK PNNL to PNNL Korea Germany Italy Slovenia Australia Canada Austria China Czech Rep. India Germany Italy to Europe KEK 2 Malesia Mexico Poland Russia Saudi Arabia Spain* Taiwan Thailand* Turkey [Gbit/s] 20 Year1 Year2 Total out-bound 16 Year3 Year4 Year5 Year6 Year7 KEK from KEK PNNL Korea Germany Italy Slovenia 12 Australia Canada Austria China 8 from PNNL Czech Rep. India Malesia Mexico 4 Poland Russia Saudi Arabia Spain* Year1 Year2 Year3 Year4 Year5 Year6 Year7 Network Connectivity 13 Current Connectivity Trans-Pacific 10G : Tokyo - LA 10G : Tokyo - NY 10G : Osaka -Washington Trans-Atlantic 3 x 10G : NY - Amsterdam 3 x 10G : Washington - Frankfurt ANA-100G NY - Amsterdam Trans-Asia 2.5G : Madrid-Mumbai 2.5G : Singapore-Mumbai 10G : Japan-Singapore “Planned” Connectivity Trans-Pacific SINET5 100G link to US in 2016 Trans-Atlantic EEX (ESNet Extension to Europe) 2 x 100G : NY - London 100G : Washington - Geneva 40G : Boston - Amsterdam Trans-Asia 10G : Mumbai - GEANT SINET ? Trans-Pacific data challenge KEKCC Intrusion Detection System Nexus 5000 Firewall for KEKCC Setup (KEK-PNNL) in 2013 Tokyo DC There are “firewalls” between KEK and PNNL We need to know the reason of the 500MB/s limitatoin Firewall ? sender/receiver harware CPU, disk I/O ? PNNL computer Catalyst 6504 40G SINET Tsukuba DC LAX Firewall ESnet 14 Trans-Pacific data challenge KEK 10Gbps Tsukuba Tokyo SINET4:10Gbps Seattle PNNL Japan and USA 15 PacificWave Los Angeles PacificWave : 20Gbps 500MB/s KEK(Japan) PNNL(USA) : 500MB/s is achieved = ~ required network bandwidth @ early 2018 100MB/s PNNL(USA) GridKa(Germany) : 100MB/s Also testing the network from PNNL to Europe But not enough for the network bandwidth @ middle of Year4 and later (~2GB/s) We need a 40Gbps - 100Gbps network between Japan and USA 16 New setup (KEK-PNNL) Japan KEK VLAN 954 SINET (VRF AS2907) VLAN 954 202.13.223.134/30 VLAN 954 202.13.223.133/30 Tsukuba Toyko VLAN 954 VLAN 954 Trans-Pacific Link KEK site test subnet 202.13.197.192/26 Belle-II Testing between PNNL and KEK (Setup to stay in place thru 30 June 2016) North America L2 (Ethernet VLAN) Connection from SINET to CENIC to support L3 BGP peering between SINET (for KEK) and ESnet LAX-dc-GM1. s4.sinet.ad.kp 202.13.223.117/30 VLAN 4000 ESnet (AS293) PNNL (AS65428) xe-1/0/1 VRF 13/1 PNNL-CE2 (VRF) V R F 192.188.41.1/30 V VLAN 3010 10/1/1 R F pnwg-cr5.es.net 202.13.223.118/30 VLAN 4000 9/1/4 sunn-cr5.es.net SUNN LOSA 192.188.41.2/30 10Gbps Best-Effort LSP between VRFs PNNL site test subnet 198.129.43.0/24 V R F Current link is 10GE with shared traffic, upgrade to 100G is in progress (ETA Aug 15 2014) CENIC (PacWave) 17 Trans-Atlantic data challenge US side 10G PNNL-CE 192.188.41.17/28 V R F VLAN 3011 10G V R F pnwg-cr5.es.net 192.188.41.1/30 aofa-cr5.es.net 100G VLAN 3011 V R F Brocade MLX 100G 192.188.41.5/30 AN 11 A1 00 xxx.hep.pnnl.org Test was done in May/June 2014 Ethernet VLAN Bridging SURFnet (AS1103) !(AS20965) 62.40.124.58/30 VLAN 3011 Juniper T4000 Chin Guok mx1.ams.nl V R F VLAN 3011 193.206.128.1/30 193.206.130.1/30 rx1.na1.garr.net V R F 10G VRF 193.206.130.2/30 VRF 40G 193.206.128.2/30 na.infn.it cnaf.infn.it ppssrm-kit.gridka.de (192.108.45.58) DFN (AS680) xr-fra1.x-win.dfn.de DFN CNAF ds-202-11-03.cnaf.infn.it (131.154.130.76) 3011 f01-151-45-e.gridka.de (192.108.45.246) rx1.mi1.garr.net GARR (AS137) rx1.bo.garr.net 192.108.68.66/30 fts3-node1-kit.gridka.de (192.108.45.59) 100G V R F GARR ANA-100 VRF 62.40.124.57/30 mx1.fra.de f01-151-10-e.gridka.de (192.108.45.245) Marco Marletta mx1.gen.ch 192.188.41.6/30 f01-151-45-e.gridka.de (192.108.45.246) Thomas Schmid, Hubert Weibel ANA-100 VRF VLAN Vincenzo Capone, Aleksandr Kurbatov, Mian Usman 100G V R F VLAN 3011 Network providers setup the VLAN Local network providers and sites coordinated final configurations Sites must configure hardware interface to match destinations 100 G VLA N3 0 Dedicated 10G link between PNNL DTN and ESNet 10G best-effort Label Switched Path in ESNet backbone 10G dc.hep.pnnl.org (192.188.41.20) VLAN Bridging 40G kr-fzk.xwin.dfn.de 192.108.68.65/30 kit.edu KIT dcache (192.108.46.24) KIT (AS34878) EU side INFN Napoli 10G 10Gbps Best-Effort LSP between VRFs 192.188.41.2/30 100G PNNL (AS65428) . “traceroute” was used to confirm the routing to each DTN . “iperf” was used to do initial network transfer rate test . “gridftp” and/or “srm-copy” was used to test site . FTS3 server at GridKa was used to schedule data transfers MANLAN Exchange Ethernet ESnet (AS293) 10G recasse01.na.infn.it (193.205.223.100) Trans-Atlantic data challenge 18 “iperf” results . Required several parallel transfers to reach network saturation . Reached ~9.6Gbps Output 1.0 GBytes/sec (=8Gbps) (>2x the Tier-1 EU site requirements) Input Results using FTS3 server . FTS3 optimization is not ideal: 1.0 GBytes/sec (=8Gbps) Output 0.5 GBytes/sec Input . reached network saturation but falls very quickly . Large amount of drop packets . satisfies the incoming network requirements for Tier1 EU sites up to calendar year “Year6” (2021 or 2022) Trans-Atlantic data challenge 19 KIT 2.0Gbps inbound Napoli 250MBytes/sec (=2.0Gbps) Challenges encountered . The main issue was the configuration of the local network apparatue . Having all the servers at each site using/checking the proper network route . Hardware limitation (router, storage, etc) . Not having dedicated setups (shared with ATLAS, etc.) To accommodate the increased rates ESNet . Modification of TCP windows was performed at PNNL and Italy . Routing hardware interface . Configure/tune network interrupts for multicore . Modification of the FTS3 optimization & global-timeout LHCONE for Belle II ?? 20 LHCONE is for LHC experiments In Belle II . European sites have already joined to LHCONE . while, KEK and PNNL does not belong to LHCONE now Our thoughts are . Belle II prefers to have a closed network like LHCONE . If configuring new VRFs for Belle II on each collaboration sites and related networks is difficult or makes any problem on operation, one possibility for Belle II is to join to LHCONE (if it is allowed.) Considerations : to join LHCONE or to configure LHCONE-like VRF layer . many Belle II computing sites overlap with computing sites in LHC experiments . . negotiation with each site could be easier under this umbrella ? . is it difficult to expand LHCONE to non-LHC experiments ? . Configuring another LHCONE-like VRF layer for Belle II could be difficult for some sites ?? . Belle II traffic shares the same badnwidth with LHC experiments . WAN traffic may be OK ? . traffic pattern is different from LHC (Japan US/Europe, US Europe are main) . but we do not have any financial support in Belle II. Under this condition, we want to find a better solution (your comments are highly appreciated) 21 Spare slides 22 Resources at LHC experiments 120 100 Tape (PB) 80 60 40 20 1400 1200 ALICE ATLAS CMS LHCb CPU (kHS) 1000 0 120 100 2009 2010 2011 2012 2013 2014 2015 2012 2013 2014 2015 Disk (PB) 80 800 60 600 40 400 200 20 0 0 2009 2010 2011 2012 2013 2014 2015 2009 2010 2011 23 DIRAC Distributed Infrastructure with Remote Agent Control (developed by LHCb) Pilot jobs Modular structure that enabled it possible to submit jobs to different backends. EMI computing cluster Interoperability in heterogeneous computings OSG cloud Network Connectivity in Asia 24 25 www.geant.net The Pan-European Research and Education Network GÉANT interconnects Europe’s National Research and Education Networks (NRENs). Together we connect over 50 million users at 10,000 institutions across Europe. >=1Gbps and <10Gbps 10Gbps 20Gbps 30Gbps >=100Gbps GE AR AZ GÉANT connectivity as at January 2014. GÉANT is operated by DANTE on behalf of Europe’s NRENs. Armenia Austria Azerbaijan Georgia BY GÉANT is co-funded by the European Union within its 7th R&D Framework Programme. Belarus MD Moldova UA Ukraine
© Copyright 2024 ExpyDoc