Portland: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric Radhika Niranjan Mysore, Andreas Pamboris, Nathan Farrington, Nelson Huang, Pardis Miri, Sivasankar Radhakrishnan, Vikram Subramanya, and Amin Vahdat Department of Computer Science and Engineering University of California San Diego Background • Emerging needs for massive scale data centers • Various design elements to achieve high performance, scalability, fault-tolerance in such environments Problems • VM migration support among Traditional DC networks are vulnerable; migrating VMs change the VM’s IP address breaks pre-existing TCP connections, which results in administrative overhead for TCP connection handover among VM hosts • Switches need to be configured before deployment • Inefficient communication between physically distance hosts • Forwarding loops results to inefficiency, worse yet paralysis of the network • Physical connectivity failures interferes with existing unicast and multicast sessions DC規模のネットワークにおいて,既存のIP, Ethernet Protocolに起因する問題が多い Solution • Portland – An ethernet compatible L2 protocol to solve the mentioned issues A Fat Tree Network • 本論文で対象とするネットワークトポロジ • DCネットワークで汎用的に用いられてるトポロジ Portland Design Fabric Manager • An user process running on a dedicated machine somewhere in the network responsible for.. – Assisting with ARP resolution – Fault tolerance – Multicast • 前提 – The location of the Fabric Manager is transparent for each of the switches in the network – Fabric Manager serves as a core function in Portland; therefore 冗長化されてる Portland Design Positional Pseudo MAC Address • • Virtual MAC addr which specifies the location of the host in the network Described as pod.position.port.vmid – – – – Pod = pod number Position = position within pod Port = switch port number VMid = virtual machine number (auto increment for each added vm, zero if not running on VM?) 1. A host is connected to an edge switch 2. The edge switch creates an address mapping table within itself for further forwarding 3. The edge switch refers to the fabric manager for the newly added host Portland Design Proxy-based ARP • Ethernet by default broadcast to all host in the same L2 domain -> inefficient Portland Design Distributed Location Discovery • All the switches broadcast a LDP (Location Discovery Protocol) to all its port on a certain interval • LDPを受け取ったスイッチは,LDP listener thread() 関数の内容を処理し,新規に接続さ れたスイッチはネットワークにおける現在位 置を,既存のスイッチはForwarding Tableの アップデートを行う Portland Design Unicast Fault Tolerant Routing 1. Link Failure Detection 2. Informs the Fabric Manager 3. The Fabric Manager updates the per-link connectivity matrix 4. The Fabric Manager informs all switches about the link failure Traditional Routing Protocols Portland O(n2) O(n) Communication Overhead for Failure Detection Implementation • HW – Switch * 20 • 4-port NetFPGA PCI card switches with Xilinx FPGA for hardware extensions ( 1U dual-core 3.2 GHz Intel Xeon machines with 3GB RAM ) • Openflow – – • Switch configuration software? • 32-entry TCAM and a 32K entry SRAM for flow table entries End host * 16 • 1U quad-core 2.13GHz Intel Xeon machines with 3GB of RAM running Linux 2.6.18-92.1.18el5 System architecture Fabric Manager Communication Module 冗長化,同期されて る Open Flow Protocol FM FM FM Openflow Switch DC Network Fabric Manager Network Evaluation Convergence Time Convergence Time with Increasing Faults Multicast Convergence TCP Convergence Evaluation Scalability Fabric manager control traffic CPU requirements for ARP Requests Conclusion • Fabric Managerの冗長化
© Copyright 2024 ExpyDoc