Ganeti Private Cloud as Google does it · Helga Velroyen <[email protected]> · Linuxtag Berlin, May 9th, 2014 A Ganeti Cluster · Instance: a virtualization guest · Node: a virtualization host · Nodegroup: a homogeneous set of nodes · Cluster: a set of nodes, managed as a collective, partitioned by nodegroups 4/20 What can it do? · Manage clusters of physical machines · Deploy virtual machines on them - Resiliency to failure (distributed storage) - Live migration - Ease of repairs and hardware swaps - Cluster balancing 5/20 Ideas · Interact with the cluster as an entity, instead of the individual machines. · Making the virtualization entry level as low as possible - Easy to install/manage - Lightweight (no "expensive" dependencies) - No specialized hardware needed (eg. SANs) - Start small, grow big · Scale to enterprise ecosystems - Manage simultaneously from 1 to ~200 host machines - Access to advanced features (distributed storage, live migration, cluster balancing) 6/20 Technologies · Linux and standard utils (iproute2, bridge-utils, ssh) · Hypervisors: - Xen, KVM, LXC · Storage: - DRBD, LVM, file, distributed storage, Ceph/Gluster · Programming languages: - Python, Haskell 7/20 Controlling Ganeti · Command line (*) · RAPI (Rest-full http interface) (*) · Webinterfaces: - Ganeti Web manager, aiming for admins, but includes "self-service management" for users - ganetimgr web manager, simplified multicluster web manager for end users - Synnefo, complete cloud service solution, OpenStack API compatible (*) Programmable interfaces 8/20 Production cluster current master node Ganeti node Ganeti node Ganeti node Ganeti node Ganeti node Ganeti node Ganeti node Ganeti node Ganeti node Ganeti node Ganeti node Ganeti node ... Ganeti node group / rack Ganeti node Ganeti node ... Ganeti node group / rack Per machine monitoring SSH access Ganeti node Per machine monitoring Remote API Ganeti node Ganeti node Ganeti node ... Per machine monitoring As we use it in a Google Datacentre Ganeti node group / rack Ganeti cluster 9/20 Fleet at Google Ganeti cluster type Office Fleet Management no maint window Virgil Ganeti cluster type General maint window A Euripides Office ZURICH Dradis Ganeti cluster type General maint window B Ganeti cluster Ganeti cluster type General maint window A type Ubiquity no maint window Ganeti cluster type Dedicated maint window A Ganeti cluster type General maint window B Datacenter Z an tr VM sf er Ganeti cluster type Ubiquity no maint window Ganeti cluster type Dedicated maint window B Ganeti cluster type General maint window A Datacenter X Datacenter Y 10/20 Instance provisioning at Google Ganeti cluster Ganeti cluster RAPI interface type General Alloc request type General Virgil Ganeti cluster Ganeti cluster scan capacity type Ubiquity Machine DB type Dedicated Monitoring gather capacity 11/20 Auto node repair at Google Send machine Euripides Virgil the broken machine (4) Ganeti HW Ganeti HW broken HW Ganeti HW Machine database Ganeti cluster Ganeti HW Send to repairs (5) Tell cluster to evacuate Ganeti HW Monitoring detects fault (1) Mark machine broken (3) to repairs (2) 12/20 Auto node readd at Google Euripides Detects machine Machine DB was repaired (1) Mark machine serving (6) Tells Virgil to reintegrate machine (3) Watches machine for 24hrs (2) Tell cluster to add it (5) Ganeti HW Ganeti HW Ganeti HW Ganeti HW repaired HW Ganeti HW Virgil Configure machine (4) Dradis 13/20 Ganeti 2.8, 2.9 2.8.4 · Downgrading · Autorepair tool · Hroller · Improvements on storage, monitoring 2.9.6 · DRBD 8.4 support · Continued work on monitoring, storage, hroller 14/20 Ganeti 2.10 2.10.3, available in debian wheezy backports, debian jessi · Cross-cluster instance moves: - automatic node allocation on destination cluster - convert disk templates on the fly · Cluster balancing based on CPU load · KVM: Hotplug support, direct access to RBD storage · Ganeti upgrades! 15/20 Updates In the past, updating Ganeti was a pain: /etc/init.d/ganeti stop // on all nodes apt-get install ganeti2=2.7.1-1 ganeti-htools=2.7.1-1 // on all nodes /usr/lib/ganeti/tools/cfgupgrade // on master /etc/init.d/ganeti start // on all nodes gnt-cluster redist-conf // on master ... // lots of other steps, depending on the version // If something goes wrong, fix the mess manually. From 2.10 on, Ganeti comes with a built-in upgrade mechanism: apt-get install ganeti-2.11 // on all nodes gnt-cluster upgrade --to 2.11 // on master gnt-cluster upgrade --to 2.10 // to roll back Note that you still have to install the new and deinstall the old packages manually. 16/20 Ganeti 2.11 Current stable release, 2.11.0. · RPC security: individual node certificates · Compression for instance moves / backups / imports · Configurable SSH ports per node group · Gluster support (experimental) 17/20 Current and Future development No guarantees! · Network improvements (IPv6, more flexibility) · Storage: more work on shared storage · Heterogeneous clusters · Improvements on cross-cluster instances moves Google Summer of Code: · Make LXC support production-ready · Conversion between arbitrary disk templates 18/20 Open Source Events Confirmed: · Linuxcon Japan, Tokyo, May 20th 2014 · Ganeticon, Portland, Oregon, September Not confirmed yet: · Linuxcon North America, Chicago, August · FrOSCon, St. Augustin, Germany, August · LISA '14, Seattle, November 19/20 Thank You! Questions? · © 2010 - 2014 Google · Use under GPLv2+ or CC-by-SA · Some images borrowed / modified from Lance Albertson, Iustin Pop, and Guido Trotter · Some slides were borrowed / modified from Tom Limoncelli ·
© Copyright 2024 ExpyDoc