Puppet Camp Berlin 2015: Andrea Giardini | Configuration Management @ CERN: Going Agile with Style
Post on 15-Jul-2015
53 Views
Preview:
Transcript
Configuration Management @CERN
Going Agile with Style
Andrea Giardini
CERN
andrea.giardini@cern.ch
24 April 2015 - PuppetCamp Berlin
Configuration Management @ CERN 2
Outline
IntroductionWhat is CERNDatacenters Overview
Puppet @ CERNCurrent InfrastructureModules, Hostgroups and Environments
Configuration ManagementManaging ChangesTools
Conclusions
Configuration Management @ CERN 3
What is CERN
I European Organization for NuclearResearch
I Situated in the border betweenSwitzerland and France
I 21 Member states
I Big challenges
Configuration Management @ CERN 4
Big Challenges - The FCC
Configuration Management @ CERN 5
The LHC
Configuration Management @ CERN 6
The Detectors
Configuration Management @ CERN 7
Data Flow
Configuration Management @ CERN 8
Datacenters in Numbers
Two datacenters:
I Budapest
I Geneva
Two dedicated links:
I 2 x 100Gbps
The number of resources is growing yearby year. As today:
I 15k servers
I 100PB on tape
I 200PB on disk
Configuration Management @ CERN 9
Going Agile
Requirements started to grow
I Agile approach was needed
Since a few years we started using Openstack to deploy virtualmachines for our users and Puppet to configure the services
Configuration Management @ CERN 10
Our Setup
We started using Puppet a few years ago and, since then, things evolved a lot . . .
We changed several time the configuration of our puppet masters in order to keep upwith the requests and we found out that:
I Puppet scales horizontally quite well
I The NFS filer underneath . . . does not
NFS is used to share configurations and Puppet code between different masters.All the masters used to mount the same shared folder . . .
Configuration Management @ CERN 11
Clusters and Pools
I Catalog compilation time ∼ 90sec∼ 180 catalogs / minute
I ∼ 17k Puppet hosts
I Batch ∼ 300 cores
I Interactive ∼ 12 cores
Configuration Management @ CERN 12
Few concepts
I Modules (∼ 280)The various modules available should be viewed as a library that your hostgroupcode can reuse.
I Hostgroups (∼ 160)Groups of nodes that are part of the same service and have some configurations incommon.
I Environments (∼ 180)Collections of modules and hostgroups at different development levels.
Configuration Management @ CERN 13
Environments allow us to . . .
Environment ”production” → All modules/hg from ”master” branchEnvironment ”qa” → All modules/hg from ”qa” branch
Custom environments (for testing purpose):I Possibility to set a default branchI Specify specific branch for one or more modules/hostgroups
Configuration Management @ CERN 14
Manage changes
Three important concepts:
I Modules
I Hostgroups
I Environments
A configuration change has to beapproved through a request in Jira.
Every git repo has at least two branches:
I master
I qa
Configuration Management @ CERN 15
Puppet Run
Configuration Management @ CERN 16
Jens
Jens creates Puppet environments for the Puppet Masters
I Using repository metadata and a list of environments definitions
I Allows dynamic environments and isolates puppet code for different services
Has recently been opensourced on GitHub:
https://github.com/cernops/jens
Useful for those running different services under the same puppet infrastructure
Configuration Management @ CERN 17
Configuration Change Process
Configuration change process:
I Modify a module on feature branch
I Create a custom env and test the module
I Open a ticket on Jira and announce the change
I Merge to qa
I After one week, merge to production
Service managers use the same module for different services: we need to be sure thatall the service managers are happy with the change before merging it to production.
Configuration Management @ CERN 18
Jenkins and Continuous Integration process
I Machines are built and tested beforemerging a change to production
I More automation, less manual work
I Still work in progress, but lookspromising
Configuration Management @ CERN 19
Dashboard
Configuration Management @ CERN 20
Automating procedures - RunDeck
I Tedious prone-error tasks replacedby executable code
I Handing off operational tasks toothers
I Procedures as a list of individualand atomic steps
I Ability to react to failures
Configuration Management @ CERN 21
Renaming hosts
Configuration Management @ CERN 22
Mcollective
Framework for server orchestration andparallel job execution
Problems in the past with big clusters> 3000 nodes
Latest improvements:
I Direct addressing
I New PuppetDB discovery method
I Threaded Mode
I Batched requests
Configuration Management @ CERN 23
Configuration Drifts
Configuration drifts started to be a problem:
I Out of sync machines
I Possibility for service managers to have snapshots
I Possibility to freeze their environment
It’s not easy to keep all the configuration in sync
Configuration Management @ CERN 24
Package Inventory
Centralized service for package inventory:
I Using Elasticsearch
I Queryable using Cli
I Compare a set of hosts
I Reports differences and misalignments
I Package History
Configuration Management @ CERN 25
Conclusions
Moving from a traditional infrastructure to an Agile one allowed us to:
I Optimize our resources
I Speed up the development cycle
I Reduce interventions time
I Have more free time :)
Configuration Management @ CERN 26
Conclusions
Puppet gives us the right combination between elasticity and efficiency
I Big community
I Active development
I Highly customizable
Configuration Management @ CERN 27
Questions?
Andrea Giardini
andrea.giardini@cern.ch
@GiardiniAndrea
Configuration Management @ CERN 28
top related