Top Banner
How TubeMogul reached 10,000 Puppet deployment in one year May 26 th , 2015 Nicolas Brousse | Sr. Director Of Operations Engineering | [email protected] Julien Fabre | Site Reliability Engineer | [email protected]
32

Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Jul 27, 2015

Download

Engineering

Nicolas Brousse
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

How TubeMogul reached 10,000 Puppet deployment in

one yearMay 26th, 2015

Nicolas Brousse | Sr. Director Of Operations Engineering | [email protected]

Julien Fabre | Site Reliability Engineer | [email protected]

Page 2: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Who are we?

TubeMogul● Enterprise software company for digital branding● Over 27 Billions Ads served in 2014● Over 30 Billions Ad Auctions per day● Bid processed in less than 50 ms● Bid served in less than 80 ms (include network round trip)● 5 PB of monthly video traffic served● 1.3 EB of data stored

Page 3: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Who are we?

Operations Engineering● Ensure the smooth day to day operation of the platform

infrastructure● Provide a cost effective and cutting edge infrastructure● Team composed of SREs, SEs and DBAs● Managing over 2,500 servers (virtual and physical)

Page 4: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Our Infrastructure

Public Cloud On Premises

Multiple locations with a mix of Public Cloud and On Premises

Page 5: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● Java (a lot!)● MySQL● Couchbase● Vertica● Kafka● Storm● Zookeeper, Exhibitor● Hadoop, HBase, Hive● Terracotta● ElasticSearch, Kibana● LogStash● PHP, Python, Ruby, Go...● Apache httpd● Nagios● Ganglia

Technology Hoarders

● Graphite● Memcached● Puppet● HAproxy● OpenStack● Git and Gerrit● Gor● ActiveMQ● OpenLDAP● Redis● Blackbox● Jenkins, Sonar● Tomcat● Jetty (embedded)● AWS DynamoDB, EC2, S3...

Page 6: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● 2008 - 2010: Use SVN, Bash scripts and custom templates.

● 2010: Managing about 250 instances. Start looking at Puppet.

● 2011: Puppet 0.25 then 2.7 by EOY on 400 servers with 2 contributors.

● 2012: 800 servers managed by Puppet. 4 contributors.

● 2013: 1,000 servers managed by Puppet. 6 contributors.

● 2014: 1,500 servers managed by Puppet. Introduced Continuous Delivery Workflow. 9 contributors. Start 3.7 migration.

● 2015: 2,000 servers managed by Puppet. 13 contributors.

Five Years Of Puppet!

Page 7: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● 2000 nodes

● 225 unique nodes definition

● 1 puppetmaster

● 112 Puppet modules

Puppet Stats

Page 8: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● Virtual and Physical Servers Configuration : Master mode

● Building AWS AMI with Packer : Master mode

● Local development environment with Vagrant : Master mode

● OpenStack deployment : Masterless mode

Where and how do we use Puppet ?

Page 9: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Code Review?

Page 10: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● Gerrit, an industry standard : Eclipse, Google, Chromium, OpenStack, WikiMedia, LibreOffice, Spotify, GlusterFS, etc...

● Fine Grained Permissions Rules● Plugged to LDAP● Code Review per commit● Stream Events● Use GitBlit● Integrated with Jenkins and Jira● Managing about 600 Git repositories

A Powerful Gerrit Integration

Page 11: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Gerrit in Action

verify -1 when no ticket # or doesn’t pass Jenkins code validation

Page 12: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● 1 job per module● 1 job for the manifests and hiera data● 1 job for the Puppet fileserver● 1 job to deploy

Continuous Delivery with Jenkins

Global Jenkins stats for the past year● ~10,000 Puppet deployment● Over 8,500 Production App Deployment

Page 13: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Plugin : github.com/jenkinsci/job-dsl-plugin

● Automate the jobs creation

● Ensure a standard across all the jobs

● Versioned the configuration

● Apply changes to all your jobs without pain

● Test your configuration changes

Jenkins job DSL : code your Jenkins jobs

Page 14: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Team Awareness: HipChat Integration with Hubot

Page 15: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Infrastructure As Code● Follow standard development lifecycle● Repeatable and consistent server

provisioning

Continuous Delivery● Iterate quickly● Automated code review to improve code

quality

Reliability● Improve Production Stability● Enforce Better Security Practices

Puppet Continuous Delivery Workflow: The Vision

Page 16: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

The Workflow

Page 17: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

The Workflow : Puppet code logic

Puppet environments● Dedicated node manifests (*.pp)● Modules deployed by branch with Git submodules

All the data in Hiera● Try to avoid params.pp class● Store everything : modules parameters, classes, keys, passwords, ...

Page 18: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Puppet Code Hierarchy

/etc/puppet├── puppet.conf, hiera.yaml, *.conf├── hiera└── environments ├── dev │ ├── manifests │ │ ├── nodes/*.pp │ │ └── site.pp │ └── modules │ ├── activemq │ ... │ └── zookeeper └── production ├── manifests │ ├── nodes/*.pp │ └── site.pp └── modules ├── activemq … └── zookeeper

Git submodules, branch dev

Git submodules, branch production

Page 19: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Hiera Configuration

$ cat /etc/puppet/hiera.yaml---:backends: - eyaml - yaml:yaml: :datadir: /etc/puppet/hiera:eyaml: :pkcs7_private_key: /var/lib/puppet/hiera_keys/private_key.pkcs7.pem :pkcs7_public_key: /var/lib/puppet/hiera_keys/public_key.pkcs7.pem:hierarchy: - fqdn/%{::fqdn} - "%{::zone}/%{::vpc}/%{::hostgroup}" - "%{::zone}/%{::vpc}/all" - "%{::zone}/%{::hostgroup}" - "%{::zone}/all" - hostname/%{::hostname} - hostgroup/%{::hostgroup} - environment/%{::environment} - common:merge_behavior: deeper

Page 20: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Hiera eyaml : github.com/TomPoulton/hiera-eyaml

● Hiera backend● Easy to use● Powerful CLI : eyaml edit /etc/puppet/hiera/secrets.yaml

Encrypt Your Secrets

$ cat secret.yaml---ec2::access_key_id: ENC[PKCS7,MIIBiQYJKoZIhvcNAQcDoIIBejCCAXYCAQAxggEhMIIIBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAVIa28OwyaqI5N1TDCvVkBZz3YG+s+Hfzr0lqgcvRCIuJGpq28sQmmuBaQjWY38i86ZSFu0gM6saOHfG64OzVlurO7k/l0CKeL0JfXNaVM4TUqMaN9dSkL5e2vsmpLKrMASawmarqbLYwllTrTe32H4NWxU1e+qWLeUMr9ciBnA3W1Azm4RIo+3bsvgvMfdks....=]

Page 21: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Encrypt Files

Blackbox : github.com/StackExchange/blackbox

● Use GPG to encrypt secret files● Easy to add/delete team members● No need to change your Puppet code !

# modules/${modules_name}/files/credentials.yaml.gpg

file { ‘/etc/app/credentials.yaml’: ensure => ‘file’, owner => ‘root’, group => ‘root’, mode => ‘0644’, source => ‘puppet:///modules/${module_name}/credentials.yaml’}

Page 22: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

The Workflow

Page 23: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

The Workflow : bottlenecks

● Only Ops team members can commit (SRE, SE)

● Review and validation is done only by a SRE

● Jenkins will verify the code but will not validate the commit

● Static Puppet environments

● Rely a lot on server hostnames

Page 24: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Flexibility : R10K github.com/adrienthebo/r10k !

● Dynamic environments

● No Git submodules anymore ! : - )

● Easy to reproduce any environment

● Can use private and forge Puppet modules

● Can use branches and tags

● Based on Puppetfile

Puppet Workflow Reloaded!

Page 25: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

R10K

$ cat Puppetfileforge "https://forgeapi.puppetlabs.com"

# Forge modulesmod 'pdxcat/collectd'mod 'puppetlabs/rabbitmq'mod 'arioch/redis'mod 'maestrodev/wget'mod 'puppetlabs/apt'mod 'puppetlabs/stdlib'

# Tubemogul modulesmod "hosts", :git => 'ssh://<gerrit_host>/puppet/modules/hosts', :branch => 'dev'mod "timezone", :git => 'ssh://<gerrit_host>/puppet/modules/timezone', :branch => 'dev'

...

Page 26: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Puppet Workflow Reloaded!

Better code organization : Roles and Profiles

● Represent the business logic : Roleso Highest abstraction layero Use Profiles for implementation

● Implement the applications : Profileso Remove potential code duplicationo Use modules and other Puppet resources

Page 27: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Roles/Profiles Pattern

class role::logs { include profile::base include profile::logstash::server include profile::elasticsearch}

class profile::logstash { $version = hiera('profile::logstash::server::version', '1.4.2') $es_host = hiera('profile::logstash::server::es_host', 'es01') $redis_host = hiera('profile::logstash::server::redis_host', 'redis01')

class { 'logstash': package_url => "https://download.elasticsearch.org/logstash/.../logstash_${version}.deb", java_install => true, }

logstash::configfile { 'input_redis': content => template('logstash/configfile/logstash.input_redis.conf.erb'), order => 10, }

logstash::configfile { 'output_es': content => template('logstash/configfile/logstash.output_es.conf.erb'), order => 30, }}

Page 28: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Do not rely on hostname : nodeless approach

● Facts to guide Puppet● No node myawesomeserver { } anymore● Enforce a cluster vision● site.pp gives the configuration logic

Puppet Workflow Reloaded!

# /etc/puppet/manifests/site.pp

node default {

if $::ec2_tag_tm_role { notify { "Using role : ${ec2_tag_tm_role}": } include "role::${::ec2_tag_tm_role}" } else { fail(‘No role found. Nothing to configure.’) }

}

Page 29: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

● Specify tags during the provisioning

● Retrieve tags with AWS Ruby SDK and create facts

● New hierarchy

AWS EC2 tags

$ facter -p | grep ec2_tagec2_tag_cluster => rtb-bidderec2_tag_nagios_host => mgmt01ec2_tag_name => bidderec2_tag_pupenv => productionec2_tag_tm_role => rtb::bidder

:hierarchy: - "%{::zone}/%{::ec2_tag_vpc}/%{::ec2_tag_cluster}" - "%{::zone}/%{::ec2_tag_vpc}/all" - "%{::zone}/all" - vpc/%{::ec2_tag_vpc}/%{::ec2_tag_cluster} - vpc/%{::ec2_tag_vpc}/all - environment/%{::environment} - common

Page 30: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

New merging and reviewing rules

● Everyone can commit a Puppet code

● Allow everyone to review a Puppet change (+1)

● Allow SE and SRE to validate a Puppet change (+2)

● Auto validation/merging in dev if at least 80% of test (+2)

Page 31: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Next improvements

● Acceptance testing with Beaker and Docker

● Full test provisioning with ServerSpec

● PuppetDB to improve the reporting

● Dedicated Puppet Masters

Page 32: Puppet Camp Silicon Valley 2015: How TubeMogul reached 10,000 Puppet Deployment in one year

Nicolas BrousseJulien Fabre

@orieg@julien_fabre