How Atlassians Build Engineering Team Has Scaled to 150k Builds Per Month and Beyond – PuppetConf 2015

Post on 22-Jan-2018

2906 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

Transcript

PETER LESCHEV • TEAM LEAD • ATLASSIAN • @PETERLESCHEV

Build Engineering @ Atlassian:Scaling to 150k builds per month & beyond

PuppetConf 2015

T E A M

I N T R O D U C T I O N

I N F R A S T R U C T U R E

B A M B O O S E RV E R S

Introduction

C O N C L U S I O N

Build platform & services used internally within

Atlassian to build, test & deliver

software

Developers expect a reliable infrastructure

& fast CI feedback

• 12 Bamboo Servers• maven.atlassian.com / 9 Nexus instances / 9 TB

• 7 Nexus proxies for internal traffic

• Monitoring• opsview, graphite, statsd, newrelic, datadog

Build Engineering today @ Atlassian

• 1200 build agents on EC2• include SCM clients, JDKs, JVM build tools, databases, headless

browser testing, Python builds, NodeJS, installers & more

• Maintain 20 AMIs of various build configurations

4 years ago:

Builds per month

21k

Last month:

Builds per month

186k

Build Engineering @ Atlassian

JIRA alone has

Automated tests

49k

3 stories of gaining maturity to handle Atlassian growth

I N T R O D U C T I O N

T E A M

I N F R A S T R U C T U R E

B A M B O O S E RV E R S

Team

C O N C L U S I O N

History of team roles

Individual Engineers

Information silos

Fault investigation, requests for advice, unplanned work

Little project work

Very interrupt driven

Duplication of effort

Limited to customer driven changes

Disturbed roleKnowledge Transfer

when switching between project / disturbed roles is difficult

More project workNon-disturbed can focus on larger tasks

Context switching

Reduction in duplication of effort, promotes collaboration within the team

2 week rotation

Team expands

Build Engineers

Team expands

Build Engineers

Team expands

Infra Engineers

Developers

Build Engineers

Disturbed for Dev & Infra

Too interrupt driven

To encourage knowledge transfer between infra & dev

Staggered changeoversMinimising disruption due to context switching

Disturbed pairing

Couldn’t handle smaller customer raised requests & interrupt driven work

Supporting Developers

team channel

Supporting Developers

Supporting Developers

1. Measure the pain

2. Continuous Improvement

Technical Debt

Technical Debt

Contact Rate

+ Confluence Questions+ Hipchat queriesCustomer JIRA issues

Number of Developers

( )÷

=

Contact Rate

The Shield

http://www.clker.com/cliparts/e/d/c/4/11970889822084687040sinoptik_Medieval_shield.svg.hi.png

Rebranding MaintenanceDisturbed

Removing the negative attitude towards the old role within the team

Project

work

Maintenance

The Shield

How do we avoid this in the future?P E T E R L E S C H E V

“ ”

Fix it now, fix it for the future

Self service

Chat bots

Self Service

Self Service

Maven Self Help Tool

I N T R O D U C T I O N

I N F R A S T R U C T U R E

T E A M

B A M B O O S E RV E R S

Infrastructure

C O N C L U S I O N

Infrastructure as Code

= Puppet + SCM ?

4 years ago…

Started using Puppet

Manually maintained snow flakes

Production rollout

puppetmaster

build agents

Production rollout failure

puppetmaster

build agents

Low confidence of change

atlassian.com/git

Style in Pull Requests

Puppet Lint

https://github.com/rodjek/puppet-lintTim Sharpe

@rodjek

Runs checks & posts results, fails if there are any warnings or errors

Automated Build

Automated Style Checking

• Coding on Puppet Master• Culture of manually modifying production - Configuration Drift• Impact on Builds

Using Staging for Development

puppetmaster

build agents

staging puppet environment

Vagrant

www.vagrantup.comMitchell Hashimoto

@mitchellh

Packer

packer.io

Rolling out to stagingRolling out to production

Broken build agents

Developing locally

Behaviour Driven Development

Cucumber

https://github.com/cucumber/aruba

But it works on my machineE V E RY D E V E L O P E R

“ ”

Continuous Integration‘From scratch’ provisioning

Confidence that you can rebuild in disaster

The Pets: you give nice names, you stroke them, and when they get ill, you nurse them back to health, taking a long time over it.

”The Cattle: you give them numbers. When they get ill, you shoot them T I M B E L L , C E R N

Provisioning from scratch is slow

Profiling Puppet Runs

Add “--evaltrace” to puppet apply

+ =Collect and show the longest occurrences of:“Evaluated in ([\d\.]+) seconds”

Profiling Cucumber runs

http://itshouldbeuseful.wordpress.com/2010/11/10/find-your-slowest-running-cucumber-features/

• Faster local provisioning• Different class of problems found• Closer to production

Delta Provisioning

‘from scratch’ provision ‘delta’ provision

provision VM

export VM fileshare

import VM box

provision VM

on success

Broken buildsmaster

Branch builds

BUILDENG-5670

BUILDENG-5669

master

Infrequent Releases

• Puppet runs impacted running builds• Disabling all the build agents

• Manually performing the roll out

• git clone / librarian-puppet / symlink update on puppetmaster

• Kick off puppet on all the build agents

• Enabling all the build agents

• Set of Puppet environments for every Bamboo server

Painful Puppet Rollouts

Graceful Service restarts

+Bamboo Agent JVM process watches for touch file & shutdowns when Idle(written as a Bamboo Plugin)

Puppet environments reduced

stagingproduction

server1_stagingserver1_productionserver2_stagingserver2_productionserver3_stagingserver3_production

etc

Bamboo Deployments

How environments work

Task list Available agents

Available agents

Available agents

Destination server

Destination server

Production

TASK 1TASK 2

TASK 1TASK 2

TASK 1TASK 2

1.3

Task list

Task list Available agents

TASK 1TASK 2

Task list

Task list

Release

Production

TASK 1TASK 2

1.3

Task list Available agents Destination server

Production

TASK 1TASK 2

1.3

Available agents Destination server

TASK 1TASK 2

Task list

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

staging

production

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

build• git clone

• librarian-puppet

• to specific environments

• scp to puppet master & symlink update

test deploy• ‘delta’ & ‘from scratch’

vagrant provisions

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Releasebuild & test AMIs

• Generated using Packer

• AMIs on Bamboo Servers updateddeploy AMIs

Puppet Build, Test & Deploy Pipeline

Puppet Build, Test & Deploy Pipeline

Terraform Pipeline

Plan & Apply changesof staging & production environments

terraform.io

‘open prs’ Bot

Less human effort through automation

= Increased frequency

& reliability of releases

SnowflakesPets

CattleStateless Machines

Infrastructure consistency is key

Challengesintroduces instability

Lots of packagesLarge number of constantly updating package dependencies

External dependencies

I N T R O D U C T I O N

B A M B O O S E RV E R S

T E A M

I N F R A S T R U C T U R E

Bamboo Servers

C O N C L U S I O N

At scale is hard

Bamboo Servers

12

Build Plans

3500

Plan Branches

14k

Bamboo is great, but hard to manage at scale

Build Configuration as code

Plan Templates

Bamboo Plugin:

Plan Templates

Checked into SCM

Bamboo Plugin:Reusable snippets

changes can be code reviewed

Export plans for backup, or move to another Bamboo instance easily

Bulk changes

Export existing plans

Update 100s of job requirements with a single commit

Pushing Bamboo to its limits

Agent Smith Wallboard

Bamboo Plugin:

Trend data sent to Graphite

https://marketplace.atlassian.com/plugins/com.atlassian.bamboo.plugin.agent-smith-wallboard

Add metrics, then alert on them

Bamboo Monitoring Plugin

Metrics to graphiteBamboo Plugin:

Bamboo HealthActiveMQ, Database connections, Tomcat, JVM Memory usage.

Background thread workers. Number of plans / plan branches, plans / plan branches for deletion.

When a Bamboo Server starts

misbehaving…

Infrastructure differences? Is it Bamboo Configuration?

Is it a Bamboo Plugin? Is it Bamboo the product?

How is it being used?

Infrastructure consistency of Bamboo Servers is key

Bamboo Puppet provider

+

REST API for Administration

Bamboo Puppet Provider

REST calls

https://forge.puppetlabs.com/atlassian/bamboo_rest

Bamboo Puppet provider

https://forge.puppetlabs.com/atlassian/bamboo_rest

Hipchat Notification

Managed via Puppet

Bamboo Plugins‘Continuous Plugin Deployment’ Task

This text box is not intended to contain a bunch of copy.

1-click upgrades of

How environments work

Task list Available agents

Available agents

Available agents

Destination server

Destination server

Production

TASK 1TASK 2

TASK 1TASK 2

TASK 1TASK 2

1.3

Task list

Task list Available agents

TASK 1TASK 2

Task list

Task list

Release

Production

TASK 1TASK 2

1.3

Task list Available agents Destination server

Production

TASK 1TASK 2

1.3

Available agents Destination server

TASK 1TASK 2

Task list

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

All Bamboo Servers

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

build

Deploy

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

build & test AMIs

Build

https://marketplace.atlassian.com/plugins/com.atlassian.bamboo.plugins.deploy.continuous-plugin-deployment

Bamboo Servers1-click upgrades of

Using scp / ssh & puppet

How environments work

Task list Available agents

Available agents

Available agents

Destination server

Destination server

Production

TASK 1TASK 2

TASK 1TASK 2

TASK 1TASK 2

1.3

Task list

Task list Available agents

TASK 1TASK 2

Task list

Task list

Release

Production

TASK 1TASK 2

1.3

Task list Available agents Destination server

Production

TASK 1TASK 2

1.3

Available agents Destination server

TASK 1TASK 2

Task list

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Upgrade Bamboo

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Build Bamboo

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

Deploymentproject

Build plan

How artifacts work

1.0

1.3

1.3

1.3

Build results(Artifacts)

Release Environments

Productio

n

Developm

ent

1.0

1.31.3

Productio

n

Developm

ent

1.31.3

Developm

ent

Artifactsn

n+1

n+2

Versions

Test & Build

JIRA issue Commit TriggerCode

Release notes

Repository Build artifacts Release

jira-bamboo

servicedesk-bamboo

Infrastructure differences? Is it Bamboo Configuration?

Is it a Bamboo Plugin? Is it Bamboo the product?

How is it being used?

T E A M

I N F R A S T R U C T U R E

B A M B O O S E RV E R S

Conclusion

C O N C L U S I O N

I N T R O D U C T I O N

Constant improvement

We’ve matured to handle the growth of Atlassian

Come join us!

Thank you!

PETER LESCHEV • TEAM LEAD • ATLASSIAN • @PETERLESCHEV

top related