Real-World Rocket Science with Chef and Ostrato
Jul 16, 2015
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
Computing Resources Have Been Increasing Exponentially
Virtual Nodes
Physical Hardware
20
40
60
80
100
120
Mill
ion
s
1980Mainframe
1990Client/Server
2010+Web-Scale
2000Datacenter
Millions
of
Serv
ers
Increased Size Leads to Operational Complexity
WEB
SERVERS
APPLICATION
SERVERS
DATABASE
ADD 1
SERVER
20+ Changes
12+ New
dependences
How does Chef work?
• Ensures desired state by continually testing and
repairing individual resources in the system
• You compose policies using a series of simple
declarations
• The Chef client fetches those policies from a
central server and applies them to the local
machine
• The state of the machine is recorded and sent
back to a database, where it is indexed for
search, reporting, and audit.
Policy is Stored on a Central Server
Node
Chef Server
"recipe[ntp::client]"
"recipe[users]"
"role[webserver]"
Chef Client Pulls New Policy and Applies It
Chef Server
"recipe[ntp::client]"
"recipe[users]"
"role[webserver]"
The Chef Software Platform
Chef
Development Kit
Cookbook and
Policy
Authoring
Test-Driven
Infrastructure
Chef Server
Management
Console
Analytics
Platform
High Availability
and Replication
Chef
Client
Nodes
Data
Center
The
Cloud
Private Cloud
Business
Group A
Business
Group BIT
Top Challenges:- Governance
- Orchestration
- Controlling Costs
- Continuous Delivery
Market Problem
Our Design Philosophy• Build a powerful, cloud service management platform:
• Seamless operations across public & private clouds
• Simple-to-use
• Open Source
• Deliver immediate business value
– Strong, global policies
– Rich product features
– Role-based Access Controls (RBAC)
• Great user experience
• User-specific marketplaces (multi-tenant)
• Same intuitive actions and workflows, regardless of CSP
Self-Service Portals
Governance Engine
Cloud Operations
Our Solution
Ostrato cloudSM
API Abstraction Layer
Self-Service Portals
Governance Engine
Cloud Operations
Our Differentiation
Ostrato cloudSM
API Abstraction Layer
What is Ostrato cloudSM?
GET
/parking_calendars
200 OK
[
{
"name":
"Schedule A",
"id": <id>,
"calendar_url":
<url>,
"times": {
With
GUI
With
API
C
O
N
T
R
O
L
One Pane to Govern Cloud Services
Automation & Governance in DevOps
• Organizations struggle to combine dev & QA
processes with IT operations (a.k.a,“DevOps”)
• Business problem: Move application changes to
production faster, without sacrificing:
• Quality
• Governance
• Reporting & Visibility
• Cost Controls
• Security
Customer
Expectation
Continuous
Delivery
Key: Policy-Driven Automation
Ostrato Chef
Self-Service Marketplaces
• No Scripting
• Template-driven
Strong Governance
• RBAC-driven Global Policies
• Workflow Approval
CSP-independent Provisioning
• Fast & Repeatable
• API-Driven
Configuration Management:
• Powerful
• Scalable
Who is Andrew?
• Chief Cloud Officer for OpenWhere
• 20+ years serving Fortune 500, Public Sector, and high
growth new ventures across multiple sectors including
telecommunications, media and entertainment,
remote sensing, defense, and intelligence.
• Held Senior Management Consulting Roles at Ernst &
Young's Center for Technology Enablement and
leadership at various start-up companies.
• Aerospace Engineer
24x7 video streaming from the International SpaceStation:
Ten months from whiteboard to Initial Operations
Ground
Terminal 1
CUSTOMER
SUPPORT
CENTER
Ground
Terminal 2
Ground
Terminal n
MISSION
OPERATION
CENTER
Cloud Data
Center 2
Cloud Data
Center 1
NETWORK
OPERATIONS
CENTER
Typical ground system environment has high degree of
operational complexity
Typical ground system has a high degree of
operational complexity
• 100’s - 1000’s of servers
• 4 Types of databases
clusters
• 2 HPC clusters
• 17 VLANS
• 7 internal firewalls
• Hardened Windows and
Linux Images
• 9 Major COTS Packages
• 15 Custom applications
in five languages
• 3 NAS devices –
Petabyte level storage
• Multiple locations
including public and
private Clouds
Systems Engineering and Program Management require
multiple environments throughout the mission...
• Need multiple, concurrent environments - Large systems require multiple copies of the environment to support concurrent activities. The different environments can include functional testing, pre-integration (component-to component testing), system integration, training, performance, user acceptance, training, and production simulation.
• Support Out of cycle, ad-hoc testing needs – Emergency production fixes, critical security patches, and other mission events can trigger activities that require on-demand environments to support these ad-hoc test requirements.
• Mimic production environment – test environments should be as close to the production configuration as possible in order to validate nonfunctional requirements like high availability, recovery, performance, etc.
AWS, Chef, and Ostrato made it possible to
accelerate the development life cycle.
• Too expensive to maintain
multiple environments
• Over 50% difference in
configurations between
environments
• Resources are focused on
production, alternative
environments are
secondary
1. Use AWS for low cost, on-
demand infrastructure
2. Use Chef to capture
environment
configurations as software
3. Use Ostrato to provide
self-service and cost
management
Traditional Approach OpenWhere Approach
Our approach required all three capabilities to meet
the requirements while reducing costs & schedule
Infrastructure Configuration Infrastructure Provisioning
Orchestration & Governance
Version
Control
Continuous
Integration
Chef
Server
Ostrato
Virtual Private Cloud
Create Initial System from marketplace
using HEAT/Cloud Formation Templates
Virtual servers include Chef
client which checks in
1
2
Developer
checks in recipe
Continuous integration
and unit testsDeploy to Chef
Server
3 4 5
The three components working together
Use Purpose Built Environments
Fixed, Static Environments
– Average 50% variance between
production & lower
environments
– Supporting environments have
variable demand, low overall
utilization, & minimal support
Dynamic, on-demand environments
– Built and scaled for specific
purposes
– Chef ensures no infrastructure
variance between environments
– Ostrato was used to orchestrate
the lifecycle of the environments
Fixed Environments
Development Staging
Integration Demonstration
Test /QA Training
Production Etc.
Dynamic Environments
Development for Sprint 11 (3 weeks)
User Test for User Story US 217 (14 hours)
Regression Test for Defect 42 (1 hour)
Performance Test for release 2.3.1 (4 hours)
Training for release 2.3.2 (8 hours/day)
Production for release 2.3.0 (1 month)
Create Programmatic Bill of Material
The entire system is captured as a software code. This allows the
infrastructure to be version controlled and replicated like any other
software asset.
Provide Self Service for the entire System
(not just servers)
Single server is not a viable unit of work
“In today’s distributed compute environment, developers can’t develop on local workstations.”
Teams want self-service to full systems, not servers
“I want an entire system not 12 servers, 2 subnets, database, NAT, load balancer, etc”
Teams aren’t good about clean up, so need guard rails (governance)
Summary
• 1st Program where infrastructure wasn’t a bottleneck
• Create parallel environments– De-conflicts development
activities
– Reduces schedule pressure
– Increases agility
• Need all three capabilities to be successful – AWS: Cloud Infrastructure
– Chef: Infrastructure Configuration Management
– Ostrato: Orchestration and governance
Q & A
Dale Wickizer
CTO, Ostrato
@dalewickizer
Nicolas Rycar
Automation Engineer, Chef
@rycar
Andrew Heifetz
Chief Cloud Officer, OpenWhere
@andyheifetz
Contact Information