This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
with Docker on Amazon ECSHemanth Jayaraman Rent-A-Center Director, DevOps
Aater Suleman Flux7 Labs Inc. CEO & Co-Founder
December 1, 2016
Today’s Presenter
Sr. Director, DevOps
Rent-A-Center owns 3,000 rent-to-own
retail stores for name-brand furniture,
electronics, appliances, and computers
across the U.S.
http://www.rentacenter.com
Today’s Presenter
Aater Suleman
Co-Founder & CEO Flux7
Faculty, UT Austin
Cloud and DevOps Solutions
Headquartered in Austin, Texas
Team Members
Troy Washburn
James Lucas
Xiaolin Liu
Junhong Liu
Tyson Malik
Samprita Hedge
Ashay Chitnis
Nitin Ayyagari
Juan Mesa
Artem Kobrin
Ali Hussain
Outline
Evolution of DevOps at RAC
The e-commerce platform○Business case
○Architecture
○Challenges and Lessons Learned
The outcomes
DevOps Timeline
2015 2015 2016 2016 Q4Q1 Q4 Q1
DevOps
Organization
at RAC
VAN Project on
AWS
Infrastructure as
Code/ELK Stack
eCommerce
project
launch eCommerce
Go-Live
Serverless
Computing
Oracle RDS
Migration
Business Case for VAN Project
• Secure B2B portal for our Acceptance Now business unit
which enables our partners to help grow their business
by increasing sales and expanding their customer base.
• PII data and PCI compliance requirements
First Success
Security: No last-minute surprises before go-live;
Least Privilege; RDS patching,
Centralized Logging, Threat protection,
Encryption at-rest and in-motion.
Availability: HA with multi-AZ solution; Auto-Scaling
Innovation: Infrastructure as Code, Agility and
Flexibility, Ansible playbooks as build
docs
Evolution: E-commerce Platform
Digital transformation:
Give our customers the
ability to rent online
Unified view of
customer
Self-service account
management
SAP Hybris selected
as the eCommerce
platform
Goals
Setup an SAP Hybris
ecommerce platform to
scale to 2 million users a
month
Ability to support
Black Friday traffic
Secure for PCI
Compliance
Stateless infrastructure -
HA across all components
including DR
Create an agile developer
workflow for rapid
execution
No downtime
deployment
Performance Scalability Security
High Availability Agility CI/CD
Outline
Evolution of DevOps at RAC
The e-commerce platform○Architecture
○Challenges and Lessons Learned
The outcomes
Process
Phase 2: Attune
Phase 3:
Knowledge Transfer
Phase 1: Assess
Run the 2-week sprints
Transfer the knowledge at the end of each sprint
Understand the requirements and the current state, architect the desired
state, and create a punch list
High-Level Diagram
Lambda ECS
Aurora
S3CloudFront WAF
ECR
Private subnetPublic subnet
Storefront
Admin
Aurora
CloudWatch
CloudFormation
CloudTrail
KMS
SES
Route53S3
bucket
(static
assets)
NAT
Gateway
WAF
CloudFront
LambdaCodecommit
ACM Cert
Manager
Direct Connect
Each subnet represents a pair in two AZs.
All components configured to span two AZs.
Details of ECS Clusters
Storefront
Admin
Admin
SCM
Dev
Build
Code +
Dockerfile
On-premise AWS
Update
ECS
ImageECR
ECS
Nodes
Code Deployment
DeployUpdate
ECS Nodes
CF
Infrastructure Provisioning
DevOps SCM
Jenkins
EC2
ECS
Lambda
Other AWS
Services
CloudFormation
Templates
Trigger Create/Update Stack
Deploying Aurora DB with Hybris
Performance
Scaling
Low management
overhead
Use of AWS Aurora
DB instead of Oracle
or MySQL
Hybris supports
MySQL, Aurora
worked out of the box
Why? What? How?
Using AWS WAF (OWASP Top 10)
PCI-ready AWS WAF used to filter
traffic per rules
-CloudFront logs written to
S3
-S3 triggered Lambda
-Offending IPs were
blocked
Why? How?
To S3 and
ELB
Trigger
Lambda
Configure
rules
ECS Auto-scaling
Servicing seasonal
traffic patterns at high
performance and low
cost
ECS auto-scaling to scale individual services
Lambda function to auto-scale underlying ECS
nodes:
-Read stats from ECS
-Decide when to scale up/down -Trigger the
operation
Why? How?
ECS Autoscaling (Cont’d)
Read current
state of ECS and
ASG
Trigger Lambda
every 5 mins
let 0 … n be the running ECS services
let dck be the desired number of containers of service k
Let desiredCnt be the current desired number of instance in ASGLet minCnt be the minimum number of instances needed in ASGLet maxCnt be the maximum number of instances allowed in ASG
max ← MAX(dc0, .., dcn)
instanceCnt ← max + extraCapacity
If instanceCnt ≠ desiredCnt AND instanceCnt <= maxCnt ANDinstanceCnt >= minCnt: