Page 1
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
December 2, 2016
Automated Governance
of Your AWS Resourceswith Real-Life Examples
Armando Leite
Global Cloud Security Architect
[email protected]
Prashant Prahlad
Principal Product Manager
[email protected]
DEV302
Page 2
What to Expect from the Session
1. Read all pages: Automating Governance on AWS
https://d0.awsstatic.com/whitepapers/compliance/Automating_Governance_on_AWS.pdf
2. Read all pages: Security Perspective of the AWS Cloud Adoption Framework
https://d0.awsstatic.com/whitepapers/AWS_CAF_Security_Perspective.pdf
3. Read all pages: Security at Scale on AWS
https://d0.awsstatic.com/whitepapers/compliance/AWS_Security_at_Scale_Governance_in_AWS_Whitepaper.pdf
Page 3
What to Expect from the Session
• Implementing automated governance
1. Control: Prevent bad actions
2. Monitor: Make bad configurations visibleLaunched! AWS Config for EC2 Systems Manager (software within EC2 instances)
3. Fix: Force timely fixes directlyLaunched! AWS CloudTrail for S3 Data Events (Amazon S3 object-level APIs)
• Automate governance: Making it real
• Your take-home toolkit
Page 4
Implementing governance:
Where’s the problem?
Page 5
DevOps: Dev==Security?, Ops==Security? Or?
Page 6
Some common problems
1. Someone else does “security stuff”
2. Policies and controls in legaleseSection 14.2 Security in development and support processesRules governing secure software/systems development should be defined as policy. Changes to systems (both applications and operating systems) should be
controlled. Software packages should ideally not be modified, and secure system engineering principles should be followed. The development environment should be
secured, and outsourced development should be controlled. System security should be tested and acceptance criteria defined to include security aspects.
3. Cloud as an extension of virtualization or physical DCManual processes, lax configurations, no awareness of AWS Shared Responsibility Model
4. Not tapping into communityNot benefiting from practices from your peers, not providing your best practices to others
Page 7
Example 1: Driving too fast
New team delivering a project using a new AWS account
• Everyone needs admin privileges
• Credentials hardcoded in code to get the job done
• CloudTrail logs? That’s for “audit people”
• Open ports: RDP, Telnet, SSH, MySQL
Project launched on time! Mission accomplished
Page 8
Example 2: I <3 experiments, dude!
• My experiments need
powerful instancesWorkload characteristics? IO? CPU? I’m
really just experimenting
• Idle instances and stale
resourcesPay-per-use means you know what you
are using
• Billing is a finance thingMy usage is so tiny, it doesn’t matter in
grand scheme of things
Page 9
Governance in 3 phases
Control Monitor Fix
Page 10
Phase 1: Control
Prevent actions that could be bad
• AWS CloudFormation
• Service Catalog
• AWS IAM policies
• Disable root credentials
• Check on GitHub for access keys available publicly
Page 11
What is AWS CloudFormation?
• AWS CloudFormation allows you to model,
provision, and update the full breadth of AWS
resources.
• Manage anything from a single Amazon EC2
instance to a multi-tier application.
• Integrates with other development and
management tools.
Page 12
Creates portfolio
Adds constraints
and grant access
1
4
5
Administrator
Portfolio
Users
Browse products
6Launch products
AWS CloudFormation
template
Creates
product
3
Authors
template
Including
parameters
2
ProductX ProductY ProductZ
7
Deploys
stacks
NotificationsNotifications
88
AWS Service CatalogCatalog Creation and Stack Provisioning Workflow
Populate parameters
Scheduled Lambda functions for
automated actions9
Assigns
product
Page 13
IAM policy to restrict instance types
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances", "ec2:DescribeImages",
"ec2:DescribeKeyPairs","ec2:DescribeVpcs", "ec2:DescribeSubnets",
"ec2:DescribeSecurityGroups"
],
"Resource": "*"
},
{"Action": "ec2:*",
"Effect": "Allow",
"Resource": "arn:aws:ec2:us-east-1:232378813418:instance",
"Condition": {
"StringLike": {
"ec2:InstanceType": [
"t2",
"m4"
]
}
}
}
]
}
Page 14
AWS IAM user access keys: Keep them safe
• Do not generate access key for root account
• Use IAM roles
• Attend or view SAC317 – IAM Best Practices to Live By
Code to prevent you from committing secrets and
credentials into Git repositories
https://github.com/awslabs/git-secrets
Page 15
Example: AWS Config: No IAM permissions
Page 16
CloudTrail: Read-only permissions
Page 17
Governance in 3 phases
Control Monitor Fix
Page 18
Phase 2: Monitor
Get all metadata, apply lifecycle policies to control costs
• CloudTrail
• AWS Config
• Amazon CloudWatch Logs
• VPC Flow Logs
Page 19
What is AWS CloudTrail?
AWS CloudTrail is a fully
managed service that
records API calls made on
your AWS account.
CloudTrail helps you gain
visibility into API activity,
and enables you to
troubleshoot operational
issues, conduct security
analysis, and meet internal
or external compliance
requirements.
Customers are making API calls...
On a growing set of services around the world…
CloudTrail is continuously recording API
calls…
And delivering log files to customers
Page 20
CloudTrail: Recent Delivery
Service Coverage• Most AWS services are integrated with
CloudTrail
• Includes most new services launching at
re:Invent 2016
Features• S3 Data Events: Get timely events for object-level
API activity for action and audit
• Event selectors to filter or add event types to a trail
• User identity included in AssumeRole calls, so you
can trace IAM user, even in role-based APIs
• Turn on a trail in all existing and future AWS regions
• Support for 5 trails (previously 1) per region
• Encrypt CloudTrail log files using your AWS KMS key
• Log File Integrity Validation
• PCI, ISO 270001/9001, ISO 27017, 27018,
SOC1,2,3
Page 21
AWS Config & Config Rules
Changing resources
AWS Config
Config Rules
History, Snapshot
Notifications
API Access
Normalized
Page 22
AWS Config: Inventory and compliance
Page 23
AWS Config Rules: Evaluate resource Config
Page 24
AWS Config + Software Inventory
Assess compliance using Config Rules
Amazon EC2 Systems Manager and AWS Config will capture
• Software Inventory in EC2 instance
• Firewall rules
• Patch level
• Application version
Page 25
Inventory Assessment
Page 26
CloudWatch
See React Diagnose Resolve
Page 27
Use AWS-generated metrics,
logs, and events over time to
understand the behavior of
your system
Publish custom metrics,
logs, and events for your
application-specific
telemetry
See React Diagnose Resolve
Page 28
Trigger automatic
notifications based on
your own rules and
metric thresholds
See React Diagnose Resolve
Page 29
Inspect, navigate, zoom, and
correlate across time to
investigate issues
Jump to your logs directly
from your metrics to perform
searches or generate
additional metrics from log
data
See React Diagnose Resolve
Page 30
Easily and automatically
correct issues via common
actions that you control
Define your own custom
actions based on AWS Lambda
functions for more fine-grained
control
See React Diagnose Resolve
Page 31
Monitor the monitor
Existence check and fix
• Lock down updates to foundational services
• CloudTrail read-only managed policy
• Config read-only managed policy
• Explicit Deny actions in policiesStep 1: Control: Use IAM policies that do not allow updates to management APIs for
foundational services
CloudTrail: Start Logging, Stop Logging, UpdateTrail, CreateTrail, DeleteTrail
Config: DeleteDeliveryChannel, PutConfigRecorder, PutDeliveryChannel,
StartConfigurationRecorder, StopConfigurationRecorder
VPC: CreateFlowLogs, DeleteFlowLogs
• Use Config Rules, or Lambda to ensure these are not turned off (coming
next).
Page 33
Governance in 3 phases
Control Monitor Fix
Page 34
Phase 3: Fix
Wide spectrum of options to fix problems
• Create awareness
• Indirect enforcement: Tickets or offline enforcement
• Direct enforcement: Take corrective actions
Page 35
Phase 3: Fix using AWS services
AWS Trusted Advisor
AWS Config Managed
Rules
AWS Config Custom
Rules with remediation
CloudWatch Events with
Lambda rules
Lambda code with various triggers
Ease of getting started vs. customization and control
Page 36
CloudTrail Data Events for S3
Act on API activity immediately in CloudWatch Events
• Data Events for S3
• Trigger rules that “fix” the problem
• Trace invocations and actions in CloudWatch Logs
Page 38
Governance in 3 phases
Control Monitor Fix
Page 39
Putting it all together
Page 40
Cloud Adoption FrameworkThe Security Perspective
Directive
Preventive Detective
Responsive
Control Monitor
?
Fix
Page 41
Automating governance with AWS Services
Rules of road:
1. Think pipelines, not discrete
controls.
2. Gather data and use it.
3. Automate from control, to
monitoring to fix.
4. The SOP is code.
5. All services are ‘security services’.
Page 42
Demo – event flow
1 – Standard
2 – Enhanced
3 – Active
Auto Scaling group
security group
security groupEC2 instance
Web
server
security groupEC2 instance
App
server
Auto Scaling group
CloudWatch
syslog
VPC Flow
Logs
CloudTrail
In standard operation, we are
observant.
Control:
- Security agent loaded in
instance.
- Logons tracked.
Monitoring:
- We gather data covering API
activity (CloudTrail), network
(VPC Flow Logs) and also
in-instance activity (syslog).
Fix:
- We are good
Logon ok?
Logon is OK!
SSH
Lo
gin
!
(CW
E C
us
tom
)
Page 43
Demo – event flow
1 – Standard
2 – Enhanced
3 – Active
Auto Scaling group
security group
security groupEC2 instance
Web
server
security groupEC2 instance
App
server
Auto Scaling group
CloudWatch
syslog
CloudTrail
SSH
Lo
gin
!
(CW
E C
us
tom
)
A logon event occurs. We go to
Enhanced surveillance mode.
Control:
- Dynamically add Lambda
subscriptions to log feeds.
Monitor:
- In instance activity (privilege
escalation)
- Initiation of forbidden flows.
Fix:
- Alert only. Watchful but
passive.
Enhance
OS data
analysis
Network data
analysis
Subscribe to Syslog
Enable Instance level VPC Flow Logs
Subscribe to instance VPC Flow Logs
VPC Flow Logs
Logon ok?
Logon NOT okVPC Flow
Logs
Page 44
Demo – event flow
Auto Scaling group
security groupEC2 instance
web app
server
Elastic Load
Balancing
security groupEC2 instance
web app
server
security groupEC2 instance
web app
server
security group
App
server
1 – Standard
2 – Enhanced
3 – Active
OS data
analysis
Isolate Preserve Deregister
syslog data
Root Access
CloudWatch
Page 45
Demo – event flow
Auto Scaling group
security groupEC2 instance
web app
server
Elastic Load
Balancing
security groupEC2 instance
web app
server
security groupEC2 instance
Anomaly
security group
App
server
1 – Standard
2 – Enhanced
3 – Active
OS data
analysis
Isolate Preserve Deregister
syslog data
CloudWatch
Page 46
Demo – event flow
Auto Scaling group
security groupEC2 instance
web app
server
Elastic Load
Balancing
security groupEC2 instance
web app
server
security groupEC2 instance
Anomaly
security group
App
server
1 – Standard
2 – Enhanced
3 – Active
OS data
analysis
Isolate Preserve Deregister
syslog data
CloudWatch
Block all
Page 47
Demo – event flow
Auto Scaling group
security groupEC2 instance
web app
server
Elastic Load
Balancing
security groupEC2 instance
web app
server
security groupEC2 instance
Anomaly
security group
App
server
1 – Standard
2 – Enhanced
3 – Active
OS data
analysis
Isolate Deregister Preserve
syslog data
CloudWatch
Block all Dereg
ASG/ELB
Page 48
Demo – event flow
Auto Scaling group
security groupEC2 instance
web app
server
Elastic Load
Balancing
security groupEC2 instance
web app
server
security groupEC2 instance
Anomaly
security group
App
server
1 – Standard
2 – Enhanced
3 – Active
OS data
analysis
Isolate Deregister Preserve
syslog data
CloudWatch
Logs
Block all Dereg
ASG/ELB
Amazon EBS
snapshots
Page 49
Demo – event flow
Auto Scaling group
security groupEC2 instance
web app
server
Elastic Load
Balancing
security groupEC2 instance
web app
server
security groupEC2 instance
web app
server
security group
App
server
1 – Standard
2 – Enhanced
3 – Active
security groupEC2 instance
Anomaly
An escalation occurred and we
switched to Active i.e.
intervene and get it fixed.
Control:
- SG to isolate anomalous
instance.
- Preserve instance for both
live and offline analysis.
- Deregister application from
live use.
Monitoring:
- We continue to monitor all
activity as per previous
steps.
Fix:
- The control actions cause
ASG to be 1 instance short and
will recover to original fleet size
from ‘last known good’.
Page 50
Demo – event flow
1 – Standard
2 – Enhanced
3 – Active
Auto Scaling group
security group
security groupEC2 instance
Web
server
security groupEC2 instance
App
server
Auto Scaling group
CloudWatch
Syslog
Flowlogs
CloudTrail
In standard operation, we are
observant.
Control:
- Security agent loaded in
instance.
- Logons tracked to TT.
Monitoring:
- We gather data covering API
activity (cloudtrail), network
(Flowlogs) and also in-
instance activity (Syslog).
Fix:
- We are BACK TO good
Page 51
Making it happen
First 5 use cases
• Root detection
• Disabling of audit trails
• Activity in unused region
• Adding/Removal of
gateways
• Changes to immutable
parameters
Turning plans into action:
1. Define your MSB.
2. Go for MVP.
3. Mature through iteration.
Page 52
Your take-home kits
Kit #1
Armando’s demo in codehttps://github.com/awslabs/automating-governance-
sample
Kit #2
AWS DevOps Blog:
Governance serieshttps://aws.amazon.com/blogs/devops/it-governance-in-
a-dynamic-devops-environment/
(Shashi Prabhakar, AWS Solutions Architect)
GitHub Config Ruleshttps://github.com/awslabs/aws-config-rules
Page 53
Remember to complete
your evaluations!