The challenge…
On premise infrastructure leads to static, lowest common denominator hardware…
…and either long lines for use, low utilization (or both!)
What’s at stake?
Scientific Community Requirements - Computation on demand
- a flexible on-demand and cost-effective infrastructure
- Data Management - Handling growth of data - Long term storage - Data transfer between participants
- Data Analysis - A robust infrastructure - Scalability - Flexibility
- Reproducibility of results - A programmable infrastructure
AWS Platform Your Applications
Building Block Services
Foundation Services
Compute Amazon EC2 Auto Scale
Storage Amazon S3
Amazon EBS Amazon StorageGateway
Database Amazon RDS
Amazon SimpleDB Amazon ElastiCache Amazon DynamoDB
Networking Amazon VPC
Elastic Load Balancing Amazon Route 53
AWS Direct Connect
Management & Administration
Application Platform Services
Content Distribution
Amazon CloudFront
Messaging Amazon SNS Amazon SQS Amazon SES
Parallel Processing
Elastic MapReduce
Libraries & SDKs Java, PHP, Python,
Ruby, .NET
Identity & Access AWS IAM
Identity Federation Consolidated Billing
Web Interface Management Console
Monitoring Amazon CloudWatch
Deployment & Automation
AWS Elastic Beanstalk AWS CloudFormation
Simple Workflow Service
AWS Global Infrastructure Regions
Availability Zones Edge Locations
AWS Pace of Innovation
New Service Announcements & Updates
Including:
AWS Oregon Region
Elastic Beanstalk (Beta)
Amazon SES (Beta)
AWS CloudFormation
Amazon RDS for Oracle
AWS Direct Connect
AWS GovCloud (US)
Amazon ElastiCache
VPC Virtual Networking
VPC Dedicated Instances
SMS Text Notification
CloudFront Live Streaming
AWS Tokyo Region
SAP RDS on EC2
SAP BO on EC2
Win Srv 2008 R2 on EC2
Win Srv 2003 VM Import
Amazon S3 SSE
2011 2010 2009 2008
74
61
48
24
Including:
Amazon SNS
Amazon CloudFront
Amazon Route 53
S3 Bucket Policies
RDS Multi-AZ Support
RDS Reserved Databases
AWS Import/Export
AWS IAM Beta
AWS Singapore Region
Cluster Instances for EC2
Micro Instances for EC2
Amazon Linux AMI
Oracle Apps on EC2
SUSE Linux on EC2
VM Import for EC2
Including:
Amazon RDS
Amazon VPC
Amazon EMR
EC2 Auto Scaling
EC2 Reserved Instances
EC2 Elastic Load Balance
AWS Import/Export
AWS Mngmt Console
Win Srv 2008 on EC2
IBM Apps on EC2
Including:
Amazon SimpleDB
Amazon Cloudfront
Amazon EBS
EC2 Availability Zones
EC2 Elastic IP Addresses
Including:
Amazon FPS
Red Hat Enterprise on EC2
2007
9
“AWS is extraordinarily innovative, exceptionally agile and very responsive to the market.”
75
25
50
And wait, there’s more!
Dec 2011 – Feb 2012
ElasicCache in 4 more AWS Regions
CloudFront & Route 53 in 3 new edge locations
S3 announces Multi-object delete
SES now supports SMTP
EMR supports Hadoop 0.20.205 and Pig 0.9.1
New AWS Region in Sao Paulo, Brazil
VPC adds multiple network interfaces
EMR support for cc2.8xlarge
S3 announces Object Expiration
SNS adds support for Delivery Policies
SNS adds support for Message Formatting
Direct Connect adds four new locations
AWS Free Usage Tier for Windows
AWS Dynamo DB
AWS IAM Identity Federation
AWS Storage Gateway
AWS Simple Workflow Service
March 2012
New m1.medium (2 ECU’s and 3.75 GB RAM)
Lowered Reserved EC2 37% and RDS 42%
Global Infrastructure for Global Enterprises US West
(Northern California)
US East (Northern Virginia)
EU (Ireland)
Asia Pacific
(Singapore)
Asia Pacific (Tokyo)
AWS Regions
AWS Edge Locations
GovCloud (US ITAR Region)
US West (Oregon)
South America (Sao Paulo)
AWS Regions and Availability Zones
Customer Decides Where Applications and Data Reside
Built to Enterprise & Gov’t Security Requirements
Security & Compliance Resources • Security & Compliance Center: http://
aws.amazon.com/security
• Security Overview & Best Practices
• AWS Risk & Compliance Whitepaper
• Creating HIPAA Compliant Applications
Hardware, Software & Network • Systematic change management
• Phased updates deployment
• Safe storage decommission
• Automated monitoring and self-audit
• Advanced network protection systems
Certifications and Accreditations • ISO 27001 • SSAE 16 / ISAE 3402 / SOC1 (formerly
U.S. standard SAS-70 Type II) • FISMA Moderate Controls; ITAR region • HIPAA applications certified on AWS • Payment Card Industry (PCI) Data
Security Standard (DSS) Level 1 • DIACAP Controls
Physical • Datacenters in nondescript facilities
• Physical access strictly controlled
• Must pass two-factor authentication at least twice for floor access
• Physical access logged and audited
Amazon Simple Storage Service (S3)
" Distributed, replicated object store
" 99.999999999% durability
" ~1 trillion objects and > 700,000 requests/second
" Store anything…pictures, XML docs, encrypted blobs
" You determine the AWS region, we replicate across
AZs
" AWS Import/Export Service
" AWS Storage Gateway
Amazon Elastic Compute Cloud (EC2)
" Virtual machines running Windows or Linux
" Full Windows admin or Linux root privileges
" Instance types ranging from t1.micro to cc2.8xl
" HPC instances have 10Gb full bisection bandwidth
" Ephemeral storage, Elastic Block Storage and SSDs
" We constantly modernize our infrastructure
Amazon EC2 Pricing Models
" On-Demand
" Reserved Instances
Light
Moderate
Heavy
" Spot
" Dedicated Instances
Compute: Amazon EC2 Instances
2 * Intel Xeon ES-2670 “Sandy Bridge” Architecture
16 cores w/ HT 60.5 GB RAM
3.4 TB disk HVM
cc2.8xlarge
2 * 1TB SSD LUNs 16 cores
60.5 GB RAM 35 ECUs
10 Gigabit Ethernet hi1.4xlarge
#42
Top-5 Pharma Client
Not just about high performance infrastructure
- Choose the right infrastructure best suited for your applications and pipelines
- A programmable infrastructure
- No longer bound by physical limits - No queued jobs
- Experiment at scale
12.7 Teraflops for < $35/hour!
Customer’s Network
Amazon Web Services Cloud
Secure VPN Connection over the Internet
Subnets
Customer’s isolated AWS resources
Amazon VPC Architecture
Router VPN
Gateway
Internet NAT
Database Options
Database Server on Amazon EC2
Your choice of database running on
Amazon EC2
Bring Your Own License (BYOL)
Amazon Relational Database Service (RDS)
Oracle or MySQL offered as a service
Flexible Licensing: BYOL or License Included
Amazon DynamoDB
NoSQL data store
SSD storage
Seamless scalability
with zero administration
Self-Managed Managed Databases
Higher-Level Services
Developer Centers
Your choice of programming language
(Java, PHP, Python, Ruby, .NET) and mobile platform (Android, iOS)
Libraries & SDKs
Amazon Elastic MapReduce
Allows customers to easily
and cost-effectively process vast amounts of data utilizing a Hadoop
framework running Amazon EC2 instances
Parallel Processing
Amazon Simple Queue Service
Reliable and highly scalable message queue for cloud
applications
Amazon Simple Notification Service
Push notifications from the cloud to subscribers or client
applications
Amazon Simple Email Service
Send bulk and transactional emails in a quick and cost-
effective manner
Messaging
AWS
CloudFormation
Use application templates to create a collection of related
AWS in order to provision and update
them in an orderly and predictable way
Deployment
Amazon
CloudWatch
Monitor AWS resources and track metrics to
gain insight and react immediately to keep applications running
smoothly
AWS Elastic Beanstalk
Provision an Apache Tomcat environment and deploy your Java
applications in minutes
Monitoring Automation
Deployment & Administration Services
Data Collaboration • Storage Services
• Amazon S3 • Amazon EBS • Amazon DynamoDB
• Transfer Services • AWS Import/Export • AWS Storage Gateway
• Identity and Access Management • Federation
• Encryption features • Amazon S3 Server Side Encryption • Client side encryption
• Key Management (Partners)
Leverage Public Datasets available on AWS
• A centralized repository of public data sets • Seamless integration with cloud-based applications • No charge to the community • http://aws.amazon.com/publicdatasets/
" Some data sets of interest: - 1000 Genomes project - Ensembl
- Annotated Human Genome Data – for FASTA - Illumina
- Jay Flateley Human Genome Data Set - YRI Trio Dataset - The Cannabis Sativa Genome - GenBank - UniGene - Influenza Virus
- (including updated Swine Flu sequences)
AWS Grant Program
" http://aws.amazon.com/education/
" Recipients selected based on: • Uniqueness of work • Application of Amazon Web Services • Ability to disseminate work publicly via papers, events
or public relations • Great way to develop new public data sets and AMIs
Consolidated Billing with IAM
" Allows you to get one bill for multiple accounts " You can easily track each account's costs and
download the cost data in CSV format " You may be able to reduce costs by combining
usage from all the accounts to qualify for volume pricing discounts
www.adscfd.com AeroDynamic Solutions, Inc. (877) RICHCFD
Air Force Conducts Large-Scale Aerodynamic Simulation with Amazon EC2
To speed the development of more fuel efficient and durable jet engines, the U.S. Air Force Research Laboratory and AeroDynamic Solutions (ADS) partnered with Amazon Web Services to devise an effective design simulation solution. With Amazon Elastic Compute Cloud (Amazon EC2), ADS proved that large-scale aerodynamic simulations can b dialed up on-demand and performed affordably and within the time constraints of commercial design.
Background
For the world’s leading manufacturers of jet engines, product development remains an extremely costly and time-consuming task. Modern designs have pushed traditional analysis methods to the limit, demanding the use of advanced simulation techniques to better tackle performance and durability issues before committing to hardware. To address these concerns, the Turbine Branch of the United States Air Force Research Laboratory (AFRL) and ADS joined forces to advance one such simulation technique—large-scale time accurate simulation—for the U.S. gas turbine industry. The Turbine Branch is responsible for advancing the technical capability of turbo propulsion systems, and ADS provides Computational Fluid Dynamics (CFD) software and analysis services to the world’s manufacturers of jet engines, industrial gas turbines, and compressors.
Time accurate simulation enables designers to understand how time-varying aerodynamic loads can lead to performance loss and structural fatigue. Though this analytical method has long been available, it has largely remained out of reach for commercial design due to its high computational cost and long turnaround time.
To carry out large-scale time accurate simulation, hundreds of clustered processors may often be required, necessitating enormous upfront hardware, software and support personnel costs. As a result, this type of simulation has largely remained out of reach to all but the largest of gas turbine manufacturers. Another problem is that time accurate simulation can take weeks to run, rendering it impractical for commercial design cycles.
Working under an SBIR Phase II award from the U.S. Air Force, ADS enhanced its large-scale time accurate analysis capabilities to tackle these issues. As part of this effort, ADS turned to Amazon Web Services (AWS). Using Amazon EC2, AeroDynamic Solutions gained the capabilities of a large commercial cluster on demand and at a fraction of the cost.
Approach
To demonstrate the capabilities of the ADS/AWS solution, the U.S. Air Force-designed Notre Dame HiLT 1.5 stage turbine was analyzed for unsteady effects. Consisting of 165 passages with 60/70/35 airfoil counts per row, the HiLT turbine is a highly
loaded, transonic, low pressure turbine representative of today's modern designs. For the analysis, one-fifth (1/5) of the full wheel (12/14/7) was simulated for a complete revolution. After generating the mesh and completing initial 3-D multi-stage analysis, the EC2-enabled ADS solution performed as follows:
• The mesh (10.6 million elements) was partitioned into 40 blocks for parallel execution on the ADS solver Code Leo.
• 40 processors were dynamically provisioned on Amazon EC2 utilizing five cc1 cluster compute instances.
• Code Leo was invoked across the 40-processor cluster, simulating 10,500 time steps with 20 inner iterations per time step.
• Results were gathered and delivered to local servers for post-processing and analysis.
• EC2 instances were deleted upon completion.
Security issues were addressed as well: SAS 70 Type II certification and VPN-level access were required; uploaded and downloaded data was encrypted; dedicated cc1 instances were provisioned to ensure that data mingling did not occur; and data was purged upon completion of the case.
Results
The results of this case were impressive. Using Amazon EC2 the large-scale, time accurate simulation was turned around in just 72 hours with computing infrastructure costs well below $1,000. Additionally, time accurate analysis revealed critical insights that were not detected using traditional analysis techniques—most notably a 2% drop in efficiency relative to conventional 3-D multi-stage steady predictions.
Dr. John Clark, Turbine Branch, Turbine Engine Division, Propulsion Directorate of the Air Force Research Laboratory, explains the importance of this case: “Advancing turbine durability and performance remains critical for the U.S. gas turbine industry. The combination of high fidelity time accurate analysis from ADS and on-demand CFD analysis resources from Amazon makes it possible for turbine manufacturers to tackle these issues during design—quickly and without the need for large hardware investment.”
George Fan, CEO of AeroDynamic Solutions, is equally thrilled with the results: “Traditional 3D steady analysis techniques are no longer sufficient to support the design of today’s advanced jet engines. To improve durability and performance, advanced analysis capabilities such as time accurate simulation must be made to work within the accuracy, time, and cost constraints of a commercial design cycle. We’re delighted to be working closely with the Air Force and AWS to make this a reality for designers large and small.”
"Advancing turbine durability and performance remains critical for the U.S. gas turbine industry. The
combination of high fidelity time accurate analysis from ADS and on-demand CFD analysis resources
from Amazon makes it possible for turbine manufacturers to tackle these issues during design—quickly and without the need for large hardware investment."
Dr. John Clark Air Force Research Laboratory
http://aws.amazon.com/solutions/case-studies/aerodynamic-solutions/
Example: Galaxy - An open, web-based platform – http://usegalaxy.org - Perform, reproduce and share complete analyses - Automatically tracks and manages data provenance and provides
support for capturing the context and intent of computational methods
Example: Ion Flux
- Services to analyze DNA sequence data for researchers and health professionals in genomic medicine
http://www.ionflux.com/
For More Information…
" http://www.bigdatahpc.com
" http://aws.amazon.com/ec2/spot-and-science/
" http://aws.amazon.com/hpc-applications/
" http://aws.amazon.com/ec2/instance-types/
" http://aws.typepad.com
Thank you!
Jamie Kinney [email protected] Twitter:@jamiekinney
http://linkedin.com/pub/jamie-kinney/0/b33/668