Cloud Building Blocks and FME
Aug 07, 2015
Overview
Cloud data storage AWS Simple Storage (S3) AWS Aurora AWS RDS
Other critical AWS Services Simple Queuing Service (SQS) AWS Lambda
Common Architecture
Data Considerations
● Uploading data to AWS is as slow as your internet connection.○ Tsunami UDP○ Aspera - 10 Mbps connection: 23 hours 100GB
● You can use AWS Import/Export to move large amounts of data into the cloud.
● Moving data around once it is in the cloud is fast.○ Within region: ~ 20MB/s○ Between regions: ~ 2 MB/s
Cloud data storage
AWS Simple Storage (S3) - secure, durable, highly-scalable object storage
AWS RDS - a web service that makes it easy to set up, operate, and scale a relational database in the cloud.
AWS Aurora - provides up to 5x better performance than MySQL at a price point one tenth that of a commercial database.
Secure, durable, highly-scalable object storage.
Reliability is 99.999999999% Over 3 million request per second. S3 is highly performant and applications can get 100’s of
requests a second via S3. $0.0330 per GB of storage. 5TB of data $160 a month. Can store single objects up to 5TB.
AWS S3 - Overview
A simple key/value persistent object store. Every object is identified with a key. It is not a disk and doesn’t have a directory or folder
structure. It can be presented as a folder structure though.
AWS S3 - How it works
You can host a static website on S3 for next to nothing. You can trigger events on S3 buckets when files are placed in there. Every object in S3 is web addressable.
https://s3.amazonaws.com/<bucket_name>/<keyname>
e.g.
https://s3.amazonaws.com/don.demo.safe.bigdata/imagery/c2775360-75d3-4e1b-b64d-16e39b8f50e8/130105_row_0_col_0.jp2
AWS S3 - Advanced
Desktop transformers S3Uploader S3Downloader S3Deleter S3ObjectLister
FME Server S3 watcher :- Watch public S3 buckets that you do not have control of.
FME Server S3 subscriber :- Send data to S3 once a workspace has completed.
** When running workspaces on FME Cloud, expect transfer speeds averaging 20 MB/s instance >> S3 bucket. **
FME and S3
AWS S3 is suited for:
1. Storing file-based datasets to process on FME Cloud.2. Storing large amounts of raster data, the bbox of the raster files
can be stored in PostGIS with a link to the object in S3.3. Automating data processing: Use S3 event notifications to send a
message to FME.4. Storing web map tiles, every file becomes web addressable so
they can be consumed by an application.
S3 uses
Amazon RDS makes it easy to set up, operate, and scale database deployments in the cloud. The following databases are supported:
Oracle PostgreSQL (PostGIS) Microsoft SQL Server MySQL
AWS RDS
The AWS RDS service lets you…
Deploy in minutes Apply software patches automatically Automate backups Scale storage and compute with one click Replicate to enhance availability and reliability
Did I mention...
On-Premises vs RDS PostGIS
Machine: 16 Cores, 122GB of RAM, Linux, 1TB Storagehttps://awstcocalculator.com/
Relational database engine that combines the speed and availability of high-end commercial databases, with the simplicity and cost-effectiveness of open source databases.
Works with the MySQL reader/writer
AWS Aurora
AWS Aurora
5x better performance than MySQL 6 million inserts per minute, 30 million selects per minute 6 way replication across three availability zones. Continually backed up to S3 Fault tolerant and automatically repairs failures in
background. Crash recovery takes seconds not hours. Database cache survives a database restart
Fully managed messaging queuing service. SQS can be used for:
Asynchronous communication pipelines Buffer queues for databases Asynchronous work queues
S3 event notifications
Simple Queuing Service
A compute service that runs your code in response to events. It makes it easy to apply compute to data as it is enters or moves through the cloud.
Run code in response to modifications to objects in Amazon S3 buckets.
Create your own back-end that operates at scale Extend other AWS services
Lambda (beta)