NoSQL and Cloud Services - Philip Balinow, Comfo
Post on 24-May-2015
74 Views
Preview:
Transcript
Introduction into Cloud Computing
philip.balinov@komfo.comDevOps Engineer
SOME DEFINITIONS
def. CLOUD COMPUTING
def. BIG DATA
SOME DEFINITIONS
def. CLOUD COMPUTING - distributed computing over a network- the ability to run a program or application on many connected computers at the same time.
def. BIG DATA- data sets so large and complex that it becomes difficult to process using traditional data processing applications
SOME DEFINITIONS, CONTD.
Q: No, seriously. What is the Cloud?
A: Well, there are three types
Infrastructure-as-a-Service
Platform-as-a-Service
Software-as-a-Service
THE CLOUD – PROS AND CONS
WHY CLOUD COMPUTING IS SO GREAT
Better hardware utilization
Economy of scale
Usage-based pricing
In-built resilience (here be monsters)
No front-up costs
No long-term contracts
Retail
Finance
E-Commerce
Telecommunication
B2B
Publishing/Media
Government & NGO
Automotive
Travel
KOMFO'S CLIENTS
KOMFO PLATFORM WORKFLOW OVERVIEW
EXTERNAL PROVIDERS
OUR TECHNOLOGY
PHP, Python, C
MySQL
MongoDB, Elasticsearch
Javascript (node.js)
Ruby
Freedom to use any tool fit for the job
KOMFO MAIN CHALLENGES
Continuously changing workload
Fast feature changes
BigData
(Predictive) Analytics
Security
Availability
HOW TO USE THE CLOUD
Automation
Horizontal scaling
Break tasks into many small sub-tasks
Synchronize all the workers
Write for eventual consistency
HOW TO USE THE CLOUD
Beware! There are traps*
*Minimally shared with fair weighting. Allows burst when idle resources are available. In rare cases, resources may be throttled back under heavy host contention.
–**Disk I/O is shared across the host.
–***A vCPU corresponds to a physical CPU thread.
–
HOW TO USE THE CLOUD
There are more traps*
Coordinate tasks – good messaging
system (AMQP, DB, MemCached)
Asynchronous task execution (see
above, also API Callback hooks)
Implement transactions in software
CLOUD ARCHITECTURE
Messaging
Ensure communication between dynamic
number of nodes
Message-oriented middleware
Exactly-once delivery
At-least-once delivery
Transaction-based delivery
Timeout-based delivery
MIX & MATCH
Crunch numbers in the cloud
Application servers
Slow running tasks
Temporary services
Test servers
Automation – automatic deployment of multi-
tiered environments
MIX & MATCH, CONTD.
Traditional servers for:
Incompatible apps (single-threaded, memory,
disk intensive, specialized hardware) do not
work well in cloud environments
Database servers are best kept on dedicated
machines
OK, so we have an (endlessly) scalable cloud app now.
What are we forgetting?
DATABASES, NOSQL
def. NoSQL a mechanism for storage and retrieval of data
that is modeled in means other than the tabular relations used in relational databases.
DATABASES
Postgres, Hadoop, MongoDB, Cassandra, Riak
In-memory dataset for faster operation
No predefined structure
Integrated sharding, load-balancing and failover
Versatility - can be used for anything from data
storage to real-time messaging to search indexes
DATABASES
Use the best tool for the job depending on the task
NoSQL Advantages
Some sources generate a lot of data
Complex interconnections, cyclical
dependencies
Aggregations must be performed on both new
and old data
Structure of foreign sources may change on
short notice
DATABASES
Use the best tool for the job depending on the task
NoSQL Disadvantages (Classic SQL advantages)
Not ACID compliant
No transactions
No relations between data
Lack of structure means aggregations are slow
DATABASES, LONG TERM STRATEGY
Data quickly becomes irrelevant
Archive it, but keep it accessible
Online Data Warehouse solutions
Amazon Redshift
Keep Everything
Terabytes for pennies
Summary
The cloud rocks, mmkay?
Questions?
top related