Top Banner
Beolink.org Why do I have to use a Message Queue System ? Fabrizio Manfred Furuholmen
57

Introduction to message_queue

May 10, 2015

Download

Technology

One of the new challenges of IT today is the "Big Data", to solve this problem many solutions are available on the market and some new paradigms have appeared.
In most of these new paradigms the Message Queue covers an important part, more than the past.
This is a small introduction to the use of Messaging Middleware and an overview of the main open source products available.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to message_queue

Beolink.org!

Why do I have to use a Message Queue System ?Fabrizio Manfred Furuholmen"

Page 2: Introduction to message_queue

Beolink.org!

Europython 2012"2"

Agenda

§  Introduction §  History §  Basic components

§  Message Queue §  Usage type §  Advantages

§  Implementation §  Solution §  Performance §  Scalability/High Avaibility

§  Big Data §  Distributed §  Cloud Computing

Page 3: Introduction to message_queue

Beolink.org!

9/11/12"3"

Introduction: Example

Process A!Filesystem!

FTP Server! Process B!

Get Message!Message: !2012082609000000-Serial-Type-ProcessB.xml!!!

Ack Message: !2012082609000000-Serial-Type-OK.xml!

Process Comunication!

More than 10 years ago!

Page 4: Introduction to message_queue

Beolink.org!

9/11/12"4"

Introduction: Definition

"""“…message queueing is a method by which process (or program instance) can exchange or pass data using an interface to a system-managed queue of message...”"

Page 5: Introduction to message_queue

Beolink.org!

9/11/12"5"

Introduction: Components

Process A!Filesystem!

FTP Server! Process B!

Get Message!Message: !2012082609000000-Serial-Type-ProcessB.xml!!!

PRODUCER! BROKER! CONSUMER!

Ack Message: !2012082609000000-Serial-Type-OK.xml!

Topic!Queue!

Page 6: Introduction to message_queue

Beolink.org!

9/11/12"6"

Introduction: Broker

"“…message broker is an architectural pattern for message validation, message transformation and message routing. It mediates communication amongst applications, minimizing the mutual awareness that applications should have of each other in order to be able to exchange messages, effectively implementing decoupling…”

Message-oriented middleware��� (MOM)

Page 7: Introduction to message_queue

Beolink.org!

9/11/12"7"

Is message queue middleware only a temporary

storage ?

Page 8: Introduction to message_queue

Beolink.org!

9/11/12"8"

Message Queue

q Asynchronous communication q  Lock q Concurrent Read/Write

q Burst Message

q Decoupling q Reliability

q Multi platform

P! C!

Es.""Multimedia Converter""SMS gateway ""

Page 9: Introduction to message_queue

Beolink.org!

9/11/12"9"

Message Queue: Multi Processing

q Parallel processing

q Load Balancing

q High Availability

q Elastic q Maintenance operation

P! C!

C!

C!

P!

P!

C!

Es.""Image converter""Billing Event""User Provisioning""

Page 10: Introduction to message_queue

Beolink.org!

9/11/12"10"

Message Queue: Pub/Sub

q Sending messages to

many consumers at once

q Event Driven P! C!

C!

C!

X!

Es.""Push notification""Chat Room""

Page 11: Introduction to message_queue

Beolink.org!

9/11/12"11"

Message Queue: Routing

q Static with routing key

q Pattern base q Pattern topic q Dynamic with header

evaluation P! C!

Queue A! C!

C!

Queue B!

Queue C!

X!

Es.""Logging collector""User Provisioning on " target System""Info Sync"

Page 12: Introduction to message_queue

Beolink.org!

9/11/12"12"

Message Queue: RPC

q Remote Procedure Call q  Single queue for Consumer q One queue for each Producer q  Reply to options

P! C!Queue !

Tmp Queue !

Es.""Distributed Scheduler"CallBack"WAN comunication""

Page 13: Introduction to message_queue

Beolink.org!

9/11/12"13"

Message : More …

q Persistent Message

q Queue q Priority / Re ordering q Message Group q QOS / rating

q Deduplication

q Broker Network q Cluster q Load distribution over

WAN q Message routing

Page 14: Introduction to message_queue

Beolink.org!

9/11/12"14"

“Message Queue scales to any number of cores, avoids all locks, costs little more than conventional single-threaded programming, is easy to learn, and does not crash in strange ways. At least no more strangely than a normal single-threaded program…” http://msdn.microsoft.com/en-us/magazine/cc817398.aspx"http://www.zeromq.org/blog:_start/p/2"http://ulf.wiger.net/weblog/2008/02/06/what-is-erlang-style-concurrency/""

Page 15: Introduction to message_queue

Beolink.org!

9/11/12"15"

Simple solution to a complicated problem!

Page 16: Introduction to message_queue

Beolink.org!

9/11/12"16"

Implementation

q Internal implementation q  Python (Queue), Perl (Thread::Queue) ...

q Nosql Based q  Redis, MongoDB, Memcache …

q Framework q Generic application framework: Gearman q  Stomp Based: ActiveMQ, Apollo… q  AMQP Based:RabbitMQ, Qpid… q Other : kafka…

q Alternative solutions q  Broker less (0MQ, Crossroads I/O)

q Services

Page 17: Introduction to message_queue

Beolink.org!

9/11/12"17"

Implementation:Message Format

q Internal / Object

q STOMP Simple (or Streaming) Text Oriented Message Protocol (STOMP) is a simple text-based protocol, designed for working with Message Oriented Middleware

q AMQP Advanced Message Queuing Protocol is an application layer protocol, designed to efficiently support a wide variety of messaging applications and communication patterns.

q XMPP Extensible Messaging and Presence Protocol

q JSON JavaScript Object Notation, is a text-based

Page 18: Introduction to message_queue

Beolink.org!

9/11/12"18"

Implementation: NoSQL

Redis Internal Function

"The HTTP operation on url:" /queue/<queuename> ""Post Message"{" "cmd": "add"," "queue": "genesis"," "value": "abacab""}""Get Message"{" "cmd": "take"," "queue": "genesis""}"The message can be formatted as a json object, so more complex data can be sent. It was inspired by Amazon SQS "

RestMQ

"self.redis = redis.StrictRedison (…)""def send(self,queue,message):" self.redis.rpush(queue,message)""def recv(self,queue)" return self.redis.blpop(queue)"""Queue Name = KEY"Message = Value"Queue = List "Notify = Event "

Demo sub/pub :https://gist.github.com/348262"

Page 19: Introduction to message_queue

Beolink.org!

9/11/12"19"

Implementation: AMQP

RabbitMQ

#!/usr/bin/env python"import pika""connection = pika.BlockingConnection(pika.ConnectionParameters(" host='localhost'))"channel = connection.channel()""channel.queue_declare(queue=’myqueue')""channel.basic_publish(exchange=''," routing_key=’myqueue'," body=’message 1 ')"print " [x] Sent ’Message 1""connection.close()""""""

#!/usr/bin/env python"import pika""connection = pika.BlockingConnection(pika.ConnectionParameters(" host='localhost'))"channel = connection.channel()""channel.queue_declare(queue=’myqueue')""print ' [*] Waiting for messages. To exit press CTRL+C'""def callback(ch, method, properties, body):" print " [x] Received %r" % (body,)""channel.basic_consume(callback," queue=’myqueue'," no_ack=True)""channel.start_consuming()"

Producer Consumer

Page 20: Introduction to message_queue

Beolink.org!

9/11/12"20"

Implementation: Alternatives

ZeroMQ

#!/usr/bin/env python""import zmq ""context = zmq.Context() ""socket = context.socket(zmq.REQ) ""socket.bind("tcp://127.0.0.1:5000") ""  ""while True: "" msg =”my msg”"   socket.send(msg) ""   print ”Send", msg ""   msg = socket.recv()

#!/usr/bin/env python""import zmq ""context = zmq.Context() ""socket = context.socket(zmq.REP) ""socket.bind("tcp://127.0.0.1:5000") ""  ""while True: ""   msg = socket.recv() ""   print "Got", msg ""   socket.send(msg)"

Producer Consumer

Page 21: Introduction to message_queue

Beolink.org!

9/11/12"21"

…, but it is not fast enough …

Page 22: Introduction to message_queue

Beolink.org!

9/11/12"22"

Performance: Redis

The Linux box is running Linux 2.6, it's Xeon X3320 2.5 GHz.""Text executed using the loopback interface (127.0.0.1)."

Page 23: Introduction to message_queue

Beolink.org!

9/11/12"23"

Performance: RabbitMQ

PowerEdge R610 with dual Xeon E5530s and 40GB RAM!

Page 24: Introduction to message_queue

Beolink.org!

9/11/12"24"

Performance: ActiveMQ/Apollo

!EC2 High-CPU Extra Large Instance EC2 xlarge !!7 GB of memory"20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each)"model name ": Intel(R) Xeon(R) CPU E5506 @ 2.13GHz""OS: Amazon Linux 64bitLinux ip-10-70-206-42 2.6.35.14-97.44.amzn1.""5 Consumer"5 Producer"

Producer"

Consumer"

Apollo"ActiveMQ"

Page 25: Introduction to message_queue

Beolink.org!

9/11/12"25"

Performance: kafka

message size = 200 bytes"batch size = 200 messages"fetch size = 1MB"flush interval = 600 messages"

Page 26: Introduction to message_queue

Beolink.org!

9/11/12"26"

Performance: zeroMQ

Box 1:"8-core AMD Opteron 8356, 2.3GHz"Mellanox ConnectX MT25408 in 10GbE mode"Linux/Debian 4.0 (kernel version 2.6.24.7)"ØMQ version 0.3.1""Box 2:"8-core Intel Xeon E5440, 2.83GHz"Mellanox ConnectX MT25408 in 10GbE mode"Linux/Debian 4.0 (kernel version 2.6.24.7)"ØMQ version 0.3.1"

Throughput gets to the maximum of 2.8 million messages !per second for messages 8 bytes long!

Page 27: Introduction to message_queue

Beolink.org!

9/11/12"27"

Performance

Page 28: Introduction to message_queue

Beolink.org!

9/11/12"28"

Performance

q Persistence message can fault down to hundreds of message per Second

q Bandwidth Message size and Acknowledge increase the usage of bandwidth

q Topics The routing based on the value of header, increase the delay

q Queue Number of queue increase the delay

q Cluster Replication message increase the time for the acknowledgement

Page 29: Introduction to message_queue

Beolink.org!

9/11/12"29"

Big Data

Page 30: Introduction to message_queue

Beolink.org!

9/11/12"30"

Big Data

Big data spans three dimensions

q Volume Enterprises are awash with ever-growing data of all types, easily amassing terabytes—even petabytes—of information.

q Velocity Sometimes 2 minutes is too late. For time-sensitive processes such as catching fraud, big data must be used as it streams into your enterprise in order to maximize its value.

q Variety Big data is any type of data - structured and unstructured data such as text, sensor data, audio, video, click streams, log files and more.

Page 31: Introduction to message_queue

Beolink.org!

9/11/12"31"

Big Data

Big Data! Message Queue!Volume! Load Balancing:"

-  with Multi Brokers Conf"-  with Multi queues Conf"

Velocity! Parallel Processing"-  Balance base on time spent"-  Increase capacity on demand"

High Availability"Variety! Routing Key"

"Path "

- Header analysis" - Topic"

Page 32: Introduction to message_queue

Beolink.org!

9/11/12"32"

Big Data: Linkedin

http://incubator.apache.org/kafka/design.html"http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf"

End User action Tracking Operational Metrics Forntend (Producer)= 100Mb/s Bakend (Consumer) = 200Mb/s

Broker!

Load Balancer!

Realtime! Hadoop"Realtime"

FD! FD! FD!

Broker!Broker!

Broker!Broker!

Broker!

DWH!

Main Datacenter! Analysis Datacenter!

Page 33: Introduction to message_queue

Beolink.org!

9/11/12"33"

Big Data: Soocial.com

http://aws.typepad.com/aws/2008/12/"

Synchronization btw different applications Collect tracking data

Page 34: Introduction to message_queue

Beolink.org!

9/11/12"34"

Big Data: CERN Architecture

Component drilldown

fusesource.com/collateral/download/82/"

Monitor reliability and availability of European distributed computing Infrastructure (GRID) 100K system Monitored

Page 35: Introduction to message_queue

Beolink.org!

9/11/12"35"

Big Data: OpenStack

OpenStack Coordination and provisioning RackSpace Cloud Management Tasks

Page 36: Introduction to message_queue

Beolink.org!

9/11/12"36"

Cloud Computing: Netflix

Transcoding media conversion for different device and channels

Page 37: Introduction to message_queue

Beolink.org!

9/11/12"37"

Cloud Computing

""

“Everything fails all the time”""

" "" "Werner Vogels "

CTO of Amazon"

Page 38: Introduction to message_queue

Beolink.org!

9/11/12"38"

Cloud Computing: SLA

"

SLA"•  Amazon’s EC2 availability SLA is

99.95% = 4.38 hours = 16680 sec "

RTO"•  Restarting time = 5 minutes = 300

sec "

Event" •  55 Reboot per year"

No one declare the MTBF !!!!

Page 39: Introduction to message_queue

Beolink.org!

9/11/12"39"

Cloud Computing: Design

Design your application architecture for failure. Don’t look for alternatives"

"…split your applications into different components

and, make sure every component of your application has redundancy with no common points

of failure…"

Page 40: Introduction to message_queue

Beolink.org!

"""

MOM!

Auto Scaling!

Auto Scaling!

9/11/12"40"

Cloud Computing: Architecture

Load Balancer!

FD! FD! FD!

Broker! Broker"

BL! BL" BL"

Back End / Storage"

cache"

Page 41: Introduction to message_queue

Beolink.org!

9/11/12"41"

Cloud Computing: Multi site

"BROKERS!

!WAN MOM!

SITE A

SITE B

SITE C W QUEUE"

W QUEUE"

W QUEUE"

LOCAL MOM! LOCAL MOM!

LOCAL MOM!

Page 42: Introduction to message_queue

Beolink.org!

9/11/12"42"

Cloud Computing: Design

High Availability!"

Page 43: Introduction to message_queue

Beolink.org!

9/11/12"43"

Cloud Computing: Scalability / HA

q Master-Salve topology queue is assigned to a master node, and all changes to the queue are also replicated to a salve node. If the master has failed, the slave can take over. (e.g. Qpid and ActiveMQ, RabbitMQ)."

"

q Queue Distribution queues are created and live in a single node, and all nodes know about all the queues in the system. When a node receives a request to a queue that is not available in the current node, it routes the request to the node that has the queue. (e.g. RabbitMQ)"

q Cluster Connections "Clients may define cluster connections giving a list of broker nodes, and messages are distributed across those nodes based on a defined policy (e.g. Fault Tolerance Policy, Load Balancing Policy). It also supports message redistribution, and it plays a minor role in this setup."

"

q Broker networks The brokers are arranged in a topology, and subscriptions are propagated through the topology until messages reach a subscriber. Usually, this uses Consumer priority mode where brokers that are close to the point of origin are more likely to receive the messages. The challenge is how to load balance those messages. (e.g. ActiveMQ)"

Page 44: Introduction to message_queue

Beolink.org!

9/11/12"44"

Cloud Computing: Cluster

"""

MOM!

!!!!!!!!!

Cluster!

Node! Node! Node!

Inter node Comunication!Internal Message Routing!

Storage! Storage" Storage"

Client! Client! Client! Lookup!-  Defined IP"-  Multicast"-  BootStrap"-  Agent"

Page 45: Introduction to message_queue

Beolink.org!

9/11/12"45"

Cloud Computing: Cluster

RabbitMQ:"http://skillsmatter.com/custom/presentations/talk4.rabbitmq_internals.pdf"

Configuration!Two Cluster "with one node""Single Cluster "with two nodes""

Page 46: Introduction to message_queue

Beolink.org!

9/11/12"46"

Cloud Computing: Federation/Shovel

RabbitMQ:"http://skillsmatter.com/custom/presentations/talk4.rabbitmq_internals.pdf"

Page 47: Introduction to message_queue

Beolink.org!

47"

Are

you

hap

py?!

9/11/12"

Page 48: Introduction to message_queue

Beolink.org!

9/11/12"48"

Critical Points

q Dimension q  Message Size q  Number of Queue q  Persistence q  Delay of the queue

q Persistence only when you need

q Cluster on client side or via boostrap

q Acknowledge when you need

q Topic vs Queue

q Queue Length

q Performance Test

E = (P/C-1)*T" L = (P-C)*T

Page 49: Introduction to message_queue

Beolink.org!

9/11/12"49"

More complex

ρ = λ / µ"Exponential probability density"

All customers have the same value"

Any arbitrary probability distribution"

Processing Delay"Transmission Delay"

Propagation Delay"

Page 50: Introduction to message_queue

Beolink.org!

9/11/12"50"

Last but not ..

Message Queue Performance "http://hiramchirino.com/stomp-benchmark/"Kafka "http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf""RestMQ"http://restmq.com/""ActiveMQ"http://www.activemq.org""RabbitMQ"http://www.rabbitmq.com""ZeroMQ"https://zeromq.org""Cloud application"http://www.eecs.qmul.ac.uk/~luca/pi_meets_industry/slides/Richardson.pdf"http://www.slideshare.net/gojkoadzic/achieving-scale-with-messaging-and-the-cloud"http://www.slideshare.net/AmazonWebServices/highly-available-websites-in-aws"

Many Vendor Products"

Redhat key sponsor of

Qpid"

RedHat acquired

Fusesourse"

Vmware acquired

RabbitMQ"

Message Queue as a

Service"Cloud "

Page 51: Introduction to message_queue

Beolink.org!

9/11/12"51"

Conclusion

"The science of programming: ""“…make building blocks that people can understand and use easily, and people will work together to solve the very largest problems.”"

Page 52: Introduction to message_queue

Beolink.org!I look forward to meeting you…

XVII European AFS meeting 2012 University of Edinburgh

October 16th to 18th

Who should attend: §  Everyone interested in deploying a globally accessible

file system §  Everyone interested in learning more about real

world usage of Kerberos authentication in single realm and federated single sign-on environments

§  Everyone who wants to share their knowledge and experience with other members of the AFS and Kerberos communities

§  Everyone who wants to find out the latest developments affecting AFS and Kerberos

More Info: http://openafs2012.inf.ed.ac.uk/

9/11/12"52"

Page 53: Introduction to message_queue

Beolink.org! #

Thank you [email protected]://www.beolink.org"

Page 54: Introduction to message_queue

Beolink.org!

9/11/12"54"

Message Queue Length

Which is the right size ? ""

"""…30% extra capacity?

Page 55: Introduction to message_queue

Beolink.org!

9/11/12"55"

Basic Formula

E = (P/C-1)*T

Capacity!

Burst Time!Peak!

Elapsed!

Page 56: Introduction to message_queue

Beolink.org!

9/11/12"56"

Basic Formula

L = (P-C)*T

Length!

Peak!

Capacity!

Burst Time!

Page 57: Introduction to message_queue

Beolink.org!

9/11/12"57"

Cloud Computing