Wissbi osdc pdf

Post on 18-Dec-2014






Click to see full reader




WissbiA toolset for distributed event processing


Batch Processing


Streaming Processing

Successful Stories• "Wissbi is an easy-to-use tool for distributed

event processing" -- Scott Wang, 奇群科技

• "Wissbi provides great utilities for logging, debugging, and monitoring your distributed system"-- 前趨勢科技,王姓工程師

• "Wissbi lets you easily manage your data workflow in a pipe and filter style with intuitive commands"-- lunastorm, Open Source Developer

Wissbi 的前世今生

• Wissbi is a highly available and scalable distributed message routing framework like ZeroMQ that took a different path from traditional MQ middlewares.

• The toolkits supports an elegant and intuitive integration model, allowing writing event driven applications as easy as "$ tail -f log | grep error > error.log".

• Scott Wang a.k.a. lunastorm, Sr. Engineer at Zillians

• Trend Message Exchange (TME) is a highly available and scalable distributed message routing framework that took a different path from traditional MQ middlewares.

• TME's client toolkit, MIST, supports an elegant and intuitive integration model, allowing writing event driven applications as easy as "$ tail -f log | grep error > error.log".

• Scott Wang a.k.a. lunastorm, Sr. Engineer at Trend Micro


I need a messaging system


Deployment Guide System Requirement

• A working ZooKeeper deployment, version newer than 3.3.2 is required • You have set your hostname and hostname resolution correctly on each machine. Running hostname -i should return

an IP other than • Create an user account TME on each machine, for example, execute useradd -m TME

Standalone Deployment

By standalone it means that all of the TME components (including ZooKeeper) will be deployed on the same host. Mostly this is for development purposes. Because all default configurations will work in standalone mode, you only have to install the packages and bring up the daemons to test and develop.


You will need the following 3rd party packages which are not in the official repository:

1. jdk (get RPM from http://www.oracle.com/technetwork/java/javase/downloads/index.html) 2. monit (http://pkgs.org/search/?keyword=monit) 3. nodejs (for portal, http://pkgs.org/search/?keyword=nodejs) 4. ruby (for portal, https://github.com/lunastorm/ruby19_centos/downloads, have to be at least 1.9.2, you can build from

https://github.com/imeyer/ruby-1.9.2-rpm) 5. ruby-bundler (for portal, https://github.com/lunastorm/ruby19_centos/downloads)

Then you can follow the following steps to install:

1. install Sun's JDK first 2. download the dependency RPMs mentioned above 3. download TME RPM binaries and place them in the same folder with the dependencies 4. yum --nogpgcheck install *.rpm


1. Grab all the deb files you would like to install 2. sudo dpkg -i tme-*.deb 3. sudo apt-get update 4. sudo apt-get -f install

TME web portal only supports Ruby 1.9.2+, and Ubuntu 10.04 only ships Ruby 1.9.1

You have to follow this step to use RVM to install Ruby 1.9.2:

1. aptitude install build-essential libssl-dev libreadline5 libreadline5-dev zlib1g zlib1g-dev 6. bash -s stable < <(curl -s https://raw.github.com/wayneeseguin/rvm/master/binscripts/rvm-installer) 2. source /etc/profile.d/rvm.sh 3. rvm install 1.9.2 ; rvm default 1.9.2 4. Edit /opt/trend/tme/conf/portal-web/portal-web-conf.sh , add "source /etc/profile.d/rvm.sh"

The web portal requires a JavaScript runtime installed. For example, you can install Node.js on the machine.

On All Distribution

Finally, execute /opt/trend/tme/bin/create_zookeeper_nodes.sh /tme2 to initialize the essential information on ZooKeeper before you start the components.

• Distributed Deployment

You can choose to deploy different components on different machines of different hardware specs. Typically, the brokers are required to be deployed on more powerful machines than others. They will handle a large amount of client connections and deliver messages so they consume more memory and use more CPU. On the other hand, the clients need not use too much computing power, but it depends on the applications' needs. The administration packages can be deployed on multiple machines for redundancy.

1. First, you will probably have a distributed ZooKeeper set up, for example, zk1.mydomain:2181,zk2.mydomain:2181,zk3.mydomain:2181

2. Follow the same way described in the standalone guide above to install the packages on the machines of your choice.

3. Initialize the essential information on ZooKeeper: Execute /opt/trend/tme/bin/create_zookeeper_nodes.sh zk1.mydomain:2181,zk2.mydomain:2181,zk3.mydomain:2181 /tme_root_prefix The first argument is the ZooKeeper quorums, and the second argument is a prefix path on ZooKeeper chosen by you. By separating the prefix path on ZooKeeper, we can have multiple TME environments coexist in one ZooKeeper deployment. Remember these two

Service Start / Stop

You can choose one of the following ways to start or stop the components.

service and chkconfig

1. sudo service tme-broker {start / stop / restart} 2. sudo service tme-mistd {start / stop / restart} 3. sudo service tme-graph-editor {start / stop / restart} 4. sudo service tme-portal-collector {start / stop / restart} 5. sudo service tme-portal-web {start / stop / restart}

If you wish to start the services upon boot, the you can use chkconfig to turn on the services.


If you have installed and enabled monit, then you can use it to ensure the services are running.

Under the configuration folders of the components, there are monit watchdog scripts that can be modified to fulfill your need, for example, send a notification when a daemon stops working.

You can use the helper scripts to start / stop the components:

1. sudo /opt/trend/tme/bin/{install|remove}_tme-broker.sh 2. sudo /opt/trend/tme/bin/{install|remove}_tme-mistd.sh 3. sudo /opt/trend/tme/bin/{install|remove}_tme-graph-editor.sh 4. sudo /opt/trend/tme/bin/{install|remove}_tme-portal-collector.sh 5. sudo /opt/trend/tme/bin/{install|remove}_tme-portal-web.sh



After MIST daemon is started and configured correctly, execute mist-session --list to show session information:

$ mist-session -l 0 sessions 0 connections You should get the response like above. If MIST daemon is not started correctly, you may receive the following error response:

$ mist-session -l Error connecting to MIST daemon! If MIST daemon is running correctly, then you can send your first Hello World message. Execute the script to send a message to a queue named test:

$ session_id=`mist-session` && echo 'Hello World!' | mist-encode --wrap test --line | mist-sink $session_id --attach ; mist-session --destroy $session_id destroyed 1453444792 Then execute the script to receive one message from the queue named test:

$ session_id=`mist-session` && mist-source $session_id --mount test && mist-source $session_id --attach --limit 1 | mist-decode --line ; mist-session --destroy $session_id exchange queue:test mounted Hello World! destroyed 1453444796 Congratulations! You can now transmit the messages.


Open a browser to access http://**portal.host**:**portal.port** to see if it shows correctly.

Graph Editor

I don’t always readthe deployment guide

When I screw up something

I call the developers

Rethink Your Problem



Do I really need a broker?

Do I really need synchronization?

What Do I Need?

• A message publisher

• A message subscriber

• A directory service for publishers to know where the subscribers are

Name Foo Bar

AddressDirectory Service


Name Foo Bar


Directory Service



Get all subscribers

Send Message

Directory Services• DNS

• First attempt


• $&@?!&@!?

• ...

What if

I use directory for directory service?

Using Filesystem for Directory Service

• Store metadata on the filesystem

• Follows the philosophy "Everything is a file"

• Use standard Unix commands to manage it

• mkdir, touch, rm, ln, ...

• Just like /proc and /sys

Using Filesystem for Directory Service

• You get authorization for free

• chown, chmod, ...

• Good for testing

• Launch any number of cluster just with different metadata directories


Name Foo


Directory Service




連你阿嬤都會用的 管理方式如果你阿嬤會用Unix...

Related Work

Plumber (Program)• The plumber, in the Plan 9 from Bell Labs and Inferno operating systems, is a

mechanism for reliable uni- or multicast inter-process communication of formatted textual messages. It uses the Plan 9 network file protocol, 9p, rather than a special-purpose IPC mechanism.

• Any number of clients may listen on a named port (a file) for messages. Ports and port routing are defined by plumbing rules. These rules are dynamic. Each listening program receives a copy of matching messages. For example, if the data /sys/lib/plumb/basic is plumbed with the standard rules, it is sent to the edit port. The port will write a copy of the message to each listener. In this case, all running editors will interpret this message as a file name, and open the file.

• The plumber is the 9P file server that provides this service. Clients may use libplumb to format messages. Since the messages are 9P, they are network transparent.


Design Philosophy

Minimum Dependency$ ldd wissbi-pub

linux-vdso.so.1 libpthread.so.0 libstdc++.so.6 libm.so.6 libgcc_s.so.1 libc.so.6 /lib64/ld-linux-x86-64.so.2

The only dependency is a compiler which supports C++11!

Easy to Test


Basic Commands

• wissbi-sub [message source]

• wissbi-pub [message destination]

Live Demohttp://ascii.io/a/2913


Write your filter with any language you like



















You Will Have Multiple Data Pipelines



















Daemonize Your Filters!

•用 config 寫程式

• Daemon start / stop / restart / status

• start / stop upon boot / shutdown

• Watchdogs

<?xml version="1.0"?> <!DOCTYPE Configure PUBLIC "-//Mort Bay Consulting//DTD Configure//EN" "http://jetty.mortbay.org/configure.dtd"> <!-- =============================================================== --> <!-- Configure the Jetty Server --> <!-- --> <!-- Documentation of this file format can be found at: --> <!-- http://docs.codehaus.org/display/JETTY/jetty.xml --> <!-- --> <!-- =============================================================== --> <Configure id="Server" class="org.mortbay.jetty.Server"> <!-- =========================================================== --> <!-- Server Thread Pool --> <!-- =========================================================== --> <Set name="ThreadPool"> <!-- Default bounded blocking threadpool --> <New class="org.mortbay.thread.BoundedThreadPool"> <Set name="minThreads">10 <Set name="maxThreads">50 <Set name="lowThreads">25 </New> <!-- New queued blocking threadpool : better scalability <New class="org.mortbay.thread.QueuedThreadPool"> <Set name="minThreads">10 <Set name="maxThreads">25 <Set name="lowThreads">5 <Set name="SpawnOrShrinkAt">2 </New> -->

用 Config 寫程式之Java 篇

# How many filter instances will be run in parallel WISSBI_FILTER_COUNT="1" # How to run the filter WISSBI_FILTER_CMD="sed --unbuffered -e \"s/^/[ / ; s/$/ ]/\"" # If you run multiple instances in parallel, you can use the instance id \$i in the command # WISSBI_FILTER_CMD="sed --unbuffered -e \"s/^/\$i: [ / ; s/$/ ]/\"" # The message source's name, leave it empty if the filter is a message generator WISSBI_FILTER_SOURCE="test.in" # The message sink's name, leave it empty if the filter is a message terminal WISSBI_FILTER_SINK="test.out" WISSBI_FILTER_LOG_PREFIX="/tmp/filter-example" WISSBI_FILTER_PID_PREFIX="/tmp/filter-example" # If WISSBI_DEBUG_DUMP is set, message recording will be enabled, and 50 messages # before the filter is terminated is dumped to the specified file. # If WISSBI_DEBUG_DUMP is set to empty, then a random dump filename will be used #WISSBI_DEBUG_DUMP=""

wissbi filter example

The easiest way to build a distributed pipeline

Live DemoA Calculator Service http://ascii.io/a/2914

The Wissbi Ecosystem

Log Collector#!/bin/sh WISSBI_FILTER_COUNT="1" if [ -e "./wissbi_log_collector.sh" ] then WISSBI_FILTER_CMD="./wissbi_log_collector.sh" else WISSBI_FILTER_CMD="/usr/bin/wissbi_log_collector.sh" fi WISSBI_FILTER_SOURCE="" WISSBI_FILTER_SINK="wissbi.log" WISSBI_FILTER_LOG_PREFIX="/tmp/wsblogcollectord" WISSBI_FILTER_PID_PREFIX="/tmp/wsblogcollectord" . /usr/bin/wissbi_filter_template.sh



tail -F | collector | wissbi-pub


tail -F | collector | wissbi-pub

solr / splunk / hadoop /


Metric Collector

• Ganglia Integration

• Write message count into rrd files

• Also uses wissbi as the transporting tool

• Demo if we have time...

http://aws.amazon.comVery useful metric

to scale your server farm

Current Limitation

• Message size < 4k

• One listening port is occupied per consumer

Platform Support• Ubuntu 12.04 with gcc 4.6.3

• BSD branch with clang++ on the way...

• OSX (等上面那個)

• Windows

• Wat?

Conclusion•窮人的鮑魚 -- 九孔

•窮人的燕窩 -- 白木耳

•窮人的原子彈 -- 生化武器

•窮人的 Message Bus -- Wissbi

• Minimize operating effort and cost

• Maximize workflow flexibility

TME Wissbi

代言 無發哥、伍佰 勞動英雄

語言 丁ava 節能減碳C++

操作 複雜 簡單

版號 2.5 謙虛的0.1


Thank You

top related