Gearman 'The manager' ”since it dispatches jobs to be done, but does not do anything useful itself.”
Jan 15, 2015
Gearman
'The manager'
”since it dispatches jobs to be done, but does not do anything useful itself.”
Presentation done using info from...
http://www.slideshare.net/pcdinh/gearman-and-asynchronous-processing-in-php-applications-6135047
http://assets.en.oreilly.com/1/event/45/The%20Gearman%20Cookbook%20Presentation.pdf
http://www.gearman.org
http://nz.php.net/manual/en/book.gearman.php
others...
Scalable Solutions..
More Hardware
Caching
Precalculated Data
Load Balancing
Multi-tier application
Job Queue
History
Created by Danga Interactive.
Some company that developed Memcache.
Original implementation in perl (2005).
2008 rewriteen in C by Brian Aker
PHP Extension by James Luedke
Used by
Digg: 45+ Servers, 400K jobs/day
Yahoo: 60+ servers, 6M jobs/day
And many others..
InstallingCompiling:
tar xzf gearmand-X.Y.tar.gzcd gearmand-X.Y./configuremakemake install
Starting server:
$ gearmand -d
Pecl extension:
tar xzf gearman-X.Y.tgzcd gearman-X.Yphpize./configuremakemake install
To add to the php.ini:
extension="gearman.so"
Terminology
Client: Create jobs to be run and send them to a job server.
Worker: Run jobs given from the job server.
Job Server: Handle the job queue form clients to workers.
“A massively distributed, massively fault tolerant fork mechanism.”
- Joe Stump, SimpleGeo
Gearman is...
Open Source.
Simple & Fast.
Multi-language.
Flexible application design.
Load Balancing.
No single point of failure.
Features
Client
Worker
Job Server Job Server
Client Client Client
Worker Worker Worker
Memory
Memcached
Mysql/Drizzle
PostgreSQL
SQLite
Tokio Cabinet
Queue Options
Foreground(synchronus)
Or
Background(asynchronus)
Fishpond_Controller_Front::getResource('gearman')
->getGearmanClient()
->doBackground("updateCompetitorPrice", $this->_barcode);
->do("updateCompetitorPrice", $this->_barcode);
GearmanClient::do() - Run a single task and return a result
GearmanClient::doLow() - Run a single low priority task
GearmanClient::doBackground() - Run a task in the background
GearmanClient::doHighBackground() - Run a high priority task in the background
GearmanClient::doLowBackground() - Run a low priority task in the background
Gearman Client
Scatter / Gather.
Map / Reduce.
Asynchronus Queues. Pipeline Processing.
Strategies
Scatter / Gather
Client
Price Calculation Image Resize
RecomendationsProduct Detail
$client = Fishpond_Controller_Front::getResource('gearman') ->getGearmanClient();
//adding gearman tasks $client->addTask("getProductDetail", $barcode); $client->addTask("getPrice", $barcode); $client->addTask("resizeImage", serialize($barcode,100,100)); $client->addTask("getRecomendations", $barcode);
//callbacks to know when this finish $client->setCompleteCallback(array($this, "complete"));
//runing tasks $client->runTasks();
/** * Callback when task is complete * */ public function complete($task) {
$data = $task->data();
}
GearmanClient::addTaskHigh() - Add a high priority task to run in parallel
GearmanClient::addTaskLow() - Add a low priority task to run in parallel
GearmanClient::addTaskBackground() - Add a background task to be run in parallel
GearmanClient::addTaskHighBackground() - Add a high priority background task to be run in parallel
GearmanClient::addTaskLowBackground() - Add a low priority background task to be run in parallel
GearmanClient::runTasks() - Run a list of tasks in parallel
Task Methods
GearmanClient::setDataCallback() - Callback function when there is a data packet for a task
GearmanClient::setCompleteCallback() - Set a function to be called on task completion
GearmanClient::setCreatedCallback() - Set a callback for when a task is queued.
GearmanClient::setExceptionCallback() - Set a callback for worker exceptions.
GearmanClient::setFailCallback() - Set callback for job failure.
GearmanClient::setStatusCallback() - Set a callback for collecting task status.
GearmanClient::setWarningCallback() - Set a callback for worker warnings.
GearmanClient::setWorkloadCallback() - Set a callback for accepting incremental data updates
Client Callback
Concurrent tasks with different workers.
All tasks run in the time for longest running.
Must have enough workers available.
Scatter / Gather
Map/Reduce
ClientTask T
Task T-0 Task T-3Task T-2Task T-1
Task T-00 Task T-02Task T-01
Asynchronous Queues
No everyting need inmediate procesing..
Competitor pricing. Emails. Whole price engine. Loging. Etc.
Example:
$gearmanClient = Fishpond_Controller_Front::getResource('gearman')->getGearmanClient();
$gearmanClient->doBackground("updateCompetitorPrice", $this->_barcode);
Loging
<VirtualHost *:80> ServerName example.com DocumentRoot /var/www/ CustomLog “| gearman -n -f looger” common (client)</VirtualHost>
Pipeline Procesing
ClientTask T
Output
WorkerOperation 3
WorkerOperation 2
WorkerOperation 1
Questions ?