Beanstalk, Pheanstalk and Priorities

I've got an application that uses Beanstalkd to queue up messages, and some PHP worker scripts that grab messages from the queue and process them. Messages get added by the web application, but can also be added by cron - and when I add a bunch of messages via cron, I don't want to swamp what the web application is doing! Those cron-added jobs are mostly pretty low priority, generating reports, sending weekly update emails, that kind of thing. Beanstalkd has a concept of priority, so I can create lower priority jobs by using code like this:

<?php

define("LOW_PRIORITY", 2048);  // default is 1024 

$queue =  new Pheanstalk_Pheanstalk($config['beanstalkd']['host'] . ":" . $config['beanstalkd']['port']);

$queue->useTube("scorem")->put(json_encode(array("action" => "my_important_task")), LOW_PRIORITY);

This will add the job to the queue, but anything with a higher priority value (where 1 is the highest priority!) will take precendence. This way I can add as many non-urgent jobs as I want to to the queue without impacting my website performance. My setup has multiple workers and also I tend to write a script that puts loads of tiny jobs on the queue rather than putting one monster task on there. I find this approach a bit more fault-tolerant and also means that incoming tasks can get a chance to get serviced rather than waiting for some crazy huge thing to finish.

I had real issues finding information about the priority settings for beanstalkd and PHP, so hopefully if anyone is looking for it, they will find this post :)

Working with PHP and Beanstalkd

I have just introduced Beanstalkd into my current PHP project; it was super-easy so I thought I'd share some examples and my thoughts on how a job queue fits in with a PHP web application.

The Scenario

I have an API backend and a web frontend on this project (there may be apps later. It's a startup, there could be anything later). Both front and back ends are PHP Slim Framework applications, and there's a sort of JSON-RPC going on in between the two.

The job queue will handle a few things we don't want to do in real time on the application, such as:

  • updating counts of things like comments; when a comment is made, a job gets created and we can return to the user. At some point the job will get processed updating the counts of how many comments are on that thing, how many comments the user made, adding to a news feed of activities ... you get the idea. Continue reading

Getting Started with Beanstalkd

I had a good experience implementing beanstalkd for the first time last week, so I thought I'd write down how that went and what I learned.

Why Beanstalkd?

My requirements were simply to add both asynchronous (for processing things like recalculating counts) and periodic (mostly for garbage collection) tasks to a PHP application. The application has a separate web application and backend API, both made of PHP's Slim framework, and the API talks to MySQL. It's all very lightweight and scalable, and I was looking for something to fit in with what we have, with good PHP support.

Enter beanstalkd, it's a super-simple job queue and has great PHP support in the shape of Pheanstalk (I'm saving my PHP + beanstalkd examples for another day because this post would get too long to read otherwise!). I've used gearman in the past but beanstalkd seemed lighter, and when I started looking at their documentation I discovered that I had a working installation in about the time it would take me to fall off a log - which is always a good indicator of a tool that will be fun to work with :) Continue reading