wiki.techinc.nl/includes/job
2012-12-20 14:00:39 +00:00
..
jobs [JobQueue] Cleaned up DuplicateJob factory function. 2012-12-17 17:11:15 -08:00
Job.php [JobQueue] Improved refreshLinks/htmlCacheUpdate job de-duplication. 2012-11-28 09:29:41 +00:00
JobQueue.php Merge "[JobQueue] Pushed stats down to job queue subclasses." 2012-12-11 08:20:51 +00:00
JobQueueDB.php Fixed comment typo. 2012-12-19 16:00:29 -08:00
JobQueueGroup.php Add numerous missing @throws to method documentation 2012-12-09 03:09:48 +00:00
README [JobQueue] README file for job queue classes. 2012-10-24 11:59:45 -07:00

/*!
\ingroup JobQueue
\page jobqueue_design Job queue design

Notes on the Job queuing system architecture.

\section intro Introduction

The data model consist of the following main components:

* The Job object represents a particular deferred task that happens in the
  background. All jobs subclass the Job object and put the main logic in the
  function called run().
* The JobQueue object represents a particular queue of jobs of a certain type.
  For example there may be a queue for email jobs and a queue for squid purge
  jobs.

Each job type has its own queue and is associated to a storage medium. One
queue might save its jobs in redis while another one uses would use a database.

Storage medium are defined in a queue class. Before using it, you must
define in $wgJobTypeConf a mapping of the job type to a queue class.

The factory class JobQueueGroup provides helper functions:
- getting the queue for a given job
- route new job insertions to the proper queue

The following queue classes are available:
* JobQueueDB (stores jobs in the `job` table in a database)

All queue classes support some basic operations (though some may be no-ops):
* enqueueing a batch of jobs
* dequeueing a single job
* acknowledging a job is completed
* checking if the queue is empty

Some queue classes (like JobQueueDB) may dequeue jobs in random order while other
queues might dequeue jobs in exact FIFO order. Callers should thus not assume jobs
are executed in FIFO order.

Also note that not all queue classes will have the same reliability guarantees.
In-memory queues may lose data when restarted depending on snapshot and journal
settings (including journal fsync() frequency).  Some queue types may totally remove
jobs when dequeued while leaving the ack() function as a no-op; if a job is
dequeued by a job runner, which crashes before completion, the job will be
lost. Some jobs, like purging squid caches after a template change, may not
require durable queues, whereas other jobs might be more important.

Callers should also try to make jobs maintain correctness when executed twice.
This is useful for queues that actually implement ack(), since they may recycle
dequeued but un-acknowledged jobs back into the queue to be attempted again. If
a runner dequeues a job, runs it, but then crashes before calling ack(), the
job may be returned to the queue and run a second time. Jobs like cache purging can
happen several times without any correctness problems. However, a pathological case
would be if a bug causes the problem to systematically keep repeating. For example,
a job may always throw a DB error at the end of run(). This problem is trickier to
solve and more obnoxious for things like email jobs, for example. For such jobs,
it might be useful to use a queue that does not retry jobs.