wiki.techinc.nl/includes/jobqueue
Roan Kattouw 1da7573bb7 WatchedItemStore: Use batching in setNotificationTimestampsForUser
Update rows in batches, using the same logic as is used by
removeWatchBatchForUser().

Also remove the functionality for updating all rows, and move that to
resetAllNotificationTimestampsForUser() instead. To that end, add a
timestamp parameter to that method and to the job it uses, and make
setNotificationTimestampsForUser() behave like a backwards-compatibility
wrapper around resetAllNotificationTimestampsForUser() when no list of
titles is specified.

Bug: T207941
Change-Id: I58342257395de6fcfb4c392b3945b12883ca1680
Follows-Up: I2008ff89c95fe6f66a3fd789d2cef0e8fe52bd93
2019-03-21 04:41:42 +00:00
..
aggregator Move class JobQueueAggregatorNull to own file 2019-03-08 20:19:26 +01:00
exception Move exceptions JobQueueError to own file 2019-02-06 19:39:20 +01:00
jobs WatchedItemStore: Use batching in setNotificationTimestampsForUser 2019-03-21 04:41:42 +00:00
utils Make PurgeJobUtils::invalidatePages avoid waiting on replication for no reason 2019-03-09 01:53:01 +00:00
IJobSpecification.php Move interface IJobSpecification to own file 2019-02-04 21:04:12 +01:00
Job.php Job::factory should throw an InvalidArgumentException, not MWException 2019-02-11 23:35:47 +00:00
JobQueue.php Merge "Rename WikiMap DB domain ID methods to reduce confusion with web domains" 2019-02-07 01:42:09 +00:00
JobQueueDB.php jobqueue: allow direct server configuration arrays to JobQueueDB 2019-03-09 03:40:20 +00:00
JobQueueFederated.php build: Updating mediawiki/mediawiki-codesniffer to 24.0.0 2019-02-07 18:39:42 +00:00
JobQueueGroup.php Rename WikiMap DB domain ID methods to reduce confusion with web domains 2019-02-06 12:28:45 -08:00
JobQueueMemory.php Use DB domain in JobQueueGroup and make WikiMap domain ID methods stricter 2018-11-07 04:46:56 +00:00
JobQueueRedis.php Rename WikiMap DB domain ID methods to reduce confusion with web domains 2019-02-06 12:28:45 -08:00
JobRunner.php Create JobQueueEnqueueUpdate class to call JobQueueGroup::pushLazyJobs() 2018-10-28 22:19:06 +00:00
JobSpecification.php Move interface IJobSpecification to own file 2019-02-04 21:04:12 +01:00
README

/*!
\ingroup JobQueue
\page jobqueue_design Job queue design

Notes on the Job queuing system architecture.

\section intro Introduction

The data model consist of the following main components:
* The Job object represents a particular deferred task that happens in the
  background. All jobs subclass the Job object and put the main logic in the
  function called run().
* The JobQueue object represents a particular queue of jobs of a certain type.
  For example there may be a queue for email jobs and a queue for CDN purge
  jobs.

\section jobqueue Job queues

Each job type has its own queue and is associated to a storage medium. One
queue might save its jobs in redis while another one uses would use a database.

Storage medium are defined in a queue class. Before using it, you must
define in $wgJobTypeConf a mapping of the job type to a queue class.

The factory class JobQueueGroup provides helper functions:
- getting the queue for a given job
- route new job insertions to the proper queue

The following queue classes are available:
* JobQueueDB (stores jobs in the `job` table in a database)
* JobQueueRedis (stores jobs in a redis server)

All queue classes support some basic operations (though some may be no-ops):
* enqueueing a batch of jobs
* dequeueing a single job
* acknowledging a job is completed
* checking if the queue is empty

Some queue classes (like JobQueueDB) may dequeue jobs in random order while other
queues might dequeue jobs in exact FIFO order. Callers should thus not assume jobs
are executed in FIFO order.

Also note that not all queue classes will have the same reliability guarantees.
In-memory queues may lose data when restarted depending on snapshot and journal
settings (including journal fsync() frequency).  Some queue types may totally remove
jobs when dequeued while leaving the ack() function as a no-op; if a job is
dequeued by a job runner, which crashes before completion, the job will be
lost. Some jobs, like purging CDN caches after a template change, may not
require durable queues, whereas other jobs might be more important.

\section aggregator Job queue aggregator

The aggregators are used by nextJobDB.php, which is a script that will return a
random ready queue (on any wiki in the farm) that can be used with runJobs.php.
This can be used in conjunction with any scripts that handle wiki farm job queues.
Note that $wgLocalDatabases defines what wikis are in the wiki farm.

Since each job type has its own queue, and wiki-farms may have many wikis,
there might be a large number of queues to keep track of. To avoid wasting
large amounts of time polling empty queues, aggregators exists to keep track
of which queues are ready.

The following queue aggregator classes are available:
* JobQueueAggregatorRedis (uses a redis server to track ready queues)

Some aggregators cache data for a few minutes while others may be always up to date.
This can be an important factor for jobs that need a low pickup time (or latency).

\section jobs Jobs

Callers should also try to make jobs maintain correctness when executed twice.
This is useful for queues that actually implement ack(), since they may recycle
dequeued but un-acknowledged jobs back into the queue to be attempted again. If
a runner dequeues a job, runs it, but then crashes before calling ack(), the
job may be returned to the queue and run a second time. Jobs like cache purging can
happen several times without any correctness problems. However, a pathological case
would be if a bug causes the problem to systematically keep repeating. For example,
a job may always throw a DB error at the end of run(). This problem is trickier to
solve and more obnoxious for things like email jobs, for example. For such jobs,
it might be useful to use a queue that does not retry jobs.