wiki.techinc.nl/includes/jobqueue
Tim Starling 68c433bd23 Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.

General principles:
* Use DI if it is already used. We're not changing the way state is
  managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
  is a service, it's a more generic interface, it is the only
  thing that provides isRegistered() which is needed in some cases,
  and a HookRunner can be efficiently constructed from it
  (confirmed by benchmark). Because HookContainer is needed
  for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
  SpecialPage and ApiBase have getHookContainer() and getHookRunner()
  methods in the base class, and classes that extend that base class
  are not expected to know or care where the base class gets its
  HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
  getHookRunner() methods, getting them from the global service
  container. The point of this is to ease migration to DI by ensuring
  that call sites ask their local friendly base class rather than
  getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
  methods did not seem warranted, there is a private HookRunner property
  which is accessed directly. Very rarely (two cases), there is a
  protected property, for consistency with code that conventionally
  assumes protected=private, but in cases where the class might actually
  be overridden, a protected accessor is preferred over a protected
  property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
  global code. In a few cases it was used for objects with broken
  construction schemes, out of horror or laziness.

Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore

Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router

setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine

Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-05-30 14:23:28 +00:00
..
exception Move exceptions JobQueueError to own file 2019-02-06 19:39:20 +01:00
jobs Hooks::run() call site migration 2020-05-30 14:23:28 +00:00
utils Coding style: Auto-fix MediaWiki.Classes.UnsortedUseStatements.UnsortedUse 2020-01-10 09:32:25 -08:00
GenericParameterJob.php jobqueue: add GenericParameterJob and RunnableJob interface 2019-04-08 11:05:23 -07:00
IJobSpecification.php jobqueue: add GenericParameterJob and RunnableJob interface 2019-04-08 11:05:23 -07:00
Job.php Fix method/function names case mismatch in core files 2019-08-31 23:17:51 +00:00
JobQueue.php Fix more PSR12.Properties.ConstantVisibility.NotFound 2020-05-15 00:33:32 +01:00
JobQueueDB.php Fix more PSR12.Properties.ConstantVisibility.NotFound 2020-05-15 00:33:32 +01:00
JobQueueFederated.php jobqueue: remove unused "aggregator" field reference in JobQueueFederated 2019-07-08 22:56:25 -07:00
JobQueueGroup.php Fix more PSR12.Properties.ConstantVisibility.NotFound 2020-05-15 00:33:32 +01:00
JobQueueMemory.php Improve param docs 2019-11-28 19:08:59 +01:00
JobQueueRedis.php Fix more PSR12.Properties.ConstantVisibility.NotFound 2020-05-15 00:33:32 +01:00
JobRunner.php Convert JobRunner into a service and use DI 2020-02-27 08:04:48 -08:00
JobSpecification.php Return deduplication to CategoryMembershipJob 2019-10-29 06:10:22 +00:00
README.md docs: Remove ingroup tag from Markdown files 2019-11-12 16:11:30 -08:00
RunnableJob.php Fix more PSR12.Properties.ConstantVisibility.NotFound 2020-05-15 00:33:32 +01:00

JobQueue Architecture

Notes on the Job queuing system architecture.

Introduction

The data model consist of the following main components:

  • The Job object represents a particular deferred task that happens in the background. All jobs subclass the Job object and put the main logic in the function called run().
  • The JobQueue object represents a particular queue of jobs of a certain type. For example there may be a queue for email jobs and a queue for CDN purge jobs.

Job queues

Each job type has its own queue and is associated to a storage medium. One queue might save its jobs in redis while another one uses would use a database.

Storage medium are defined in a queue class. Before using it, you must define in $wgJobTypeConf a mapping of the job type to a queue class.

The factory class JobQueueGroup provides helper functions:

  • getting the queue for a given job
  • route new job insertions to the proper queue

The following queue classes are available:

  • JobQueueDB (stores jobs in the job table in a database)
  • JobQueueRedis (stores jobs in a redis server)

All queue classes support some basic operations (though some may be no-ops):

  • enqueueing a batch of jobs
  • dequeueing a single job
  • acknowledging a job is completed
  • checking if the queue is empty

Some queue classes (like JobQueueDB) may dequeue jobs in random order while other queues might dequeue jobs in exact FIFO order. Callers should thus not assume jobs are executed in FIFO order.

Also note that not all queue classes will have the same reliability guarantees. In-memory queues may lose data when restarted depending on snapshot and journal settings (including journal fsync() frequency). Some queue types may totally remove jobs when dequeued while leaving the ack() function as a no-op; if a job is dequeued by a job runner, which crashes before completion, the job will be lost. Some jobs, like purging CDN caches after a template change, may not require durable queues, whereas other jobs might be more important.

Job queue aggregator

The aggregators are used by nextJobDB.php, which is a script that will return a random ready queue (on any wiki in the farm) that can be used with runJobs.php. This can be used in conjunction with any scripts that handle wiki farm job queues. Note that $wgLocalDatabases defines what wikis are in the wiki farm.

Since each job type has its own queue, and wiki-farms may have many wikis, there might be a large number of queues to keep track of. To avoid wasting large amounts of time polling empty queues, aggregators exists to keep track of which queues are ready.

The following queue aggregator classes are available:

  • JobQueueAggregatorRedis (uses a redis server to track ready queues)

Some aggregators cache data for a few minutes while others may be always up to date. This can be an important factor for jobs that need a low pickup time (or latency).

Jobs

Callers should also try to make jobs maintain correctness when executed twice. This is useful for queues that actually implement ack(), since they may recycle dequeued but un-acknowledged jobs back into the queue to be attempted again. If a runner dequeues a job, runs it, but then crashes before calling ack(), the job may be returned to the queue and run a second time. Jobs like cache purging can happen several times without any correctness problems. However, a pathological case would be if a bug causes the problem to systematically keep repeating. For example, a job may always throw a DB error at the end of run(). This problem is trickier to solve and more obnoxious for things like email jobs, for example. For such jobs, it might be useful to use a queue that does not retry jobs.