Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
Done:
* Replace LanguageConverter::newConverter by LanguageConverterFactory::getLanguageConverter
* Remove LanguageConverter::newConverter from all subclasses
* Add LanguageConverterFactory integration tests which covers all languages by their code.
* Caching of LanguageConverters in factory
* Make all tests running (hope that's would be enough)
* Uncomment the deprecated functions.
* Rename FakeConverter to TrivialLanguageConverter
* Create ILanguageConverter to have shared ancestor
* Make the LanguageConverter class abstract.
* Create table with mapping between lang code and converter instead of using name convention
* ILanguageConverter @internal
* Clean up code
Change-Id: I0e4d77de0f44e18c19956a1ffd69d30e63cf51bf
Bug: T226833, T243332
In all these cases, the foreach() loop specifies a variable for the
current value. We don't need two ways to access the same value. This
makes the code harder to read.
Change-Id: I6ed7a518439963b7091057194de993a7e977be32
Until I70473280, integer literals were always quoted as strings, because
the databases we support all have no problem with casting
string-literals for comparisons and such.
PHP associative arrays don't preserve
the types of keys, it converts integer-like strings into actual
integers, which can result in errors:
WikiPage::updateCategoryCounts localhost 1292
Truncated incorrect DOUBLE value: 'A String Category' (localhost)
UPDATE
`category` SET cat_pages = cat_pages - 1 WHERE cat_title IN
(143434343434,'14343434string')
#0 includes\libs\rdbms\database\Database.php(1587):
Wikimedia\Rdbms\Database->getQueryExceptionAndLog(string, integer,
string, string)
#1 includes\libs\rdbms\database\Database.php(1166):
Wikimedia\Rdbms\Database->reportQueryError(string, integer, string,
string, boolean)
#2 includes\libs\rdbms\database\Database.php(2217):
Wikimedia\Rdbms\Database->query(string, string)
#3 includes\libs\rdbms\database\DBConnRef.php(68):
Wikimedia\Rdbms\Database->update(string, array, array, string)
#4 includes\libs\rdbms\database\DBConnRef.php(380):
Wikimedia\Rdbms\DBConnRef->__call(string, array)
#5 includes\page\WikiPage.php(3689):
Wikimedia\Rdbms\DBConnRef->update(string, array, array, string)
#6 includes\deferred\LinksUpdate.php(420):
WikiPage->updateCategoryCounts(array, array, integer)
#7 includes\deferred\LinksUpdate.php(315):
LinksUpdate->updateCategoryCounts(array, array)
#8 includes\deferred\LinksUpdate.php(193):
LinksUpdate->doIncrementalUpdate()
#9 includes\deferred\DeferredUpdates.php(416): LinksUpdate->doUpdate()
Also update some param docs
Change-Id: If77cf924af01a215977bfdc8f085c4e1f4c96cad
Some of the errors are suppressed because they're phan false positives.
The idea behind this is that they'll be fixed in a future version of
phan, and we'll just have to remove the suppressions.
Note: I'm disabling UnusedPluginSuppression so that we can start suppressing
issues even if they're still disabled. The sniff should be re-enabled
as soon as we upgrade phan.
Bug: T231636
Change-Id: I0f7fa06a9e03fbb86c7a5eb6e50a850bb258a7f7
knowing if recursion is enabled might help extensions implementing
LinksUpdate hooks to take some decisions.
E.g. CirrusSearch would like to know if a particular update needs to go to a
priorized queue or not (template transclusion).
Change-Id: I0a0de0d4621ed302b4fb550a1ddecd4ac8c5775a
These callers just need to load some data from DB_MASTER.
Subsequent code needing that latest title data should also use the
required flags, rather than relying on flakey global cache state.
Change-Id: I53248ea4b5bf1cd953f956c41b8244831ec5ef04
The purge() method handles purging of both file cache and CDN, using
a PRESEND deferred update. This avoids code duplication and missing
file cache purge calls.
Also:
* Migrate HTMLCacheUpdate callers to just directly using HTMLCacheUpdateJob
* Add HtmlFileCacheUpdate class and defer such updates just like with CDN
* Simplify HTMLCacheUpdate constructor parameters
* Remove BacklinkCache::clear() calls which do nothing since the backlink
query does not actually happen until the job runs
Change-Id: Ic453b189a40109a73a9426538608eea87a76befa
LinksUpdate does not match RefreshLinksJob since the former is only a subset
of the later. Also, DeferredUpdates::doUpdates() only runs in "enqueue" mode
for cases in MediaWiki::restInPeace() if there is no post-send support.
In a future commit, the deferred callback in which LinksUpdate runs
currently, will be abstracted into its own deferred update, which
will then bring back EnqueueableDataUpdate for this update.
Bug: T206283
Change-Id: I0680be445e8b8e8d0dba85df135b84640f4fcb81
* Remove logic for saving slow-to-render parser output. This has
not worked ever since DerivedPageDataUpdater was introduced.
* Make the logic to use cached output actually work. This was
also broken since DerivedPageDataUpdater was added. In order
to pass the output, add a known-revision-output parameter
to both WikiPage::doSecondaryUpdates() and
DerivedPageDataUpdater::prepareUpdate().
* Also factored out some helper methods from runForTitle() in
RefreshLinksJob to make it more readable and avoid the need
for multiple transaction round commit calls. This makes the
case of multiple-title jobs less likely to break again.
* Make use of RefreshLinksJob::runForTitle() return value.
* Add unit tests for multiple-title job case.
Change-Id: I0cd13c424a87653b5a7253c42cd48fe43befd692
MWNamespace::clearCaches() has been removed entirely, along with the
$rebuild parameter to MWNamespace::getCanonicalNamespaces(). The rest of
MWNamespace is deprecated.
Diff best viewed with -C1 so git notices that NamespaceInfo is a copy of
MWNamespace.
Depends-On: Icb7a4a2a5d19fb1f2453b4b57a5271196b0e316d
Depends-On: Ib3c914fc99394e4876ac9fe27317a1eafa2ff69e
Change-Id: I1a03d4e146f5414ae73c7d1a5807c873323e8abc
This is proposed as an alternative to I0b636dc144f34bb.
The idea is to ensure that page deletion causes the exact same
database updates to be performed, and the exact same hooks to
be fired, as a page edit.
Bug: T216249
Change-Id: I665320e27da8edc2867b47d181cc0f324e75d102
Make DerivedPageDataUpdater bundle all the related DataUpdate tasks
on page change with a RefreshSecondaryDataUpdate wrapper. If one of
the DataUpdate tasks fails, then the entire bundle of updates can be
re-run in the form of enqueueing a RefreshLinksJob instance (these
jobs are idempotent). If several of the bundled tasks fail, it is easy
for DeferredUpdates to know that only one RefreshLinksJob should be
enqueued.
The goal is to make DataUpdate tasks more reliable and resilient.
Most of these deferred update failures are due to ephemeral problems
like lock contention. Since the job queue is already able to reliably
store and retry jobs, and the time that a regular web request can spend
in post-send is more limited, it makes the most sense to just enqueue
tasks as jobs if they fail post-send.
Make LinkUpdate no longer defined as enqueueable as RefreshLinksJob
since they are not very congruent (LinksUpdate only does some of the
work that RefreshLinksJob does). Only the wrapper, with the bundle of
DataUpdate instances, is congruent to RefreshLinksJob.
This change does not itself implement the enqueue-on-failure logic
in DeferredUpdates, but is merely a prerequisite.
Bug: T206288
Change-Id: I191103c1aeff4c9fedbf524ee387dad9bdf5fab8
This adds a method to LinkFilter to build the query conditions necessary
to properly use it, and adjusts code to use it.
This also takes the opportunity to clean up the calculation of el_index:
IPs are handled more sensibly and IDNs are canonicalized.
Also weird edge cases for invalid hosts like "http://.example.com" and
corresponding searches like "http://*..example.com" are now handled more
regularly instead of being treated as if the extra dot were omitted,
while explicit specification of the DNS root like "http://example.com./"
is canonicalized to the usual implicit specification.
Note that this patch will break link searches for links where the host
is an IP or IDN until refreshExternallinksIndex.php is run.
Bug: T59176
Bug: T130482
Change-Id: I84d224ef23de22dfe179009ec3a11fd0e4b5f56d
While RefreshLinksJob is de-duplicated by page-id, it is possible
for two jobs to run for the same page ID if the second one was queued
after the first one started running. In that case they the newer
one must not be skipped or ignored because it will have newer
information to record to the database, but it also has no way
to stop the old one, and we can't run them concurrently.
Instead of letting the lock exception mark the job as error,
making it implicitly retry, do this more explicitly, which avoids
logspam.
Bug: T170596
Co-Authored-By: Aaron Schulz <aschulz@wikimedia.org>
Change-Id: Id2852d73d00daf83f72cf5ff778c638083f5fc73
In some functions MediaWikiServices::getInstance() was called twices or
in loops. Extract the variable to reduce calls.
Change-Id: I2705db11d7a9ea73efb9b5a5c40747ab0b3ea36f
The invalid UTF-8 could cause incorrect sorting of affected pages in
category lists on wikis using UCA collations. On my local testing
wiki, the generated cl_sortkey was just 0x30 regardless of the value
of cl_sortkey_prefix.
This doesn't fix existing bad data in the database. It will only be
updated when the affected page is edited (or null-edited).
The cl_timestamp field will also be updated when that happens, which
apparently may affect Wikinews' DynamicPageList extension, according
to comments on T27254. This is not easily avoidable.
Bug: T200623
Change-Id: I4baa9ea3c7f831ff3c9c51e6b8e5d66e7da42a91
This method returns the value used as cl_type for category links that
are "from" pages within the namespace, and is added to avoid duplication
of code across a few classes.
Change-Id: I4e55932a5a27858cfedb12009b455fcd02f9b5df
Adds a maintenance script to populate the field, has that be
automatically run during update.php, and drops the no-longer-needed
default value on the column (where possible: mssql has some sort of
constraint thing going on that I have no idea how it works).
Bug: T59176
Change-Id: I971edf013a1a39466aca3b6e34c915cb24fd3aa7
The hook handlers are likely to write to secondary databases, in which
case it is better to wrap the callback in its own transaction round.
This lowers the chance of pending write warnings happening in
runMasterTransactionIdleCallbacks() as well as DBTransactionError
exceptions in LBFactory due to recursion during commit.
Bug: T191282
Bug: T193668
Change-Id: Ie207ca312888b6bb076f783d41f05b701f70a52e