Consolidate more logic into JobRunner::execute() and make
it public. Add "caught" field to the resulting map. The
intended use case for this method is JobExecutor. Calling
this method from there could cut down on code duplication.
Also:
* Use try/finally to restore state instead of ScopedCallback.
* Use more generic Throwable instead of Exception.
* Reorganize JobRunner::run() slightly for readability.
* Set class constant visibility and improve code comments.
Bug: T243492
Change-Id: I90566a49c603aa78f45b35c0d3fc1925d2cfe2f8
Until I70473280, integer literals were always quoted as strings, because
the databases we support all have no problem with casting
string-literals for comparisons and such.
But it turned out that gave MySQL/MariaDB's planner problems in some
queries, so we changed it to not quote actual PHP integers.
But then we run into the fact that PHP associative arrays don't preserve
the types of keys, it converts integer-like strings into actual
integers. And when those are passed to the DB unquoted for comparison
with a string-typed column, MySQL/MariaDB screws up the comparison while
PostgreSQL simply throws an error. Sigh.
This patch adds string casting to direct uses of array_keys() to supply
values for such query conditions. It doesn't change uses where the field
being compared is a numeric field.
If anyone knows of a good way to find indirect uses of array_keys() for
passing as $conds to IDatabase methods, please do so!
Change-Id: Ie72ee33437d492904e1495b3f4ebb1fcf0118f49
Until I70473280, integer literals were always quoted as strings, because
the databases we support all have no problem with casting
string-literals for comparisons and such.
PHP associative arrays don't preserve
the types of keys, it converts integer-like strings into actual
integers, which can result in errors:
WikiPage::updateCategoryCounts localhost 1292
Truncated incorrect DOUBLE value: 'A String Category' (localhost)
UPDATE
`category` SET cat_pages = cat_pages - 1 WHERE cat_title IN
(143434343434,'14343434string')
#0 includes\libs\rdbms\database\Database.php(1587):
Wikimedia\Rdbms\Database->getQueryExceptionAndLog(string, integer,
string, string)
#1 includes\libs\rdbms\database\Database.php(1166):
Wikimedia\Rdbms\Database->reportQueryError(string, integer, string,
string, boolean)
#2 includes\libs\rdbms\database\Database.php(2217):
Wikimedia\Rdbms\Database->query(string, string)
#3 includes\libs\rdbms\database\DBConnRef.php(68):
Wikimedia\Rdbms\Database->update(string, array, array, string)
#4 includes\libs\rdbms\database\DBConnRef.php(380):
Wikimedia\Rdbms\DBConnRef->__call(string, array)
#5 includes\page\WikiPage.php(3689):
Wikimedia\Rdbms\DBConnRef->update(string, array, array, string)
#6 includes\deferred\LinksUpdate.php(420):
WikiPage->updateCategoryCounts(array, array, integer)
#7 includes\deferred\LinksUpdate.php(315):
LinksUpdate->updateCategoryCounts(array, array)
#8 includes\deferred\LinksUpdate.php(193):
LinksUpdate->doIncrementalUpdate()
#9 includes\deferred\DeferredUpdates.php(416): LinksUpdate->doUpdate()
Also update some param docs
Change-Id: If77cf924af01a215977bfdc8f085c4e1f4c96cad
In several places, we're including rc_timestamp or other fields in a
query selecting on rc_this_oldid because there was historically no index
on the column.
The needed index was created by I0ccfd26d and deployed by T202167, so
let's remove the hacks.
Bug: T139012
Bug: T239772
Change-Id: Ic99760075bde6603c9f2ab3ee262f5a2878205c7
Repeating the variable name doesn't do anything. Documentation
generators don't need it. It's more stuff to read that doesn't add new
information. And it can become outdated.
Note there are two types of @var docs. When used inline (and not on a
class property) the variable name is needed.
Change-Id: If5a520405efacd8cefd90b878c999b842b91ac61
It is converted to a valid sql string from the abstract database layer
Also use array for GROUP BY and column alias
Change-Id: I293a563607d115a42c8456c9b9ac66665d71d943
This patch fixes the nesting issue in the Doxygen navigation
by removing the @ingroup tag from class-level Markdown files.
These files will now appear as top-level files in the
navigation and are discoverable via links on the class pages.
Bug: T87796
Change-Id: I370bfea3bf2a6816724d04b15107658f1c336f0f
The FileRepo, FileBackend, and JobQueue classes include documentation
files that don’t appear in the generated Doxygen docs. This PR:
* Converts these files to Markdown
* Links to each file from the respective class description
* Adds an ingroup tag so the files appear in the sidebar at the
module level
* Updates the exclude pattern in the Doxyfile to surface these pages
Bug: T87796
Change-Id: I94f0636ab489d741ab505f15da43a5d63c1ca61a
After I86d26e494924eec24e7b1fb32c424ac1284be478 the job is
no longer instantiated on submission, only upon execution,
so deduplication flags and dedup info are no longer available
to kafka queue.
Bug: T204761
Depends-On: Ieb2604e65177736606aed351c6658b7df748dcee
Change-Id: Ibf95638a2ad218a83347db6749e2e7c9e8dbe0db
Use this in JobRunner to avoid overly sensitive lag timeouts and
log spam. The 3 second timeout is between the regular web default
and the CLI default.
Follow-up to e8df0fbab1.
Bug: T235244
Change-Id: I92f657a638031d913b0575d74bf48c3e3a63cd17
These were all checked with codesearch to ensure nothing is overriding
these methods.
For the most part, I've updated the signature to use nullable types; for
two Pager's, I've just made all parameters non-optional, because you're
already forced to pass them with a required parameter at the end.
Bug: T231636
Change-Id: Ie047891f55fcd322039194cfa9a8549e4f1f6f14
After recent refactors of the jobs, the job params will contain
the title information if it's relevant. So, the getDeduplicationInfo
method of teh job class no longer includes page namespace/title
explicitly, but it was never removed from the JobSpecification
class.
See fc5d51f129 (I9c9d0726d4066bb0a).
Bug: T204761
Change-Id: Ieb2604e65177736606aed351c6658b7df748dcee
These callers just need to load some data from DB_MASTER.
Subsequent code needing that latest title data should also use the
required flags, rather than relying on flakey global cache state.
Change-Id: I53248ea4b5bf1cd953f956c41b8244831ec5ef04
PHP doesn't care much but I think we humans do because we should
call methods by the name we give them. Method fixed are;
- isOk() -> isOK()
- setOk() -> setOK()
- teardown() -> tearDown()
Change-Id: I6b3f0cf3902887058efa426968da380803869e0b
If JobRunner is called when replica transactions exists, the first job
would previously use the stale REPEATABLE-READ snapshot data.
Also clear any master connection snapshots via commitMasterChanges().
This makes the code more similar to DeferredUpdates::attemptUpdate().
Change-Id: I2157a91fb01ea8c233f964b1f3164e8c3b1a07ca
When a column specified in GROUP BY is both a table column and a
SELECT-field, MySQL and PostgreSQL both interpret it as the table column
rather than the SELECT-field.
In PostgreSQL this raises an error, which is good since it lets us know
it needs fixing. But MySQL goes ahead and groups by the table field
which gives us the wrong result.
Bug: T230618
Change-Id: Id500556b2795b86849329eece3b651b08e29a7f7
The purge() method handles purging of both file cache and CDN, using
a PRESEND deferred update. This avoids code duplication and missing
file cache purge calls.
Also:
* Migrate HTMLCacheUpdate callers to just directly using HTMLCacheUpdateJob
* Add HtmlFileCacheUpdate class and defer such updates just like with CDN
* Simplify HTMLCacheUpdate constructor parameters
* Remove BacklinkCache::clear() calls which do nothing since the backlink
query does not actually happen until the job runs
Change-Id: Ic453b189a40109a73a9426538608eea87a76befa
File objects can contain closures which can't be serialized.
Instead, add makeWarningsSerializable(), which converts the warnings
to a serializable array. Make ApiUpload::transformWarnings() act on this
serializable array instead. For consistency, ApiUpload::getApiWarnings()
also needs to convert the result of checkWarnings() before transforming
it.
Bug: T228749
Change-Id: I8236aaf3683f93a03a5505803f4638e022cf6d85
If the root job timestamp keys are lost or otherwise unknown, they
will now be deductively recached with the best known values as jobs
are popped and executed. This means the running any of many child
jobs of a root job can restore the root timestamp if it was lost.
This does not need to use the main stash given this fact.
Bug: T227376
Change-Id: Iae0f3af15803af048ff49f3bf281b2bde18c87f2
This is the preferred method as it enforces read-only mode for DB_REPLICA
and handles LoadBalancer::reuseConnection() calls automatically.
Change-Id: Iab9439ba8e0810fa14c302661ed7a3534f6bfc0d
* Remove logic for saving slow-to-render parser output. This has
not worked ever since DerivedPageDataUpdater was introduced.
* Make the logic to use cached output actually work. This was
also broken since DerivedPageDataUpdater was added. In order
to pass the output, add a known-revision-output parameter
to both WikiPage::doSecondaryUpdates() and
DerivedPageDataUpdater::prepareUpdate().
* Also factored out some helper methods from runForTitle() in
RefreshLinksJob to make it more readable and avoid the need
for multiple transaction round commit calls. This makes the
case of multiple-title jobs less likely to break again.
* Make use of RefreshLinksJob::runForTitle() return value.
* Add unit tests for multiple-title job case.
Change-Id: I0cd13c424a87653b5a7253c42cd48fe43befd692