Commit graph

617 commits

Author SHA1 Message Date
Petr Pchelko
c8136454cd Add test for JobRunner
Bug: T220127
Change-Id: I35ff5f97d3e8e677c5c236723df6f74b5e21214d
2020-02-07 11:32:17 -05:00
Aaron Schulz
6aa41cbd37 jobqueue: cleanup JobRunner for reability and code reuse
Consolidate more logic into JobRunner::execute() and make
it public. Add "caught" field to the resulting map. The
intended use case for this method is JobExecutor. Calling
this method from there could cut down on code duplication.

Also:
* Use try/finally to restore state instead of ScopedCallback.
* Use more generic Throwable instead of Exception.
* Reorganize JobRunner::run() slightly for readability.
* Set class constant visibility and improve code comments.

Bug: T243492
Change-Id: I90566a49c603aa78f45b35c0d3fc1925d2cfe2f8
2020-01-29 21:00:39 +00:00
James D. Forrester
0958a0bce4 Coding style: Auto-fix MediaWiki.Usage.IsNull.IsNull
Change-Id: I90cfe8366c0245c9c67e598d17800684897a4e27
2020-01-10 14:17:13 -08:00
James D. Forrester
4f2d1efdda Coding style: Auto-fix MediaWiki.Classes.UnsortedUseStatements.UnsortedUse
Change-Id: I94a0ae83c65e8ee419bbd1ae1e86ab21ed4d8210
2020-01-10 09:32:25 -08:00
Umherirrender
b583747df8 Set method visibility for Job::run implementation
Change-Id: Iee7ccd8765542f243d3555abbe805d75d4b1ea2c
2019-12-21 21:33:32 +00:00
Brad Jorsch
b26248235b Add string casts when using array_keys() with SQL query conditions
Until I70473280, integer literals were always quoted as strings, because
the databases we support all have no problem with casting
string-literals for comparisons and such.

But it turned out that gave MySQL/MariaDB's planner problems in some
queries, so we changed it to not quote actual PHP integers.

But then we run into the fact that PHP associative arrays don't preserve
the types of keys, it converts integer-like strings into actual
integers. And when those are passed to the DB unquoted for comparison
with a string-typed column, MySQL/MariaDB screws up the comparison while
PostgreSQL simply throws an error. Sigh.

This patch adds string casting to direct uses of array_keys() to supply
values for such query conditions. It doesn't change uses where the field
being compared is a numeric field.

If anyone knows of a good way to find indirect uses of array_keys() for
passing as $conds to IDatabase methods, please do so!

Change-Id: Ie72ee33437d492904e1495b3f4ebb1fcf0118f49
2019-12-16 16:05:18 -05:00
Umherirrender
8acea5491d Handle database error with LinksUpdate and numeric category names
Until I70473280, integer literals were always quoted as strings, because
the databases we support all have no problem with casting
string-literals for comparisons and such.

PHP associative arrays don't preserve
the types of keys, it converts integer-like strings into actual
integers, which can result in errors:

WikiPage::updateCategoryCounts	localhost	1292
Truncated incorrect DOUBLE value: 'A String Category' (localhost)
UPDATE
`category` SET cat_pages = cat_pages - 1 WHERE cat_title IN
(143434343434,'14343434string')
#0 includes\libs\rdbms\database\Database.php(1587):
Wikimedia\Rdbms\Database->getQueryExceptionAndLog(string, integer,
string, string)
#1 includes\libs\rdbms\database\Database.php(1166):
Wikimedia\Rdbms\Database->reportQueryError(string, integer, string,
string, boolean)
#2 includes\libs\rdbms\database\Database.php(2217):
Wikimedia\Rdbms\Database->query(string, string)
#3 includes\libs\rdbms\database\DBConnRef.php(68):
Wikimedia\Rdbms\Database->update(string, array, array, string)
#4 includes\libs\rdbms\database\DBConnRef.php(380):
Wikimedia\Rdbms\DBConnRef->__call(string, array)
#5 includes\page\WikiPage.php(3689):
Wikimedia\Rdbms\DBConnRef->update(string, array, array, string)
#6 includes\deferred\LinksUpdate.php(420):
WikiPage->updateCategoryCounts(array, array, integer)
#7 includes\deferred\LinksUpdate.php(315):
LinksUpdate->updateCategoryCounts(array, array)
#8 includes\deferred\LinksUpdate.php(193):
LinksUpdate->doIncrementalUpdate()
#9 includes\deferred\DeferredUpdates.php(416): LinksUpdate->doUpdate()

Also update some param docs

Change-Id: If77cf924af01a215977bfdc8f085c4e1f4c96cad
2019-12-06 19:17:56 +01:00
jenkins-bot
1f77235c14 Merge "Replace deprecated lSize with lLen" 2019-12-05 19:51:14 +00:00
jenkins-bot
2b04ef6657 Merge "Set method visibility for various constructors" 2019-12-05 10:23:34 +00:00
Brad Jorsch
152376376e Remove hacks for lack of index on rc_this_oldid
In several places, we're including rc_timestamp or other fields in a
query selecting on rc_this_oldid because there was historically no index
on the column.

The needed index was created by I0ccfd26d and deployed by T202167, so
let's remove the hacks.

Bug: T139012
Bug: T239772
Change-Id: Ic99760075bde6603c9f2ab3ee262f5a2878205c7
2019-12-04 16:00:02 -05:00
Umherirrender
0688dd7c6d Set method visibility for various constructors
Change-Id: Id3c88257e866923b06e878ccdeddded7f08f2c98
2019-12-03 20:17:30 +01:00
Paladox
fac9054e3f Replace deprecated lSize with lLen
lSize is an alias to lLen according to [1]

[1] 9f4ededa41/README.markdown (L2148)

Bug: T239734
Change-Id: I5b72fbe61e313511b69e8d2e96c2042742370b85
2019-12-03 18:20:15 +00:00
Paladox
afad1a43e4 Avoid using deprecated phpredis::delete() alias
Bug: T227461
Change-Id: I5eb2fa42d61e4757b11b6eb909c04dafb40923a1
2019-12-02 22:56:27 +00:00
jenkins-bot
dce7e7c384 Merge "Remove duplicate variable name from class property PHPDocs" 2019-12-02 16:09:39 +00:00
Thiemo Kreuz
78ca9eff4a Remove duplicate variable name from class property PHPDocs
Repeating the variable name doesn't do anything. Documentation
generators don't need it. It's more stuff to read that doesn't add new
information. And it can become outdated.

Note there are two types of @var docs. When used inline (and not on a
class property) the variable name is needed.

Change-Id: If5a520405efacd8cefd90b878c999b842b91ac61
2019-12-02 12:58:29 +00:00
Umherirrender
2cc97d844a Use array for 'ORDER BY'
It is converted to a valid sql string from the abstract database layer

Also use array for GROUP BY and column alias

Change-Id: I293a563607d115a42c8456c9b9ac66665d71d943
2019-11-29 23:01:07 +01:00
jenkins-bot
af4b49e67c Merge "Improve param docs" 2019-11-28 19:36:24 +00:00
Umherirrender
c7ad21c25f Improve param docs
Change-Id: I746a69f6ed01c3ff000da125457df62b02d13b34
2019-11-28 19:08:59 +01:00
apaskulin
4d24ad65c9 docs: Remove ingroup tag from Markdown files
This patch fixes the nesting issue in the Doxygen navigation
by removing the @ingroup tag from class-level Markdown files.
These files will now appear as top-level files in the
navigation and are discoverable via links on the class pages.

Bug: T87796
Change-Id: I370bfea3bf2a6816724d04b15107658f1c336f0f
2019-11-12 16:11:30 -08:00
Thiemo Kreuz
50dca8ed1e Replace some oldskool @see with @inheritDoc
This patch also adds some missing newlines at the beginning of files.

Change-Id: Ifcdf75396c96f17b7bfb103f54bfdf4ba4dfbccc
2019-11-08 18:00:27 +00:00
jenkins-bot
605bf24772 Merge "docs: Convert class-level documentation files to Markdown" 2019-11-01 01:43:29 +00:00
apaskulin
80221c0aff docs: Convert class-level documentation files to Markdown
The FileRepo, FileBackend, and JobQueue classes include documentation
files that don’t appear in the generated Doxygen docs. This PR:

* Converts these files to Markdown
* Links to each file from the respective class description
* Adds an ingroup tag so the files appear in the sidebar at the
  module level
* Updates the exclude pattern in the Doxyfile to surface these pages

Bug: T87796
Change-Id: I94f0636ab489d741ab505f15da43a5d63c1ca61a
2019-10-31 11:03:13 -07:00
Petr Pchelko
a738dd647a Return deduplication to CategoryMembershipJob
After I86d26e494924eec24e7b1fb32c424ac1284be478 the job is
no longer instantiated on submission, only upon execution,
so deduplication flags and dedup info are no longer available
to kafka queue.

Bug: T204761
Depends-On: Ieb2604e65177736606aed351c6658b7df748dcee
Change-Id: Ibf95638a2ad218a83347db6749e2e7c9e8dbe0db
2019-10-29 06:10:22 +00:00
jenkins-bot
7badf41667 Merge "rdbms: add ILBFactory::setDefaultReplicationWaitTimeout() method" 2019-10-27 00:44:31 +00:00
James D. Forrester
2bc660c95a Collapse uses of now-deprecated wfGetRusage()
Change-Id: I9a2b5d1234ebb458b6cd29797de3f387d1399e6f
2019-10-22 11:32:06 +01:00
Aaron Schulz
d5f5dd2a52 rdbms: add ILBFactory::setDefaultReplicationWaitTimeout() method
Use this in JobRunner to avoid overly sensitive lag timeouts and
log spam. The 3 second timeout is between the regular web default
and the CLI default.

Follow-up to e8df0fbab1.

Bug: T235244
Change-Id: I92f657a638031d913b0575d74bf48c3e3a63cd17
2019-10-21 20:05:30 -07:00
Daimona Eaytoy
95dc119527 Fix new phan errors, part 2
Still mostly doc-only.

Bug: T231636
Change-Id: I65cec6c716ce6859e14da00a12ef71e03603e59a
2019-10-12 10:35:09 +00:00
Daimona Eaytoy
e3412efac3 Unsuppress PhanParamReqAfterOpt, use PHP71 nullable types
These were all checked with codesearch to ensure nothing is overriding
these methods.
For the most part, I've updated the signature to use nullable types; for
two Pager's, I've just made all parameters non-optional, because you're
already forced to pass them with a required parameter at the end.

Bug: T231636
Change-Id: Ie047891f55fcd322039194cfa9a8549e4f1f6f14
2019-10-10 11:53:58 +02:00
Petr Pchelko
ef51ecc6db jobqueue: Remove 'title' and 'namespace' from JobSpecification dedup info
After recent refactors of the jobs, the job params will contain
the title information if it's relevant. So, the getDeduplicationInfo
method of teh job class no longer includes page namespace/title
explicitly, but it was never removed from the JobSpecification
class.

See fc5d51f129 (I9c9d0726d4066bb0a).

Bug: T204761
Change-Id: Ieb2604e65177736606aed351c6658b7df748dcee
2019-10-02 18:08:26 +00:00
Aaron Schulz
a5c7fd0db2 Move callers away from Title::GAID_FOR_UPDATE
These callers just need to load some data from DB_MASTER.
Subsequent code needing that latest title data should also use the
required flags, rather than relying on flakey global cache state.

Change-Id: I53248ea4b5bf1cd953f956c41b8244831ec5ef04
2019-09-09 13:19:08 -07:00
Daimona Eaytoy
e2e543f7c2 Unsuppress more phan issues (part 5)
Bug: T231636
Depends-On: I6e5fba7bd273219b1206559420b5bdb78734aa84
Change-Id: I50377746f01749b058c39fd8229f9d566224cc43
2019-09-01 09:48:31 +00:00
Derick Alangi
52a21ace03 Fix method/function names case mismatch in core files
PHP doesn't care much but I think we humans do because we should
call methods by the name we give them. Method fixed are;

- isOk() -> isOK()
- setOk() -> setOK()
- teardown() -> tearDown()

Change-Id: I6b3f0cf3902887058efa426968da380803869e0b
2019-08-31 23:17:51 +00:00
Derick Alangi
b5445185c3 jobqueue: Avoid usage of deprecated MWHttpRequest::factory()
Change-Id: I58c007436d38e4d0edd1ce14034b2f3bfb536df9
2019-08-30 21:55:04 +00:00
Daimona Eaytoy
f18af0b61f Remove more Oracle and Mssql leftovers
Follows-up 4d10bb14e8 and 807d793ab9.

According to codesearch [0], these were the last usages. Note that this
patch leaves two constants in places, IDatabase::DBO_SYSDBA and
DBO_DDLMODE. These are public constants used "mostly for oracle" according
to the docs, but maybe we could find other use cases in the future (?).

[0] - https://codesearch.wmflabs.org/core/?q=oracle%7Cmssql&i=fosho&files=%5C.%5B%5Ej%5Cd%5D%7Cen%5C.json&repos=

Bug: T230418
Change-Id: Ibfb748b4b23b885a77f4de161af4bf2ab9649a89
2019-08-25 17:21:49 +00:00
Aaron Schulz
0844db0b6d Make the JobRunner flushReplicaSnapshots() call cover the first job
If JobRunner is called when replica transactions exists, the first job
would previously use the stale REPEATABLE-READ snapshot data.

Also clear any master connection snapshots via commitMasterChanges().
This makes the code more similar to DeferredUpdates::attemptUpdate().

Change-Id: I2157a91fb01ea8c233f964b1f3164e8c3b1a07ca
2019-08-25 14:07:24 +00:00
DannyS712
a4835b43c7 docs: Fix typos for 'parameter' and 'perform'
Bug: T201491
Change-Id: I37ed48907bf7c1a1d4ebab7b10b41a77623eba8a
2019-08-20 09:45:52 +00:00
Brad Jorsch
2479041060 RecentChangesUpdateJob: Fix GROUP BY
When a column specified in GROUP BY is both a table column and a
SELECT-field, MySQL and PostgreSQL both interpret it as the table column
rather than the SELECT-field.

In PostgreSQL this raises an error, which is good since it lets us know
it needs fixing. But MySQL goes ahead and groups by the table field
which gives us the wrong result.

Bug: T230618
Change-Id: Id500556b2795b86849329eece3b651b08e29a7f7
2019-08-16 14:02:56 -04:00
Aaron Schulz
97cbaa349e Cleanup JobQueueDB::recycleAndDeleteStaleJobs() use of IDatabase::affectedRows()
Bug: T229456
Change-Id: Ie22085ddba66f122e59e93baaf9b53c76b5ce448
2019-08-08 10:59:26 +00:00
Daniel Kinzler
aa4da3c2e8 Revert "Add small HtmlCacheUpdater service class to normalize purging code"
This reverts commit 35da1bbd7c.

Reason for revert: wrong tab, wrong patch. Ooops.

Change-Id: I5828fff6308d43460a3b2b10f60996409181f8b3
2019-08-07 13:56:30 +00:00
Aaron Schulz
35da1bbd7c Add small HtmlCacheUpdater service class to normalize purging code
The purge() method handles purging of both file cache and CDN, using
a PRESEND deferred update. This avoids code duplication and missing
file cache purge calls.

Also:
* Migrate HTMLCacheUpdate callers to just directly using HTMLCacheUpdateJob
* Add HtmlFileCacheUpdate class and defer such updates just like with CDN
* Simplify HTMLCacheUpdate constructor parameters
* Remove BacklinkCache::clear() calls which do nothing since the backlink
  query does not actually happen until the job runs

Change-Id: Ic453b189a40109a73a9426538608eea87a76befa
2019-08-06 13:45:27 -07:00
Tim Starling
51e837f68f Don't try to store File objects to the upload session
File objects can contain closures which can't be serialized.

Instead, add makeWarningsSerializable(), which converts the warnings
to a serializable array. Make ApiUpload::transformWarnings() act on this
serializable array instead. For consistency, ApiUpload::getApiWarnings()
also needs to convert the result of checkWarnings() before transforming
it.

Bug: T228749
Change-Id: I8236aaf3683f93a03a5505803f4638e022cf6d85
2019-07-26 16:15:30 +10:00
jenkins-bot
5476aa9bea Merge "Switch various LoadBalancer::getConnection() callers to getConnectionRef()" 2019-07-14 00:22:39 +00:00
jenkins-bot
672808c859 Merge "jobqueue: migrate root job deduplication to the WAN cache" 2019-07-13 23:30:56 +00:00
Aaron Schulz
43c78e83d7 jobqueue: migrate root job deduplication to the WAN cache
If the root job timestamp keys are lost or otherwise unknown, they
will now be deductively recached with the best known values as jobs
are popped and executed. This means the running any of many child
jobs of a root job can restore the root timestamp if it was lost.
This does not need to use the main stash given this fact.

Bug: T227376
Change-Id: Iae0f3af15803af048ff49f3bf281b2bde18c87f2
2019-07-13 15:38:17 -07:00
Aaron Schulz
f72ae0f6e6 Switch various LoadBalancer::getConnection() callers to getConnectionRef()
This is the preferred method as it enforces read-only mode for DB_REPLICA
and handles LoadBalancer::reuseConnection() calls automatically.

Change-Id: Iab9439ba8e0810fa14c302661ed7a3534f6bfc0d
2019-07-12 10:56:30 -07:00
Aaron Schulz
d4cb1968c8 Reduce contention of getScopedLockAndFlush() callers by using the DB domain in the key
Change-Id: Ie9fb6a9ff384c72cca559f74d8e409d108207ae3
2019-07-11 22:23:09 +00:00
jenkins-bot
4d393de9d8 Merge "jobqueue: fix IDEA warnings in JobQueueRedis" 2019-07-11 21:02:08 +00:00
Aaron Schulz
1758a57245 jobqueue: fix IDEA warnings in JobQueueRedis
Change-Id: I7258191cbae22028d76a52c005f44b7347bd86aa
2019-07-11 07:22:32 +00:00
jenkins-bot
316509d908 Merge "Various fixes and simplifications to RefreshLinksJob::runTitle()" 2019-07-11 06:19:24 +00:00
Aaron Schulz
f588586e16 Various fixes and simplifications to RefreshLinksJob::runTitle()
* Remove logic for saving slow-to-render parser output. This has
  not worked ever since DerivedPageDataUpdater was introduced.
* Make the logic to use cached output actually work. This was
  also broken since DerivedPageDataUpdater was added. In order
  to pass the output, add a known-revision-output parameter
  to both WikiPage::doSecondaryUpdates() and
  DerivedPageDataUpdater::prepareUpdate().
* Also factored out some helper methods from runForTitle() in
  RefreshLinksJob to make it more readable and avoid the need
  for multiple transaction round commit calls. This makes the
  case of multiple-title jobs less likely to break again.
* Make use of RefreshLinksJob::runForTitle() return value.
* Add unit tests for multiple-title job case.

Change-Id: I0cd13c424a87653b5a7253c42cd48fe43befd692
2019-07-11 06:06:02 +00:00