* DeferrableUpdate classes can implement MergeableUpdate.
Duplicate updates will be merged via the merge() method.
* Make SquidUpdate support merge() so that duplicate URL
purges are now caught accross the entire pre-send request
execution.
Change-Id: Idffdd3e71d89e4a0f28281e65a881113caae497c
* PRESEND/POSTSEND constants can now be used in addUpdate()
and addCallableUpdate() to control when the update runs.
This is useful for updates that may report errors the client
should see or to just get a head start on queued or pubsub
based updates like CDN purges. The OutputPage::output() method
can easily take a few 100ms.
* Removed some argument b/c code from doUpdates().
* Also moved DeferrableUpdate to a separate file.
Change-Id: I9831fe890f9f68f9ad8c4f4bba6921a8f29ba666
* Recursive link updates no longer mention an category changes.
It's hard to avoid either duplicate mentioning of changes or
confusing explicit and automatic category changes.
* LinksUpdate no longer handles this logic, but rather WikiPage
decides to spawn this update when needed in doEditUpdates().
* Fix race conditions with calculating category deltas. Do not
rely on the link tables for the read used to determine these
writes, as they may be out-of-date due to slave lag. Using the
master would still not be good enough since that would assume
FIFO and serialized job execution, which is not garaunteed.
Use the parser output of the relevant revisions to determine
the RC rows. If 3 users quickly edit a page's categories, the
old way could misattribute who actually changed what.
* Make sure RC rows are inserted in an order that matches that
of the corresponding revisions.
* Better avoid mentioning time-based (parser functions) category
changes so they don't get attributed to the next editor.
* Also wait for slaves between RC row insertions if there where
many category changes (it theory it could well over 10K rows).
* Using a separate job better separates concerns as LinksUpdate
should not have to care about recent changes updates.
* Added more docs to $wgRCWatchCategoryMembership.
Bug: T95501
Change-Id: I5863e7d7483a4fd1fa633597af66a0088ace4c68
If the SearchEngine does not support search-update, then
$indexTitle is not used and thus no need to create it in
that case.
Change-Id: I487d06274e921223a3bcb5af846b48b7c2b8065e
* This puts these updates behind the JobRunner lag checks
* Made HTMLCacheUpdateJob::newForBacklinks() method, which,
along with lazyPush(), obsoletes this class.
Bug: T95501
Change-Id: I934de63bb6fe9e9b7abae157f0b3b2dbfd3188e1
Do the LinksUpdateComplete hook updates in a separate
transaction as they may do slow SELECTs and updates.
A large amount of DBPerformance warnings were triggered
by such cases.
Bug: T95501
Change-Id: Ie4e6b7f6aefc21bafba270282c55571ff5385fe0
So extensions like Echo are able to attribute post-edit link updates to
specific the users who triggered them.
Bug: T116485
Change-Id: I083736a174b6bc15e3ce60b2b107c697d0ac13da
* Focus on updating links that would *not* already be updated
by jobs, not those that already *will* be updated.
* Place the jobs into a dedicated queue so they don't wait
behind jobs that actually have to parse every time. This
helps avoid queue buildup.
* Make Job::factory() set the command field to match the value
it had when enqueued. This makes it easier to have the same
job class used for multiple queues.
* Given the above, remove the RefreshLinksJob 'prioritize' flag.
This worked by overriding getType() so that the job went to a
different queue. This required both the special type *and* the
flag to be set if using JobSpecification or either ack() would
route to the wrong queue and fail or the job would go in the
regular queue. This was too messy and error prone. Cirrus jobs
using the same pattern also had ack() failures for example.
Change-Id: I5941cb62cdafde203fdee7e106894322ba87b48a
New enqueue method of DeferredUpdates was turning LinksUpdate
updates into Jobs. However RefreshLinksJob was not properly
reconstructing the secondary updates as being recursive (if they
were recursive). This means that templates weren't having the pages
that were using them being updated.
See also related T116001.
Change-Id: Ia06246efb2034fdfe07232fd8c2334160edbcf02
* All callers are either using commit already or would be fine
using it (e.g. Maintenance scripts and JobRunner that have
no real transaction open).
Change-Id: I9f54b27619da6dac2cb63d255995aabc4ee78002
Previously the 'prioritize' flag was set in getAsJobSpecification()
but the job still went into the regular 'refreshLinks' queue. Since
the parser cache was just saved, they should be fast to run. The
separate queue stops them from getting stuck behind template updates.
Change-Id: I46c5760702d862dc9288618ee5eff3897f6e838b
This is part of a chain that reverts:
e412ff5ecc.
NOTE:
- The feature is disabled by default
- User settings default to hiding changes
- T109707 Touching a file on wikisource adds and
removes it from a category... Even when page
has no changes.... WTF? See linked issue,
marked as stalled with a possible way forward
for this patch.
@see https://gerrit.wikimedia.org/r/#/c/235467/
Changes since version 1:
- T109604 - Page names in comment are no longer
url encoded / have _'s
- T109638 & T110338 - Reserved username now used
when we can't determine a username for the change
(we could perhaps set the user and id to be blank
in the RC table, but who knows what this might do)
- T109688 - History links are now disabled in RC....
(could be fine for the introduction and worked
on more in the future)
- Categorization changes are now always patrolled
- Touching on T109672 in this change emails will never
be sent regarding categorization changes. (this
can of course be changed in a followup)
- Added $wgRCWatchCategoryMembership defaulting to true
for enabling / disabling the feature
- T109700 - for cases when no revision was retrieved
for a category change set the bot flag to true.
This means all changes caused by parser functions
& Lua will be marked as bot, as will changes that
cant find their revision due to slave lag..
Bug: T9148
Bug: T109604
Bug: T109638
Bug: T109688
Bug: T109700
Bug: T110338
Bug: T110340
Change-Id: I51c2c1254de862f24a26ef9dbbf027c6c83e9063
* Remove $wgMaxSquidPurgeTitles; silently discarding URLs is
the wrong way to limit things. The purge methods already batch,
which HTCP pushing one at a time. Jobs already batch and have
$wgJobBackoffThrottling. If code does purge 500 URLs at once
with no limiting (flooding UDP), then it is already broken.
* Make the HTCP method protected.
* Rename $urlArr field.
Change-Id: I17cced187fe7e93f5a5188022f12202a7746bdb7