WikiPage: Document triggerOpportunisticLinksUpdate and related code

== History of WikiPage::triggerOpportunisticLinksUpdate ==

* 2007 (r19095; T10575; b3a8d488a8)

  Introduces the "cascading protection" feature.

  This commit added code to Article.php, in a conditional branch
  where we encountered a ParserCache "miss" and thus have done a
  fresh parse. The code in question would query which templates
  we ended up using, and if that differed from what the database
  said (e.g. stored during the last actual edit or links update),
  then a new LinksUpdate is ad-hoc constructed and executed.

  I could not find it anywhere explicitly spelled out, but my best
  guess is that the reason for this is to make sure that if the page
  in question contains wikitext that trancludes a different page based
  on the current date and time (such as how most Wikipedia main pages
  transclude news information and "Did you know" information based on
  dated subpages that are prepared in advance), then we don't just
  want to re-render the page after a day has passed, we also want to
  re-do the links update to ensure the search index, category links,
  and "WhatLinksHere" is correct, and thus by extent, to make sure
  that cascading protection from the main page does in fact apply
  to the "current" set of subpages and templates actually in-use.

* 2007 (r19227; 0c0c0eff81)

  This adds an optimisation to the added logic that limits it to
  pages that satisfy `mTitle->areRestrictionsCascading()`.

  Thus for most articles, which aren't protected at all, we don't
  run LinksUpdate mid-request after a cache miss page view.

  Because of this commit, the pre-2007 status quo remained unaltered
  and has remains unaltered to this very day: We don't re-index
  categories and WhatLinksHere etc, unless an article edit or
  propagating template edit takes place.

* 2009 (r52888; 1353a8ba29)

  Introduces the PoolCounter feature.

  The logic in question moves to Article::doCascadeProtectionUpdates().

* 2015 (Iea952d4d2e66; df5ef8b5d7).

  The logic in question is changed, motivated by wanting to avoid
  DB writes during page views.

  * Instead of executing LinksUpdate mid-request, we now queue a
    RefreshLinksJob on the JobQueue, and utilize a newly added
    `prioritize => true` parameter.

  This commit also introduces a new feature, which is to queue
  RefreshLinksJob also for pages that do not have cascading
  protection, but that do satisfy a new boolean method
  called `$parserOutput->hasDynamicContent()`, which is set when
  the Parser encounters TTL-reducing magic words and functions
  such as {{CURRENTDAY}} and {{#time}}. For this new case, however,
  the `prioritize` parameter is not set, and this feature is disabled
  in WMF production (and other farms that enable wgMiserMode).

  This commit also renamed doCascadeProtectionUpdates()
  to triggerOpportunisticLinksUpdate().

  This commit also removed various documentation comments, which
  I've partly restored in this patch, the patch you're looking at
  now.

== Actual changes ==

* Rename hasDynamicContent() to hasReducedExpiry() and keep the
  previous method as a non-deprecated wrapper.

  This change is motivated by T280605, in which I intent to make use
  of a Parser hook that reduces the cache expiry. There are numerous
  extensions in WMF production that already do this, and thus the
  assumption that these have "dynamic content" is already false in
  some cases. I'm not yet sure how or if to refactor this so to allow
  reducing of the TTL *without* causing this side-effect, but as a
  first step we can make the method more obvious in its impact
  and behaviour.

  I've also updated two of the callers that I think will benefit from
  this more explicit name and (current) implementation detail.

Bug: T280605
Change-Id: I85bdff7f86911f8ea5b866e3639f08ddd3f3bf6f
This commit is contained in:
Timo Tijhof 2021-05-05 02:03:16 +01:00
parent 23cc232837
commit 481f1a49d6
4 changed files with 47 additions and 10 deletions

View file

@ -3629,10 +3629,16 @@ class WikiPage implements Page, IDBAccessObject, PageRecord {
}
/**
* Opportunistically enqueue link update jobs given fresh parser output if useful
* Opportunistically enqueue link update jobs after a fresh parser output was generated.
*
* This method should only be called by PoolWorkArticleViewCurrent, after a page view
* experienced a miss from the ParserCache, and a new ParserOutput was generated.
* Specifically, for load reasons, this method must not get called during page views that
* use a cached ParserOutput.
*
* @param ParserOutput $parserOutput Current version page output
* @since 1.25
* @internal For use by PoolWorkArticleViewCurrent
* @param ParserOutput $parserOutput Current version page output
*/
public function triggerOpportunisticLinksUpdate( ParserOutput $parserOutput ) {
if ( wfReadOnly() ) {
@ -3653,11 +3659,23 @@ class WikiPage implements Page, IDBAccessObject, PageRecord {
];
if ( $this->mTitle->areRestrictionsCascading() ) {
// If the page is cascade protecting, the links should really be up-to-date
// In general, MediaWiki does not re-run LinkUpdate (e.g. for search index, category
// listings, and backlinks for Whatlinkshere), unless either the page was directly
// edited, or was re-generate following a template edit propagating to an affected
// page. As such, during page views when there is no valid ParserCache entry,
// we re-parse and save, but leave indexes as-is.
//
// We make an exception for pages that have cascading protection (perhaps for a wiki's
// "Main Page"). When such page is re-parsed on-demand after a parser cache miss, we
// queue a high-priority LinksUpdate job, to ensure that we really protect all
// content that is currently transcluded onto the page. This is important, because
// wikitext supports conditional statements based on the current time, which enables
// transcluding of a different sub page based on which day it is, and then show that
// information on the Main Page, without the Main Page itself being edited.
JobQueueGroup::singleton()->lazyPush(
RefreshLinksJob::newPrioritized( $this->mTitle, $params )
);
} elseif ( !$config->get( 'MiserMode' ) && $parserOutput->hasDynamicContent() ) {
} elseif ( !$config->get( 'MiserMode' ) && $parserOutput->hasReducedExpiry() ) {
// Assume the output contains "dynamic" time/random based magic words.
// Only update pages that expired due to dynamic content and NOT due to edits
// to referenced templates/files. When the cache expires due to dynamic content,

View file

@ -116,6 +116,10 @@ class CacheTime implements ParserCacheMetadata, JsonUnserializable {
*
* Avoid using 0 if at all possible. Consider JavaScript for highly dynamic content.
*
* NOTE: Beware that reducing the TTL for reasons that do not relate to "dynamic content",
* may have the side-effect of incurring more RefreshLinksJob executions.
* See also WikiPage::triggerOpportunisticLinksUpdate.
*
* @param int $seconds
*/
public function updateCacheExpiry( $seconds ) {

View file

@ -757,8 +757,8 @@ class Parser {
}
$limitReport .= 'Cached time: ' . $this->mOutput->getCacheTime() . "\n";
$limitReport .= 'Cache expiry: ' . $this->mOutput->getCacheExpiry() . "\n";
$limitReport .= 'Dynamic content: ' .
( $this->mOutput->hasDynamicContent() ? 'true' : 'false' ) .
$limitReport .= 'Reduced expiry: ' .
( $this->mOutput->hasReducedExpiry() ? 'true' : 'false' ) .
"\n";
$limitReport .= 'Complications: [' . implode( ', ', $this->mOutput->getAllFlags() ) . "]\n";

View file

@ -1328,19 +1328,34 @@ class ParserOutput extends CacheTime {
}
/**
* Check whether the cache TTL was lowered due to dynamic content
* Check whether the cache TTL was lowered from the site default.
*
* When content is determined by more than hard state (e.g. page edits),
* such as template/file transclusions based on the current timestamp or
* extension tags that generate lists based on queries, this return true.
*
* This method mainly exists to facilitate the logic in
* WikiPage::triggerOpportunisticLinksUpdate. As such, beware that reducing the TTL for
* reasons that do not relate to "dynamic content", may have the side-effect of incurring
* more RefreshLinksJob executions.
*
* @internal For use by Parser and WikiPage
* @since 1.37
* @return bool
*/
public function hasReducedExpiry() : bool {
global $wgParserCacheExpireTime;
return $this->getCacheExpiry() < $wgParserCacheExpireTime;
}
/**
* @see ParserOutput::hasReducedExpiry
* @return bool
* @since 1.25
*/
public function hasDynamicContent() {
global $wgParserCacheExpireTime;
return $this->getCacheExpiry() < $wgParserCacheExpireTime;
return $this->hasReducedExpiry();
}
/**