== History of WikiPage::triggerOpportunisticLinksUpdate ==
* 2007 (r19095; T10575; b3a8d488a8)
Introduces the "cascading protection" feature.
This commit added code to Article.php, in a conditional branch
where we encountered a ParserCache "miss" and thus have done a
fresh parse. The code in question would query which templates
we ended up using, and if that differed from what the database
said (e.g. stored during the last actual edit or links update),
then a new LinksUpdate is ad-hoc constructed and executed.
I could not find it anywhere explicitly spelled out, but my best
guess is that the reason for this is to make sure that if the page
in question contains wikitext that trancludes a different page based
on the current date and time (such as how most Wikipedia main pages
transclude news information and "Did you know" information based on
dated subpages that are prepared in advance), then we don't just
want to re-render the page after a day has passed, we also want to
re-do the links update to ensure the search index, category links,
and "WhatLinksHere" is correct, and thus by extent, to make sure
that cascading protection from the main page does in fact apply
to the "current" set of subpages and templates actually in-use.
* 2007 (r19227; 0c0c0eff81)
This adds an optimisation to the added logic that limits it to
pages that satisfy `mTitle->areRestrictionsCascading()`.
Thus for most articles, which aren't protected at all, we don't
run LinksUpdate mid-request after a cache miss page view.
Because of this commit, the pre-2007 status quo remained unaltered
and has remains unaltered to this very day: We don't re-index
categories and WhatLinksHere etc, unless an article edit or
propagating template edit takes place.
* 2009 (r52888; 1353a8ba29)
Introduces the PoolCounter feature.
The logic in question moves to Article::doCascadeProtectionUpdates().
* 2015 (Iea952d4d2e66; df5ef8b5d7).
The logic in question is changed, motivated by wanting to avoid
DB writes during page views.
* Instead of executing LinksUpdate mid-request, we now queue a
RefreshLinksJob on the JobQueue, and utilize a newly added
`prioritize => true` parameter.
This commit also introduces a new feature, which is to queue
RefreshLinksJob also for pages that do not have cascading
protection, but that do satisfy a new boolean method
called `$parserOutput->hasDynamicContent()`, which is set when
the Parser encounters TTL-reducing magic words and functions
such as {{CURRENTDAY}} and {{#time}}. For this new case, however,
the `prioritize` parameter is not set, and this feature is disabled
in WMF production (and other farms that enable wgMiserMode).
This commit also renamed doCascadeProtectionUpdates()
to triggerOpportunisticLinksUpdate().
This commit also removed various documentation comments, which
I've partly restored in this patch, the patch you're looking at
now.
== Actual changes ==
* Rename hasDynamicContent() to hasReducedExpiry() and keep the
previous method as a non-deprecated wrapper.
This change is motivated by T280605, in which I intent to make use
of a Parser hook that reduces the cache expiry. There are numerous
extensions in WMF production that already do this, and thus the
assumption that these have "dynamic content" is already false in
some cases. I'm not yet sure how or if to refactor this so to allow
reducing of the TTL *without* causing this side-effect, but as a
first step we can make the method more obvious in its impact
and behaviour.
I've also updated two of the callers that I think will benefit from
this more explicit name and (current) implementation detail.
Bug: T280605
Change-Id: I85bdff7f86911f8ea5b866e3639f08ddd3f3bf6f
This is micro-optimization of closure code to avoid binding the closure
to $this where it is not needed.
Created by I25a17fb22b6b669e817317a0f45051ae9c608208
Change-Id: I0ffc6200f6c6693d78a3151cb8cea7dce7c21653
Apparently I've missed a few instances where dynamic properties
were still written. To fix those we need some grace period for
ParserCache to expire when we would need to have get fallback,
so undeprecate reading dynamic properties.
Bug: T263851
Change-Id: I123605f7b5b907cc1b0ae6f183f3015b41835e8b
One major difference with what we've had before is that now we
actually write class names into the serialization - given that
this new mechanism is extencible, we can't establish any kind
of mapping of allowed classes. I do not think it's a problem
though.
Bug: T264394
Change-Id: Ia152f3b76b967aabde2d8a182e3aec7d3002e5ea
CacheTime::mUsedOptions and ParserOutput::mAccessedOptions
do exactly the same thing and has to be merged into a single property.
This patch adds forward-compatibility and needs to be deployed
at least one train before the patch which actually merges the properties.
Change-Id: Ic9d71a443994e2545ebf2a826b9155c82961cb88
This introduces a mechanism for encoding binary data in
strings set via setProperty(). This is needed to accommodate compressed
data as used by TemplateData, which uses gzip compression to make the
data fit into the page_props table.
Bug: T266200
Change-Id: I19fa0dea8c25d93fcdec9dc5ddd6f3c9c162b621
This adds JSON serialization and deserialization capabilities
to CacheTime and ParserOutput.
NOTE: JSON serialization is disabled for now. Merging this patch
should not change behavior in production.
Bug: T263579
Change-Id: I18187e8bce573d21f6f1bd29106e07c63a6d2f4d
This reverts commit a4dc6d82af.
I've reverted the merged patch since I didn't do enough testing
on serialized/reserialized ParserOutput and CacheTime. Now I'm
confident serialization/deserialization works.
Changes since original reverted version:
- Use __get/__set instead of DeprecationHelper in order to
avoid $deprecateProperties array to be serialized.
- Add test for old format serialization new format deserialization.
Change-Id: Ic911c2724ad709931d3316e609781fb89b5b7b28
This reverts commit 799c10b7eb.
Reason for revert: Didn't test how this would work with deserializing stored ParserOutput.
Change-Id: I4221bc26282f3b4bd044f0ab50d00e77eb57ede0
When ParserOutput encounters a bad page title in an editsection
placeholder, this should not cause a fatal error. We can just not
produce an edit link and continue.
It's still worth logging though, since the parser shouldn't be putting
invalid links into editsection placeholders.
Bug: T261347
Change-Id: I154e85aec4b408e659e6281b02473c51f370865d
* In preparation for ParserCache/Parsoid integration, it's nice to
do some cleanups. Will untie our hands a bit more.
* Verified no usages in extensions deployed at wikimedia, other then
Flow, fixed in the dependent patch.
Change-Id: Idd78413a36887e2ff5c902d410e55691cafb736b
Three new checks are now applied to user signatures in preferences:
* Disallow invalid HTML and lint errors (T140606)
Since 15e0e9bb4b we can rely on Parsoid to check the signature for
lint errors. (The old PHP Parser doesn't have this capability.)
Most importantly, this will disallow unclosed HTML tags. Unclosed
formatting tags like `<i>` (and also wikitext markup like `''`)
could affect the entire page with the bad markup.
New configuration variable $wgSignatureAllowedLintErrors is added
to allow ignoring some errors. The default value ignores the
'obsolete-tag' error (caused by HTML tags like `<font>` and `<tt>`.)
* Require a link to user page, talk page or contributions (T237700)
Various tools don't work correctly when such a link is missing. For
example, Echo notifications are not sent, DiscussionTools will not
allow replying to these comments, English Wikipedia's SineBot treats
these comments as unsigned.
Such requirement has been present for a long time in many Wikimedia
wikis' policies, but it was not enforced by software.
* Disallow "nested" substitution in signature (T230652)
Clever abuse of "subst" markup and tildes allows users to save edits
containing wikitext in which substitution occurs again when the page
is next saved. Disallow this in signatures, at least.
New configuration variable $wgSignatureValidation is added to control
what we do about the result of the validation described above. The
options are:
* 'warning':
Only displays a warning near the field on Special:Preferences if
the current signature is invalid. Signatures can still be changed
regardless of validity and will be used when signing comments.
* 'new':
In addition to the above, if a user tries to change their signature,
the new one must be valid. Existing invalid signatures are still
used when signing comments.
* 'disallow':
In addition to the above, existing invalid signatures are no longer
used when signing comments.
Bug: T140606
Bug: T237700
Bug: T230652
Change-Id: I07c575c2d9d2afe7a89c4847d16ac044417297bf
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.
So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.
Also:
* Fix the stripping of leading line breaks from the log header emitted
by Setup.php. This feature, accidentally broken in 2014, allows
requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
WebRequest in the hopes that it would not always be needed, however
$wgRequest->getIP() is now called unconditionally a few lines up in
Setup.php. This means that it is put in its proper place after the
"start request" message.
* Wrap the log header code in a closure so that variables like $name do
not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
parameter to wfDebug().
Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
Updated Doxygen markup in several .php files triggering warnings when mwdocgen.php is executed. Removed
obsolete settings MSCGEN_PATH and TCL_SUBST from Doxyfile. The former would generate a warning in 1.8.16
while TCL support was removed in 1.8.18. Since TCL_SUBST was blank anyway, it was removed prior to getting
to .18 in production. Increased DOT_GRAPH_MAX_NODES from 50 to 200 since Doxygen complained about it being
too low for API and Maintenance.
Bug: T248706
Change-Id: I9c67f0807d1b43089d351263d4f591dee5501f36
This adds methods to ParserOutput ::addExtraCSPStyleSrc,
::addExtraCSPDefaultSrc, and ::addExtraCSPScriptSrc, to easily
allow parser tags/functions to add additional CSP sources if their
tag needs it. Previously such an extension would need to use
and OutputPage hook. This is modeled on how addModules() works.
The immediate use case is for Kartographer (T240960), although
its expected that lots of extensions might do something like this,
especially extensions used outside of Wikimedia.
Change-Id: I24e5f0b4edff58025a0c2a3e1a9aa3f62eb7db7b
Apparently the section edit links may depend on state that is
available through context in the Skin object, but not necessarily
through the global context, such as the current user and page title.
Allow ParserOutput::getText() to take a 'skin' option for this purpose.
Bug: T234868
Change-Id: Iaa83e5f801c7776bf8218d8ce7484e2485b227d4
Using @see is not enough description
Enable the php sniffs for now, but skip /tests/ to fix it later.
That avoids new issues in future patch sets
Change-Id: I49cb341a2880bfaeefb6bbfbb1717051ea3a4b16
And add a test which is confirmed to fail on HHVM prior to this change
with the error message "serialize(): "" returned as member variable from
__sleep() but does not exist".
Bug: T229366
Change-Id: I236bb4d64bc2e9f7756885e8c418399804eac5e1
This works similarly to speculative rev IDs with {{REVISIONID}}.
Re-parses can be avoided if the page ID is correctly guessed.
Also make the {{PAGEID:X}} parser function set vary-page-id.
Bug: T226785
Change-Id: I0b19be45e6ddd6cde330bfcd09d243e4e5beda01
This can be used to avoid double parsed on save if the prior output
can be reused in-spite of involving a self content reference.
Change-Id: Idcd30a3fa3f7012dac76ce8bbf46625453ae331f
DerivedPageDataUpdater::prepareContent already locks in the revision
timestamp before insertion, so inject that into the parser options
used for any pre-save parse (e.g for edit filters).
This means that a reparse is no longer needed within in the same save
request to get the post-save canonical output. A parse will still be
required if the edit filter output used an edit stash output, since
the revision timestamp is not set at stash time.
Instead of using vary-revision, add a vary-revision-timestamp flag
for the revision timestamp words. The month/day/hour variants retain
their prior optimizations for allowing edit stash output reuse for
the post-save canonical output.
Change-Id: Ic2c13db4d21197c79a89de0de56745ca32918eb6
Output::sectionEditLinksEnabled(), ParserOutput::getEditSectionTokens() and
::getTOCEnabled(), ::setEditSectionTokens(), ::setTOCEnabled have been removed.
Change-Id: I7fe927776e2451bafb96ef5c4ee500497ec3734c