Commit graph

253 commits

Author SHA1 Message Date
Petr Pchelko
92564edc7c Use Message::page instead of Message::title
Also modified new APIs added to ApiErrorFormatter to
use PageReference instead of Title.

Change-Id: I093c89f8e1e6d383603f887358be6ece70f23a02
2021-06-09 13:18:22 +00:00
Timo Tijhof
481f1a49d6 WikiPage: Document triggerOpportunisticLinksUpdate and related code
== History of WikiPage::triggerOpportunisticLinksUpdate ==

* 2007 (r19095; T10575; b3a8d488a8)

  Introduces the "cascading protection" feature.

  This commit added code to Article.php, in a conditional branch
  where we encountered a ParserCache "miss" and thus have done a
  fresh parse. The code in question would query which templates
  we ended up using, and if that differed from what the database
  said (e.g. stored during the last actual edit or links update),
  then a new LinksUpdate is ad-hoc constructed and executed.

  I could not find it anywhere explicitly spelled out, but my best
  guess is that the reason for this is to make sure that if the page
  in question contains wikitext that trancludes a different page based
  on the current date and time (such as how most Wikipedia main pages
  transclude news information and "Did you know" information based on
  dated subpages that are prepared in advance), then we don't just
  want to re-render the page after a day has passed, we also want to
  re-do the links update to ensure the search index, category links,
  and "WhatLinksHere" is correct, and thus by extent, to make sure
  that cascading protection from the main page does in fact apply
  to the "current" set of subpages and templates actually in-use.

* 2007 (r19227; 0c0c0eff81)

  This adds an optimisation to the added logic that limits it to
  pages that satisfy `mTitle->areRestrictionsCascading()`.

  Thus for most articles, which aren't protected at all, we don't
  run LinksUpdate mid-request after a cache miss page view.

  Because of this commit, the pre-2007 status quo remained unaltered
  and has remains unaltered to this very day: We don't re-index
  categories and WhatLinksHere etc, unless an article edit or
  propagating template edit takes place.

* 2009 (r52888; 1353a8ba29)

  Introduces the PoolCounter feature.

  The logic in question moves to Article::doCascadeProtectionUpdates().

* 2015 (Iea952d4d2e66; df5ef8b5d7).

  The logic in question is changed, motivated by wanting to avoid
  DB writes during page views.

  * Instead of executing LinksUpdate mid-request, we now queue a
    RefreshLinksJob on the JobQueue, and utilize a newly added
    `prioritize => true` parameter.

  This commit also introduces a new feature, which is to queue
  RefreshLinksJob also for pages that do not have cascading
  protection, but that do satisfy a new boolean method
  called `$parserOutput->hasDynamicContent()`, which is set when
  the Parser encounters TTL-reducing magic words and functions
  such as {{CURRENTDAY}} and {{#time}}. For this new case, however,
  the `prioritize` parameter is not set, and this feature is disabled
  in WMF production (and other farms that enable wgMiserMode).

  This commit also renamed doCascadeProtectionUpdates()
  to triggerOpportunisticLinksUpdate().

  This commit also removed various documentation comments, which
  I've partly restored in this patch, the patch you're looking at
  now.

== Actual changes ==

* Rename hasDynamicContent() to hasReducedExpiry() and keep the
  previous method as a non-deprecated wrapper.

  This change is motivated by T280605, in which I intent to make use
  of a Parser hook that reduces the cache expiry. There are numerous
  extensions in WMF production that already do this, and thus the
  assumption that these have "dynamic content" is already false in
  some cases. I'm not yet sure how or if to refactor this so to allow
  reducing of the TTL *without* causing this side-effect, but as a
  first step we can make the method more obvious in its impact
  and behaviour.

  I've also updated two of the callers that I think will benefit from
  this more explicit name and (current) implementation detail.

Bug: T280605
Change-Id: I85bdff7f86911f8ea5b866e3639f08ddd3f3bf6f
2021-05-05 02:03:30 +01:00
Umherirrender
8de3b7d324 Use static closures where safe to use
This is micro-optimization of closure code to avoid binding the closure
to $this where it is not needed.

Created by I25a17fb22b6b669e817317a0f45051ae9c608208

Change-Id: I0ffc6200f6c6693d78a3151cb8cea7dce7c21653
2021-02-11 00:13:52 +00:00
Daniel Kinzler
48172f794d Revert "Hard-deprecate all public property access on CacheTime and ParserOutput."
This reverts commit b1a30eb0c4.

Reason for revert: T269396

Bug: T269396
Change-Id: I374ca13ccc30418b8fe3bf98f5090f7643aac4d7
2020-12-04 11:47:55 +00:00
Petr Pchelko
869962e7bb ParserOutput: temporary undeprecate getting dynamic properties.
Apparently I've missed a few instances where dynamic properties
were still written. To fix those we need some grace period for
ParserCache to expire when we would need to have get fallback,
so undeprecate reading dynamic properties.

Bug: T263851
Change-Id: I123605f7b5b907cc1b0ae6f183f3015b41835e8b
2020-12-02 15:35:40 +00:00
Petr Pchelko
b1a30eb0c4 Hard-deprecate all public property access on CacheTime and ParserOutput.
Bug: T263851
Change-Id: I3d3ff7b5b6899150df836e445b56896dfd5b887e
2020-11-19 10:12:41 -07:00
Petr Pchelko
b956c77d27 Merge CacheTime and ParserOutput accessedOptions properties
Change-Id: I5785596d68e8923f8bcbd182ace0b1991bd75c9a
2020-11-19 10:12:39 -07:00
Petr Pchelko
7c68ae9296 Safe ParserOutput extension data and JsonUnserializable helper.
One major difference with what we've had before is that now we
actually write class names into the serialization - given that
this new mechanism is extencible, we can't establish any kind
of mapping of allowed classes. I do not think it's a problem
though.

Bug: T264394
Change-Id: Ia152f3b76b967aabde2d8a182e3aec7d3002e5ea
2020-11-10 11:21:09 -07:00
Petr Pchelko
017cfcf016 Forward-compat for merging CacheTime and ParserOutput mOptions
CacheTime::mUsedOptions and ParserOutput::mAccessedOptions
do exactly the same thing and has to be merged into a single property.
This patch adds forward-compatibility and needs to be deployed
at least one train before the patch which actually merges the properties.

Change-Id: Ic9d71a443994e2545ebf2a826b9155c82961cb88
2020-11-10 07:09:41 -07:00
daniel
cac89b547c ParserOutput: add support for binary properties in JSON.
This introduces a mechanism for encoding binary data in
strings set via setProperty(). This is needed to accommodate compressed
data as used by TemplateData, which uses gzip compression to make the
data fit into the page_props table.

Bug: T266200
Change-Id: I19fa0dea8c25d93fcdec9dc5ddd6f3c9c162b621
2020-11-04 18:52:09 +01:00
Petr Pchelko
09c14b9dd0 Move serializability validation from ParserOutput to ParserCache
Bug: T263579
Change-Id: Iac2dbc817c2e7af4a6d112f01bd380a04354db22
2020-10-15 13:15:30 -07:00
daniel
600f64029f Use JSON for parser cache
This adds JSON serialization and deserialization capabilities
to CacheTime and ParserOutput.

NOTE: JSON serialization is disabled for now. Merging this patch
should not change behavior in production.

Bug: T263579
Change-Id: I18187e8bce573d21f6f1bd29106e07c63a6d2f4d
2020-10-13 16:28:52 -07:00
Petr Pchelko
13574e8404 Deprecate ParserCache::getKey and replace it with getMetadata
Bug: T263689
Change-Id: I4a71e5a7eb1c25cd53b857c115883cd00160736b
2020-10-13 08:31:22 -07:00
Petr Pchelko
1c70cca3ee Check if non-JSON-serializable data passed to ParserOutput
Bug: T264394
Change-Id: I6eedd03a81b95f6f55d25c00b31e01cbd8658d43
2020-10-05 10:54:08 -06:00
Ppchelko
3254e41a4c Revert "Revert "Revert "Hard deprecate all public properties in CacheTime and ParserOutput"""
This reverts commit deacee9088.

Bug: T264257
Change-Id: Ie68d8081a42e7d8103e287b6d6857a30dc522f75
2020-10-01 12:03:41 -06:00
jenkins-bot
b43b4c728f Merge "Revert "Revert "Hard deprecate all public properties in CacheTime and ParserOutput""" 2020-09-24 16:26:17 +00:00
Ppchelko
deacee9088 Revert "Revert "Hard deprecate all public properties in CacheTime and ParserOutput""
This reverts commit a4dc6d82af.

I've reverted the merged patch since I didn't do enough testing
on serialized/reserialized ParserOutput and CacheTime. Now I'm
confident serialization/deserialization works.

Changes since original reverted version:
 - Use __get/__set instead of DeprecationHelper in order to
   avoid $deprecateProperties array to be serialized.
 - Add test for old format serialization new format deserialization.

Change-Id: Ic911c2724ad709931d3316e609781fb89b5b7b28
2020-09-24 07:55:18 -07:00
Ppchelko
a4dc6d82af Revert "Hard deprecate all public properties in CacheTime and ParserOutput"
This reverts commit 799c10b7eb.

Reason for revert: Didn't test how this would work with deserializing stored ParserOutput.

Change-Id: I4221bc26282f3b4bd044f0ab50d00e77eb57ede0
2020-09-23 22:46:33 +00:00
daniel
e6f37dc1d8 ParserOutput: don't throw on bad editsection
When ParserOutput encounters a bad page title in an editsection
placeholder, this should not cause a fatal error. We can just not
produce an edit link and continue.

It's still worth logging though, since the parser shouldn't be putting
invalid links into editsection placeholders.

Bug: T261347
Change-Id: I154e85aec4b408e659e6281b02473c51f370865d
2020-09-23 22:30:59 +00:00
Petr Pchelko
799c10b7eb Hard deprecate all public properties in CacheTime and ParserOutput
* In preparation for ParserCache/Parsoid integration, it's nice to
  do some cleanups. Will untie our hands a bit more.
* Verified no usages in extensions deployed at wikimedia, other then
  Flow, fixed in the dependent patch.

Change-Id: Idd78413a36887e2ff5c902d410e55691cafb736b
2020-09-23 07:17:13 -07:00
Ed Sanders
7683f7d839 Use strict (in)equality with namespaces constants when LHS is definitely an integer
Change-Id: I8fede00dfe1270d93c5d78d3c36e788cddfc8a99
2020-07-31 18:03:28 +01:00
Bartosz Dziewoński
df7231ad89 preferences: Signature validation (lint errors, user links, nested subst)
Three new checks are now applied to user signatures in preferences:

* Disallow invalid HTML and lint errors (T140606)

  Since 15e0e9bb4b we can rely on Parsoid to check the signature for
  lint errors. (The old PHP Parser doesn't have this capability.)

  Most importantly, this will disallow unclosed HTML tags. Unclosed
  formatting tags like `<i>` (and also wikitext markup like `''`)
  could affect the entire page with the bad markup.

  New configuration variable $wgSignatureAllowedLintErrors is added
  to allow ignoring some errors. The default value ignores the
  'obsolete-tag' error (caused by HTML tags like `<font>` and `<tt>`.)

* Require a link to user page, talk page or contributions (T237700)

  Various tools don't work correctly when such a link is missing. For
  example, Echo notifications are not sent, DiscussionTools will not
  allow replying to these comments, English Wikipedia's SineBot treats
  these comments as unsigned.

  Such requirement has been present for a long time in many Wikimedia
  wikis' policies, but it was not enforced by software.

* Disallow "nested" substitution in signature (T230652)

  Clever abuse of "subst" markup and tildes allows users to save edits
  containing wikitext in which substitution occurs again when the page
  is next saved. Disallow this in signatures, at least.

New configuration variable $wgSignatureValidation is added to control
what we do about the result of the validation described above. The
options are:

* 'warning':
  Only displays a warning near the field on Special:Preferences if
  the current signature is invalid. Signatures can still be changed
  regardless of validity and will be used when signing comments.

* 'new':
  In addition to the above, if a user tries to change their signature,
  the new one must be valid. Existing invalid signatures are still
  used when signing comments.

* 'disallow':
  In addition to the above, existing invalid signatures are no longer
  used when signing comments.

Bug: T140606
Bug: T237700
Bug: T230652
Change-Id: I07c575c2d9d2afe7a89c4847d16ac044417297bf
2020-06-24 01:20:05 +02:00
Tim Starling
47a1619027 Remove terminating line breaks from debug messages
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.

So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.

Also:
* Fix the stripping of leading line breaks from the log header emitted
  by Setup.php. This feature, accidentally broken in 2014, allows
  requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
  WebRequest in the hopes that it would not always be needed, however
  $wgRequest->getIP() is now called unconditionally a few lines up in
  Setup.php. This means that it is put in its proper place after the
  "start request" message.
* Wrap the log header code in a closure so that variables like $name do
  not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
  parameter to wfDebug().

Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
2020-06-03 12:01:16 +10:00
Tim Starling
68c433bd23 Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.

General principles:
* Use DI if it is already used. We're not changing the way state is
  managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
  is a service, it's a more generic interface, it is the only
  thing that provides isRegistered() which is needed in some cases,
  and a HookRunner can be efficiently constructed from it
  (confirmed by benchmark). Because HookContainer is needed
  for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
  SpecialPage and ApiBase have getHookContainer() and getHookRunner()
  methods in the base class, and classes that extend that base class
  are not expected to know or care where the base class gets its
  HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
  getHookRunner() methods, getting them from the global service
  container. The point of this is to ease migration to DI by ensuring
  that call sites ask their local friendly base class rather than
  getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
  methods did not seem warranted, there is a private HookRunner property
  which is accessed directly. Very rarely (two cases), there is a
  protected property, for consistency with code that conventionally
  assumes protected=private, but in cases where the class might actually
  be overridden, a protected accessor is preferred over a protected
  property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
  global code. In a few cases it was used for objects with broken
  construction schemes, out of horror or laziness.

Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore

Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router

setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine

Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-05-30 14:23:28 +00:00
Reedy
b038d6333a Fix even more PSR12.Properties.ConstantVisibility.NotFound
Change-Id: I6d98efcfac1f1c0ab6a442e0af6d5daa6ef7801a
2020-05-16 00:28:41 +00:00
Holger Knust
471d2371ab doxygen: Changed Doxygen tags causing warnings during documentation generation
Updated Doxygen markup in several .php files triggering warnings when mwdocgen.php is executed. Removed
obsolete settings MSCGEN_PATH and TCL_SUBST from Doxyfile. The former would generate a warning in 1.8.16
while TCL support was removed in 1.8.18. Since TCL_SUBST was blank anyway, it was removed prior to getting
to .18 in production. Increased DOT_GRAPH_MAX_NODES from 50 to 200 since Doxygen complained about it being
too low for API and Maintenance.

Bug: T248706
Change-Id: I9c67f0807d1b43089d351263d4f591dee5501f36
2020-04-14 03:25:19 +00:00
Brian Wolff
89be2c5820 Allow storing additional CSP sources in ParserOutput
This adds methods to ParserOutput ::addExtraCSPStyleSrc,
::addExtraCSPDefaultSrc, and ::addExtraCSPScriptSrc, to easily
allow parser tags/functions to add additional CSP sources if their
tag needs it. Previously such an extension would need to use
and OutputPage hook. This is modeled on how addModules() works.

The immediate use case is for Kartographer (T240960), although
its expected that lots of extensions might do something like this,
especially extensions used outside of Wikimedia.

Change-Id: I24e5f0b4edff58025a0c2a3e1a9aa3f62eb7db7b
2020-03-12 17:39:51 -07:00
jenkins-bot
8a19838915 Merge "ApiParse: Use the right Skin object for building section edit links" 2020-01-24 22:01:28 +00:00
Bartosz Dziewoński
965b788178 ApiParse: Use the right Skin object for building section edit links
Apparently the section edit links may depend on state that is
available through context in the Skin object, but not necessarily
through the global context, such as the current user and page title.

Allow ParserOutput::getText() to take a 'skin' option for this purpose.

Bug: T234868
Change-Id: Iaa83e5f801c7776bf8218d8ce7484e2485b227d4
2020-01-24 18:53:20 +01:00
James D. Forrester
0958a0bce4 Coding style: Auto-fix MediaWiki.Usage.IsNull.IsNull
Change-Id: I90cfe8366c0245c9c67e598d17800684897a4e27
2020-01-10 14:17:13 -08:00
Umherirrender
b1a38362f3 Add missing @param and @return to documentation
Using @see is not enough description

Enable the php sniffs for now, but skip /tests/ to fix it later.
That avoids new issues in future patch sets

Change-Id: I49cb341a2880bfaeefb6bbfbb1717051ea3a4b16
2019-11-13 17:26:55 +01:00
James D. Forrester
2bc660c95a Collapse uses of now-deprecated wfGetRusage()
Change-Id: I9a2b5d1234ebb458b6cd29797de3f387d1399e6f
2019-10-22 11:32:06 +01:00
Max Semenik
8a98dd9d59 Convert some private static arrays to constants
Remove @since for some private ones as we don't guarantee anything
about private class members.

Change-Id: Ifb898353c02082e9ef69d67f69339345c6cd154d
2019-10-16 01:30:54 +00:00
Daimona Eaytoy
95dc119527 Fix new phan errors, part 2
Still mostly doc-only.

Bug: T231636
Change-Id: I65cec6c716ce6859e14da00a12ef71e03603e59a
2019-10-12 10:35:09 +00:00
Tim Starling
bdedfb8ffa Suppress notice from ParserOutput::__sleep()
Bug: T229366
Change-Id: I8f0a537f0b6b76aac0c52e691ec4653c51c49940
2019-08-01 10:11:12 +10:00
Tim Starling
212ae934cd Revert rename of mSpeculativeRevId to speculativeRevIdUsed
And add a test which is confirmed to fail on HHVM prior to this change
with the error message "serialize(): "" returned as member variable from
__sleep() but does not exist".

Bug: T229366
Change-Id: I236bb4d64bc2e9f7756885e8c418399804eac5e1
2019-07-31 02:42:27 +00:00
Tim Starling
ae116da889 Code cleanup related to initSpeculativePageId()
Change-Id: I5b97c6292a28df6633c573a05c89210b096db5a8
2019-07-26 16:41:00 +10:00
Aaron Schulz
5099ee9f72 parser: add speculative page IDs to use with {{PAGEID}}
This works similarly to speculative rev IDs with {{REVISIONID}}.
Re-parses can be avoided if the page ID is correctly guessed.

Also make the {{PAGEID:X}} parser function set vary-page-id.

Bug: T226785
Change-Id: I0b19be45e6ddd6cde330bfcd09d243e4e5beda01
2019-07-26 16:41:00 +10:00
Aaron Schulz
dd6ed7840f parser: add vary-revision-sha1 and related ParserOutput methods
This can be used to avoid double parsed on save if the prior output
can be reused in-spite of involving a self content reference.

Change-Id: Idcd30a3fa3f7012dac76ce8bbf46625453ae331f
2019-07-17 05:12:18 +00:00
jenkins-bot
20e65a1915 Merge "parser: inject the time for {{REVISIONTIMESTAMP}} on pre-save parse" 2019-06-23 22:13:48 +00:00
Reedy
2c35b5be5f Add some @since tags to ParserOutput::SUPPORTS_ constants
Change-Id: I2f6588fe563ed5c1dc5ef2a70e2ed59fdca99018
2019-06-20 15:33:00 +01:00
Aaron Schulz
e85fe191c9 parser: inject the time for {{REVISIONTIMESTAMP}} on pre-save parse
DerivedPageDataUpdater::prepareContent already locks in the revision
timestamp before insertion, so inject that into the parser options
used for any pre-save parse (e.g for edit filters).

This means that a reparse is no longer needed within in the same save
request to get the post-save canonical output. A parse will still be
required if the edit filter output used an edit stash output, since
the revision timestamp is not set at stash time.

Instead of using vary-revision, add a vary-revision-timestamp flag
for the revision timestamp words. The month/day/hour variants retain
their prior optimizations for allowing edit stash output reuse for
the post-save canonical output.

Change-Id: Ic2c13db4d21197c79a89de0de56745ca32918eb6
2019-06-09 13:12:57 +01:00
Aaron Schulz
19ab538705 parser: list the vary-* flags in the NewPP report HTML comment
Change-Id: I5a4afba2bfdb5b5b56ba0a01ed8ff444a67fbb1a
2019-05-29 10:58:56 -07:00
Reedy
9f2ffdfbd4 Remove "Squiz.WhiteSpace.FunctionSpacing" from phpcs exclusions
Change-Id: I78b3315f26ab91b6b443f5b028a635552f82f5a3
2019-05-11 02:44:26 +01:00
jenkins-bot
bec3717780 Merge "Remove deprecated unused method getModuleScripts()" 2019-05-10 17:00:06 +00:00
jenkins-bot
d28eb17a50 Merge "Missing space between variable name and docstring" 2019-05-10 08:51:37 +00:00
Adam Wight
1fa31b61a0 Missing space between variable name and docstring
Change-Id: I308a0de17da5058691ceea983e60c064b0bfc16c
2019-05-10 10:25:56 +02:00
Derick Alangi
820d33ac96 Remove deprecated unused method getModuleScripts()
Deprecated in 1.33 and no longer used. Removed from OutputPage and
from ParserOutput.

Usage
=====

https://codesearch.wmflabs.org/search/?q=%5CbgetModuleScripts%5Cb&i=nope&files=&repos=

Bug: T220656
Change-Id: Ifddea94504d0c749d3a77daf967d5fec95b50339
2019-05-09 23:33:53 +01:00
Derick Alangi
b4e557f8f8 Remove several deprecated unused methods from OutputPage & ParserOutput
Output::sectionEditLinksEnabled(), ParserOutput::getEditSectionTokens() and
::getTOCEnabled(), ::setEditSectionTokens(), ::setTOCEnabled have been removed.

Change-Id: I7fe927776e2451bafb96ef5c4ee500497ec3734c
2019-05-09 16:39:28 +01:00
Fomafix
f17c297624 Use short assignment operator in PHP
Use
  $var .= $foo
instead of
  $var = $var . $foo

Change-Id: I5dcdd7278e618c14968e5ac1fb8ea43ac2200deb
2019-03-07 09:55:49 +01:00