Gated behind the flag $wgParserEnableLegacyMediaDOM. The scattershot
usage of it is a little unfortunate but isn't expected to live very long
so maybe that's acceptable.
Further details can be found at,
https://www.mediawiki.org/wiki/Parsing/Media_structure
Bug: T51097
Bug: T266148
Bug: T271129
Change-Id: I978187f9f6e9e0a105521ab3e26821e36a96b911
This matches Parsoid and shouldn't be an issue since nothing in core is
generating <figure>s yet, presumably.
Bug: T51097
Change-Id: I9f489b13d5d4db0415a3f0f0dbb936c8e89461fa
ParserOptions::__construct() and ::newCanonical()
no longer accept null and fallback to the global
$wgUser - instead, ::__construct() has a typehint
for a UserIdentity, and ::newCanonical() will throw
an exception.
Bug: T284977
Change-Id: I35865e160190582ab10abaa696c6fc6686cc8989
Updates for the removal of the Revision class itself
and the various methods/hooks/variables removed in the
process, including:
- Update some documentation removing most references
to the Revision class and updating the MCR migration
notes to reflect the past tense for Revision methods.
- Change some capitalization from "Revision" to "revision"
to make it clear comments are about revisions in general,
not the Revision class in particular.
- Minor code tweaks including removing unused variables that
were around for the old hooks that were removed, and
removing the use of DeprecatablePropertyArray where no
longer needed for anything.
- Fix incorrect documentation for PageUpdater::getStatus(),
the status value changed a while ago to have revision-record
in addition to revision, and recently to only have the
revision-record, but ironically PageUpdater was never updated.
- Removed Parser::$mRevisionObject, used to be a Revision object
and was deprecated in 1.35, missed earlier because it was no
longer being set to Revision objects, always null.
- Add RevisionRecord typehints in DummyLinker to match those
in the corresponding Linker methods
This should be a no-op in terms of functionality.
Bug: T247143
Change-Id: I03bbb94fc29085855448780b1a5ad9063911ecc4
… including PHPDoc tags like `@return <type> $variableName`.
A return value doesn't have a variable name. I can see that
some people do this intentionally, repeating the variable
name that was used in the final `return $var;` at the end
of a method. This can indeed be helpful. I leave a lot of
these untouched and removed them only when it's obviously
wrong, or does not provide any additional information in
addition to what the code already says.
Change-Id: Ia18cd9f25ef658b08ad25b97a744897e2a8deffc
array_fill_keys() was introduced in PHP 5.2.0 and works like
array_flip() except that it does only one thing (copying keys) instead
of two things (copying keys and values). That makes it faster and more
obvious.
When array_flip() calls were paired, I left them as is, because that
pattern is too cute. I couldn't kill something so cute.
Sometimes it was hard to figure out whether the values in array_flip()
result were used. That's the point of this change. If you use
array_fill_keys(), the intention is obvious.
Change-Id: If8d340a8bc816a15afec37e64f00106ae45e10ed
== History of WikiPage::triggerOpportunisticLinksUpdate ==
* 2007 (r19095; T10575; b3a8d488a8)
Introduces the "cascading protection" feature.
This commit added code to Article.php, in a conditional branch
where we encountered a ParserCache "miss" and thus have done a
fresh parse. The code in question would query which templates
we ended up using, and if that differed from what the database
said (e.g. stored during the last actual edit or links update),
then a new LinksUpdate is ad-hoc constructed and executed.
I could not find it anywhere explicitly spelled out, but my best
guess is that the reason for this is to make sure that if the page
in question contains wikitext that trancludes a different page based
on the current date and time (such as how most Wikipedia main pages
transclude news information and "Did you know" information based on
dated subpages that are prepared in advance), then we don't just
want to re-render the page after a day has passed, we also want to
re-do the links update to ensure the search index, category links,
and "WhatLinksHere" is correct, and thus by extent, to make sure
that cascading protection from the main page does in fact apply
to the "current" set of subpages and templates actually in-use.
* 2007 (r19227; 0c0c0eff81)
This adds an optimisation to the added logic that limits it to
pages that satisfy `mTitle->areRestrictionsCascading()`.
Thus for most articles, which aren't protected at all, we don't
run LinksUpdate mid-request after a cache miss page view.
Because of this commit, the pre-2007 status quo remained unaltered
and has remains unaltered to this very day: We don't re-index
categories and WhatLinksHere etc, unless an article edit or
propagating template edit takes place.
* 2009 (r52888; 1353a8ba29)
Introduces the PoolCounter feature.
The logic in question moves to Article::doCascadeProtectionUpdates().
* 2015 (Iea952d4d2e66; df5ef8b5d7).
The logic in question is changed, motivated by wanting to avoid
DB writes during page views.
* Instead of executing LinksUpdate mid-request, we now queue a
RefreshLinksJob on the JobQueue, and utilize a newly added
`prioritize => true` parameter.
This commit also introduces a new feature, which is to queue
RefreshLinksJob also for pages that do not have cascading
protection, but that do satisfy a new boolean method
called `$parserOutput->hasDynamicContent()`, which is set when
the Parser encounters TTL-reducing magic words and functions
such as {{CURRENTDAY}} and {{#time}}. For this new case, however,
the `prioritize` parameter is not set, and this feature is disabled
in WMF production (and other farms that enable wgMiserMode).
This commit also renamed doCascadeProtectionUpdates()
to triggerOpportunisticLinksUpdate().
This commit also removed various documentation comments, which
I've partly restored in this patch, the patch you're looking at
now.
== Actual changes ==
* Rename hasDynamicContent() to hasReducedExpiry() and keep the
previous method as a non-deprecated wrapper.
This change is motivated by T280605, in which I intent to make use
of a Parser hook that reduces the cache expiry. There are numerous
extensions in WMF production that already do this, and thus the
assumption that these have "dynamic content" is already false in
some cases. I'm not yet sure how or if to refactor this so to allow
reducing of the TTL *without* causing this side-effect, but as a
first step we can make the method more obvious in its impact
and behaviour.
I've also updated two of the callers that I think will benefit from
this more explicit name and (current) implementation detail.
Bug: T280605
Change-Id: I85bdff7f86911f8ea5b866e3639f08ddd3f3bf6f
The following methods no longer support Revision parameters:
- CategoryMembershipChange::__construct
- ContentHandler::getUndoContent
- DerivedPageDataUpdater::prepareUpdate
- DifferenceEngine::getRevisionHeader
The following methods were removed entirely:
- Title::countAuthorsBetween
The following methods return arrays that formerly include
a 'revision' key that would emit deprecation warnings when
accessed and return a Revision object. The Revision object
has been removed from the arrays, and the 'revision-record'
key should be used to get the relevant RevisionRecord instead:
- PageUpdater::doModify
- PageUpdater::doCreate
- Parser::statelessFetchTemplate
The ParserOptions `templateCallback` option is a callback
that is called in Parser::fetchTemplateAndTitle() and should
return an array - the 'revision' key to that array used to
be a Revision object and was used if no 'revision-record'
was returned - it is now ignored.
Bug: T247143
Change-Id: I163ada88d649c75697aff4fa31a3a3c0bdef78b7
All hooks were previously hard deprecated
in 1.35. Affected hooks:
* ArticleRevisionUndeleted - use RevisionUndeleted
* ArticleRollbackComplete - use RollbackComplete
* DiffRevisionTools - use DiffTools
* DiffViewHeader - use DifferenceEngineViewHeader
* HistoryRevisionTools - use HistoryTools
* NewRevisionFromEditComplete - use RevisionFromEditComplete
* PageContentInsertComplete - use PageSaveComplete
* PageContentSaveComplete - use PageSaveComplete
* ParserFetchTemplate - use BeforeParserFetchTemplateRevisionRecord
* RevisionInsertComplete - use RevisionRecordInserted
* TitleMoveComplete - use PageMoveComplete
* TitleMoveCompleting - use PageMoveCompleting
* UndeleteShowRevision - no replacement
Includes a fix for setting the associated rev id
of page protections, which previously was only done
using $nullRevision which was a Revision object created
if any hooks needed it; those hooks were hard deprecated
and so for WMF prod the rev id was not being set.
Bug: T247143
Depends-On: Idfa345193ae99fb2f1c9a8f8d28d8d540a6e3d62
Change-Id: I519167f76a5a3c1f5410415b2721462a3dcc3ec8
Typehinting parameters that take the return value of these methods
with Language is not safe as they may return global $wgLang which
may or may not be instance of Language.
Bug: T278429
Change-Id: Ia5a71e4c39124f4427bd816e6e19207bb371cc6b
Cache misses in metadata were miscounted as miss.unserialize.
Count them as miss.absent.metadata instead.
Change-Id: Idff062325a34445478a4543709a9f2b3cc365f60
CachedBagOStuff caches negatives, so it breaks PoolCounter.
We only need to cache metadata in-process, since it's commonly
used twice within the request.
Bug: T277829
Change-Id: I11a147c24b6cdb275b521b48802d6f3d0e1a4387
Our PortableInfobox extension uses the HTML5 <aside> tag in its generated HTML.
This tag isn't recognized as a block element (in the way e.g. <div> is) by the
legacy parser, resulting in some spurious empty paragraphs in the output.
As a fix, make the legacy parser aware of <aside> tags to avoid unnecessary
p-wrapping. Also add <aside> to the Sanitizer's internal attribute check.
I3e57f55ac69d2c1ee8a1d41c21b692e56fc7e628 takes care of updating Parsoid-PHP
accordingly.
Bug: T278565
Change-Id: I89dbdf7770e13e1b62320228a366c64e64217b0b
ParserOptions not updated cause they depend on Title::getLanguage
implementation.
Tests converted to not require a DB anymore. Can't be proper unit
tests yet due to globals in ParserOptions and fake time hacks,
but exec time does go down from 70 seconds to 9 seconds.
Page content model is still emitted in the metrics since
it was considered useful. Should be removed when we get
something like a page type concept.
Change-Id: Ib16fd0b5b87ffc3cb4d21f4aa43d1203cb7206d2
ParserOutput object wraps revision ID and revision timestamp
of the parsed revision. Currently ParserCache sets these properties,
but it's not at all it's job - whatever generates the ParserOutput
knows much better what revision it parsed. This also allows us to
simplify ParserCache and easier switch it to PageRecord.
I've only removed setting the timestamp inside ParserCache
cause it's a blocker for page record, I will do followupus
to remove the $revId parameter from ParserCache as well.
cacheRevisionId should also be renamed, but later.
Bug: T278284
Change-Id: I9a82e9fd154b29a81d1f7a3c4abb073c9a27314e
We still need a lot of refactoring in ParserOptions
constructions, but for now converting the public interface
should be enough.
Change-Id: I04663c39ca037129b827b33555c3f59def5f9b59
Initializing the preprocessor in the constructor allows better
dependency injection, and removes code complexity caused by
lazy initialization. Any use of the parser is going to end
up creating the preprocessor in any case, so deferring the
initialization doesn't save any performance. (Best performance
is given by not creating the Parser in the first place if it
is not needed, which is what DI allows.)
Old code tried to unbreak cyclic dependencies by setting the
preprocessor to null. This is somewhat of a lost cause,
since there are a number of other cyclic dependencies
involving the parser, including StripState, LinkHolders,
etc. The code complexity is not worth it, given how
ineffective it is in any case.
This is part of T275160 in so far as it allows
Parser::getPreprocessor() to be a simple getter, and thus
(once this patch is merged) we can safely replace any
direct access to Parser::$mPreprocessor with a call to
Parser::getPreprocessor().
Bug: T275160
Change-Id: I38c6fe7d5a97badffdbf34d8b9d725756ed86514
This property was deprecated in 1.35.
Code search:
https://codesearch.wmcloud.org/search/?q=mTagHooks&i=nope&files=&excludeFiles=&repos=
Dependencies below are for WMF-deployed code. Other uses in
non-WMF-deployed code have been patched in:
* I435b0d1ccae9d9bf6fff85dc3e79d3c4b447eb37
* I85ef0e6ce3f0c818df85809d39259d13b56d966c
* Idab6c9475f78ff4040061f2f317560bbe41666d8
Bug: T275160
Depends-On: Ic5445471d770e396421a4fb2bcfbe1490a77e1bf
Depends-On: Ib708e3f84aa871de84aa56561c875f4a85bb000c
Change-Id: I42e23b101e870b66d169cbb731a0359e90f46265
This has effectively been the case since 1.35; this just cleans up the
remaining code which assumed it still needed to explicitly call
Parser::firstCallInit() on a newly-constructed Parser.
Bug: T250444
Change-Id: I340947c721172f12ff413322b4283627c0b0b3a4
A number of different argument variants were deprecated in 1.34,
and direct calls to the Parser constructor were deprecated at the
same time (a ParserFactory should be used instead). These were
hard-deprecated in 1.35. They should be safe to remove now.
Code search:
https://codesearch.wmcloud.org/deployed/?q=new%20Parser%5C%28&i=nope&files=%5C.php%24&excludeFiles=&repos=
Bug: T236811
Change-Id: I58f7b3ba1b1d62851b2db71197a8d9129e8d473d
Also copied the tests that used to be in TidyTest into
RemexDriverTest, so that we're not losing coverage when MWTidy is
eventually removed.
Bug: T198214
Change-Id: I0b301f6c98d0943ce4b6dc224f1066cb7bf244d1