Implicitly marking parameter $... as nullable is deprecated in php8.4,
the explicit nullable type must be used instead
Created with autofix from Ide15839e98a6229c22584d1c1c88c690982e1d7a
Break one long line in SpecialPage.php
Bug: T376276
Change-Id: I807257b2ba1ab2744ab74d9572c9c3d3ac2a968e
This is the third patch of a series of patches to remove
ParserOutput::getText() calls from core. This series of patches should
be functionally equivalent to I2b4bcddb234f10fd8592570cb0496adf3271328e.
Here we temporarily introduce runOutputPipeline in ParserOutput. It
creates and runs the pipeline with default options, and is called by
getText. (This is not entirely truthful because we go through a
runPipelineInternal transient method for null-argument-passing reasons,
but let's not over-complicate this commit message.)
getText is responsible for maintaining the current behaviour,
that is "disallow the cloning of the ParserOutput and putting text back
to as it was" to mitigate T353257. As we get rid of getText, this
behaviour should be moved, if necessary, to the caller site.
The new method is currently added to ParserOutput so that further
refactorings are, for the moment, simpler. It will eventually be moved
to another place within the Content framework.
We also rename 'suppressClone' to 'allowClone' (which is actually its
negation) to avoid multiple levels of negations that make the code
confusing. Note that the default value of 'allowClone' is true, and is
currently overriden in two places: getText and
OutputPage::getParserOutputText (which calls the pipeline directly and
not through ParserOutput).
Bug: T293512
Bug: T371022
Change-Id: Ibf04af1079aaa1934dc78685b00e636ff4d38a9a
The refactorings in I45951a49e57a8031887ee6e4546335141d231c18 replaced
calls to ParserOutput::getText() with direct invocations of the pipeline,
including in OutputPage::getParserOutputText(). However, the direct
invocation skipped the implicit initialization of the options array
previously done in ParserOutput::getText(). Ensure that the options
array gets appropriate default values; in particular 'isParsoidContent'
is expected to always be set.
Bug: T293512
Bug: T373405
Change-Id: Ib8d540b4221f7c00f6047706c4e3bfd88a2cb8cc
Previously, it assumed Parsoid content and loaded/stored data attributes
unconditionally. The result being that, if this stage was subclassed to
be used an non-Parsoid pipeline, the dom would undesirably be dirtied
with Parsoid ids or data-parsoid attributes.
Change-Id: I2f1af43d9c39140ce215e2145e51cc3b02f68923
Adds an experimental configuration to allow extensions to define
OutputPipelineStages to include in the DefaultOutputPipeline.
There are a lot of open questions about this api, like ordering of
execution, but adding it @experimental will help surface the
requirements.
Bug: T370541
Needed-By: I6dc92af0611c680b6e55605a7c9ff8a3fc1dfa26
Change-Id: I64baea40a1687c7a06fbcda9efe9f9a159b0ae8d
The rest of the pipeline is trying to have the same defaults in the
pipeline built for (what is still) getText than the default options of
the pipeline stages. This is currently not the case for
AddWrapperDivClass; this patch fixes that.
Change-Id: I791d679a7b7309dfeb90c9736ef0e4848b038e08
A comment in I8744382dd24b28c623d0dc6569f800fb5489e6c1 mentions that two
tests are skipped. This patch fixes one of these skips, and makes the
other one more explicit.
Change-Id: Id5680fc163a9bfacfe797af619e40032cdee38b1
Changes to the use statements done automatically via script
Addition of missing use statement done manually
Change-Id: Ia35b2d3105880631dd26ec974068b000ac7f4b6b
When re-injecting the page bundle to the newly created ParserOutput, we
were omitting the version, headers and contentmodel data of said page
bundle reinjection. This patch fixes that.
Note that it will silence places where getText should typically not be
called, but that's a larger problem that needs to be addressed on the
calling places, and doesn't detract from the fact that we needed to fix
this loss of information on the bundle anyway.
Bug: T365433
Depends-On: I2a87a8233b9e42cbafdba63bdf513abe00d826ce
Change-Id: I7f57ddc76b9d3b24226f8b5da1b70bc83134856f
Rather than have DefaultOutputPipelineFactory::CONSTRUCTOR_OPTIONS be a
union of all the options needed by all the stages, allow each stage to
define its own CONSTRUCTOR_OPTIONS and pass a Config object to the
DefaultOutputPipelineFactory service.
In the process, move the $options and $logger properties into the
abstract superclass, since they are passed to every stage.
Bug: T363764
Followup-To: I64aeb81b395ba84e1d839dfbd31decf16c337cd0
Change-Id: I7d386b22c7d8e99b6dfe4cf798069914ac9af373
The legacy parser does not run ExpandToAbsoluteUrls unless it's doing
?action=render. ExpandToAbsoluteUrls doesn't work for mobile urls,
which seems to be captured in T171398 / T195494. Since relative urls
aren't resolved in legacy output though, the browser uses the mobile
url.
Parsoid, however, does ExtractBody which has its own expandRelativeAttrs
pass, which resolves relative urls against the baseHref in the document
head. The baseHref is taken from MainConfigNames::Server, which
presumably suffers the same issue as the above task. But also maybe MFE
is transforming cached html, where the non-mobile baseHref is desirable.
In any case, to produce the same urls as the legacy parser, transform
the baseHref to one that conforms with mobile url template.
Bug: T365483
Change-Id: I32800f5ea848d70b6ef67ec9102c432b9626afcb
Parsoid abstracts the specific DOM implementation it is using, in
practice (currently) using subclasses of the built-in \DOMDocument
classes using the \DOMDocument::registerNodeClass() mechanism.
Parsoid's own phan configuration uses stubs for its abstract DOM
classes to encourage the use of "standard" DOM methods -- but core
doesn't use Parsoid's phan configuration and doesn't really understand
the way that ::registerNodeClass() works and so get confused by code
such as:
$el = $document->createElement('div');
In actual practice this is a Wikimedia\Parsoid\DOM\Document (a
subclass of \DOMDocument) which creates a
Wikimedia\Parsoid\DOM\Element (a subclass of \DOMElement) via the
::registerNodeClass() mechanism, but phan sees only the base
\DOMDocument::createElement() signature and assumes this creates a
\DOMElement *not* a Wikimedia\Parsoid\DOM\Element. If you do
"element-y" things on this, phan has no complaints, but if you pass
this back to a Parsoid method which expects the abstract
Wikimedia\Parsoid\DOM\Element type then phan (spuriously) complains.
This type error can be hard to understand.
Workaround this issue by simply aliasing Parsoid's abstract DOM types
to the built-in \DOMDocument etc types. The alternative would be to
use Parsoid's stubs, but it seems cleaner (for now) to avoid reaching
into
vendor/wikimedia/parsoid/.phan/stubs
to get them.
Change-Id: I90b33c5d65bde1582be9a452a144808b6d53d914
When going through a ContentDOMTransformStage, we try to move the
PageBundle when transforming the document from and to DOM. In the
current version of this code, this adds DataParsoid, a non-serializable
class, to ExtensionData, which breaks on ParserCache storage in later
steps.
This patch is pretty hacky, but it transforms the PageBundle structure
back to a stdClass so that it can be re-serialized before cache
insertion. The added test fails without this patch.
Hopefully we'll get rid of these hacks when using a HTMLHolder later.
Bug: T365036
Change-Id: Icc74edd43ea5098faebc21a084b6d483d6ab99d1
When running a ContentDOMTransformStage, we effectively clone the input
ParserOutput, which is in contradiction with the current expectations of
the pipeline. This patch slightly modifies the logic by making it
possible to apply a PageBundle data to an existing ParserOutput without the
necessity to create a new one.
Bug: T364597
Change-Id: I633fc33485f22cf645acd41650a6983df3b0a534
This is an output transform to resolve the mw:I18n and mw:LocalizedAttrs
to their localized forms.
Bug: T358191
Change-Id: Id32bc05ff72eb2d9fba7f8c2f192c9f7812cbc70
Legacy parser can now output headings using a more accessible markup,
which is also identical to the markup used by the Parsoid parser.
Changes to client-side JS and CSS necessary to support the new markup
have already been merged in earlier commits.
includes/skins/Skin.php
includes/ServiceWiring.php
* Define a new skin option, 'supportsMwHeading', which can be used
to toggle the new markup per-skin.
* Update the built-in fallback skin to enable it. This affects the
output in parser tests.
docs/config-schema.yaml
includes/config-schema.php
includes/config-vars.php
includes/MainConfigNames.php
includes/MainConfigSchema.php
* Add a new configuration setting, 'ParserEnableLegacyHeadingDOM',
which can be used to toggle the new markup per-site.
includes/OutputTransform/Stages/HandleSectionLinks.php
* Output new heading HTML for skins that enabled the option.
tests/*
* Duplicate parser tests that cover heading generation to cover both
new and old markup. Update other parser tests to use new markup.
* Add some unit and integration tests for the behavior of the skin
option and some parser tests for edge cases of the new markup.
Bug: T13555
Change-Id: I1180169a8e83af834c2984ba16089e6277f2a8dd
Adding a data-mw-parsoid-version attribute to the wrapper div helps to
unambiguously mark parsoid-generated output in a way which is compatible
with CSS rules and client-side JavaScript.
By embedding the current version of parsoid in the data attribute,
sophisticated CSS rules can match against a specific version of
Parsoid in order to facilitate proper behavior; for example:
div[data-mw-parsoid-version^="0.20.0"]
This could be useful in deployment scenarios where the parser cache
might contain content generated by older or newer versions of Parsoid,
for roll-forward or roll-back deployment scenarios, respectively.
Bug: T363378
Change-Id: I941d31479eebb12ea1f4dcdb0a1737033ddc8ac1
This is a non-default option that will add a <div> wrapper around
section contents to allow client-side collapsing. This is intended
for use by MobileFrontEnd, but could eventually be enabled for
desktop read views as well.
Since this parser option is in the "cache-varying options" set, any
caller who sets this option will fork the cache for that page, which
is reasonable as the parser options sets a ParserOutput property.
In the future our caching strategy will get smarter and we'll add
code which avoids the cache split and just transfers the appropriate
values from ParserOptions to ParserOutput flags after the cached
output is retrieved.
Bug: T359001
Change-Id: Ie93959a056ed15a728404eb293e4bb6eeaeb15c0
* Followup to 9a466310
* I had previously added page title info to ParserOutput as part of
6e5413b1, but while working on 9a466310, we didn't realize that.
* Removed urldecode(..) since output of Title::getPrefixedDBKey
isn't urlencoded and urldecode converts "+" into " ". A new test
ensures that edge case works properly.
* Simplify testing + add additional test to ensure title normalization
doesn't trip up the transform.
Bug: T358242
Change-Id: I9a0cb00bdf9d104a4b327d72b1ec94cf509883a2
* This ensures that when you have query params like (?useparsoid=1),
all cite links no longer take you to the non-Parsoid page but
resolve internally.
* Additionally, this also unbreaks reference previews in local testing
- not yet sure if this will fix all breakage in production.
* We don't have ready access to the title string and so this patch
extracts it from a link tag in the <head> of Parsoid HTML. That is
guaranteed to be correct and reliably present.
But, if in the future, this changes (whether by adding it to
ParserOptions, ParserOutput, or the $opts array), we can use that
directly.
* Added new unit tests that verify the new expectations.
Bug: T358242
Change-Id: Iaf482cc9803564b4cf4ae04f975573f61ff3b0e4
This is just a cleanup change. The exception should never happen,
but if it does, this can be reverted.
Change-Id: I26a7c4105d39d83015c09b779a2de3fd1ddacec1
Follow-up to Ibce512b3c4a52f74b2d2124f0159e306f2689ea5.
HEADING_REGEX will now correctly match opening tags when one of the
attributes contains an unencoded > character.
In a better world, this would not use regular expressions. However,
while implementing it as a DOM transformation is easy enough, doing so
causes never-ending test failures due to changes in HTML serialization,
so we gave up on it for now in after discussion on the original patch.
Bug: T358810
Change-Id: Ibad4b29a988c2a4911ebe6512791042c46dd1a9b
Discussion Tools runs *before* this stage runs, and so we end up
wrapping headings which have already been wrapped by discussion tools.
Check for an existing wrapper to avoid this.
In the future, we will probably add a new post-cache transform hook
which is at the very *end* of the pipeline, instead of in the middle,
to avoid this sort of ordering dependency between extensions and core.
Bug: T357826
Change-Id: I8cd28a3b42e55844be1258d639e605862952806f
This was formerly used by the REST api, but instead that code just
uses ParserOutput::getRawText() when it needs the full HTML document.
This option has been broken, with various passes like RenderDebugInfo
and AddWrapperDiv adding content in inappropriate places if
bodyContentOnly was false.
Change-Id: Ib45f95ded59c81c16d61803f977d1edbfe82b262
Make ContentDOMTransformStage handle Parsoid markup with PageBundle
information embedded in the ParserOutput.
Much of the complexity of this code should move to either Parsoid's
ContentUtils or else into the HtmlHolder abstraction (T347062).
Change-Id: Ib35ae38d84adc7df613d4c7de8930ed80e535634
I realized that this code path is also triggered by a special page
transclusion that outputs headings, e.g. `{{Special:RecentChanges}}`.
It doesn't seem worth it to try to handle all these cases distinctly.
Follow-up to b26db1f866.
Change-Id: I389ea9210fcc184f41b6731409331dbd3d34d2ca
The resource attribute is used in read views for magnify links and
imagemap description links. See Id46d1b2ab1af3baebff13e10f1485f3cfd9a4b37
and I20130fd39135dfd5074590ee9c2b6e01693384e4
Bug: T357573
Change-Id: I974701ba9eb77e8d0abc894d1091fcdd63b84684
Split off from I4eae18d9d16f54391daba0de82ad05e50f07f9eb for
forward-compatibility, in case that patch needs to be reverted.
See that change for tests and explanation.
Bug: T13555
Change-Id: Ibce512b3c4a52f74b2d2124f0159e306f2689ea5