Previously, it assumed Parsoid content and loaded/stored data attributes
unconditionally. The result being that, if this stage was subclassed to
be used an non-Parsoid pipeline, the dom would undesirably be dirtied
with Parsoid ids or data-parsoid attributes.
Change-Id: I2f1af43d9c39140ce215e2145e51cc3b02f68923
A comment in I8744382dd24b28c623d0dc6569f800fb5489e6c1 mentions that two
tests are skipped. This patch fixes one of these skips, and makes the
other one more explicit.
Change-Id: Id5680fc163a9bfacfe797af619e40032cdee38b1
In the spirit of T332865, data providers should be made static methods
for its performance benefits.
Follow-up on: I7da0823d4686238003579a.
Change-Id: Ia08fcdd2c27f5cdc13ca8d12132a90c259ca2447
* Use $this->assertEquals as per PHPUnit docs and other code in
this repository.
* Remove redundant doc comments for data providers, as per
current PHPCS presets and matching other code in the repo.
Our latest PHPCS no longer requires native+doc blocks when there is
no added information in the doc block. Since data providers aren't
explicitly called by anyone, presumably no audience for it either.
* Widen `@covers` tags. Rationale in other commits
at https://gerrit.wikimedia.org/r/q/owner:Krinkle+is:merged+message:Widen
Follows-up 8de2e66ca7.
Change-Id: I7da0823d4686238003579afe425a635541e9baf6
Now that use Ib263439ae221232ffe0902a0c58d155402fb7a17 is merged,
we can use it instead of keeping it in ParsoidLocalizationTest.
It solves the issue pointed out in
I7da0823d4686238003579afe425a635541e9baf6.
Change-Id: I8744382dd24b28c623d0dc6569f800fb5489e6c1
Rather than have DefaultOutputPipelineFactory::CONSTRUCTOR_OPTIONS be a
union of all the options needed by all the stages, allow each stage to
define its own CONSTRUCTOR_OPTIONS and pass a Config object to the
DefaultOutputPipelineFactory service.
In the process, move the $options and $logger properties into the
abstract superclass, since they are passed to every stage.
Bug: T363764
Followup-To: I64aeb81b395ba84e1d839dfbd31decf16c337cd0
Change-Id: I7d386b22c7d8e99b6dfe4cf798069914ac9af373
When going through a ContentDOMTransformStage, we try to move the
PageBundle when transforming the document from and to DOM. In the
current version of this code, this adds DataParsoid, a non-serializable
class, to ExtensionData, which breaks on ParserCache storage in later
steps.
This patch is pretty hacky, but it transforms the PageBundle structure
back to a stdClass so that it can be re-serialized before cache
insertion. The added test fails without this patch.
Hopefully we'll get rid of these hacks when using a HTMLHolder later.
Bug: T365036
Change-Id: Icc74edd43ea5098faebc21a084b6d483d6ab99d1
This is an output transform to resolve the mw:I18n and mw:LocalizedAttrs
to their localized forms.
Bug: T358191
Change-Id: Id32bc05ff72eb2d9fba7f8c2f192c9f7812cbc70
Legacy parser can now output headings using a more accessible markup,
which is also identical to the markup used by the Parsoid parser.
Changes to client-side JS and CSS necessary to support the new markup
have already been merged in earlier commits.
includes/skins/Skin.php
includes/ServiceWiring.php
* Define a new skin option, 'supportsMwHeading', which can be used
to toggle the new markup per-skin.
* Update the built-in fallback skin to enable it. This affects the
output in parser tests.
docs/config-schema.yaml
includes/config-schema.php
includes/config-vars.php
includes/MainConfigNames.php
includes/MainConfigSchema.php
* Add a new configuration setting, 'ParserEnableLegacyHeadingDOM',
which can be used to toggle the new markup per-site.
includes/OutputTransform/Stages/HandleSectionLinks.php
* Output new heading HTML for skins that enabled the option.
tests/*
* Duplicate parser tests that cover heading generation to cover both
new and old markup. Update other parser tests to use new markup.
* Add some unit and integration tests for the behavior of the skin
option and some parser tests for edge cases of the new markup.
Bug: T13555
Change-Id: I1180169a8e83af834c2984ba16089e6277f2a8dd
A number of tests have hardcoded expections that pass only in WMF CI
where Quibble has LocalSettings.php with $wgScript and $wgArticlePath
set a certain way.
We could fix these by adding setMwGlobals() in their tests, as we
often do, but these are so often forgotten that I'd rather we just
add them to TestSetup.php so that it is simply impossible to write a
test that that passes locally for you (if you have the same config)
but not for someone else.
There is a larger project in there somewhere about expanding this
slowly such that we basically only pluck DB-settings and extension
enablement from LocalSettings and otherwise run the tests with the
default settings in PHPUnit. Pretty much by definition, any (other)
setting you have in LocalSettings is irrelevant because it either:
1. has no effect on the test (majority, harmless either way),
2. has a custom default via TestSetup.php (which has precedence over
LocalSettings.php),
3. is relevant to the code being tested and the test case correctly
calls setMwGlobals() to ensure a consistent value during test.
4. is relevant to the tested code but has no override, thus only
passes if you happen to have the "right" value set for it
(undesirable).
Case 4 is already categorically impossible for the most common config
settings that influence random code because we give them a value
in TestSetup.php. This patch expands that to include $wgScript
and $wgArticlePath. Perhaps in the future we can think about a way
to do this automatically by either re-applying MainConfigSchema
(sans db settings) or by only selectively applying LocalSettings.php
in the first place.
This patch follows-up I072ddf89562fe, which added a test case in
WikitextContentHandlerIntegrationTest.php that assumed "/index.php"
as the value of $wgScript. This passes in WMF CI since Quibble uses
that value, but the tests failed in most local development installs
since those tend to use "/w" instead.
Rather than one-off fixing that one test with overrideConfigValues(),
switch to a more general fixture, since the precise values don't
matter for this test.
Bug: T349087
Bug: T277470
Change-Id: If4304b7ca4a838bd892d4516a0b5c6dfbc30986e
The legacy parser only allows for a single insertion of TOC (it drops
later __TOC__ magic words). On Parsoid, we can end up with multiple TOC
markers (which we want to keep around for round-trip reasons), so we
need to discard them in the HandleTOCMarkers phase.
Bug: T359882
Change-Id: I60fdfc2c52680ed53e48d1931fd7f5c937b437a2
This is a non-default option that will add a <div> wrapper around
section contents to allow client-side collapsing. This is intended
for use by MobileFrontEnd, but could eventually be enabled for
desktop read views as well.
Since this parser option is in the "cache-varying options" set, any
caller who sets this option will fork the cache for that page, which
is reasonable as the parser options sets a ParserOutput property.
In the future our caching strategy will get smarter and we'll add
code which avoids the cache split and just transfers the appropriate
values from ParserOptions to ParserOutput flags after the cached
output is retrieved.
Bug: T359001
Change-Id: Ie93959a056ed15a728404eb293e4bb6eeaeb15c0
* Followup to 9a466310
* I had previously added page title info to ParserOutput as part of
6e5413b1, but while working on 9a466310, we didn't realize that.
* Removed urldecode(..) since output of Title::getPrefixedDBKey
isn't urlencoded and urldecode converts "+" into " ". A new test
ensures that edge case works properly.
* Simplify testing + add additional test to ensure title normalization
doesn't trip up the transform.
Bug: T358242
Change-Id: I9a0cb00bdf9d104a4b327d72b1ec94cf509883a2
* This ensures that when you have query params like (?useparsoid=1),
all cite links no longer take you to the non-Parsoid page but
resolve internally.
* Additionally, this also unbreaks reference previews in local testing
- not yet sure if this will fix all breakage in production.
* We don't have ready access to the title string and so this patch
extracts it from a link tag in the <head> of Parsoid HTML. That is
guaranteed to be correct and reliably present.
But, if in the future, this changes (whether by adding it to
ParserOptions, ParserOutput, or the $opts array), we can use that
directly.
* Added new unit tests that verify the new expectations.
Bug: T358242
Change-Id: Iaf482cc9803564b4cf4ae04f975573f61ff3b0e4
This is just a cleanup change. The exception should never happen,
but if it does, this can be reverted.
Change-Id: I26a7c4105d39d83015c09b779a2de3fd1ddacec1
Follow-up to Ibce512b3c4a52f74b2d2124f0159e306f2689ea5.
HEADING_REGEX will now correctly match opening tags when one of the
attributes contains an unencoded > character.
In a better world, this would not use regular expressions. However,
while implementing it as a DOM transformation is easy enough, doing so
causes never-ending test failures due to changes in HTML serialization,
so we gave up on it for now in after discussion on the original patch.
Bug: T358810
Change-Id: Ibad4b29a988c2a4911ebe6512791042c46dd1a9b
Why:
* The ExecutePostCacheTransformHooksTest::testTransform test was
failing due to needing to use the DB. This was addressed in
7358ddd62f but then caused the
assertion in the test to fail as VisualEditor modified the
output causing the test failure.
* Disabling the SkinEditSectionLinks hook for the test should fix
the test and does not cause test failures on my local machine.
What:
* Call ::clearHook with the 'SkinEditSectionLinks' hook in the
ExecutePostCacheTransformHooksTest::testTransform test.
Bug: T358103
Change-Id: Ia05cfd1eb572639c117fd264e3c05265adb38e32
Why:
* The ExecutePostCacheTransformHooksTest core test is not currently
a database test but is an integration test case.
* However, ::testTransform calls a hook and VisualEditor provides
a handler for SkinEditSectionLinks that reads from the DB which
is called by this test.
* Adding the test class to the database group will fix this by
allowing VisualEditor to use the database in the handler as part
of the test.
What:
* Add `@group Database` to ExecutePostCacheTransformHooksTest.php
Bug: T358103
Change-Id: Ib3b361f07d5411e4951156059dee11dc5367dffb
Discussion Tools runs *before* this stage runs, and so we end up
wrapping headings which have already been wrapped by discussion tools.
Check for an existing wrapper to avoid this.
In the future, we will probably add a new post-cache transform hook
which is at the very *end* of the pipeline, instead of in the middle,
to avoid this sort of ordering dependency between extensions and core.
Bug: T357826
Change-Id: I8cd28a3b42e55844be1258d639e605862952806f
Needed to create a mock Skin for one test case in order to avoid using
the ServiceContainer prematurely.
Change-Id: Iaa33dfd2b187ac3a1fc44ea46f3b88ef29a62098
[Previously attempted in de0646843a,
reverted in e72e1cd16368346b66853f68e2d13f9b416d5a11.]
Previously, Parser.php used Linker::makeHeadline() in order to
generate the `<h2><span class="mw-headline" id="...">...</span></h2>`
markup for section headings, and this was saved in the parser cache.
Now it generates heading tags with placeholder attributes like
`<h2 data-mw-...="..." ...>...</h2>`, and they are replaced in a
post-cache transform to generate the final heading markup, similarly
to how section edit links already worked.
The purpose of these changes is to allow changing the final markup
depending on skin options without splitting the parser cache (T13555).
Deployment and undeployment safety:
* The new post-cache transform has been already added in commit
Ibce512b3c4a52f74b2d2124f0159e306f2689ea5 for forward-compatibility
(so that if this patch is reverted, new parser cache entries
will still be shown correctly).
Implementation notes:
* There are many ways to keep the temporary information other than
`data-mw-...` attributes, but this way is the easiest to handle
in a post-cache transform (everything is on the DOM node we want
to modify), is compatible with other heading-enhancing code in
DiscussionTools and MobileFrontend, and remains human-readable
if the post-cache transform doesn't run.
* Sadly this code can't be reused to add section heading markup and
section edit links to Parsoid (T269630), because it lacks some of
the necessary metadata, and exposes the rest in ways that are
trickier to handle in a post-cache transform (on other DOM nodes
or outside the document).
Depends-On: If85f89c40834618f23dc0ace2e599efb3b6d5ed4
Bug: T13555
Change-Id: If04d72f427ec3c3730e757cbb3ade8840c09f7d3
Previously, Parser.php used Linker::makeHeadline() in order to
generate the `<h2><span class="mw-headline" id="...">...</span></h2>`
markup for section headings, and this was saved in the parser cache.
Now it generates heading tags with placeholder attributes like
`<h2 data-mw-...="..." ...>...</h2>`, and they are replaced in a
post-cache transform to generate the final heading markup, similarly
to how section edit links already worked.
The purpose of these changes is to allow changing the final markup
depending on skin options without splitting the parser cache (T13555).
Deployment and undeployment safety:
* The new post-cache transform has been already added in commit
Ibce512b3c4a52f74b2d2124f0159e306f2689ea5 for forward-compatibility
(so that if this patch is reverted, new parser cache entries
will still be shown correctly).
Implementation notes:
* There are many ways to keep the temporary information other than
`data-mw-...` attributes, but this way is the easiest to handle
in a post-cache transform (everything is on the DOM node we want
to modify), is compatible with other heading-enhancing code in
DiscussionTools and MobileFrontend, and remains human-readable
if the post-cache transform doesn't run.
* Sadly this code can't be reused to add section heading markup and
section edit links to Parsoid (T269630), because it lacks some of
the necessary metadata, and exposes the rest in ways that are
trickier to handle in a post-cache transform (on other DOM nodes
or outside the document).
Bug: T13555
Change-Id: I4eae18d9d16f54391daba0de82ad05e50f07f9eb
This was formerly used by the REST api, but instead that code just
uses ParserOutput::getRawText() when it needs the full HTML document.
This option has been broken, with various passes like RenderDebugInfo
and AddWrapperDiv adding content in inappropriate places if
bodyContentOnly was false.
Change-Id: Ib45f95ded59c81c16d61803f977d1edbfe82b262
Abstract test classes are no longer allowed to end in "Test" as of
PHPUnit 9.6.
Follow-up: I53551ec6d6
Bug: T342110
Change-Id: I9638c2937f8b702851d080ab217fbc34620fabb6
This reverts commit 82da9cf14b.
Passing through Remex seems to have unexpected consequences to be
investigated but, for the sake of unbreaking the UBN, let's revert this
first.
Bug: T353920
Change-Id: Iaac7942aa77aee5ab525852ac5b41dd516ff13c9
The previous implementation was using an ad-hoc regular expression which
was matching inside the data-mw attribute of Parsoid output, eg:
<sup about="#mwt42" [...] typeof="mw:Extension/ref mw:Error" data-mw="{"name":"ref","attrs":{"name":"infobox_stats_ref_rail"},"body":{"html":"<style data-mw-deduplicate=\"TemplateStyles:r1133582631\" typeof=\"...">
After substitution, the <link> element inserted contained " instead of
" and so broke out of the attribute.
Instead use a proper HTML tokenizer (via wikimedia/remex-html) so that
we don't allow bogus matches inside attribute values.
To fix up tests:
* Don't deduplicate styles when parsing UX messages (also helps performance)
* Don't deduplicate styles in ContentHandler integration tests
* Don't deduplicate styles by default in parser tests
(unless explicit option is set)
Depends-On: Id9801a9ff540bd818a32bc6fa35c48a9cff12d3a
Depends-On: I5111f1fdb7140948b82113adbc774af286174ab3
Followup-To: Ic0b17e361bf6eb0e71c498abc17f5f67f82318f8
Change-Id: I32d3d1772243c3819e1e1486351d16871b6e21c4