* Followup to 9a466310
* I had previously added page title info to ParserOutput as part of
6e5413b1, but while working on 9a466310, we didn't realize that.
* Removed urldecode(..) since output of Title::getPrefixedDBKey
isn't urlencoded and urldecode converts "+" into " ". A new test
ensures that edge case works properly.
* Simplify testing + add additional test to ensure title normalization
doesn't trip up the transform.
Bug: T358242
Change-Id: I9a0cb00bdf9d104a4b327d72b1ec94cf509883a2
* This ensures that when you have query params like (?useparsoid=1),
all cite links no longer take you to the non-Parsoid page but
resolve internally.
* Additionally, this also unbreaks reference previews in local testing
- not yet sure if this will fix all breakage in production.
* We don't have ready access to the title string and so this patch
extracts it from a link tag in the <head> of Parsoid HTML. That is
guaranteed to be correct and reliably present.
But, if in the future, this changes (whether by adding it to
ParserOptions, ParserOutput, or the $opts array), we can use that
directly.
* Added new unit tests that verify the new expectations.
Bug: T358242
Change-Id: Iaf482cc9803564b4cf4ae04f975573f61ff3b0e4
This is just a cleanup change. The exception should never happen,
but if it does, this can be reverted.
Change-Id: I26a7c4105d39d83015c09b779a2de3fd1ddacec1
Follow-up to Ibce512b3c4a52f74b2d2124f0159e306f2689ea5.
HEADING_REGEX will now correctly match opening tags when one of the
attributes contains an unencoded > character.
In a better world, this would not use regular expressions. However,
while implementing it as a DOM transformation is easy enough, doing so
causes never-ending test failures due to changes in HTML serialization,
so we gave up on it for now in after discussion on the original patch.
Bug: T358810
Change-Id: Ibad4b29a988c2a4911ebe6512791042c46dd1a9b
Why:
* The ExecutePostCacheTransformHooksTest::testTransform test was
failing due to needing to use the DB. This was addressed in
7358ddd62f but then caused the
assertion in the test to fail as VisualEditor modified the
output causing the test failure.
* Disabling the SkinEditSectionLinks hook for the test should fix
the test and does not cause test failures on my local machine.
What:
* Call ::clearHook with the 'SkinEditSectionLinks' hook in the
ExecutePostCacheTransformHooksTest::testTransform test.
Bug: T358103
Change-Id: Ia05cfd1eb572639c117fd264e3c05265adb38e32
Why:
* The ExecutePostCacheTransformHooksTest core test is not currently
a database test but is an integration test case.
* However, ::testTransform calls a hook and VisualEditor provides
a handler for SkinEditSectionLinks that reads from the DB which
is called by this test.
* Adding the test class to the database group will fix this by
allowing VisualEditor to use the database in the handler as part
of the test.
What:
* Add `@group Database` to ExecutePostCacheTransformHooksTest.php
Bug: T358103
Change-Id: Ib3b361f07d5411e4951156059dee11dc5367dffb
Discussion Tools runs *before* this stage runs, and so we end up
wrapping headings which have already been wrapped by discussion tools.
Check for an existing wrapper to avoid this.
In the future, we will probably add a new post-cache transform hook
which is at the very *end* of the pipeline, instead of in the middle,
to avoid this sort of ordering dependency between extensions and core.
Bug: T357826
Change-Id: I8cd28a3b42e55844be1258d639e605862952806f
Needed to create a mock Skin for one test case in order to avoid using
the ServiceContainer prematurely.
Change-Id: Iaa33dfd2b187ac3a1fc44ea46f3b88ef29a62098
[Previously attempted in de0646843a,
reverted in e72e1cd16368346b66853f68e2d13f9b416d5a11.]
Previously, Parser.php used Linker::makeHeadline() in order to
generate the `<h2><span class="mw-headline" id="...">...</span></h2>`
markup for section headings, and this was saved in the parser cache.
Now it generates heading tags with placeholder attributes like
`<h2 data-mw-...="..." ...>...</h2>`, and they are replaced in a
post-cache transform to generate the final heading markup, similarly
to how section edit links already worked.
The purpose of these changes is to allow changing the final markup
depending on skin options without splitting the parser cache (T13555).
Deployment and undeployment safety:
* The new post-cache transform has been already added in commit
Ibce512b3c4a52f74b2d2124f0159e306f2689ea5 for forward-compatibility
(so that if this patch is reverted, new parser cache entries
will still be shown correctly).
Implementation notes:
* There are many ways to keep the temporary information other than
`data-mw-...` attributes, but this way is the easiest to handle
in a post-cache transform (everything is on the DOM node we want
to modify), is compatible with other heading-enhancing code in
DiscussionTools and MobileFrontend, and remains human-readable
if the post-cache transform doesn't run.
* Sadly this code can't be reused to add section heading markup and
section edit links to Parsoid (T269630), because it lacks some of
the necessary metadata, and exposes the rest in ways that are
trickier to handle in a post-cache transform (on other DOM nodes
or outside the document).
Depends-On: If85f89c40834618f23dc0ace2e599efb3b6d5ed4
Bug: T13555
Change-Id: If04d72f427ec3c3730e757cbb3ade8840c09f7d3
Previously, Parser.php used Linker::makeHeadline() in order to
generate the `<h2><span class="mw-headline" id="...">...</span></h2>`
markup for section headings, and this was saved in the parser cache.
Now it generates heading tags with placeholder attributes like
`<h2 data-mw-...="..." ...>...</h2>`, and they are replaced in a
post-cache transform to generate the final heading markup, similarly
to how section edit links already worked.
The purpose of these changes is to allow changing the final markup
depending on skin options without splitting the parser cache (T13555).
Deployment and undeployment safety:
* The new post-cache transform has been already added in commit
Ibce512b3c4a52f74b2d2124f0159e306f2689ea5 for forward-compatibility
(so that if this patch is reverted, new parser cache entries
will still be shown correctly).
Implementation notes:
* There are many ways to keep the temporary information other than
`data-mw-...` attributes, but this way is the easiest to handle
in a post-cache transform (everything is on the DOM node we want
to modify), is compatible with other heading-enhancing code in
DiscussionTools and MobileFrontend, and remains human-readable
if the post-cache transform doesn't run.
* Sadly this code can't be reused to add section heading markup and
section edit links to Parsoid (T269630), because it lacks some of
the necessary metadata, and exposes the rest in ways that are
trickier to handle in a post-cache transform (on other DOM nodes
or outside the document).
Bug: T13555
Change-Id: I4eae18d9d16f54391daba0de82ad05e50f07f9eb
This was formerly used by the REST api, but instead that code just
uses ParserOutput::getRawText() when it needs the full HTML document.
This option has been broken, with various passes like RenderDebugInfo
and AddWrapperDiv adding content in inappropriate places if
bodyContentOnly was false.
Change-Id: Ib45f95ded59c81c16d61803f977d1edbfe82b262
Abstract test classes are no longer allowed to end in "Test" as of
PHPUnit 9.6.
Follow-up: I53551ec6d6
Bug: T342110
Change-Id: I9638c2937f8b702851d080ab217fbc34620fabb6
This reverts commit 82da9cf14b.
Passing through Remex seems to have unexpected consequences to be
investigated but, for the sake of unbreaking the UBN, let's revert this
first.
Bug: T353920
Change-Id: Iaac7942aa77aee5ab525852ac5b41dd516ff13c9
The previous implementation was using an ad-hoc regular expression which
was matching inside the data-mw attribute of Parsoid output, eg:
<sup about="#mwt42" [...] typeof="mw:Extension/ref mw:Error" data-mw="{"name":"ref","attrs":{"name":"infobox_stats_ref_rail"},"body":{"html":"<style data-mw-deduplicate=\"TemplateStyles:r1133582631\" typeof=\"...">
After substitution, the <link> element inserted contained " instead of
" and so broke out of the attribute.
Instead use a proper HTML tokenizer (via wikimedia/remex-html) so that
we don't allow bogus matches inside attribute values.
To fix up tests:
* Don't deduplicate styles when parsing UX messages (also helps performance)
* Don't deduplicate styles in ContentHandler integration tests
* Don't deduplicate styles by default in parser tests
(unless explicit option is set)
Depends-On: Id9801a9ff540bd818a32bc6fa35c48a9cff12d3a
Depends-On: I5111f1fdb7140948b82113adbc774af286174ab3
Followup-To: Ic0b17e361bf6eb0e71c498abc17f5f67f82318f8
Change-Id: I32d3d1772243c3819e1e1486351d16871b6e21c4