This replaces the previous `variant` name for this option to help avoid
confusion: Parsoid maintains both an 'html variant' and a 'wikitext variant'
and this helps us ensure we're all talking about the same thing.
Change-Id: I01b5eef72e8b4433510bc8cb9b684b1e37715821
It is difficult to distinguish this method from OutputPage::addJsConfigVars()
in code search:
https://codesearch.wmcloud.org/deployed/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3EgetCategories%5C%28&files=&excludeFiles=&repos=
We generally try to replace $output with $parserOutput or $pOutput
as we touch code to improve the ability of codesearch to dig up
deprecated ParserOutput methods.
Bug: T305161
Depends-On: I02dd4f61c43c225b0ef6dc51c3e4f9d967a0a272
Depends-On: I61d2d77591579d825ad9d37f902e40366be55dd6
Depends-On: I91155106b7a9e10d3334f95ba4936d02851bfb11
Depends-On: Iaca745c79d9587571af03b23b21d76a6cba0ebf1
Depends-On: Id10a171c44411b1233ee4d6cf8fbd3dc57744eef
Depends-On: I47a25c011d9bd4b1a15dda4e673e32c25eb64f2b
Depends-On: I683fc768aba50b801f46467fcfa1668fa8731ea6
Change-Id: I5a2ac1c99b8b199102e12f0d32dd6ec5cdc24054
The parser test reader (imported from Parsoid) already converts
comma-separated lists to an array; we get an exception from explode()
if we then try to split that array by comma.
Change-Id: Id17ed22d81cc880f41f584406369323232c00bdf
The method should never be called directly, so make it throw an exception.
Nonetheless, mark it as deprecated and detect overrides in the
constructor, so that anyone who tries to override this method will see a
warning.
Fix the few tests that were relying on the existence of the test page.
Bug: T342428
Depends-On: Ic64ded5e2c0b59e7c888ece9566076058a125be4
Change-Id: I308617427309815062d54c14f3438cab31b08a73
Was introduced in I6c1e9bfad9790cf805809c28a3f8d45952cbb981 and
later rendered unused in Ie62250242965d3d90873909795ced2cbda506ddb.
Since we want to make sure we're using the right instance passed in
closures, let's keep getting it from the global service container
then keep locally in a variable.
Change-Id: I834b09933efc8835c73b9fafaad80cc7041757b6
The interaction between the title cache, the link cache and the parser
tests is very strange. With different parser tests and different
extension enabled it can fail and seems not very deterministic
Follow-Up: Ie4b67106512fb1a3a1b595dc4f6036276db96378
Change-Id: I55501eea7de739cac044b22caec150089183620b
The instance in the title cache does not see the reset of the id (lazy
loaded Title::mArticleId). When a title instance with Title::newFromText
is used from cache the instance can assume the title does not exists,
while the page was created. Remove the cached instance to get a fresh
instance which can do a fresh db lookup for the id when needed.
This should remove the issues from the parser test
with wrong title states
Follow-Up: Id056580c7b869ae4984de5e2c89fb4687eecf7bd
Follow-Up: Ibc8e0ddbe9e53c3334b9c26ec2d1eda976c2a62b
Change-Id: Ie4b67106512fb1a3a1b595dc4f6036276db96378
Avoid clone of Parser object, create a new one
Reset the UrlUtils services after changing the server setting,
it is a dependency of the ParserFactory and gets initalizied earlier now
Bug: T250448
Change-Id: Ie62250242965d3d90873909795ced2cbda506ddb
In the parser tests for LabelledSectionTransclusion a page transcludes
another page. On creation of the first page the second page does not
exists. The parser parsed the first page and after that the title cache
contains a title object for the second page with an article id of 0,
after the creation of the second page the article id in the title object
needs to be reset as it gets reused in the test.
Before 880fc5da the title object was in the title cache on page
creation, as the addArticle function was using Title::newFromText, which
results in the correct reset of the id in the cached object.
Bug: T342875
Change-Id: Id056580c7b869ae4984de5e2c89fb4687eecf7bd
Parsoid CI broke with b42062e7. It looks like the title cache clear
is still needed.
For anyone interested in digging deeper into this, you can revert
this patch and run the command below to see a failing test.
* composer phpunit:entrypoint -- --testsuite parsertests --filter 'multiple templates'
Change-Id: Ibc8e0ddbe9e53c3334b9c26ec2d1eda976c2a62b
Promote the deprecation to an error in the context of PHPUnit tests. The
point of hard deprecations is to make tests fail and this will help with
that, and also with eventually promoting the deprecation to an error
outside of tests.
Adjust code in parser tests that was accessing MediaWikiServices via
Title too early.
Avoid hack of resetting the error handler after loading Setup.php, and
conditionally install MW's hadler instead. This is particularly
important in scenarios where an exception is thrown before the handler
is reset, because MW's exception handler may also access
MediaWikiServices.
Bug: T227900
Bug: T273261
Change-Id: I7c5234046379cf4abd25d65e78c0a99ac9f32600
This reapplied commit b4e797510c with
some fixes for the parsoid tests, which can be checked by diff the
patchset 1 and the latest patchset.
- Factored out resetting of services that only related to language
and variant into a function, sorted all services roughly by their
dependency relationship.
Also, only reset when the language or variant is configured by the
test case.
- Replaced manual redefine of ContentLanguage service with reset
method like others.
- Reset necessary services after setting the default language code in
staticSetup(), so the override in addArticles() can be removed.
- Use the 'skin' param of getText() for setting skin, so we don't need
to touch the context.
- Removed the override of wgUser, wgLang, and wgOut. They didn't live
in parser-related codes anymore.
- Setting user option is not the correct way here, the problem is
UserOption(Lookup|Manager) didn't get reset, so the default user
language is wrong; and the context have a cached Language object.
When the default language option updated, purge the cache with
setUser() should be sufficient.
Change-Id: I4ebeaef98ed9b7682701c0385c68145ee1e78951
This reverts commit b4e797510c.
That commit breaks Parsoid CI, as demonstrated by tests on empty
commits Iad5e05eda4b94ce9f5708c84526c59e25cafa7a0 (passing,
depending on this patch) and I8a85aa11a29bcc568dd8079bce01320b087e04ac
(failing, not depending on this patch).
Change-Id: Ifaf295d45c00783a37e056e01bee98567d4a7cf5
- Factored out resetting of services that only related to language
and variant into a function, sorted all services roughly by their
dependency relationship.
Also, only reset when the language or variant is configured by the
test case.
- Replaced manual redefine of ContentLanguage service with reset
method like others.
- Reset necessary services after setting the default language code in
staticSetup(), so the override in addArticles() can be removed.
- Use the 'skin' param of getText() for setting skin, so we don't need
to touch the context.
- Removed the override of wgUser, wgLang, and wgOut. They didn't live
in parser-related codes anymore.
- Setting user option is not the correct way here, the problem is
UserOption(Lookup|Manager) didn't get reset, so the default user
language is wrong; and the context have a cached Language object.
When the default language option updated, purge the cache with
setUser() should be sufficient.
Change-Id: I94103b86a02d6b971f70a0bb7ece1f22cd16e715
The Hooks class contains deprecated functions and the whole class is
going to get removed, so remove the convenience function and inline the
code.
Bug: T335536
Change-Id: I8ef3468a64a0199996f26ef293543fcacdf2797f
Note that the metadata isn't even checked unless the wt2html is passing
it may take several runs to get all the tests updated.
Follows-Up: Ieaca9152b9f0d0a853c0dfaff1bdca808110539e
Change-Id: I10f5b54a8ebffaf10111d57aa66e1220c5418ca7
Follow up to I854f89bd823aab297efe29cd4fdee675afd77752
Returns the behaviour to what it was before that patch.
Change-Id: I743fa1118c4c78863f3857f4dc70d82f6bf4f0ac
Clean up to bypass skipped tests early in both legacy and
parsoid test runs without duplicating the skipped test check.
Also got rid of two FIXMEs with this refactoring.
Change-Id: I854f89bd823aab297efe29cd4fdee675afd77752
This is now enabled in production (Ic5a4a9950d51f63b17f4c5e70516bec87b981aa5)
and not something we want to remain configurable.
It is removed from Parsoid in I52ddfd21ff2e72a34cb5eb68742e3dfb85c6ccf6
Change-Id: I6a4d7d33fb42270fc5da3a922aa0a959180fb33f
The TOC used to be language-converted in ParserOutput::getText(), but
it wasn't possible to apply custom rules defined in the wikitext
article body at ::getText() time. Remove the various hacks that we'd
added in an attempt to do so, which were made unnecessary by
I321cd31dae64bbf845d53282e5d28a55bc4ec319.
Bug: T306862
Change-Id: Ib12cd02e9ade91d5794462e8833f2aa3b45a51f2
* ParserTestRunner: LocalisationCache needs to be reset since it has a
reference to LanguageNameUtils which has a copy of
$wgUsePigLatinVariant. Also factor out some
MediaWikiServices::getInstance() calls.
* In some other tests, set the variable.
Change-Id: I6c1e9bfad9790cf805809c28a3f8d45952cbb981
Two bugs here: first, we were silently skipping a needed file update in
::updateKnownFailures() if the file didn't previously exist. We're going
to still avoid writing the file if it didn't previously exist, out of an
abundance of caution, but at least we'll now fail noisily so the problem
can be fixed.
Second, the `--parsoid` flag was overriding the result of
::getFileSkipMessage() so we were processing files for
`--updateKnownFailures` that should have been skipped because they
are not marked parsoid-compatible. This override made sense when
we were still debugging the integrated-mode parsoid support in the
ParserTestRunner, but it is not needed anymore.
Change-Id: Iba961ea327e54bb6bdc87399dbcba87cd57b6b20
This adds support for various options which add metadata
information to the parser test output, including 'showtocdata'.
This builds on I845694d4f2109a8b9125410e8533ca69bbea50fa in treating
the metadata output as a separate section.
Bug: T270312
Depends-On: I8023931d31e494df325b16d1b922539e20b58c51
Change-Id: I0c42ec2dc93c358f1cddab77324b229bcc163e83
This provides a bit of isolation from the actual layout and names
of properties in the object, as well as being a touch more readable
when debugging test failures.
Change-Id: I5ddca850f577b2ac24e237a2518f03983e79a51d
If a ParserTest mixes HTML output and metadata properties, it can
complicate HTML normalization and other test processes, especially
for Parsoid-mode bidirectional tests.
Support splitting metadata output into a separate section, named
`!! metadata`, with the standard options for legacy and parsoid
variants, like `!! metadata/php` and `!! metadata/parsoid` and
`!! metadata/parsoid+integrated` etc.
For compatibility, if the metadata flags are present on the test
and the new section is not present, we'll continue to handle the
metadata output as we have before, aka append or prepend the metadata
to the HTML.
Code search for uses of these options (uses in parsoid and core can
be ignored; uses of 'pst' are harmless when they are not combined
with another option):
https://codesearch.wmcloud.org/search/?q=%28%5E%7C%20%29%28%28showtitle%7Cshowindicators%7Cill%7Ccat%7Cpst%7Cshowflags%29%28%20%7C%24%29%7C%28extension%3D%7Cproperty%3D%29%29&i=nope&files=%5Etests%2Fparser%2F.*%5C.txt&excludeFiles=&repos=
Change-Id: I845694d4f2109a8b9125410e8533ca69bbea50fa
This is a clean up refactor to keep the metadata handling code in one place,
and to allow Parsoid to share it when running in integrated mode.
Change-Id: Ic4fda0397977413b9d742d47ab1fc5a7bc6f6b96
In 24949480eb (Oct 2021) injection of
the Table of Contents was moved from Parser to
ParserOutput::getText(); that is, from parse time to "postprocess text
possibly fetched from the cache" time. Unfortunately, this meant that
language conversion wasn't done on the table of contents (!), for
either traditional skins or the vector-2022 skin. This was fixed for
traditional skins by 059e62cde6 (Nov
2021), later amended by 0955046ca5 (Mar
2022), which added explicit language conversion to the TOC injection
process in ParserOptions::getText(). This fix was still not complete,
however, since editor-defined custom language-conversion rules defined
in the article body were no longer available to the language converter
when conversion was done in ParserOutput::getText(); the ToC title was
also being double-converted. Further, neither of these short-term
fixes addressed the output of ParserOutput::getSections() (now
ParserOutput::getTOCData()) which was used by vector-2022 to generate
the ToC in the sidebar and which remained entirely unconverted.
With 439656e019 (Jan 2023), we started
using the ::getSections()/::getTOCData() output for main article text
as well, but we kept the previous hack which post-converted the
generated HTML. This kept old skins at parity with the post-Oct-2021
status, but also didn't address the conversion issue for vector-2022.
The solution here is to perform language conversion on the ToC lines
at parse time along with the rest of the language conversion, and
store *converted* headings in TOCData. This has a number of side
effects:
1. The ToC information array available via the action API
is now language converted. This is *probably* what you wanted in the
first place, but could potentially be disruptive.
2. The ToC is consistently converted with the full set of
editor-defined custom conversion rules. Before Oct 2021, the ToC was
converted using the set of custom conversion rules *active at the
point at which the ToC was inserted* (which was usually near the
beginning of the article). When all conversion rules appear at the
very top of the article (best practice!):
-{en:Foo; en-x-piglatin:Bar;}
Lead section text
== Introduction ==
== Foo ==
There should be no difference before pre-Oct 2021 behavior and the
behavior after this patch: in both cases the rule defined in the
article body will be applied both to the heading and to the TOC, and
they will be consistent. (After Oct 2021 and before this patch, Foo
would be converted in the heading but not in the table of contents.)
But in cases where conversion rules are defined after the
TOC insertion point, the section heading as it appears in the body
text could appear different from the section heading as it appears in
the ToC. For example, if you defined a conversion rule just before
using a term in a heading:
== Introduction ==
-{en:Foo; en-x-piglatin:Bar;}-
== Foo ==
Before Oct 2021, this rule would be applied to the heading, but not to
the TOC (because the TOC insertion point was before the rule
definition). This would also be the behavior before this patch (since
rules defined in the article body are currently not applied at all).
After this patch, the rule will be applied to both the heading and the
TOC (because the rule application location is effectively "at the very
end of the article"). In the rare cases when rules are not defined in
glossaries at the top of the article, this type of usage (definition
immediately preceding first use) is expected to be the most common
and the behavior after this patch is more correct.
But alternatively, if you defined a conversion rule *after* using
the term in a heading:
== Introduction ==
== Foo ==
-{en:Foo; en-x-piglatin:Bar;}-
Before Oct 2021, this rule wouldn't be applied to the heading *or* the
TOC. Before this patch, this would also be the case (because rules
defined in the article body are not applied at all). After this
patch, the rule will be applied to the ToC but not the heading, since
the application point for the TOC is effectively at the end of the
article. This inconsistency is probably not desirable, but this case
is expected to be rare, and (assuming the editor intended 'Foo' to be
unconverted) the editor can work around the inconsistency by
explicitly protecting 'Foo' from conversion:
== -{Foo}- ==
-{en:Foo; en-x-piglatin:Bar;}-
And if the editor /intended/ Foo to be converted, the rule definition
should be moved earlier in the article. Again, putting all rules at
the top of the article is the preferred style, and works better with
the glossary style used by the zhwiki community (see also
https://www.mediawiki.org/wiki/Requests_for_comment/Scoped_language_converter
).
Bug: T306862
Depends-On: I0c9c9fec920f7cb028d935e552a8f11475a23ba7
Change-Id: I321cd31dae64bbf845d53282e5d28a55bc4ec319