Commit graph

722 commits

Author SHA1 Message Date
Isabelle Hurbain-Palatin
7f63d5250e Revert "Use Remex for DeduplicateStyles transform"
This reverts commit 82da9cf14b.

Passing through Remex seems to have unexpected consequences to be
investigated but, for the sake of unbreaking the UBN, let's revert this
first.

Bug: T353920
Change-Id: Iaac7942aa77aee5ab525852ac5b41dd516ff13c9
2023-12-22 11:26:09 +01:00
C. Scott Ananian
82da9cf14b Use Remex for DeduplicateStyles transform
The previous implementation was using an ad-hoc regular expression which
was matching inside the data-mw attribute of Parsoid output, eg:

 <sup about="#mwt42" [...] typeof="mw:Extension/ref mw:Error" data-mw="{&quot;name&quot;:&quot;ref&quot;,&quot;attrs&quot;:{&quot;name&quot;:&quot;infobox_stats_ref_rail&quot;},&quot;body&quot;:{&quot;html&quot;:&quot;<style data-mw-deduplicate=\&quot;TemplateStyles:r1133582631\&quot; typeof=\&quot;...">

After substitution, the <link> element inserted contained " instead of
&quot; and so broke out of the attribute.

Instead use a proper HTML tokenizer (via wikimedia/remex-html) so that
we don't allow bogus matches inside attribute values.

To fix up tests:
* Don't deduplicate styles when parsing UX messages (also helps performance)
* Don't deduplicate styles in ContentHandler integration tests
* Don't deduplicate styles by default in parser tests
  (unless explicit option is set)

Depends-On: Id9801a9ff540bd818a32bc6fa35c48a9cff12d3a
Depends-On: I5111f1fdb7140948b82113adbc774af286174ab3
Followup-To: Ic0b17e361bf6eb0e71c498abc17f5f67f82318f8
Change-Id: I32d3d1772243c3819e1e1486351d16871b6e21c4
2023-12-15 17:49:21 +01:00
James D. Forrester
9bfb75ff90 Namespace ParserOutput
Most used non-namespaced class!

Bug: T353458
Change-Id: I4c2cbb0a808b3881a4d6ca489eee5d8c8ebf26cf
2023-12-14 14:57:34 -05:00
C. Scott Ananian
4b83285954 ParserOutput: Allow passing LinkTarget to title-related methods
Broadened the argument type to allow passing LinkTarget to:
* ParserOutput::addCategory()
* ParserOutput::addLanguageLink()
* ParserOutput::addLink()
* ParserOutput::addImage()
* ParserOutput::addTemplate()

This allows for a tighter interface with Parsoid's
ContentMetadataCollector class and avoids errors caused by passing the
wrong form of string title ("text" with spaces versus "dbkey" with
underscores).

There are a few performance problems remaining after this patch, which
only apply to use by Parsoid (not the legacy parser):

1. ::addLink() does inefficient db requests to fetch the page id for
each link if the optional $id parameter is not passed.  These lookups
should be deferred and a LinkBatch used.  (The legacy parser always
passes $id.)

2. ::addTemplate() similarly requires $page_id (and $rev_id) to be
passed, so is not currently usable by Parsoid.

3. ::addLanguageLink() uses Title::getFullText() which is not present
in LinkTarget and is currently implemented as a full Title lookup.
This is not an issue for the legacy parser, because it already has a
Title object so the lookup is a no-op, but could be improved for
Parsoid's use.

Bug: T296023
Change-Id: If21ec8563c8a619bdde7c0cb6534bb9009480a21
2023-12-08 17:50:29 -05:00
jenkins-bot
b7fc1b2f43 Merge "Only cache expensive renderings" 2023-11-30 21:24:34 +00:00
daniel
e3fb964439 Only cache expensive renderings
Pages that are fast to render can be omitted from the parser cache
to preserve disk space and cache write operations.

The threshold is configurable per namespace, so the tradeoff can
be evaluated based on different access patterns. For example, pages
that are accessed rarely, like file description pages on commons,
may have a high threshold configured, while pages that are read
frequently, like wikipedia articles, may be configured to be always
cached, using a 0 threshold.

Filtering is based on a time profile recorded in the ParserOutput.
A generic mechanism for capturing the timing profile is implemented
in the ContentHandler base class. Subclasses may implement a more
rigorous capture mechanism.

Bug: T346765
Change-Id: I38a6f3ef064f98f3ad6a7c60856b0248a94fe9ac
2023-11-30 20:56:12 +00:00
Martin Urbanec
29af4dd074 Move user options related classes into its own namespace
There are a couple of user options related classes already,
and the T321527 work on dynamic defaults is going to add
even more. Let's move them into a separate namespace
to make core a bit more organized.

Old name is kept as an alias for compatibility purposes.

Bug: T321527
Bug: T352284
Change-Id: I9822eb1553870b876d0b8a927e4e86c27d83bd52
2023-11-29 13:27:13 +01:00
thiemowmde
10a828ba72 Deprecate MagicWordFactory::getSubstIDs
The main motivation is to further reduce the complexity of the class:
* There is no code that ever writes to $this->mSubstIDs. It's
  effectively a constant.
* According to CodeSearch the getSubstIDs() method is not used
  anywhere. It's @internal to the parser.
* I find it weird that the parser needs to call 2 factory methods to
  do 1 thing.
* I still find it a good idea to keep the knowledge encapsulated in
  the factory and not have the [ 'subst', 'safesubst' ] array in the
  parser. That's why I propose the new method.

Change-Id: I5c147c75200c3c34a410d93a0328b56ea00a050f
2023-11-13 11:10:24 +01:00
Timo Tijhof
d0a96db0f9 parser: Move lang/dir and mw-content-ltr to ParserOutput::getText
== Skin::wrapHTML ==

Skin::wrapHTML no longer has to perform any guessing of the
ParserOutput language. Nor does it have to special wiki pages vs
special pages in this regard. Yay, code removal.

== ImagePage ==

On URLs like /wiki/File:Example.jpg, the main output handler is
ImagePage::view. This calls the parent Article::view to handle most of
its output. Article::view obtains the ParserOptions, and then fetches
ParserOutput, and then adds `<div class=mw-parser-output>` and its
metadata to OutputPage.

Before this change, ImagePage::view was creating a wrapper based
on "predicting" what language the ParserOutput will contain. It
couldn't call the new OutputPage::getContentLanguage or some
equivalent as Article::view wouldn't have populated that yet.

This leaky abstraction is fixed by this change as now the `<div>`
from ParserOutput no longer comes with a "please wrap it properly"
contract that Article subclasses couldn't possibly implement correctly
(it coudln't wrap it after the fact because Article::view writes to
OutputPage directly).

RECENT (T310445):

A special case was recently added for file pages about translated SVGs.
For those, we decide which language to use for the "fullMedia" thumb
atop the page. This was recently changed as part of T310445 from a
hardcoded $wgLanguageCode (site content lang) to new problematic
Title::getPageViewLanguage, which tries to guestimate the page
language of the rendered ParserOutput and then gets the preferred
variant for the current user. The motivation for this was to support
language variants but used Title::getPageViewLanguage as a kitchen
sink to achieve that minor side-effect. The only part of this
now-deprecated method that we actually need is
LanguageConverter::getPreferredVariant().

Test plan: Covered by ImagePageTest.

== Skin mainpage-title ==

RECENT (T331095, T298715):

A special case was added to Skin::getTemplateData that powers the
mainpage-title interface message feature. This is empty by default,
but when created via MediaWiki:mainpage-title allows interface admins
to replace the H1 with a custom and localised page heading.

A few months ago, in Ifc9f0a7174, Title::getPageViewLanguage was
applied here to support language variants. Replace with the same
fix as for ImagePage. Revert back to Message::inContentLanguage()
but refactor to inLanguage() via MediaWikiServices::getContentLanguage
so that LanguageConverter::getPreferredVariant can be applied.

== EditPage ==

This was doing similar "predicting" of the ParserOutput language to
create an empty preview placeholder for use by preview.js. Now that
ApiParse (via ParserOutput::getText) returns a usable element without
any secret "you magically know the right class, lang, and dir" contract,
this placeholder is no longer needed.

Test Plan:

* EditPage: Default preview
  1. index.php?title=Main_Page&action=edit
  2. Show preview
  3. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr>

* EditPage: JS preview
  1. Preferences > Editing > Show preview without reload
  2. index.php?title=Main_Page&action=edit
  3. Show preview
  4. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr>
  5. Type something and 'Show preview' again
  6. Assert old element gone, new text is shown, and new element
     attributes are the same as the above.

== McrUndoAction ==

Same as EditPage basically, but without the JS preview use case.

== DifferenceEngine ==

Test:

1. Open /w/index.php?title=Main_Page&diff=0
   (this shows the latest diff, can do manually by viewing
   /wiki/Main_Page, click "View history", click "Compare selected revisions")
2. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr>
3. Open /w/index.php?title=Main_Page&diff=0&action=render
4. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr>

== Special:ExpandTemplates ==

Test:

1. /wiki/Special:ExpandTemplates
2. Write "Hello".
3. "OK"
4. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr>

Bug: T341244
Depends-On: Icd9c079f5896ee83d86b9c2699636dc81d25a14c
Depends-On: I4e7484b3b94f1cb6062e7cef9f20626b650bb4b1
Depends-On: I90b88f3b3a3bbeba4f48d118f92f54864997e105
Change-Id: Ib130a055e46764544af0f1a46d2bc2b3a7ee85b7
2023-11-03 19:24:47 -04:00
jenkins-bot
761bdee61e Merge "tests: Use fallback skin for ParserOutput/DefaultOutputTransform tests" 2023-10-30 22:40:07 +00:00
Timo Tijhof
08ddbf3465 parser: deprecate unused MagicWord::getId, improve docs and tests
* MagicWord::getId was added in r24808 (164bb322f2) but never used.
  At the time, access modifiers like 'private' were not yet in use.
  Deprecate the method with warnings, for removal in a future release.

* Fix zero coverage for MagicWord, due to constructor being
  internal, this is only intended to be created via array and
  factory classes. Let their tests cover this class.

* Remove redundant file-level description and ensure the class desc
  and ingroup tag are on the class block instead.
  Ref https://gerrit.wikimedia.org/r/q/owner:Krinkle+message:ingroup

* Mark constructor `@internal` (was already implied by
  stable interface policy), and explain where to get the object
  instead.

* Mark load() `@internal`. Method was introduced in 1.1 when the
  class (and PHP) did not yet use visibility modifiers for private
  methods. The only way to get an instance of MagicWord
  (MagicWordFactory::get) already calls load(), the method is not
  a no-op if called a second time, and (fortunately) there exist no
  callers to this outside this class that I could find.

* MagicWordArray::getBaseRegex was marked as internal
  in change I17f1b7207db8d2203c904508f3ab8a64b68736a8.

Change-Id: I4084f858bb356029c142fbdb699f91cf0d6ec56f
2023-10-26 16:07:20 +01:00
thiemowmde
6447dbc37b parser: Use more specific exceptions in MagicWord classes
… instead of the generic MWException and even more generic Exception.
Most, if not all of these should be unreachable anyway. I.e. these
are what we call "unchecked" exceptions, see T240672.

We also have a polyfill for preg_last_error_msg. No need to wrap it
in a function_exists (any more).

Change-Id: Ie26bef3b4371d011ec3f1874986072605692f486
2023-10-25 15:34:03 +02:00
Bartosz Dziewoński
154e9a444c tests: Use fallback skin for ParserOutput/DefaultOutputTransform tests
This matches the behavior of parserTests.txt again (in which
the fallback skin is used by ParserTestRunner::runLegacyTest).
The extra <span> wrappers were added by the Vector skin
(and could be affected by future changes to the Vector skin).

Follow-up to Ief6a6ee03ada8207fc5c60ea438412fa2d529022.

Change-Id: I33729b5026fcfbdbacc0e3fdfef91c9e6b461e6c
2023-10-24 19:02:23 +02:00
Jon Robson
9ef28e8e0e Skin: Separate generation of edit section data from HTML
The SkinMustache class now accepts a skin option that allows
callers to specify a template that can be used to render
the edit section link.

Additional change:
* Parser tests updated as now edit link label is wrapped
as a span when rendered in Vector 2022 consistent with other
links.

Bug: T346944
Change-Id: Ief6a6ee03ada8207fc5c60ea438412fa2d529022
2023-10-23 21:08:33 +00:00
Isabelle Hurbain-Palatin
36b4ab44f6 Refactor ParserOutput::getText into DefaultOutputTransform service
This also introduces the ephemeral field "$mTransformedText" to store
the result of transformation in ParserOutput.

This is a first step before the transformation uses HtmlHolder as input
and output.

Bug: T348253
Change-Id: I312f3748ebfb0373ee3542ba0abdeefe7db1d488
2023-10-16 13:11:38 +02:00
C. Scott Ananian
02852b813d Remove implicit setter for ParserOutput::mTOCHTML
The ::setTOCHTML() and ::getTOCHTML() method have been deprecated
since 1.40; there's no reason we should be updating ::$mTOCHTML
behind their backs.

Bug: T348134
Change-Id: I9396bc0a2caeb974a06c5b47075b3e2bb9f4278a
2023-10-04 15:10:58 -04:00
C. Scott Ananian
d20663259f Hard-deprecate ParserOutput::getCategories(), deprecated in 1.40
It is difficult to distinguish this method from OutputPage::addJsConfigVars()
in code search:

   https://codesearch.wmcloud.org/deployed/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3EgetCategories%5C%28&files=&excludeFiles=&repos=

We generally try to replace $output with $parserOutput or $pOutput
as we touch code to improve the ability of codesearch to dig up
deprecated ParserOutput methods.

Bug: T305161
Depends-On: I02dd4f61c43c225b0ef6dc51c3e4f9d967a0a272
Depends-On: I61d2d77591579d825ad9d37f902e40366be55dd6
Depends-On: I91155106b7a9e10d3334f95ba4936d02851bfb11
Depends-On: Iaca745c79d9587571af03b23b21d76a6cba0ebf1
Depends-On: Id10a171c44411b1233ee4d6cf8fbd3dc57744eef
Depends-On: I47a25c011d9bd4b1a15dda4e673e32c25eb64f2b
Depends-On: I683fc768aba50b801f46467fcfa1668fa8731ea6
Change-Id: I5a2ac1c99b8b199102e12f0d32dd6ec5cdc24054
2023-09-29 15:25:50 -04:00
James D. Forrester
bda8e89073 Drop Sanitizer::escapeIdReferenceList(), deprecated since 1.36
Change-Id: Idc9398a3bc7f6378965e5b6350f6c52fd05cef99
2023-09-27 22:40:23 +00:00
jenkins-bot
3cc61694cc Merge "Update 'validateParserCacheSerializationTestData' maintenance script" 2023-09-25 17:46:06 +00:00
James D. Forrester
468e69bccc Namespace Sanitizer under \MediaWiki\Parser
Bug: T166010
Change-Id: Id13dcbf7a0372017495958dbc4f601f40c122508
2023-09-21 05:39:23 +00:00
James D. Forrester
1d0b7ae1e2 Namespace User under \MediaWiki\User
Bug: T166010
Change-Id: I7257302b485588af31384d4f7fc8e30551f161f1
2023-09-19 19:18:16 +00:00
jenkins-bot
14e52d187d Merge "Parser: use PHPDoc comments on properties, typed private properties" 2023-09-19 05:53:21 +00:00
James D. Forrester
b16be7a36c Namespace TitleFormatter under \MediaWiki\Title
One of the big ones, so doing this alone.

Bug: T166010
Change-Id: Ic2d59eb6764b1a273ed7162ecabf641f638b8f66
2023-09-19 05:17:18 +00:00
James D. Forrester
a8a6cfd966 Namespace NamespaceInfo under \MediaWiki\Title
One of the big ones, so doing this alone.

Bug: T166010
Change-Id: Ibe103cd362535d3cb94cb8931e95fc74099d1497
2023-09-19 05:17:04 +00:00
jenkins-bot
30f54f6322 Merge "Namespace TitleValue under \MediaWiki\Title" 2023-09-18 21:34:29 +00:00
jenkins-bot
3751d36211 Merge "Namespace remaining 'specialpage' files under \MediaWiki\SpecialPage" 2023-09-18 21:06:01 +00:00
James D. Forrester
94ece673b2 Namespace TitleValue under \MediaWiki\Title
One of the big ones, so doing this alone.

Bug: T166010
Change-Id: I4c901d5c32696d8334ec30cede7d9b6f3d8d645e
2023-09-18 18:24:39 +01:00
James D. Forrester
459cbb0494 Namespace remaining 'specialpage' files under \MediaWiki\SpecialPage
SpecialPageFactory is already here, but none of the others were yet.

Bug: T166010
Change-Id: I9689bf0a1ab329625e23669b99f019b96295fffd
2023-09-18 18:23:13 +01:00
C. Scott Ananian
d421ab57f8 Remove ParserOutput::addOutputHook() and related code
ParserOutput::addOutputHook() has been deprecated since 1.38, and without
any calls to ::addOutputHook() the associated ::getOutputHooks() and
$wgParserOutputHooks configuration do nothing.

Bug: T292321
Bug: T305161
Change-Id: Ib770c680d5e0697980e7e36a323ec56ba1d806b8
2023-09-18 11:34:02 -04:00
C. Scott Ananian
83e197d817 Remove ParserOutput::addTrackingCategory(), deprecated since 1.38
Instead use either Parser::addTrackingCategory() or the TrackingCategories
service.

Bug: T305161
Change-Id: I19e0f67e377e6c68f54f6d5bb4f079110d1e61fc
2023-09-18 11:34:02 -04:00
tacsipacsi
6cf91bbb4c Parser: use PHPDoc comments on properties, typed private properties
Many private and even public properties and class constants were
documented using #-style comments, which were not available in Doxygen
documentation and editor tooling. Move these comments to PHPDoc comments
to make them accessible.

Add type declarations to private and internal properties wherever
possible. Remove PHPDoc documentation made redundant by this, but
add/keep PHPDoc documentation where it provides additional value
(human-readable documentation, array types, union types). Don’t add type
declarations to non-internal public properties as it potentially causes
breakage in case some external code not only uses the deprecated
property, but even writes it. These type declarations should be added
when the properties are made private or internal.

Change-Id: I247643b9bf0cabdc92a7e893d653edeaed9a1307
2023-09-17 21:51:59 +02:00
Umherirrender
790ae736c1 tests: Move test cases from /includes/ into sub folder
Follow move of the tested class
Most moves are part of T321882

Change-Id: I74ab45d6a5331dcb2ff0b65dc2cc7c6315146646
2023-09-13 00:09:05 +02:00
C. Scott Ananian
22f8397694 Update 'validateParserCacheSerializationTestData' maintenance script
Transitions the validateParserCacheSerializationTestData maintenance
script to the new maintenance script mechanism based on
maintenance/run.php

While we're here, also fix a minor bug that made the `--create` option
crash if this was the very first time serialization files for a
particular test case were being generated (ie, there was no prior
existing serialization on disk yet).

Change-Id: Ic0dadce750a2b390739ae657bab7f899860d1078
2023-09-07 20:37:36 +00:00
Reedy
a1144dc7c5 mark various anonymous functions as static
Change-Id: Iefe896769359f0d32e52bf20aa03e1c3715d5074
2023-08-22 19:38:38 +00:00
Amir Sarabadani
15a278189f Reorg: Move MWTimestamp to MediaWiki\Utils
Bug: T321882
Change-Id: I48c10343295c4eb3d9ef8037343b0070e928f040
2023-08-19 05:53:40 +02:00
C. Scott Ananian
7a8dd531b2 Remove ParserOutput::addWarning, deprecated since 1.38
Replaced with ParserOutput::addWarningMsg()

Bug: T305161
Change-Id: I137b35a2e8250ea7c10059d04071a98a4f968038
2023-08-07 11:57:07 -04:00
Daimona Eaytoy
6b1a62e169 Fix more non-database tests accessing the database
Mock the needed services, or set fixed values to avoid DB lookups, when
possible. Add the test to the Database group otherwise, e.g. for things
like Skin and Parser that use global state all over the place.

Change-Id: I8d87013d89accaf04d0ac19cb4b7216290383eb5
2023-08-06 15:30:41 +00:00
jenkins-bot
549961495b Merge "Hard-deprecate ParserOutput::{get,set}Flag()" 2023-08-02 17:48:18 +00:00
Daimona Eaytoy
2a0de02aab phpunit: Avoid TestUser in non-database tests
TestUser creates the user and therefore needs the database. Avoid using
it in non-database tests.

Add ApiQueryBlockInfoTraitTest to the Database group because it needs
the database.

Add DeleteUserEmailTest to the Database group because since 3bedffa8
the default user is not created any more in non-database tests

Change-Id: Iff438964dde47a47a2fa4a314d55010bd8c7fee5
2023-07-29 14:26:50 +00:00
C. Scott Ananian
e22d93a6bb Hard-deprecate ParserOutput::{get,set}Flag()
These were deprecated in 1.38; users are expected to use
ParserOutput::{get,set}OutputFlag() instead, which helps eliminate a
confusing aliasing of many MW methods named "flag".

Original deprecation: 06ab90f163

Code search:
    https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos=

Patches for non-production extensions:
 PageProperties: I592d43e2c912df635cd9162180ed20a6136535f1
 CIForms: I238a6c557891bb6d271d2641261ef69542b7957e

Bug: T292868
Bug: T305161
Change-Id: I4525443ab0932241b0cf64ab606f7ab7d6d70b6e
2023-07-28 13:51:02 -04:00
jenkins-bot
1929084e47 Merge "Rename newly-added ParserOutput::appendOutputString() method" 2023-07-28 17:29:12 +00:00
jenkins-bot
4a4da63cd2 Merge "Fix incomplete/broken ParserFactoryTest & ParserTest" 2023-07-28 17:08:30 +00:00
C. Scott Ananian
ea51801f79 Rename newly-added ParserOutput::appendOutputString() method
Tweaked the pluralization of the newly-added
ParserOutput::appendOutputString() method (now ::appendOutputStrings()
and ::getOutputStrings()), and name of the ParserOutputStrings class
(now ParserOutputStringSets), in an effort to continue repainting
bikesheds until the color is juuuust right.

Also extended the new method to cover ::addModules() and ::addModuleStyles()
and added support for these string sets in ::collectMetadata().

(These methods and the enumeration class were originally added in
b2cfa31eb6173e9f5e8607eadd126c33f8ce440b.)

Depends-On: I8bdffa55498d90e990af5bfc3332e3028b0a3539
Change-Id: Ibd41485d5db7779f01642e2144c50ed49d409812
2023-07-28 12:10:56 -04:00
thiemowmde
8a2b869945 Fix incomplete/broken ParserFactoryTest & ParserTest
Some details:
* Just use a real MagicWord object. It doesn't do anything that
  needs mocking.
* Add missing methods to mocks.
* Remove not needed details from mocks.
* Remove duplicate test that does the same.
* Remove pointless assertions that are impossible to ever fail.

Change-Id: I177242429a528d2c7109ca757840b538b772711c
2023-07-28 14:22:46 +00:00
Isabelle Hurbain-Palatin
b2cfa31eb6 Add append/getOutputString to ParserOutput
This aims at providing an interface similar to setOutputFlag for string
sets, such as the ones used in CSP properties.

Change-Id: I6f103bd88802e66611e483403a2f8a540d54aae9
2023-07-27 11:37:11 +02:00
thiemowmde
8a9dd67139 Avoid calling overrideConfigValue() multiple times
Same as I7a82951.

overrideConfigValue() and overrideConfigValues() both call
setMwGlobals(), which calls resetServices(). This is surprisingly
expensive. It's much better to call it once with an array.

Change-Id: I4ff2f6b902b1a1e0b554ce6fc76f3b612f703fae
2023-07-20 14:59:42 +02:00
Umherirrender
d2a09384a7 tests: Change some setMwGlobals to overrideConfigValue
Change-Id: I21b9bf907e313947360b1607f11ae9917488f109
2023-07-17 23:02:32 +02:00
Daimona Eaytoy
5035ecd2f7 CoreParserFunctionsTest: Avoid username pattern reserved for temp users
The leading "*" is currently used as the username pattern for temp
users, meaning this test will fail if

  $wgAutoCreateTempUser['enabled'] = true;

Put the star at the end instead, and use a variable for the username
instead of repeating it multiple times.

Change-Id: Ie0414de5f9d9054dfec540f14bd0dc9ec7b4cb72
2023-07-16 19:55:50 +02:00
daniel
c4033734db HookContainer: deprecate old hook handler formats
This reduces the acceptable forms for hook handlers to three things:
* a callable (in the form of a string, an array, or a closure)
* an object, which is expected to have a public "on" method that
  matches the hook name.
* an array containing an object spec in the "handler" key, for use
  with ExtensionRegistry.

All other forms will trigger a deprecation warning.

Bug: T339167
Depends-On: I980f2d45e6bb8c6a04058e68c758f71bbcf709de
Depends-On: Ieae405f70caa01d84602583cc214b0ee3fadc796
Depends-On: If15df4b598c02ed9bda5eea0ae89a16ebbf4f2e2
Depends-On: Id70276fa1e1821bd400dc0ae5cea722a21d524d5
Change-Id: I83bc81d1b3033c38b9313884a9c70a187fdde227
2023-06-21 11:40:10 +00:00
Daimona Eaytoy
518a5da533 Replace deprecated MWException
Bug: T328220
Change-Id: I0408575ee71e58d1c9e9ebedabab35bd3813f515
2023-06-12 12:27:49 +00:00
Umherirrender
d36073cdcf tests: Make some PHPUnit data providers static
Initally used a new sniff with autofix (T333745),
but some provide are defined non-static in TestBase class
and need more work to make them static in a compatible way

Bug: T332865
Change-Id: I889d33424f0c01fb26f2d86f8d4fc3de3e568843
2023-05-20 01:05:27 +02:00
Volker E
2c1729e4e9 HTML: Remove self-closing XHTML syntax from core
Syntactical leftover with no significance in modern web.

Bug: T309150
Depends-On: I3a029ca950db42b938962b2452ad136ae8ddea6f
Depends-On: Id0557ac19583de36d7226b14a4c06933da47fe97
Depends-On: I17580a72e4a9384d7d774866e610197e950900cb
Change-Id: I4bbfa47fbf6e30fb90d920d6d02cdf6e0b1cdb46
2023-05-03 10:44:41 +02:00
thiemowmde
bee13a2a6d Avoid calling setMwGlobals multiple times
Turns out this method is rather expensive because of the final
resetServices() is does internally. It's much better to call it with
an array.

Change-Id: I7a82951e281512d535ffc5a86929f4441f3ddc4e
2023-05-02 15:48:12 +02:00
Umherirrender
997726c4ee tests: Use array_fill_keys instead of array_combine/array_fill
Change-Id: I3bee4452b182a982b99017beed4ff929e96a10c6
2023-04-29 15:51:03 +02:00
jenkins-bot
65812ee715 Merge "parser: Make all LinkHolderArray properties private" 2023-04-08 22:28:31 +00:00
Aaron Schulz
366a0afd63 parser: improve cache TTL accuracy for CURRENT*/LOCAL* magic words
Consolidate cache TTL handling within CoreMagicVariables.

Make the TTL account for how many seconds away the value is from changing.
For example, CURRENTHOUR should change soon after the next hour is reached.
There is a minimum adjustment TTL to avoid parser-after-save delays.

This allows for longer caching in most cases, as well as more up-to-date
rendering when the hour/day/week/year is about to change. Previously, there
were blind TTLs, which are either way too pessimistic or way too generous.

This commit does not change the CURRENTTIME, CURRENTTIMESTAMP, LOCALTIME,
and LOCALTIMESTAMP words, since there is no reasonable way to cache output
while keeping them up-to-date.

Bug: T320668
Change-Id: I9acb42b0d9ff67798a1624cbf9c7cac99c8fbe2f
2023-03-28 22:35:17 +00:00
C. Scott Ananian
cfd9c516e1 Allow setting a ParserOption to generate Parsoid HTML
This is an initial quick-and-dirty implementation.  The
ParsoidParser class will eventually inherit from \Parser,
but this is an initial placeholder to unblock other Parsoid
read views work.

Currently Parsoid does not fully implement all the ParserOutput
metadata set by the legacy parser, but we're working on it.

This patch also addresses T300325 by ensuring the the Page HTML
APIs use ParserOutput::getRawText(), which will return the entire
Parsoid HTML document without post-processing.  This is what
the Parsoid team refers to as "edit mode" HTML. The
ParserOutput::getText() method returns only the <body> contents
of the HTML, and applies several transformations, including
inserting Table of Contents and style deduplication; this is
the "read views" flavor of the Parsoid HTML.

We need to be careful of the interaction of the `useParsoid` flag with
the ParserCacheMetadata.  Effectively `useParsoid` should *always* be
marked as "used" or else the ParserCache will assume its value doesn't
matter and will serve legacy content for parsoid requests and
vice-versa.  T330677 is a follow up to address this more thoroughly by
splitting the parser cache in ParserOutputAccess; the stop gap in this
patch is fragile and, because it doesn't fork the ParserCacheMetadata
cache, may corrupt the ParserCacheMetadata in the case when Parsoid
and the legacy parser consult different sets of options to render a
page.

Bug: T300191
Bug: T330677
Bug: T300325
Change-Id: Ica09a4284c00d7917f8b6249e946232b2fb38011
2023-03-26 21:46:05 -04:00
Tim Starling
5e30a927bc tests: Make some PHPUnit data providers static
Just methods where adding "static" to the declaration was enough, I
didn't do anything with providers that used $this.

Initially by search and replace. There were many mistakes which I
found mostly by running the PHPStorm inspection which searches for
$this usage in a static method. Later I used the PHPStorm "make static"
action which avoids the more obvious mistakes.

Bug: T332865
Change-Id: I47ed6692945607dfa5c139d42edbd934fa4f3a36
2023-03-24 02:53:57 +00:00
thiemowmde
4ebc778eb7 parser: Make all LinkHolderArray properties private
I could not find any use outside of core, or even outside of this
class.

The class is instantiated a single time in core:
https://codesearch.wmcloud.org/search/?q=new%5CW%2BLinkHolderArray&files=%5C.php%24
This instance is not used anywhere else:
https://codesearch.wmcloud.org/search/?q=mLinkHolders&files=%5C.php%24

I would argue this doesn't really qualify as a breaking change. This
was always meant to be private.

Change-Id: I4c614dae1fe1d61c9cf8b7a03c37eb93fae33873
2023-03-15 10:44:04 +01:00
jenkins-bot
6de76f1fad Merge "Add ParserOutput::getLanguage()" 2023-03-13 14:18:47 +00:00
jenkins-bot
bd5cccf7c4 Merge "Deprecate ParserOutput::{get,set}TOCHTML()" 2023-03-12 21:41:20 +00:00
libraryupgrader
7375f3a5fe build: Updating mediawiki/mediawiki-codesniffer to 41.0.0
The following sniffs are failing and were disabled:
* MediaWiki.Usage.ForbiddenFunctions.eval

Change-Id: I6fd0a9296c88a77c3abec6e5e8d568bb469c2d6e
2023-03-11 19:04:09 +00:00
C. Scott Ananian
29853113f7 Deprecate ParserOutput::{get,set}TOCHTML()
No uses in deployed code outside mediawiki-core:

 https://codesearch.wmcloud.org/deployed/?q=%5Bgs%5DetTOCHTML%5C%28&i=nope&files=&excludeFiles=&repos=

Bug: T293513
Change-Id: I3fd82150ac581afbeb94f401672702063586fff0
2023-03-10 20:34:33 -05:00
C. Scott Ananian
183a6da420 Add ParserOutput::getLanguage()
Provide a way for backend code to determine the primary language of a
ParserOutput, eg for setting the Content-Language header of an API
response.

This is read-only and backed by extension data at the moment for
transition purposes; if this API sticks we'll graduate it to a
"real" property in the future, with appropriate serialization
to/from JSON (T303329).

Similarly, this patch only includes the most basic code to handle
the various ParserOutput merge cases in
ParserOutput::merge{Internal,Html,Tracking}MetaDataFrom(),
ParserOutput::collectMetadata(), and
OutputPage::addParserOutput{Content,Metadata,Text,}(); mostly
inherited from the fact that the storage is backed by extension
data at the moment.

Generally only the "top-level" parser output gets to set the
primary language; we'll presumably need to ensure that the
language is consistent during merge.

Change-Id: I767daba22805a877d9b806fd77334e508902844b
2023-03-10 18:42:29 -05:00
James D. Forrester
ad06527fb4 Reorg: Namespace the Title class
This is moderately messy.

Process was principally:

* xargs rg --files-with-matches '^use Title;' | grep 'php$' | \
  xargs -P 1 -n 1 sed -i -z 's/use Title;/use MediaWiki\\Title\\Title;/1'
* rg --files-without-match 'MediaWiki\\Title\\Title;' . | grep 'php$' | \
  xargs rg --files-with-matches 'Title\b' | \
  xargs -P 1 -n 1 sed -i -z 's/\nuse /\nuse MediaWiki\\Title\\Title;\nuse /1'
* composer fix

Then manual fix-ups for a few files that don't have any use statements.

Bug: T166010
Follows-Up: Ia5d8cb759dc3bc9e9bbe217d0fb109e2f8c4101a
Change-Id: If8fc9d0d95fc1a114021e282a706fc3e7da3524b
2023-03-02 08:46:53 -05:00
Amir Sarabadani
0f13e81a15 Reorg: Move five page-related classes to page/ out of includes/
These classes:
 - MergeHistory
 - MovePage
 - ProtectionForm
 - BadFileLookup (to MediaWiki\Page\File)
 - FileDeleteForm (to MediaWiki\Page\File)

Bug: T321882
Change-Id: Ibeb488ba322c62a34042a0307bbb5562773bcad1
2023-02-23 17:03:49 +01:00
C. Scott Ananian
d5b39490ca Remove back-compatibility code for ToC marker
Before 1.39 we used <mw:toc> and in 1.39 we switched to <mw:tocplace/>
(commit 24949480eb).  This was changed
to a <meta> tag in 1.40 (commit
0b10563895 and
fa8646ca7b) and the old content has long
since expired from the ParserCache.  Clean up the old ParserCache
transition code.

Change-Id: I3254d0acba31e107b50767797a2b0ad28aba59ee
2023-02-10 00:03:54 -05:00
Amir Sarabadani
c8116223b4 Reorg: Move category-related classes from includes/ to Category/
Bug: T321882
Change-Id: I0b86acfdeaa3a2a0a14b7763fd088122820bafdc
2023-02-09 20:18:54 +01:00
C. Scott Ananian
439656e019 Generate TOC HTML on demand in ParserOutput::getText()
* Rather than computing TOC HTML in Parser and setting it in
  ParserOutput, compute it on demand based on section metadata.

  This will let Parsoid set section metadata in ParserOutput
  and have the TOC generated automatically.

* This required fixing some "bugs" in Linker's generateTOC
  which didn't properly close tags and relied on Tidy to fix
  up unclosed li and ul tags.

* This patch relies on converting section metadata objects to
  array objects, but Linker::generateTOC could be converted to
  use TOC data instead.

* Since TOC generation is now moved to getText(), this is done
  post-PC load and this eliminates the parser cache split on
  user language for TOC heading localization.

Bug: T293513
Change-Id: Ief1bba326d3612b40930440c872a61abadffab10
2023-01-25 16:42:16 -05:00
jenkins-bot
8220c7dce3 Merge "Generate/set/get TOCData/SectionMetadata objects instead of arrays" 2023-01-19 21:36:56 +00:00
Subramanya Sastry
d8d6ecd39f Generate/set/get TOCData/SectionMetadata objects instead of arrays
* ParserOutput::setSections()/::getSections() are expected
  to be deprecated. Uses in extensions and skins will need to be
  migrated in follow up patches once the new interface has stabilized.

* In the skins code, the metadata is converted back to an array.
  Downstream skin TOC consumers will need to be migrated as well
  before we can remove the toLegacy() conversion.

* Fixed SerializationTestTrait's validation method
  - Not sure if this is overkill but should handle all future
    complex objects we might stuff into the ParserCache.

* This patch emits a backward-compatible Sections property in order to
  avoid changing the parser cache serialization format. T327439 has
  been filed to eventually use the JsonCodec support for object
  serialization, but for this initial patch it makes sense to avoid
  the need for a concurrent ParserCache format migration by using a
  backward-compatible serialization.

* TOCData is nullable because the intent is that
  ParserOutput::setTOCData() is MW_MERGE_STRATEGY_WRITE_ONCE; that is,
  only the top-level fragment composing a page will set the TOCData.
  This will be enforced in the future via wfDeprecated() (T327429),
  but again our first patch is as backward-compatible as possible.

Bug: T296025
Depends-On: I1b267d23cf49d147c5379b914531303744481b68
Co-Authored-By: C. Scott Ananian <cananian@wikimedia.org>
Co-Authored-By: Subramanya Sastry <ssastry@wikimedia.org>
Change-Id: I8329864535f0b1dd5f9163868a08d6cb1ffcb78f
2023-01-19 16:18:13 -05:00
C. Scott Ananian
96e4f5d840 JsonCodec: fix en/decoding of nested objects and stdClass objects
Add a type annotation when encoding `stdClass` objects so that we can
be sure to decode them as objects instead of arrays.

This avoids issues such as that seen in the Graph extension (T312589)
where an extension data key is stored as a stdClass.  If ParserOutput
was computed fresh, a subsequent getExtensionData(..) call will return
a stdClass object, but if the ParserOutput was cached, getExtensionData()
would return an array.  After this change the return type is always
consistent.

Properly handle nested objects: encode all object values returned by
JsonSerializable::jsonSerialize() (so that client is not responsible
for implementing this correctly), and decode all object values *before*
calling JsonUnserializable::newFromJsonArray (again, so that the
client is not responsible for decoding its property values).  The new
behavior matches how serialize/unserialize is handled in the 'naive'
JsonUnserializable{Sub,Super}Class test cases; ParserOutput (the only
users of JsonCodec in core) was doing an extra manual decode for
the ExtensionData array in ParserOutput::initFromJson that is no longer
necessary.

The GrowthExperiments and SemanticMediaWiki extensions were working
around the non-recursive nature of JsonCodec; this patch depends on
patches to GrowthExperiments to make it agnostic about whether object
unserialization occurs before or after ::newFromJsonArray() is called,
which can then be further cleaned up once this is released.
A pull request for SemanticMediaWiki has also been submitted.

Bug: T312589
Depends-On: I3413609251f056893d3921df23698aeed40754ed
Change-Id: Id7d0695af40b9801b42a9b82f41e46118da288dc
2023-01-12 14:12:32 -05:00
jenkins-bot
d3ecbc93a3 Merge "parser: Optimize regex patterns used in LinkHolderArray" 2023-01-07 16:21:50 +00:00
thiemowmde
69c5757243 parser: Optimize regex patterns used in LinkHolderArray
Two micro-optimizations are done in this patch:

1. We know exactly how these placeholders are built in the makeHolder()
method. In »<!--IWLINK'" 1-->« it's guaranteed to be a single number
and in »<!--LINK'" 1:2-->« it's two numbers.

The most extreme synthetic micro benchmark I did cuts the runtime of
these regular expressions down to about 25%. It won't make much of a
difference in real-world scenarios but is still worth it, I believe.

It also makes the code more specific and less confusing (see below).

2. We don't need to use the full string »<!--LINK'" 1:2-->« as array
key when the only thing that matters is the part »1:2«. Note the same
is done just a few lines below in the replaceInterwiki() method.

This code does have outstanding test coverage via all the parser tests,
I believe. Any change here that doesn't make a test fail should be safe.

Note the unit tests have been written many years later via I2c12cc7,
using "dummy" strings and such instead of the expected numeric
namespace and link ids. Most of this is already fixed via previous
patches. The last mistake addressed in this patch is that
getPrefixedDBkey() is supposed to be a title. It can't contain one of
these placeholders.

Follow-Up: I2c12cc76a9bf01eb527db3ea038e4adc59446cac
Change-Id: Ie994059092df8861ddb97c098acd082698d45c53
2023-01-07 13:25:33 +00:00
Amir Sarabadani
523ab7cff8 Reorg: Move RawMessage to under language/
To follow Message. This is approved as part of RFC T166010.

Also namespace it but doing it properly with PSR-4 would require
namespacing every class under language/ and that will take some time.

Bug: T321882
Change-Id: I195cf4c67bd51410556c2dd1e33cc9c1033d5d18
2022-12-16 11:30:19 +01:00
Umherirrender
fd516a98e1 Fix whitespaces after comma
Change-Id: Ide6de0a53661e6f650099d7b1f274a02699441df
2022-12-15 01:24:14 +01:00
jenkins-bot
be2ff28b48 Merge "Reorg: Move MagicWord related files to under parser/" 2022-12-11 18:15:48 +00:00
Amir Sarabadani
a1b4699fea Reorg: Move MagicWord related files to under parser/
This is approved as part of T166010 RFC.

Bug: T321882
Change-Id: Ia4498c0a20e38a6a288dc14065ea8242c84fbc49
2022-12-09 13:48:35 +01:00
thiemowmde
800fd1d4c4 Fix bogus nextLinkID in LinkHolderArrayIntegrationTest
Parser::nextLinkID cannot return a string. It returns a positive
integer number.

Note a very similar mistake was already fixed before via I7e71ffc.

Change-Id: Ifce71d0f4db31787bf0eb84e621cfdeb07c674ef
2022-12-09 11:45:09 +01:00
Reedy
0cb2c3c106 Fix casing of class and function name usages
Bug: T253628
Change-Id: I5c64f436d3cf757390b751ce3e34bfc7872bc176
2022-12-04 19:09:30 +00:00
Subramanya Sastry
bcb7009c41 Use real section metadata in tests
* Most of the files were generated from the validate* script.
* Post-processing of these generated files to fix problems:
  - Some of the files were binary-edited via "vi -b" to fix some
    issues with bad property names used in the prior step.
    1.36, 1.38, 1.39 files were all fixed up this way.
  - In addition, the 1.36 file had bad data (not sure if the wrong
    php version was used) but I fixed this by splicing in data
    from the 1.38 file to revert incorrect changes to "Categories"
    and "IndexPolicy" properties.
  - The 1.35 data file was binary edited by splicing data from the
    now 1.36 version.

Change-Id: I4e22b94ce30c2ad9b1f544c15e1c3cd0dd0bce6b
2022-11-23 12:45:27 -05:00
Subramanya Sastry
623625e8f2 Followup to fb747bc0: Fix bad property names
Change-Id: I362b0cf8feca13a91fd91961d400579f2e4ea97e
2022-11-18 16:12:06 -06:00
Subramanya Sastry
fb747bc038 Add section metadata parsercache serialization tests for MW 1.40
* Generate data files for 1.40 only since the new formats only
  showed up in 1.40 and won't be present in the parser cache
  for older MW versions.

Change-Id: I6f297e3091ec2faab7c2203c138800551b01e32a
2022-11-17 15:48:15 -06:00
daniel
118d4980b2 Track the reason for rendering.
Allow the causeAction that triggers page rendering to be looped through
to ParserCache, so we can count what causes writes to the cache.

Change-Id: I6ad8e105a3ce457e3ab4f85cd154f47a32085e0d
2022-11-09 09:38:57 +00:00
daniel
8c1c1ae35a Enable pig-latin variant for testing
Having pig-latin enabled per default in dev environments is convenient
for manual testing. More importantly, it will allow us to write
end-to-end tests for variant conversion.

Depends-On: I9dc2f743ac487b0f7cfb667150c0f6950d5e7fce
Depends-On: I85b66c85be3959d48a048733af17197bc4cf70af
Change-Id: Ia80ad33cbf5e311fa8b84bd765a8df8d156f4c38
2022-11-08 17:45:51 +05:30
Tim Starling
0077c5da15 Use short array destructuring instead of list()
Introduced in PHP 7.1. Because it's shorter and looks nice.

I used regex replacement.

Change-Id: I0555e199d126cd44501f859cb4589f8bd49694da
2022-10-21 15:33:37 +11:00
C. Scott Ananian
d96207ab86 Auto-discover core parser test files
Make parser test discover in core work the same way as it does in
extensions: any file ending with *.txt under tests/parser is run
as a parser test file.

This search is recursive, which is motivation to also move some
unrelated files under tests/parser/preprocess over to
tests/phpunit/data/preprocess where they belong; they are used
by tests/phpunit/includes/parser/PreprocessorTest.php and are
unrelated to the parser test infrastructure.

Change-Id: I8c84b4b853e1309929dceb700aab1e79a598d8ab
2022-10-13 10:41:15 -04:00
Jon Robson
d1662dca59 Parser: Use linkAnchor in section definition as well as anchor
The anchor property comes from Sanitizer::escapeIdForAttribute() and
should be used if you want to (eg) look up an element by ID using
document.getElementById(). The linkAnchor property comes from
Sanitizer::escapeIdForLink() and contains additional escaping
appropriate for use in a URL fragment, and should be used (eg) if you
are creating the href attribute of an <a> tag.

Bug: T315222
Change-Id: Icecf9640a62117c2729dca04af343fb1ddaaf8f8
2022-09-14 12:54:36 -04:00
jenkins-bot
61cbd18ff3 Merge "parser: Use a <meta> tag for the internal TOC_PLACEHOLDER" 2022-09-09 21:12:34 +00:00
Subramanya Sastry
c8a944a94b Add support to enable Scribunto & Parsoid to handle nowikis properly
* Lua modules have been written to inspect nowiki strip state markers
  and extract nowiki content to further process them. Callers might have
  used nowikis in arguments for any number of reasons including needing
  to have the argument be treated as raw text intead of wikitext.

  While we might add first-class typing features to wikitext, templates,
  extensions, and the like in the future which would let Parsoid process
  template arguments based on type info (rather than as wikitext always),
  we need a solution now to enable modules to work properly with Parsoid.

* The core issue is the decoupled model used by Parsoid where
  transclusions are preprocessed before further processing. Since
  nowikis cannot be processed and stripped during preprocessing,
  Lua modules don't have access to nowiki strip markers in this model.

* In this patch, we change extension tag processsing for nowikis.

  When generating HTML, nowikis are replaced with a 'nowiki' strip
  marker with the nowiki's "innerXML" (only tag contents).

  In this patch, during preprocessing, instead of adding a 'general'
  strip marker with the "outerXML" (tag contents and the tag wrapper),
  we add a 'nowiki' strip marker with its "outerXML".

* Since Parsoid (and any clients using the preprocessed output) will
  unstrip all strip markers, the shift from a general to nowiki
  strip marker won't make a difference.

* To support Scribunto and Lua modules unstrip usage, this patch adds
  new functionality to StripState to replace the (preprocessing-)nowiki
  strip markers with whatever its users want. So, Scribunto could
  pass in a callback that replaces these with the "innerXML" by
  stripping out the tag wrapper.

* Hat tip to Tim Starling for recommending this strategy.

* Updated strip state tests.

Bug: T272507
Bug: T299103
Depends-On: Id6ea611549e98893f53094116a3851e9c42b8dc8
Change-Id: Ied0295feab06027a8df885b3215435e596f0353b
2022-09-01 21:04:42 +00:00
Bartosz Dziewoński
f7158c396d Add markup to page titles to distinguish the namespace and the main text
Pages outside of the main namespace now have the following markup in
their <h1> page titles, using 'Talk:Hello' as an example:

<h1>
  <span class="mw-page-title-namespace">Talk</span>
  <span class="mw-page-title-separator">:</span>
  <span class="mw-page-title-main">Hello</span>
</h1>
(line breaks and spaces added for readability)

Pages in the main namespace only have the last part, e.g. for 'Hello':

<h1>
  <span class="mw-page-title-main">Hello</span>
</h1>

The change is motivated by a desire to style the titles differently on
talk pages in the DiscussionTools extension (T313636), but it could
also be used for other things:
* Language-specific tweaks (e.g. adding typographically-correct spaces
  around the colon separator: T249149, or replacing it with a
  different character: T36295)
* Site-specific tweaks (e.g. de-emphasize or emphasize specific
  namespaces like 'Draft': T62973 / T236215)

The markup is also added to automatically language-converted titles.

It is not added when the title is overridden using the wikitext
`{{DISPLAYTITLE:…}}` or `-{T|…}-` forms. I think this is a small
limitation, as those forms mostly used in the main namespace, where
the extra markup isn't very helpful anyway. This may be improved in
the future. As a workaround, users could also just add the same HTML
markup to their wikitext (as those forms accept it).

It is not also added when the title is overridden by an extension
like Translate. Maybe we'll have a better API before anyone wants
to do that. If not, one could un-mark Parser::formatPageTitle()
as @internal, and use that method to add the markup themselves.

Bug: T306440
Change-Id: I62b17ef22de3606d736e6c261e542a34b58b5a05
2022-08-16 23:36:21 +00:00
C. Scott Ananian
0b10563895 parser: Use a <meta> tag for the internal TOC_PLACEHOLDER
Split out from the I44045b3b9e78e change.

This is consistent with what Parsoid will use for the TOC marker.

Bug: T287767
Bug: T270199
Bug: T311502
Depends-On: I1f607cf1ef1b61fb4d2e1880de756fb94d5a6b22
Change-Id: Ie63eed07b9bca1bfa07d4c256aba3728cedd8f93
2022-08-16 06:05:17 +00:00
C. Scott Ananian
fa8646ca7b parser: Prepare to use a <meta> tag for the internal TOC_PLACEHOLDER
Split out from the I44045b3b9e78e and Ie63eed07b9bca changes.  We
first add code to handle the new tag as well as the old tag in
ParserCache contents. This will allow us to safely rollback if needed
when deploying the follow-on patch which actually changes the tag
used.

Bug: T287767
Bug: T270199
Bug: T311502
Change-Id: Ib3e5e010b9f5ca2c4ea7c4fe28080170b6a88812
2022-08-15 18:54:52 -04:00
Derick Alangi
5e8cd2c838
Migrate from setMwGlobals() to overrideConfigValue(s)
Change-Id: I3f167d0e7d59a5aa091c3095a7d96c889d6e7e78
2022-08-02 10:14:10 +01:00
Brian Wolff
f79ea41072 parser: Mock WikiPage::getContentModel in ParserCacheTest to fix php8.1
PHP 8.1 doesn't like this returning null.

Bug: T313663
Change-Id: I59eb21301aab946b6362fea956b398337af8d971
2022-07-25 20:51:51 +00:00
Thiemo Kreuz
61ae7504df Replace trivial usa of mock builder with createMock() shortcut
createMock() does the same, but is much easier to read.

A small difference is that some of the replacements made in this
patch didn't use disableOriginalConstructor() before. In case this
was relevant we should see the respective test fail. If not we can
save some CPU cycles and skip these constructors.

Change-Id: Ib98fb06e0fe753b7a53cb087a47e1159515a8ad5
2022-07-15 16:43:48 +00:00
Umherirrender
246bc931f6 tests: Set wgLang with MediaWikiIntegrationTestCase::setUserLang
Change-Id: Ic1247a6719032b3a0ea1f76514edc5ffd5a7854a
2022-07-13 00:59:46 +02:00
Umherirrender
047c184bfe tests: Use Title::makeTitle instead of Title::newFromText
Avoid parsing known titles in tests to improve performance

Change-Id: Ibfccfe696f0b8bfda0b99abae324e60bbecef7d8
2022-07-06 00:44:00 +02:00
Derick Alangi
d01e3ed739 Replace deprecated calls ParserOptions::newCanonical( 'canonical' )
This is a quick find & replace of calls to the deprecated method
ParserOptions::newCanonical() when the context is the string literal
'canonical'. This can be safely replaced by called newFromAnon().

Change-Id: If7bb68459b11e0c5f5de188f10fdae85ad1a78bf
2022-06-16 14:22:24 +01:00
jenkins-bot
b494330aa7 Merge "ParserCache: always use JSON" 2022-06-07 14:12:29 +00:00