Even though this JSON property is unused on master, the previous
train release read it from the JSON (and threw the value away).
In order to provide error-free roll-forward and roll-back of the
train, temporarily write an empty string as the value of TOCHTML
so that the read from `$jsonData['TOCHTML']` won't cause a PHP
notice in the logs if we roll back.
This patch is only needed for one train release, and can then
be removed.
Bug: T363107
Change-Id: I77add3bd7f00941cb81481f738bc59d6008c2406
Before this method name gets baked forever into the 1.42 release, rename
the ParserOutput::setIndexedPageProperty() and ::setUnindexedPageProperty()
methods to ::setNumericPageProperty() and ::setUnsortedPageProperty() to
try to address some confusion about whether the *presence* of the page
property is still indexed (it is!), in contrast to whether there's an
additional "sort key" associated with the *value* assigned to the page
property.
This naming is compatible with the feature request in T357783 to have
the sort key and property value specified independently. The new
method signature in that case would be:
...setSortedPageProperty( string $name, string $value, int|float $sortKey )
Although PHP 8.0 will throw a TypeError if a non-numeric type is coerced
to numeric using `0 + ...`, use an explicit is_numeric check to obtain
the same behavior in PHP 7.x.
Change-Id: Ia94c192c429d0482c58467bed787fd2e0aca052f
Not *all* ParserOutputs represent parsed articles, and describe the
merging operations on ParserOutputs in more depth. The interaction
with Content and ContentHandlers is also described (thanks, Daniel!).
Followup-To: Id2e3124652315a74869f504056fa8a99ad794350
Change-Id: I5c1016532eba1b71dc4d3d5d5d0c46775713efb5
If a placeholder value is needed, it is recommended to use the empty string
to avoid wasting database space unnecessarily. Operationalize this
recommendation by providing a default value for the method argument.
Bug: T305158
Bug: T350224
Change-Id: I9ea8d93298d771c2d38fdfb451a2817220ca679a
Deprecate non-string values to ::setPageProperty(), which introduce easy
traps for programmers to fall into. Instead if page properties are intended
to be indexed, use the new ::setIndexedPageProperty() instead. Also add
::setUnindexedPageProperty() for symmetry, with a tighter string type on
the value.
Bug: T305158
Bug: T350224
Change-Id: I8a39a7c90341dfee932aa819c9a0a637a8782f69
This ensures uniform treatment of all places that call `addCategory`
without duplicating the `defaultsort` code; it also ensures that the
effect of the {{DEFAULTSORT}} parser function is independent of page
position.
Bug: T40435
Bug: T353530
Change-Id: I4480a6d59e766fa4eddc9ec9117c58b66771bb47
Fixed SkinModuleTest::provideGetFeatureFilePathsOrder as nesting of
arrays for parameters is wrong
Change-Id: I9875008adf62d284c48662ebfbd245d72e5be064
ParserOutput::getText() is not a simple getter, but does
transformations on the "text" of the ParserOutput; the simple getter
is named ::getRawText().
To maintain consistency, rename ParserOutput::setText() to
::setRawText() and the property name ParserOutput::$mText to
::$mRawText so future readers are not confused.
The JSON property name as it appears in the serialized ParserCache
is left as 'Text' so that we don't have any forward- or backward-
rollback issues.
Change-Id: I3ef34814ab9473cc70d0a6806e8c5a4a02b73491
Non-scalar values passed to ParserOutput::setPageProperty() have never
"worked"; they've been stringified (and null has been stored as an empty
string). Emit a warning so we can fail harder in future releases.
Bug: T305158
Depends-On: Ib36787d04c0ca713587dc8b814ca1c5a827f6f72
Change-Id: I38234084fdc7427ca577bb33a7fce1541581188d
String and non-string values behave very differently when passed to
::setPageProperty(), resulting in some unexpected gotchas for the
unaware caller.
Bug: T350224
Bug: T305158
Change-Id: I23b35b250f27a117d1353ea8a26d2b3f77c568e7
We closed T296023 and opened a new task for the work remaining, so
update the comments in the code to match.
The task relating to `addLanguageLink` is actually T296019.
Change-Id: I28b942a57ed41751d44d8565a290d925f6d7f180
This was formerly used by the REST api, but instead that code just
uses ParserOutput::getRawText() when it needs the full HTML document.
This option has been broken, with various passes like RenderDebugInfo
and AddWrapperDiv adding content in inappropriate places if
bodyContentOnly was false.
Change-Id: Ib45f95ded59c81c16d61803f977d1edbfe82b262
This will allow the Translate extension to set this parser option
in the ArticleParserOptions hook, instead of mutating $options passed
to ParserOutput::getText() in the ParserOutputPostCacheTransform hook.
It ought to also help to handle the many places which call:
... = $parserOutput->getText( [
'enableSectionEditLinks' => false,
] );
by allowing them to set the appropriate ParserOption instead
of passing arguments to ::getText().
Bug: T350626
Change-Id: I719c115194059060f7f888608417a194ac80cc92
This class belongs with the rest of the Parsoid output stash code.
This class has been marked @unstable since 1.39 and thus the move
does not need release notes.
Change-Id: I16061c0c28b1549fbe90ea082cc717fee4a09a6e
This avoids confusion with the "render timestamp" held by the cache,
and is consistent with ::get*RevisionId() etc.
The old ::getTimestamp() and ::setTimestamp() methods have been
deprecated.
Change-Id: Idb5e687709c98086c5d3075d31885c58a0723197
Set the render ID for each parse stored into cache so that we are able
to identify a specific parse when there are dependencies (for example
in an edit based on that parse). This is recorded as a property added
to the ParserOutput, not the parent CacheTime interface. Even though
the render ID is /related/ to the CacheTime interface, CacheTime is
also used directly as a parser cache key, and the UUID should not be
part of the lookup key.
In general we are trying to move the location where these cache
properties are set as early as possible, so we check at each location
to ensure we don't overwrite a previously-set value. Eventually we
can convert most of these checks into assertions that the cache
properties have already been set (T350538). The primary location for
setting cache properties is the ContentRenderer.
Moved setting the revision timestamp into ContentRenderer as well, as
it was set along the same code paths. An extra parameter was added to
ContentRenderer::getParserOutput() to support this.
Added merge code to ParserOutput::mergeInternalMetaDataFrom() which
should ensure that cache time, revision, timestamp, and render id are
all set properly when multiple slots are combined together in MCR.
In order to ensure the render ID is set on all codepaths we needed to
plumb the GlobalIdGenerator service into ContentRenderer, ParserCache,
ParserCacheFactory, and RevisionOutputCache. Eventually (T350538) it
should only be necessary in the ContentRenderer.
Bug: T350538
Bug: T349868
Followup-To: Ic9b7cc0fcf365e772b7d080d76a065e3fd585f80
Change-Id: I72c5e6f86b7f081ab5ce7a56f5365d2f75067a78
When we are merging a ParserOutput into a ContentMetadataCollector,
convert categories to LinkTarget, which is the preferred parameter
type of CMC::addCategory().
This also reverts the temporary fix in
I0715f4fbc870e401e5759dd7c7a3c19077c40a6a.
Note that the category names *should* be in dbkey form for proper
deduplication, but both TitleValue:tryNew() and
CategoryLinksTable::setParserOutput() will renormalize if needed
(see I2b08edd90666e0fa4eafe91444a58806909b02d6 / T328477).
Depends-On: Iea894aa2cee90f4ca5c7688493b0654e4605ce23
Change-Id: I5a903396edb4da0900ecef37cb3bf4bd03b5ba68
Due to a botched signature change on the Parsoid side, in -a8 Parsoid
only accepts `string|int` for ContentMetadataCollector::addCategory()
and in -a9 Parsoid only accept `LinkTarget`. The ParserOutput in
core, of course, accepts both. So move the code which merges
categories into the section of ContentMetadataCollector::collectMetadata()
where we know that the CMC we're merging with is really a ParserOutput.
Change-Id: I0715f4fbc870e401e5759dd7c7a3c19077c40a6a
OutputPage::getParserOutputText/addParserOutputContent expects
ParserOutput to be mutated (e.g. by
PostCacheTransformHookRunner). Hence, cloning it before running the
pipeline is breaking DiscussionTools, probably among others.
Suppress the clone for the case where the output pipeline is invoked
from ParserOutput::getText() (which is a deprecated method anyway) and
additionally suppress the side-effects to ParserOutput::$mText on that
code path.
Bug: T353257
Co-Authored-By: C. Scott Ananian <cananian@wikimedia.org>
Co-Authored-By: Isabelle Hurbain-Palatin <ihurbainpalatin@wikimedia.org>
Change-Id: I85c690fd37b781cb27c21970467639e852113b2a
Broadened the argument type to allow passing LinkTarget to:
* ParserOutput::addCategory()
* ParserOutput::addLanguageLink()
* ParserOutput::addLink()
* ParserOutput::addImage()
* ParserOutput::addTemplate()
This allows for a tighter interface with Parsoid's
ContentMetadataCollector class and avoids errors caused by passing the
wrong form of string title ("text" with spaces versus "dbkey" with
underscores).
There are a few performance problems remaining after this patch, which
only apply to use by Parsoid (not the legacy parser):
1. ::addLink() does inefficient db requests to fetch the page id for
each link if the optional $id parameter is not passed. These lookups
should be deferred and a LinkBatch used. (The legacy parser always
passes $id.)
2. ::addTemplate() similarly requires $page_id (and $rev_id) to be
passed, so is not currently usable by Parsoid.
3. ::addLanguageLink() uses Title::getFullText() which is not present
in LinkTarget and is currently implemented as a full Title lookup.
This is not an issue for the legacy parser, because it already has a
Title object so the lookup is a no-op, but could be improved for
Parsoid's use.
Bug: T296023
Change-Id: If21ec8563c8a619bdde7c0cb6534bb9009480a21
Pages that are fast to render can be omitted from the parser cache
to preserve disk space and cache write operations.
The threshold is configurable per namespace, so the tradeoff can
be evaluated based on different access patterns. For example, pages
that are accessed rarely, like file description pages on commons,
may have a high threshold configured, while pages that are read
frequently, like wikipedia articles, may be configured to be always
cached, using a 0 threshold.
Filtering is based on a time profile recorded in the ParserOutput.
A generic mechanism for capturing the timing profile is implemented
in the ContentHandler base class. Subclasses may implement a more
rigorous capture mechanism.
Bug: T346765
Change-Id: I38a6f3ef064f98f3ad6a7c60856b0248a94fe9ac
Instead of waiting until ParserOutput::mergeList() is called to uniquify
the list of modules, use an array<string,true> to ensure that the
modules and module styles are a set.
Change-Id: I49673bc369dec373bce23fe7b831e6be5a256c46
== Skin::wrapHTML ==
Skin::wrapHTML no longer has to perform any guessing of the
ParserOutput language. Nor does it have to special wiki pages vs
special pages in this regard. Yay, code removal.
== ImagePage ==
On URLs like /wiki/File:Example.jpg, the main output handler is
ImagePage::view. This calls the parent Article::view to handle most of
its output. Article::view obtains the ParserOptions, and then fetches
ParserOutput, and then adds `<div class=mw-parser-output>` and its
metadata to OutputPage.
Before this change, ImagePage::view was creating a wrapper based
on "predicting" what language the ParserOutput will contain. It
couldn't call the new OutputPage::getContentLanguage or some
equivalent as Article::view wouldn't have populated that yet.
This leaky abstraction is fixed by this change as now the `<div>`
from ParserOutput no longer comes with a "please wrap it properly"
contract that Article subclasses couldn't possibly implement correctly
(it coudln't wrap it after the fact because Article::view writes to
OutputPage directly).
RECENT (T310445):
A special case was recently added for file pages about translated SVGs.
For those, we decide which language to use for the "fullMedia" thumb
atop the page. This was recently changed as part of T310445 from a
hardcoded $wgLanguageCode (site content lang) to new problematic
Title::getPageViewLanguage, which tries to guestimate the page
language of the rendered ParserOutput and then gets the preferred
variant for the current user. The motivation for this was to support
language variants but used Title::getPageViewLanguage as a kitchen
sink to achieve that minor side-effect. The only part of this
now-deprecated method that we actually need is
LanguageConverter::getPreferredVariant().
Test plan: Covered by ImagePageTest.
== Skin mainpage-title ==
RECENT (T331095, T298715):
A special case was added to Skin::getTemplateData that powers the
mainpage-title interface message feature. This is empty by default,
but when created via MediaWiki:mainpage-title allows interface admins
to replace the H1 with a custom and localised page heading.
A few months ago, in Ifc9f0a7174, Title::getPageViewLanguage was
applied here to support language variants. Replace with the same
fix as for ImagePage. Revert back to Message::inContentLanguage()
but refactor to inLanguage() via MediaWikiServices::getContentLanguage
so that LanguageConverter::getPreferredVariant can be applied.
== EditPage ==
This was doing similar "predicting" of the ParserOutput language to
create an empty preview placeholder for use by preview.js. Now that
ApiParse (via ParserOutput::getText) returns a usable element without
any secret "you magically know the right class, lang, and dir" contract,
this placeholder is no longer needed.
Test Plan:
* EditPage: Default preview
1. index.php?title=Main_Page&action=edit
2. Show preview
3. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr>
* EditPage: JS preview
1. Preferences > Editing > Show preview without reload
2. index.php?title=Main_Page&action=edit
3. Show preview
4. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr>
5. Type something and 'Show preview' again
6. Assert old element gone, new text is shown, and new element
attributes are the same as the above.
== McrUndoAction ==
Same as EditPage basically, but without the JS preview use case.
== DifferenceEngine ==
Test:
1. Open /w/index.php?title=Main_Page&diff=0
(this shows the latest diff, can do manually by viewing
/wiki/Main_Page, click "View history", click "Compare selected revisions")
2. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr>
3. Open /w/index.php?title=Main_Page&diff=0&action=render
4. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr>
== Special:ExpandTemplates ==
Test:
1. /wiki/Special:ExpandTemplates
2. Write "Hello".
3. "OK"
4. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr>
Bug: T341244
Depends-On: Icd9c079f5896ee83d86b9c2699636dc81d25a14c
Depends-On: I4e7484b3b94f1cb6062e7cef9f20626b650bb4b1
Depends-On: I90b88f3b3a3bbeba4f48d118f92f54864997e105
Change-Id: Ib130a055e46764544af0f1a46d2bc2b3a7ee85b7
This also introduces the ephemeral field "$mTransformedText" to store
the result of transformation in ParserOutput.
This is a first step before the transformation uses HtmlHolder as input
and output.
Bug: T348253
Change-Id: I312f3748ebfb0373ee3542ba0abdeefe7db1d488