For language links, when there are conflicts between namespaces and
interwiki prefixes, it is important to use TitleValue for language links
rather than to try to reparse the Title. Language links also preserve
fragments, unlike other link types in ParserOutput; added tests to
document this.
Added handling for interwiki links and template links.
Bug: T363538
Change-Id: I6e8ff8ed7f8819000cc3f80e49c0739b568217a4
Add doc-typehints to class properties found by the PropertyDocumentation
sniff to improve the documentation.
Once the sniff is enabled it avoids that new code is missing type
declarations. This is focused on documentation and does not change code.
Change-Id: I3afaba387663320187c49ff1cdb2ff3ae01681ad
This is the fourth patch of a series of patches to remove
ParserOutput::getText() calls from core. This series of patches should
be functionally equivalent to I2b4bcddb234f10fd8592570cb0496adf3271328e.
Here we replace calls to getText where a ContentRenderer is available
close by by temporary ParserOutput::runOutputPipeline that will
eventually be replaced by a call to (probably) ContentRenderer
(T371004). Doing this work in stages allows us to separate the work of
"bring ParserOptions to the call site" from the work of "bringing
ContentRenderer(ish) to the call site", since both need to be done for
to make ParserOutput a value object (T293512).
Change-Id: Ib4f9357293dc230df6e0ca2379a1e2a4cc1b91b7
Bug: T293512
This is the third patch of a series of patches to remove
ParserOutput::getText() calls from core. This series of patches should
be functionally equivalent to I2b4bcddb234f10fd8592570cb0496adf3271328e.
Here we temporarily introduce runOutputPipeline in ParserOutput. It
creates and runs the pipeline with default options, and is called by
getText. (This is not entirely truthful because we go through a
runPipelineInternal transient method for null-argument-passing reasons,
but let's not over-complicate this commit message.)
getText is responsible for maintaining the current behaviour,
that is "disallow the cloning of the ParserOutput and putting text back
to as it was" to mitigate T353257. As we get rid of getText, this
behaviour should be moved, if necessary, to the caller site.
The new method is currently added to ParserOutput so that further
refactorings are, for the moment, simpler. It will eventually be moved
to another place within the Content framework.
We also rename 'suppressClone' to 'allowClone' (which is actually its
negation) to avoid multiple levels of negations that make the code
confusing. Note that the default value of 'allowClone' is true, and is
currently overriden in two places: getText and
OutputPage::getParserOutputText (which calls the pipeline directly and
not through ParserOutput).
Bug: T293512
Bug: T371022
Change-Id: Ibf04af1079aaa1934dc78685b00e636ff4d38a9a
ParserOutput::setPageProperty() was deprecated for use with non-string
values in 1.42 but there are still callers out there; handle these cases
without the implicit cast to string which ::setUnsortablePageProperty()
would do via its argument type hint.
Bug: T374046
Bug: T373920
Followup-To: I68c28b0d5d23decc058a46c55e767a83c80452f8
Followup-To: I9a235ae828c2cadc9d2c619760f759e51ba73874
Change-Id: I52ecce78fcee8b18cf9d7ea848946f29e2d8b51b
The ParserOutput::collectMetadata() method is used to transfer parsing
metadata from the legacy parser (ParserOutput) to Parsoid
(ContentMetadataCollecctor). Several new methods were added to Parsoid's
ContentMetadataCollector class but weren't being transferred from
the ParserOutput.
Change-Id: If2b933005c1ebd0f8b33884242a1c97b94f97a2b
It is difficult to distinguish this method from OutputPage::addJsConfigVars()
in code search:
https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3EaddJsConfigVars%5C%28
We generally try to replace $output with $parserOutput or $pOutput
as we touch code, to improve the ability of codesearch to dig up
deprecated ParserOutput methods.
A future project will unify those parts of OutputPage which duplicate
ParserOutput: T301020.
Bug: T300307
Bug: T305161
Depends-On: I39ae7d7a40190eedaa024097a6442cd02b6a02e7
Depends-On: I2c660972b289bbad730ceee1325d70d5ba75d27e
Change-Id: I53c28ee7c80b889c893c1d00f37678e716e55783
Versions are changed in 8e940c4f21,
but that makes the version wrong
Follow-Up: I7f85d931d3b79da23e87b4e5692b2e14be8fcaa0
Change-Id: Iae43725b8e0fffc4d44bf57f6227334b41290bd9
MessageValue and friends are pure value objects and newable, so
it makes sense for them to be (de)serializable too. There are some
places where we want to serialize messages, such as in ParserOutput.
The structure of the resulting JSON is inspired by the way we
represent Message objects as plain values elsewhere in MediaWiki,
e.g. StatusValue::getStatusArray().
Co-Authored-By: C. Scott Ananian <cscott@cscott.net>
Depends-On: Ia32f95a6bdf342262b4ef044140527f0676402b9
Depends-On: I7bafe80cd36c2558517f474871148286350a4e76
Change-Id: Id47d58b5e26707fa0e0dbdd37418c0d54c8dd503
This is to make it clearer that they're related to converting serialized
content back into JSON, rather than stating that things are not
representable in JSON.
Change-Id: Ic440ac2d05b5ac238a1c0e4821d3f2d858bc3d76
The serialization test cases look for files based on the name of the
class they are testing. After the namespacing of ParserOutput, they
were looking for files named like:
1.42-MediaWiki\Parser\ParserOutput-binaryPageProperties.json
The embedded backslashes in these filenames would raise havoc on Windows
machines. What's more, none of the existing ParserOutput tests will
actually be checked anymore because the filenames don't match up
with what is expected after namespacing.
Fix this by stripping the namespace from the classname when forming
the test file names.
When this is done, the tests cases for GhostFieldAccess begin running
again, revealing that they were broken when GhostFieldTestClass was
re-namespaced. Add a class alias for the GhostFieldTestClass to fix
this.
Finally, PHP <= 8.1 does not deserialize private properties correctly
after a class is renamed and aliased, because the internal name of the
private property contains the "old" class name in the serialization.
Add a new ::restoreAliasedGhostField() method to the
GhostFieldAccessTrait to workaround this issue and restore proper
deserialization of ParserOutput.
Bug: T365060
Followup-To: I9c64a631b0b4e8e4fef8a72ee0f749d35f918052
Followup-To: I4c2cbb0a808b3881a4d6ca489eee5d8c8ebf26cf
Change-Id: I7bafe80cd36c2558517f474871148286350a4e76
'string|int|float|bool' (in any order) can be replaced by 'scalar'.
'string|int|float|bool|null' (likewise) can be replaced by '?scalar'.
This is convenient for functions that can accept any primitive value,
which comes up sometimes when serializing things as SQL, JSON etc.
Change-Id: I4a711ee59611d76d6745f3640e4aa6bebec02918
This is a non-default option that will add a <div> wrapper around
section contents to allow client-side collapsing. This is intended
for use by MobileFrontEnd, but could eventually be enabled for
desktop read views as well.
Since this parser option is in the "cache-varying options" set, any
caller who sets this option will fork the cache for that page, which
is reasonable as the parser options sets a ParserOutput property.
In the future our caching strategy will get smarter and we'll add
code which avoids the cache split and just transfers the appropriate
values from ParserOptions to ParserOutput flags after the cached
output is retrieved.
Bug: T359001
Change-Id: Ie93959a056ed15a728404eb293e4bb6eeaeb15c0
Even though this JSON property is unused on master, the previous
train release read it from the JSON (and threw the value away).
In order to provide error-free roll-forward and roll-back of the
train, temporarily write an empty string as the value of TOCHTML
so that the read from `$jsonData['TOCHTML']` won't cause a PHP
notice in the logs if we roll back.
This patch is only needed for one train release, and can then
be removed.
Bug: T363107
Change-Id: I77add3bd7f00941cb81481f738bc59d6008c2406
Before this method name gets baked forever into the 1.42 release, rename
the ParserOutput::setIndexedPageProperty() and ::setUnindexedPageProperty()
methods to ::setNumericPageProperty() and ::setUnsortedPageProperty() to
try to address some confusion about whether the *presence* of the page
property is still indexed (it is!), in contrast to whether there's an
additional "sort key" associated with the *value* assigned to the page
property.
This naming is compatible with the feature request in T357783 to have
the sort key and property value specified independently. The new
method signature in that case would be:
...setSortedPageProperty( string $name, string $value, int|float $sortKey )
Although PHP 8.0 will throw a TypeError if a non-numeric type is coerced
to numeric using `0 + ...`, use an explicit is_numeric check to obtain
the same behavior in PHP 7.x.
Change-Id: Ia94c192c429d0482c58467bed787fd2e0aca052f
Not *all* ParserOutputs represent parsed articles, and describe the
merging operations on ParserOutputs in more depth. The interaction
with Content and ContentHandlers is also described (thanks, Daniel!).
Followup-To: Id2e3124652315a74869f504056fa8a99ad794350
Change-Id: I5c1016532eba1b71dc4d3d5d5d0c46775713efb5
If a placeholder value is needed, it is recommended to use the empty string
to avoid wasting database space unnecessarily. Operationalize this
recommendation by providing a default value for the method argument.
Bug: T305158
Bug: T350224
Change-Id: I9ea8d93298d771c2d38fdfb451a2817220ca679a
Deprecate non-string values to ::setPageProperty(), which introduce easy
traps for programmers to fall into. Instead if page properties are intended
to be indexed, use the new ::setIndexedPageProperty() instead. Also add
::setUnindexedPageProperty() for symmetry, with a tighter string type on
the value.
Bug: T305158
Bug: T350224
Change-Id: I8a39a7c90341dfee932aa819c9a0a637a8782f69
This ensures uniform treatment of all places that call `addCategory`
without duplicating the `defaultsort` code; it also ensures that the
effect of the {{DEFAULTSORT}} parser function is independent of page
position.
Bug: T40435
Bug: T353530
Change-Id: I4480a6d59e766fa4eddc9ec9117c58b66771bb47
Fixed SkinModuleTest::provideGetFeatureFilePathsOrder as nesting of
arrays for parameters is wrong
Change-Id: I9875008adf62d284c48662ebfbd245d72e5be064
ParserOutput::getText() is not a simple getter, but does
transformations on the "text" of the ParserOutput; the simple getter
is named ::getRawText().
To maintain consistency, rename ParserOutput::setText() to
::setRawText() and the property name ParserOutput::$mText to
::$mRawText so future readers are not confused.
The JSON property name as it appears in the serialized ParserCache
is left as 'Text' so that we don't have any forward- or backward-
rollback issues.
Change-Id: I3ef34814ab9473cc70d0a6806e8c5a4a02b73491
Non-scalar values passed to ParserOutput::setPageProperty() have never
"worked"; they've been stringified (and null has been stored as an empty
string). Emit a warning so we can fail harder in future releases.
Bug: T305158
Depends-On: Ib36787d04c0ca713587dc8b814ca1c5a827f6f72
Change-Id: I38234084fdc7427ca577bb33a7fce1541581188d
String and non-string values behave very differently when passed to
::setPageProperty(), resulting in some unexpected gotchas for the
unaware caller.
Bug: T350224
Bug: T305158
Change-Id: I23b35b250f27a117d1353ea8a26d2b3f77c568e7
We closed T296023 and opened a new task for the work remaining, so
update the comments in the code to match.
The task relating to `addLanguageLink` is actually T296019.
Change-Id: I28b942a57ed41751d44d8565a290d925f6d7f180
This was formerly used by the REST api, but instead that code just
uses ParserOutput::getRawText() when it needs the full HTML document.
This option has been broken, with various passes like RenderDebugInfo
and AddWrapperDiv adding content in inappropriate places if
bodyContentOnly was false.
Change-Id: Ib45f95ded59c81c16d61803f977d1edbfe82b262
This will allow the Translate extension to set this parser option
in the ArticleParserOptions hook, instead of mutating $options passed
to ParserOutput::getText() in the ParserOutputPostCacheTransform hook.
It ought to also help to handle the many places which call:
... = $parserOutput->getText( [
'enableSectionEditLinks' => false,
] );
by allowing them to set the appropriate ParserOption instead
of passing arguments to ::getText().
Bug: T350626
Change-Id: I719c115194059060f7f888608417a194ac80cc92
This class belongs with the rest of the Parsoid output stash code.
This class has been marked @unstable since 1.39 and thus the move
does not need release notes.
Change-Id: I16061c0c28b1549fbe90ea082cc717fee4a09a6e
This avoids confusion with the "render timestamp" held by the cache,
and is consistent with ::get*RevisionId() etc.
The old ::getTimestamp() and ::setTimestamp() methods have been
deprecated.
Change-Id: Idb5e687709c98086c5d3075d31885c58a0723197
Set the render ID for each parse stored into cache so that we are able
to identify a specific parse when there are dependencies (for example
in an edit based on that parse). This is recorded as a property added
to the ParserOutput, not the parent CacheTime interface. Even though
the render ID is /related/ to the CacheTime interface, CacheTime is
also used directly as a parser cache key, and the UUID should not be
part of the lookup key.
In general we are trying to move the location where these cache
properties are set as early as possible, so we check at each location
to ensure we don't overwrite a previously-set value. Eventually we
can convert most of these checks into assertions that the cache
properties have already been set (T350538). The primary location for
setting cache properties is the ContentRenderer.
Moved setting the revision timestamp into ContentRenderer as well, as
it was set along the same code paths. An extra parameter was added to
ContentRenderer::getParserOutput() to support this.
Added merge code to ParserOutput::mergeInternalMetaDataFrom() which
should ensure that cache time, revision, timestamp, and render id are
all set properly when multiple slots are combined together in MCR.
In order to ensure the render ID is set on all codepaths we needed to
plumb the GlobalIdGenerator service into ContentRenderer, ParserCache,
ParserCacheFactory, and RevisionOutputCache. Eventually (T350538) it
should only be necessary in the ContentRenderer.
Bug: T350538
Bug: T349868
Followup-To: Ic9b7cc0fcf365e772b7d080d76a065e3fd585f80
Change-Id: I72c5e6f86b7f081ab5ce7a56f5365d2f75067a78