Previously:
* It was unclear that generate-html is an optional optimization
* Most of MediaWiki core was doing $parserOutput->setText('') if
html wasn't generated. However this is wrong and will cause
$parserOutput->hasText() to return true and also potentially cause
cache pollution if a content handler both does that and supports
parser cache (Like MassMessage; see T299896)
* The default value of mText in the constructor was '', and most
of the time MW used that default. This doesn't seem right. If
setText() is never called, the ParserOutput should not be considered
to have text
* It was impossible to set mText to null, as $parserOutput->setText(null)
was a no-op. Docs implied you were supposed to do this, so it was very
confusing.
This patch clarifies docs, changes the default value for ParserOutput::$mText
from '' to null, and makes $parserOutput->setText(null) do what you
expect it to. The last two are arguably breaking changes, although
the previous behaviours were unexpected, mostly undocumented and
based on a code search do not appear to be relied on.
It seems like the main reason this only broke MassMessage is most
content handlers either don't support generateHtml, or they don't
support parser cache.
Bug: T306591
Change-Id: I49cdf21411c6b02ac9a221a13393bebe17c7871e
Depends-On: I68ad491735b2df13951399312a4f9c37b63a08fa
Part 1, proof of concept. Hundreds of files left to go. These changes
brought to you in large part by vim macros.
Bug: T305805
Change-Id: I44789091e9f6394c800a11b29f22528c8dcacf71
As part of the project of enforcing uniform semantics for
combining ParserOutput objects (T300979) use standard boolean flags
for the 'index' and 'noindex' index policy metadata.
The forward-compatibility "1.39_wmf.7-ParserCache-*" serialization
test cases have been renamed to "1.39-ParserCache-*" in this commit;
backward compatibility with the prior representation of index policy
will continue to be tested via the "1.38-ParserCache-*" cases.
Bug: T300979
Change-Id: I683e5ae054a0425b03c60a4af8c845b576414c1d
Instead of ParserOutput::$mIndexPolicy, a future MW version
(I683e5ae054a0425b03c60a4af8c845b576414c1d) will use two boolean fields
ParserOutput::$mIndexSet and ::$mNoIndexSet. For parser cache migration
purposes, ensure that core can deserialize the new version so that
rollback are safe.
Add serialization test cases with the new boolean fields as
"1.39_wmf.7-ParserOutput-*"; compatibility with the existing
"mIndexPolicy" serialization will continue to be tested with the
"1.38-ParserOutput-*" cases.
Change-Id: I5e4fc68cea18b31ecb028b3867537dcbd86b93cd
A number of exceptions assume that the ParserAfterParse hook is called
exactly once per top-level page, and use that hook to "finalize"
various write-once properties in ParserOutput, including jsconfigvars.
Unfortunately, ParserAfterParse can't be supported properly in Parsoid
(T303630), which results in legacy extensions using this hook
overwriting extensiondata and jsconfigvars multiple times.
Instead of throwing an exception, restore the previous Parsoid
behavior where the last write wins. This doesn't fix the root cause,
but at least it doesn't regress. Eventually we'll have to deprecate
the ParserAfterParse hook, and when we do so we can add deprecation
warnings to these code paths in ParserOutput::collectMetadata() as
well and eventually remove them.
Noting uses ParserOutput::collectMetadata() except Parsoid at the
moment.
Bug: T303014
Bug: T303015
Change-Id: I9d1f0f6bab1305552a0350667d6142a24bc04049
This allows Parsoid to properly merge jsconfigvars via the external API
(ie, when Parsoid is run in 'standalone mode') when an extension uses
the new-in-1.38 ParserOutput::appendJsConfigVar() method.
Change-Id: I974d9ecfb4ca8b22361d25c4c70fc5e55c39d5ed
This patch exports the necessary information from the Parser into the
ParserOutput to ensure that the Table of Contents can be properly
language-converted: both ensuring that the target language is correct
(in cases where it differs from the content language) and that various
conversion-suppression mechanisms are functional. When the
ParserCache does not (yet) have the new properties from Parser, the
behavior is unchanged from before (the content language is used, and
its "preferred variant").
This is a follow up to the "quick fix" deployed in
Ic14b3a49a8ee7ed600485d4f8a363a206035a847 to fix an UBN regression.
Parser tests have also been added to verify that ToC conversion
is correctly done (T299973).
Task T303329 has been opened to (eventually) rename the
'core:target-lang' and 'core:target-lang-variant' properties added to
the ParserOutput in this patch.
Bug: T303235
Bug: T295187
Bug: T299973
Followup-To: Ic14b3a49a8ee7ed600485d4f8a363a206035a847
Followup-To: Ib273f88531c340b561072ee9f616aa60725091e6
Change-Id: Ie0f1d7b6daffc8ff47228f6f086a257518f72717
This reverts commit 0fdd607a84.
This attempt to fix T295187 caused other issues (T303235). A proper
fix is in Ie0f1d7b6daffc8ff47228f6f086a257518f72717.
Bug: T303235
Change-Id: Ib273f88531c340b561072ee9f616aa60725091e6
Content language of specific pages can be changed manually or by the Translate extension.
Bug: T295187
Change-Id: I714711201ba71a2234d625c2e71505973655f36e
This has core implement an abstract interface defined by Parsoid in order
to allow Parsoid to record metadata in ParserOutput without introducing
a cyclic dependency.
Bug: T287216
Followup-To: Ia02c6774c87b13d1ae5a8ed1e55cdd8c88c19b9e
Depends-On: Ie0e358a4910c1946eb4added76318fcacf9308df
Change-Id: I15c0e81185b9957fe097c82e6609a200742ee7d1
The old ParserOutput::getProperty() method returned `false` when a property
was missing. This requires callers to use the `?:` syntax to supply default
values, which then causes any falsey value to be treated as missing.
So, for example, setting the defaultsort to '0' will cause the default
sort to be ignored.
Modern php convention is to use `null` for missing values, and the `??`
syntax is a better/more restrictive alternative to `?:`.
We renamed `ParserOutput::getProperty()` to `::getPageProperty()` in
1.38 (Ie963eea5aa0f0e984ced7c4dfa0fd65d57313cfa/T287216) but kept the
return value convention. Before this actually makes it into a 1.38
release, take the opportunity to fix the return value for the new
`ParserOutput::getPageProperty()` method to return `null` when the
property is missing.
We need to do some temporary workarounds to the places we'd
already swapped over to use the new `::getPageProperty()` method
to allow them to handle either `false` or `null` as a return value;
we'll clean that up once this is merged.
Code search:
https://codesearch.wmcloud.org/deployed/?q=-%3EgetPageProperty%5C%28|T301915&i=nope&files=&excludeFiles=&repos=
Bug: T301915
Depends-On: I3f11ce604970e47b41fc1c123792df8c3045626f
Depends-On: Ie7533f49fe4cad01ebfda29760d23c61e9867b10
Depends-On: Ic5c09f5caa4c897bc553c614fbae9cee159566a2
Depends-On: I0278b2eafd90e77e4fee41c45a1165fb79ddf47e
Depends-On: I383abb6b7dc5e96c0061af13957609f6e31a1065
Depends-On: I79f9f4078e415284af29b15047bafd1c823d7f5b
Depends-On: I02276c48c49f5d2d241a69eb0a6cdf439b572d8b
Depends-On: I71628661b4539a4e35ae32846e719f92bcf782e0
Depends-On: I7e215cb43de0ce150a6bcc00f92481dcdcfed383
Change-Id: Iaa25c390118d2db2b6578cdd558f2defd5351d15
ContentMetadataCollector is a write-only interface defined by Parsoid
that performs the metadata collection functions of ParserOutput. In
order to support asynchronous and out-of-order parses,
ContentMetadataCollector is write-only and merges of fragments are
defined to be independent of merge order.
This provides an initial implementation of ParserOutput::collectMetadata()
which transfers metadata from a ParserOutput to a ContentMetadataCollector.
It is intended that the flags and accumulators in ParserOutput will be
(incrementally) made more regular so that ::collectMetadata() grows
simpler over time.
An optional $strategy argument is added to ::appendExtensionData() and
::appendJsConfigVars() to allow future expansion of merge strategies,
although only `union` is supported for the moment.
The MW_MERGE_STRATEGY_UNION constant will be upstreamed into Parsoid's
ContentMetadataCollector class as MERGE_STRATEGY_UNION; we've added a
prefix to ParserOutput's copy for now to avoid a conflict with the
constant which Parsoid will define.
Bug: T300979
Change-Id: I4e20b84eb590296fb3c011bb4d658d7a65082a11
Just added the low-hanging fruit: the methods where the return type was
obvious from local inspection.
Change-Id: If6aabfc8f0dacb156167745808fd5c57cdb3eb23
As the @note says, only types which can be array keys are currently supported
as values. Fix the phan @param to match.
(In the future we might use a different data representation which would
allow richer types, but we're starting simple.)
Change-Id: I141dc5381a8b260a3a99553b7855a1fd01b0170f
Soft-deprecate the use of ::setExtensionData() to destructively update
the value stored under a single key. Add the new
::appendExtensionData() method to use where multiple values are
desired. This accomodates the asynchronous and incremental parsing
goals on the Parsoid roadmap.
Bug: T300981
Change-Id: I2dea4ba71ea506428854a9983c1abd906b2efd5f
Deprecate ParserOutput::addJsConfigVars() and add setter methods which
better ensure that the ParserOutput contents are independent of parse
order. This accomodates the asynchronous and incremental parsing goals
on the Parsoid roadmap.
Bug: T300307
Change-Id: I4f08d1098da211f7bf5c43c08c620de224cbf37f
We always implicitly converted a string argument to an array anyway; just
ask the caller to do this instead so that we can have a simpler and
more straight-forward method signature which matches the plural form
of the method name.
Part of the ParserOutput API cleanup / Parsoid unification discussed
in T287216.
In a number of places we also rename $out to $parserOutput, to make it
easier for codesearch (and human readers) to distinguish between
ParserOutput and OutputPage methods.
Code search:
https://codesearch.wmcloud.org/deployed/?q=p%28arser%29%3F%28Out%7Cout%29%28put%29%3F-%3EaddModule%28Style%29%3Fs%5C%28&i=nope&files=&excludeFiles=&repos=https://codesearch.wmcloud.org/deployed/?q=arser-%3EgetOutput%5C%28%5C%29-%3EaddModule%28Style%29%3Fs%5C%28&i=nope&files=&excludeFiles=&repos=
Bug: T296123
Depends-On: Iedea960bd450474966eb60ff8dfbf31c127025b6
Depends-On: I7900c5746a9ea75ce4918ffd97d45128038ab3f0
Depends-On: If29dc1d696b3a4c249fa9b150cedf2a502796ea1
Depends-On: I8f1bc7233a00382123a9b1b0bb549bd4dbc4a095
Depends-On: I52dda72aee6c7784a8961488c437863e31affc17
Depends-On: Ia1dcc86cb64f6aa39c68403d37bd76f970e55b97
Depends-On: Ib89ef9c900514d50173e13ab49d17c312b729900
Depends-On: If54244a0278d532c8553029c487c916068e1300f
Depends-On: I8d9b34f5d1ed5b1534bb29f5cd6edcdc086b71ca
Depends-On: I068f9f8e85e88a5c457d40e6a92f09b7eddd6b81
Depends-On: Iced2fc7b4f3cda5296532f22d233875bbc2f5d1b
Depends-On: If14866f76703aa62d33e197bb18a5eacde7a55c0
Depends-On: I9b7fe5acee73c3a378153c0820b46816164ebf21
Depends-On: I95858c08bce0d90709ac7771a910f73d78cc8be4
Depends-On: If9a70e8f8545d4f9ee3b605ad849dbd7de742fc1
Depends-On: I982c81e1ad73b58a90649648e19501cf9172d493
Depends-On: I53a8fd22b22c93bba703233b62377c49ba9f5562
Depends-On: Ic532bca4348b17882716fcb2ca8656a04766c095
Depends-On: If34330acf97d2c4e357b693b086264a718738fb1
Change-Id: Ie4d6bbe258cc483d5693f7a27dbccb60d8f37e2c
This method was renamed in 1.37 to ::hasReducedExpiry() and only has a
single use in core (and no uses outside of core, as far as I can
tell). Rename the single use and deprecate the old name.
Code search:
https://codesearch.wmcloud.org/search/?q=hasDynamicContent&i=nope
Change-Id: Ie2bea78e31433a01a5590becc06f32294b04522e
This reverts commit 2bcb3fe567.
Reason for revert: this is a good change,
just needed more work to not break CI
Change-Id: I23768bee242e3cf81b1493a740cf070e7ad1e224
This does not move the actual limit report data into
ParserOptions yet, that should be done separately
given that it will require serialization changes.
Let's get this change settled first before messing
with serialization.
This unifies canonical and non-canonical ParserOptions,
so ParserCache can now be used with both. It is hard
to say how this will affect the ParserCache capacity,
so we should monitor it after releasing this.
Change-Id: I154c0a77a5b0287b5572614d56339fb57ac56c33
We moved the ToC insertion from the parser to ParserOutput::getText()
in T287767 but forgot to ensure that the ToC contents are properly
language converted -- this happens *after* the call to
ParserOutput::setTOCHTML() in the old Parser code.
This is a quick and dirty fix, which does the language conversion
but probably misses a few corner cases of the original behavior
(marked by XXX comment). For example, it doesn't disable language
conversion on interface messages -- but there shouldn't be any
ToC on interface messages. Not heeding __NOCONTENTCONVERT__
in the article is a legit problem, but probably not as bad as
the UBN regression we're fixing. We'll clean this up in
a followup (T295209), but it will involve passing some additional
information from the Parser to ParserOutput which won't be
present in "old" parser cache entries anyway.
This is an UBN and this patch is the quickest way to ensure that
existing parser cache content renders correctly. It's
preferable to the alternative
(Iffcff96fd9b4749794ac78414c1801979a652792) which handles all the
corner cases but can't fix up existing parser cache content,
which has "always" been stored without language conversion.
Bug: T295187
Change-Id: Ic14b3a49a8ee7ed600485d4f8a363a206035a847
* Do not store table of contents in parser output
* Instead inject table of contents via strpos where needed
inside Article based on Skin "toc" option
* Use <mw:tocplace> as a TOC placeholder; for Parsoid compatibility
this will be replaced with a <meta> tag in a followup patch.
Bug: T287767
Change-Id: I44045b3b9e78e7ab793da3f37e3c0dbc91cd7d39
Encourage localization and factor out common code by taking a message
key as the first argument to ::addWarningMsg() instead of a wikitext
string. This also plays nicer with Parsoid by separating out the
localization code from the parse.
Bug: T293515
Change-Id: I6a7c04c67ac586ab00d4edcbb3d09485a7794e23
This is a uniform mechanism to access a number of bespoke boolean
flags in ParserOutput. It allows extensibility in core (by adding new
field names to ParserOutputFlags) without exposing new getter/setter
methods to Parsoid. It replaces the ParserOutput::{get,set}Flag()
interface which (a) doesn't allow access to certain flags, and (b) is
typically called with a string rather than a constant, and (c) has a
very generic name. (Note that Parser::setOutputFlag() already called
these "output flags".)
In the future we might unify the representation so that we store
everything in $mFlags and don't have explicit properties in
ParserOutput, but those representation details should be invisible to
the clients of this API. (We might also use a proper enumeration
for ParserOutputFlags, when PHP supports this.)
There is some overlap with ParserOutput::{get,set}ExtensionData(), but
I've left those methods as-is because (a) they allow for non-boolean
data, unlike the *Flag() methods, and (b) it seems worthwhile to
distingush properties set by extensions from properties used by core.
Code search:
https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos=
Bug: T292868
Change-Id: I39bc58d207836df6f328c54be9e3330719cebbeb
This function, added in 1.34, does not appear to be used outside core:
https://codesearch.wmcloud.org/search/?q=-%3EgetAllFlags\%28\%29&i=nope&files=&excludeFiles=&repos=
It currently exposes representation details of ParserOutput (that is,
whether a boolean flag is stored in an explicit property, in the
$mFlags array, or in the $mExtensionData array) and so it would be
best not to expose it outside core so as to facilitate any future
change in the internal representation of ParserOutput.
Bug: T292868
Change-Id: I7b6d309425ff01dc211334b848068d0b9c0f9261