This avoids a type mismatch found by phan strict checks
(Ifc7fc64ac26a756f181b7d0155f13a6500114f5e) -- the passed timestamp
from Parser was a string in unix format (ie, an integer as a string)
but was declared as an integer. It was then passed to
MWTimestamp::getInstance() which expected a string.
However, the 'simple' fix for this issue still caused unnecessary
conversions to/from timestamp format. We took the string (nominally
in TS_MW format), ran a regexp against it to convert it to an
MWTimestamp instance, then converted that MWTimestamp to UNIX format
and exported that as a string, just to take that string and run four
different regexps against it *again* to convert it back to an
MWTimestamp instance so we can format it.
Better to just pass the MWTimestamp directly. Only two wrinkles:
1. the ParserGetVariableValueTs hook expects to be passed a string
in TS_UNIX format and then to be able to mutate it. Nothing in production
uses that hook, so only do this conversion if the hook is registered.
2. Parsoid would like to use the definitions in CoreMagicVariables
in the future as well. So pass the timestamp as the not-MW-@internal
ConvertibleTimestamp class instead of directly as a MWTimestamp.
Change-Id: Ib2c5fa45630c54c2716897370a0580ed48d27242
This patch exports the necessary information from the Parser into the
ParserOutput to ensure that the Table of Contents can be properly
language-converted: both ensuring that the target language is correct
(in cases where it differs from the content language) and that various
conversion-suppression mechanisms are functional. When the
ParserCache does not (yet) have the new properties from Parser, the
behavior is unchanged from before (the content language is used, and
its "preferred variant").
This is a follow up to the "quick fix" deployed in
Ic14b3a49a8ee7ed600485d4f8a363a206035a847 to fix an UBN regression.
Parser tests have also been added to verify that ToC conversion
is correctly done (T299973).
Task T303329 has been opened to (eventually) rename the
'core:target-lang' and 'core:target-lang-variant' properties added to
the ParserOutput in this patch.
Bug: T303235
Bug: T295187
Bug: T299973
Followup-To: Ic14b3a49a8ee7ed600485d4f8a363a206035a847
Followup-To: Ib273f88531c340b561072ee9f616aa60725091e6
Change-Id: Ie0f1d7b6daffc8ff47228f6f086a257518f72717
The functions returning null or the class property is set explict null.
Some function should not accept null or return null.
Found by phan strict checks
Change-Id: Ie50f23249282cdb18caa332f562a3945a58d86ff
In modern mediawiki these methods are just wrappers for the 'defaultsort'
page property, and don't need a parser property of their own.
Change-Id: I18bdffd4d6565733fb52cbff409cc25d49a76b65
php internal functions like floor/round/ceil documented to return
float, most cases the result is used as int, added casts
Found by phan strict checks
Change-Id: I92daeb0f7be8a0566fd9258f66ed3aced9a7b792
Rename Sanitizer::removeHTMLtags() into an @internal method named
::internalRemoveHtmlTags() so that we can deprecate external use.
Code search:
https://codesearch.wmcloud.org/deployed/?q=removeHTMLtags&i=nope&files=&excludeFiles=&repos=
Followup-To: Ic864c01471c292f11799c4fbdac4d7d30b8bc50f
Depends-On: Iaca83ed06e9c61d8366579cd2283cba653c82319
Depends-On: I1963bfe9a99198ea02ca482a5769467ce806cd58
Depends-On: I83923d8b38d33f3638cd53958dd10f257ec21f7c
Depends-On: I018b34bb5f6e113056da9b04cc72d4318422adce
Change-Id: I202826f8b27519f7be89643e24eda47a6e3fc9f6
* Some functions accept only string, cast ints and floats to string
* After preg_matches or explode() casts numbers to int to do maths
* Cast unix timestamps to int to do maths
* Cast return values from timestamp format function to int
* Cast bitwise operator to bool when needed as bool
* php internal functions like floor/round/ceil documented to return
float, most cases the result is used as int, added casts
Found by phan strict checks
Change-Id: Icb2de32107f43817acc45fe296fb77acf65c1786
This makes the call to getImageLinkMTOParams consistent between thumbs
and not thumbs by now passing the parser in the former. The result
being that the nofollow relationship is now added for external thumb
links as well.
Linker::getImageLinkMTOParams sets the parser-extlink-target based on
the same getExternalLinkTarget call. This also makes it clearer when
parser-extlink-rel is modified by the target in getExternalLinkAttribs.
custom-target-link does have a higher precedence but it looks like this
was added in 1a4957e as a fallback to respect the global config so the
change seems fine.
A few parserTests cover this with the wgExternalLinkTarget config but a
case for thumbs with external links was added.
The phan annotations look like a false positive similar to
https://github.com/phan/phan/issues/3645
from the use of `$params[$type][$paramName] = $value;`
Change-Id: Icf887b13d046b0f610b1984d641f248d1dec5226
Don't catch and discard exceptions from the RequestTimeout library,
except when the exception is properly handled and the code seems to be
trying to wrap things up.
In most cases the exception is rethrown. Ideally it should instead be
done by narrowing the catch, and this was feasible in a few cases. But
sometimes the exception being caught is an instance of the base class
(notably DateTime::__construct()). Often Exception is the root of the
hierarchy of exceptions being thrown and so is the obvious catch-all.
Notes on specific callers:
* In the case of ResourceLoader::respond(), exceptions were caught for API
correctness, but processing continued. I added an outer try block for
timeout handling so that termination would be more prompt.
* In LCStoreCDB the Exception being caught was Cdb\Exception not
\Exception. I added an alias to avoid confusion.
* In ImageGallery I added a special exception class.
* In Message::__toString() the rationale for catching disappears
in PHP 7.4.0+, so I added a PHP version check.
* In PoolCounterRedis, let the shutdown function do its thing, but
rethrow the exception for logging.
Change-Id: I4c3770b9efc76a1ce42ed9f59329c36de04d657c
Loops ServiceOptions through to CoreParserFunctions and CoreTagHooks to
avoid access to the main config from static methods.
Bug: T294739
Change-Id: Ia6c97f2d0952964c2ad6189f8053ad127589b37c
In PHP 8.1 the default $flags argument to htmlspecialchars() has changed
from ENT_COMPAT to ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401. This
breaks some tests.
I changed all the calls that break unit tests, and some others
based on a quick code review. A lot of callers just use the default for
convenience, and were already over-quoting, so the default should still
be good enough for them.
Change-Id: Ie9fbeae6f0417c6cf29dceaf429243a135f9fecb
Use the value from corresponding services,
for consistency if services are injected from outside of service wiring.
Change-Id: Ib0f6af20df8dbc0deae71023e5493524d43ce211
PageImages is expensively loading and reparsing the lead section during
LinksUpdate because the parser's image link hooks do not give enough
context to tell whether the image is in the lead section.
So, add a hook which allows PageImages to add a marker to image links.
The marker can be associated with the section number in ParserAfterTidy.
Bug: T296895
Bug: T176520
Change-Id: I24528381e8d24ca8d138bceadb9397c83fd31356
This method was renamed in 1.37 to ::hasReducedExpiry() and only has a
single use in core (and no uses outside of core, as far as I can
tell). Rename the single use and deprecate the old name.
Code search:
https://codesearch.wmcloud.org/search/?q=hasDynamicContent&i=nope
Change-Id: Ie2bea78e31433a01a5590becc06f32294b04522e
This reverts commit 2bcb3fe567.
Reason for revert: this is a good change,
just needed more work to not break CI
Change-Id: I23768bee242e3cf81b1493a740cf070e7ad1e224
This does not move the actual limit report data into
ParserOptions yet, that should be done separately
given that it will require serialization changes.
Let's get this change settled first before messing
with serialization.
This unifies canonical and non-canonical ParserOptions,
so ParserCache can now be used with both. It is hard
to say how this will affect the ParserCache capacity,
so we should monitor it after releasing this.
Change-Id: I154c0a77a5b0287b5572614d56339fb57ac56c33
* Do not store table of contents in parser output
* Instead inject table of contents via strpos where needed
inside Article based on Skin "toc" option
* Use <mw:tocplace> as a TOC placeholder; for Parsoid compatibility
this will be replaced with a <meta> tag in a followup patch.
Bug: T287767
Change-Id: I44045b3b9e78e7ab793da3f37e3c0dbc91cd7d39
Encourage localization and factor out common code by taking a message
key as the first argument to ::addWarningMsg() instead of a wikitext
string. This also plays nicer with Parsoid by separating out the
localization code from the parse.
Bug: T293515
Change-Id: I6a7c04c67ac586ab00d4edcbb3d09485a7794e23
This is a uniform mechanism to access a number of bespoke boolean
flags in ParserOutput. It allows extensibility in core (by adding new
field names to ParserOutputFlags) without exposing new getter/setter
methods to Parsoid. It replaces the ParserOutput::{get,set}Flag()
interface which (a) doesn't allow access to certain flags, and (b) is
typically called with a string rather than a constant, and (c) has a
very generic name. (Note that Parser::setOutputFlag() already called
these "output flags".)
In the future we might unify the representation so that we store
everything in $mFlags and don't have explicit properties in
ParserOutput, but those representation details should be invisible to
the clients of this API. (We might also use a proper enumeration
for ParserOutputFlags, when PHP supports this.)
There is some overlap with ParserOutput::{get,set}ExtensionData(), but
I've left those methods as-is because (a) they allow for non-boolean
data, unlike the *Flag() methods, and (b) it seems worthwhile to
distingush properties set by extensions from properties used by core.
Code search:
https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos=
Bug: T292868
Change-Id: I39bc58d207836df6f328c54be9e3330719cebbeb
This moves the implementation of ParserOutput::addTrackingCategory()
to the TrackingCategories class as a non-static method. This makes
invocation from ParserOutput awkward, but when invoking as
Parser::addTrackingCategory() all the necessary services are
available. As a result, we've also soft-deprecated
ParserOutput::addTrackingCategory(); new users should use the
TrackingCategories::addTrackingCategory() method, or else
Parser::addTrackingCategory() if the parser object is available.
The Parser class is already kind of bloated as it is (alas), but there
aren't too many callsites which invoke
ParserOutput::addTrackingCategory() and don't have the corresponding
Parser object handy; see:
https://codesearch.wmcloud.org/search/?q=%5BOo%5Dutput%28%5C%28%5C%29%29%3F-%3EaddTrackingCategory%5C%28&i=nope&files=&excludeFiles=&repos=
Change-Id: I697ce188a912e445a6a748121575548e79aabac6
The ::getProperty() naming is too generic and doesn't clearly indicate
that these are "page properties" (which have their own table in the DB).
As part of refactoring a clean API out of ParserOutput which can be used
by Parsoid, clean up the naming here.
Soft-deprecation in this patch, there are a handful of external users
which need to be cleaned up before we hard-deprecate.
Bug: T287216
Change-Id: Ie963eea5aa0f0e984ced7c4dfa0fd65d57313cfa
The value in the attribute displaytitle must contain valid HTML. The
sanitizer of the {{DISPLAYTITLE}} parser ensures that only valid HTML
is accepted.
If there is no {{DISPLAYTITLE}} in the wikitext then displaytitle
falls back to $title->getPrefixedText(). Here an HTML encoding of
special characters is necessary. This affects only the replacement of
& by & because other special characters like < and > are not
allowed in the title.
This change affects the displaytitle fallback on the following places:
* ApiParse
* ApiQueryInfo
* InfoAction
* Parser
The displaytitle fallback in OutputPage is also updated to this
behavior although
Sanitizer::normalizeCharReferences( Sanitizer::removeHTMLtags( $html )
also replaces & by &.
Also add test cases with & in the displaytitle to:
* ApiParseTest
* ApiQueryInfoTest
* parserTests
Bug: T291985
Change-Id: I8ee1e2731d9bfa49725d663b34986e7e3073e4ca
Because in PHP is "0" == false.
Also
* Combine $this->mOutput->setTitleText calls.
* Avoid inverted logic. Use
if ( !A && !B && !C && D )
instead of
if ( !( A || B || C || !D ) )
* Document false as possible return value.
Change-Id: I92c343b74a9b313b10a2c9b31717a3727aed4cde
Update/Create override classes of ContentHandler.
Soft-deprecate and remove method from Content and classes that override them.
Bug: T287158
Change-Id: Idfcfbfe1a196cd69a04ca357281d08bb3d097ce2