This also protects naked external links, which are internally surrounded by
`-{R|...}-` by LanguageConverter::markNoConversion.
Originally found in failed tests in I7fa2d85d6.
Bug: T54190
Change-Id: I9b099273203482ffb570a5654d8ba50c833e526d
A protected version of explode is factored out as
`StringUtils::delimiterExplode`, since it will be used in follow-up
patches in this series. The `delimiterExplode` implementation creates
an intermediate array of the exploded results, which is reasonable as
the number of image options is small; but since an Iterator is
returned the implementation can be upgraded in the future (at the cost
of additional complexity) to avoid this. The additional code in that
case would be similar to ExplodeIterator.
Bug: T146305
Change-Id: I1327685e9e8c07ef476dceaa6f6dae4ba40989ef
Changes:
- uses int instead of number as param and return value type,
- uses stdClass instead of stdObject
- fixes ResourceLoaderClientHtml constructor's $target param type:
it is string|null, not an array (previously misspelled as "aray")
- changes the type of references to XML parser in XMP lib to resource
instead of not existing XMLParser
Change-Id: I98c363ebc6658d1f4dcabad97a9a92f3fcd7ea8c
This is a pure documentation change. It mostly removes empty lines from
comments (and entirely empty comments), as well as adds a few missing
documentation blocks and fixes a minor mistake. I hope it's ok to have
this in one patch. I can split it, please tell me.
Change-Id: I9668338602ac77b903ab6b02ff56bd52743c37c4
This change resulted in unreasonable feature loss (human-readable
limit report was gone). Three months and multiple followups later,
the functionality is still not completely restored. Given lack
of response from the original author, I think it is time to revert
and reconsider, especially since the 1.28 release is soon.
A machine-readable limit report would be a very useful feature,
but not at the cost of losing human-readable limit report.
This reverts the following commits:
* Move NewPP limit report HTML comments to JS variables
b7c4c8717f
* Only pretty-print the parser report JS vars
28adc4d7ee
* Show wgPageParseReport on page previews too
1255654ed5
* Re-add human readable parser limit report
0051f108b9
* Restore hooks.txt for ParserLimitReportFormat
4663e7a737
Resolved minor merge conflicts in OutputPage (with 80e5b160)
and release notes.
Bug: T110763
Bug: T142210
Change-Id: Id88c8066fae3f369e8977b4b7488f67071bdeeb7
Use HTTPS instead of HTTP where the HTTP link is a redirect to the HTTPS link.
Also update some defect links.
Change-Id: Ic3a5eac910d098ed5c2a21e9f47c9b6ee06b2643
This adds 3 tracking categories, one for each type of magic link (ISBN,
RFC, PMID). This will allow wikis to gauge usage and identify pages that
need migrating.
These will only show up if the respective magic links are enabled via
$wgEnableMagicLinks.
Change-Id: Ic483f0c493112bf6373e1b37961e1241c20c3582
Which means we can't check if a parser limit was exceeded while trying
to expand the content of a tag, but that's probably not a huge loss.
It'll just result in potentially strange output rather than an exception.
Bug: T149622
Change-Id: I7910dfa0f61b1cc9168c7ed1498b2bda27c47f0e
The most critical one is if the marker name is bad, since that causes
StripState to throw an exception since I798d31af. But we may as well
check the other expand calls in this function too to avoid outputting
broken wikitext.
Bug: T136401
Change-Id: I1cb353d74f9a46168055e1abeb22cf569fe9354a
Apparently it is possible for Parser::mParserOptions
to not be set in some cases. I'll try again later.
This reverts commit bda74bff6e.
Bug: T146433
Change-Id: Idb6d1b20995d5f86b712abb386ab987356c4f560
wfEscapeWikiText() used $wgEnableMagicLinks, but that could result in an
inconsistency when something modifies the magic link related
ParserOptions.
In general, most uses of wfEscapeWikiText() are in parser functions or
when message parsing, so the Parser is a logical place for it.
A future patch will make it easy to use Parser::escapeWikitext() in
message parameters.
Change-Id: I0fd4d5c135541971b1384a20328f1302b03d715f
The magic link functionality is "old backwards-compatibility baggage"
that we probably want to get rid of eventually. The first step to doing
so would be making it configurable and allowing it to be turned off on
wikis that don't use it.
This adds each of the 3 magic link types as individual parser options,
which can be controlled by the $wgEnableMagicLinks setting.
Additionally, wfEscapeWikiText() was updated to only escape enabled
magic link types.
Bug: T47942
Change-Id: If63965f31d17da4b864510146e0018da1cae188c
Since in several cases, with an all-in-one commit, git's file rename
detection failed, I split the renames out into their own commit to
make review easier. Some changes here won't make complete sense without
the following commit.
* Moved TestsAutoLoader to tests/common/. It will be joined by a friend.
* Renamed ParserTest to ParserTestRunner, since the former name was
overly generic.
* Renamed TestFileIterator to TestFileReader. Please see the subsequent
commit for rationale.
* Moved parserTests.php to tests/parser/. It was the only file left in
tests/, and it should have been moved to tests/parser years ago,
analogous to phpunit.php.
* Renamed NewParserTest to ParserIntegrationTest. This was a tricky one,
apparently the name has to end in "Test" or else the structure test
will fail. Analogous to ParserMethodsTest etc. Rationale: because it's
not new anymore.
* Renamed MediaWikiParserTest to ParserTestTopLevelSuite and moved it to
the suites directory. A more descriptive name. Being in suites/
shields it from StructureTests, and is correct anyway.
Change-Id: Iddc6eaf815fdd64b3addb8570b4b6303ab99d634
This is more consistent with LoadBalancer, modern, and inclusive
of master/master mysql, NDB cluster, and MariaDB galera cluster.
The old constant is an alias now.
Change-Id: I0b37299ecb439cc446ffbe8c341365d1eef45849
Inverse flame graphs shows revision lookups as one of the
big three queries (Revision, LinkCache, getTitleInfo of
ResourceLoaderWikiModule).
This works via a new Revision::newKnownCurrent() method
needs both page/rev ID from the DB (to avoid invalidation)
and fetches the user name and rev_deleted if needed (again
to avoid invalidation). Parser does not care about fields
anyway in the template path.
Also improved cross-wiki support a bit, and fixed up some
docs and IDEA errors.
Change-Id: Icad602dba5de18c7758b77fd23b0a450ff21d09f
For simple pages that transclude special pages, like user pages
including Special:PrefixIndex, the TTL is allowed to drop to 15
seconds if the page parses fast enough.
Bug: T139893
Change-Id: If41885ded648d68352fe3d06336d98aa0ab53966
The code that normalizes line endings ("\r\n" and "\r" to "\n") and
trims trailing whitespace is buried in Parser::preSaveTransform(), and
was duplicated to TextContent in 96b6afb31d, as non-wikitext content
models should still be normalizing line endings.
This splits the duplicated code into
TextContent::normalizeLineEndings(), and utilize it in the Parser.
Additionally, expand the documentation of
TextContent::preSaveTransform() to document that subclasses should make
sure they normalize line endings during the PST stage.
And remove a useless rtrim() call from WikitextContent that did nothing.
Change-Id: I9094c671d4bbd23d75436f8f1d682d6dd6e6d2fc
rawurldecode was being run on unclosed internal links
which could allow an attacker to insert arbitrary
html into the page.
See also related: r13302
Bug: T137264
Change-Id: I4e112a9e918df9fe78b62c311939239b483a21f5
This does the same normalization of newlines that
Parser::preSaveTransform() does. This should be appropriate for any text
content type, especially considering that EditPage uses
WebRequest::getText() which does a less-strict version of this same
transformation.
This also cleans up the code for doing that newline replacement
to be a bit less verbose.
Bug: T142805
Change-Id: I462afcda502f031a8b0360d982ce2398a0383a96
Doxygen requires the full qualified name of the class in a comment
or in the @aram/@return annotation, otherwise the class isn't linked
in the resulting output[1]. This commit changes the LinkRenderer
annotations in SpecialPage and Parser to \MediaWiki\Linker\LinkRenderer.
[1] https://doc.wikimedia.org/mediawiki-core/master/php/classSpecialPage.html#a3560214f63fc2f20c63b4025db5cd81d
Change-Id: I74cedcd764a6053cc5a0c6d2eedbedb72651f57c
We have two hacks which are used when Tidy is not available: one in
Sanitizer::removeHTMLtags(), and the second here as a late Parser pass
equivalent to Tidy itself. But the Sanitizer one was enabled only if
MWTidy::isEnabled() returned false, whereas the Parser one was enabled
also when tidy was disabled in ParserOptions. This patch makes them both
consistent, it enables the bug 2702 hack only when MWTidy::isEnabled()
returns false, and when Tidy is disabled in parser options, the output
is simply passed through.
This allows tidying to be done separately on the ParserOutput, as is
required by the proposed ParserMigration extension (I24d0776a933fa3f).
Eventually the bug 2702 hack will be removed in favour of a pure-PHP
HTML 5 parser, but it looks like it is too early for that.
Change-Id: I94be6c9dec531c23ef80cb36732243bd6858bf22
* Instead of having messy code to create a hidden HTML
comment of English strings at the bottom of the page,
expose the structured data of the parse information
to JS so tools can use it.
* Make makeConfigSetScript() use pretty output so these
variables are also easy to read in "view source".
* Remove ParserLimitReportFormat hook, since the data
is not formatted to HTML anymore.
Bug: T110763
Change-Id: I2783c46c6d80f828f9ecf5e71fc8f35910454582
We originally imagined rolling out the display of empty elements
simultaneously with the Html5Depurate, but now we have added support for
marking empty elements to Html5Depurate and plan on having some sort of
longer migration period. So, move the relevant CSS to content.css, and
remove the concept of CSS dependant on tidy driver.
Add a body class which will allow the effect to be toggled in a gadget or
extension. Actual toggling in the CSS will be in the stage 2 patch, to be
deployed after the varnish cache and parser cache have expired.
I originally imagined that there would be a gadget that overrides the
rule with an !important selector, but that method does not allow you to
recover the original display property, which is often overridden by the
style attribute or site CSS to be "inline".
Also, in RaggettWrapper, switch to the new class mw-empty-elt, following
Html5Depurate, instead of mw-empty-li. The old class will be removed in
the stage 2 patch.
Change-Id: Ic0f432c43a006629ca5a1a7c2dda3552ceb4dc4f
Some pages use constructs like `<b/>` or `<span/>` to protect spaces or
special characters at the beginning/end of templates. This syntax is
incompatible with HTML5 parsing rules, which dictate that these should
be treated as open tags, and instead rely on an unusual quirk of the
`tidy` program that removes invalid constructs.
This syntax is deprecated as part of the process of reconciling `tidy`
with modern HTML5 parsing semantics. Authors can use ` ` or `<nowiki/>`
as valid replacements.
In order to provide time to transition existing content, pages using
self-closing tags in violation of the HTML5 parsing specification
will have their templates/pages added to a new tracking category.
After these uses are fixed, we will change the sanitizer to treat these
as normal open tags, to be consistent with the HTML5 parsing spec.
Note that this construct is already disallowed if tidy is disabled; it
is rendered as `<b/>`. We add a tracking category in the no-tidy
case as well, in preparation for eventually making the no-tidy and
with-tidy behaviors consistent.
Bug: T134423
Change-Id: Ie1cf3aa40d5483bf395ece539f0240b694ff04ab
During both the edit stash and first parse in on page save,
guess what the rev_id will be and use that instead of null.
Only reparse if it turns out to be wrong. This avoids extra
parsing on wikis that have low-medium traffic, and does not
cost much. The parsing that can be avoided is:
a) in doEditContent() by using the stash
b) in doEditUpdates() by using the doEditContent() result,
whether that was able to use the stash or not itself
Also improved the parse operation logging in save paths.
Bug: T137900
Change-Id: Ic6faae70a78b4e223e4d3585cefd482c0fa00677
And SpecialPage::setLinkRenderer(), so the Parser can pass on its
LinkRenderer instance for when special pages are being included in a
page.
Change-Id: If9a9c648ab670b824ce534e7cf0d20d41e1bfd12
In galleries, bad images are rendered as links. This causes the same behavior
to occur in wikitext, rather than the current behavior of not rendering
anything.
Change-Id: I1a074bff7cb661b5b4e6db9503eb6a5de702ee2f
Few maintained extensions still rely on this and it is
bad practice to use this for handling cache correctness.
Change-Id: I2de481198bbff5c4f3dd81fc6d1b137e4c37b93f
Previously {{Special:Foo}} would cause parser cache to be disabled,
now have a method in SpecialPage to control this behaviour and set
arbitrary caching times.
Note: This does not affect caching of direct views to the special page
The new default is now disabling cache if not in miser mode,
otherwise setting to 1 hour, except for Special:Recentchanges
and Special:Newpages which set to 5 minutes. These values are
possibly really low, but for now I think best to be close to the
old behaviour. We had 0 caching for these things for years, and
afaik it hasn't caused any big issues. Part of me wonders if
Special:Recentchanges should stay at 0, but that sounds crazy.
This change also causes transcluded special pages to not be
"per-user" if they are being cached (Specificly $wgUser et al
become 127.0.0.1).
Bug: 60561
Change-Id: Id9ce987adeaa69d886eb1c5cd74c01072583e84d
Previously, no TTL at all was used, which is quite harsh on
performance and had downstream effects like disabling edit
stashing for affected pages.
Bug: T136678
Change-Id: I2462057aa189cfb05fe65d0b3c081a9fd10066a2
* Do not change the result to a null editing user anymore.
* Use a new vary-user flag instead of vary-revision. This
will only cause a reparse on null edits. Normal edits
can still use the prepared output now.
* Edit stashing now applies for pages with this magic word.
* Fixed bug where the second prepareContentForEdit() call
(due to vary-X flags) would still check the edit stash.
Bug: T135261
Bug: T136678
Change-Id: Id1733443ac3bf053ca61e5ae25db3fbf4499e9f9
Just always use the input size for new revisions. If they are
saved, then that should be the revision size. If they are just
null edits, then the size must have matched the current revision.
This also enables edit stashing for this case.
Change-Id: I428c0cc87750eeddd1d7dcebd1a2b03817cec441