* in links (T103714)
* in indicators (T104196)
This change removes the automatic Sanitizer::decodeCharReferences from
Sanitizer::escapeId and Sanitizer::escapeIdInternal. Where decoding of
HTML entities are wanted an explicit call to
Sanitizer::decodeCharReferences is added.
Explicit decode HTML entities in non local autocomments. (T104311)
Bug: T103714
Bug: T104196
Bug: T104311
Change-Id: I88e8e2077e6f5eec2b232391f7818370894a62dc
This special case complicates wikitext semantics and ought to be
unnecessary. Parsoid doesn't include this special case; if this patch
to the PHP parser isn't merged, we should write one for Parsoid to
implement the missing special case logic.
Bug: T175416
Change-Id: I3865c51b21de9d63ac5d06dcc3a3fa9108129d6c
Previous attempt was I943cd9bec0855d9a326b0b50739d686a29995370, reverted in
e687f2da3e due to T174639.
There's still a weird behavior with newline stripping between links, which
I'll try to tackle in a follow-on patch (T175416).
Bug: T2087
Bug: T10897
Bug: T87753
Bug: T174639
Change-Id: I8228cdd3b80faf899000adb511a983edc454bc76
* Parser::getRandomString() (deprecated in 1.26) was removed.
* Parser::uniqPrefix() (deprecated in 1.26) was removed.
* Parser::extractTagsAndParams() now only accepts three arguments. The fourth, $uniq_prefix was deprecated in 1.26 and has now been removed.
Bug: T61113
Change-Id: I7333fff4eb8b9a754b4596992f2a69bbdaac664d
- mostly auto fixes
- some too long lines fixed
- ignore amp space in one case passing by reference
Change-Id: I6472f83bc3cbf4bd629d83050cc3319b19ec465c
It adds the ability to replace the current section ID escaping
schema (.C0.DE) with a HTML5-compliant escaping schema that is
displayed as Unicode in many modern browsers.
See the linked bug for discussion of various options that were
considered before the implementation. A few remarks:
* Because Sanitizer::escapeId() is used in a bunch of places without
escaping, I'm deprecating it without altering its behavior.
* The bug described in comments for Parser::guessLegacySectionNameFromWikiText()
is still there in some Edge versions that display mojibake.
Bug: T152540
Change-Id: Id304010a0342efbb7ef2d56c5b8b244f2e4fb2c5
And auto-fix all errors.
The `<exclude-pattern>` stanzas are now included in the default ruleset
and don't need to be repeated.
Change-Id: I928af549dc88ac2c6cb82058f64c7c7f3111598a
* Right now, one or two are permitted. This patch limits it to one.
The current behaviour seems more a byproduct of refactoring than an
explicit goal.
* Note that this will break links on a handful of pages surfaced in
Parsoid's roundtrip testing.
Change-Id: Icabd34bbf15781bb891bd8e0c079d1a65eb28595
The used phpcs has a bug, so the version 0.9.0 could not be enforced at the moment.
Will be fixed in next version, see T167168
Changed:
- Remove duplicate newline at end of file
- Add space between function and ( for closures
- and -> &&, or -> ||
Change-Id: I4172fb08861729bccd55aecbd07e029e2638d311
This is to safeguard against issues like T167238, where the callback was invalid,
but was only noticed when the magic word was used in an article.
In general, type checks should be applied when things get
registered for later use. Fail early, stay sane.
Change-Id: Ifb7005ee214829c98cec534261c0db7d13f50f35
Save the backtrace when locking, so that if some code tries locking again,
we can print the lock owner's backtrace for easier debugging.
Change-Id: I6e352b4aa5e7cb35825a66592f6c066d9e8b95c9
This was the only addModules() call ever to be inside Parser.
Introduced in a54ef1a203. Prior to that, mediawiki.toc had always been loaded
by OutputPage (via mediawiki.util; and before that, via wikibits).
This patch restores that, and also fixes T130632 by making OutputPage get
it from the Skin, instead of hardcoding this somewhere in addParserOutput().
* Remove deprecated method OutputPage::enableTOC().
* Move mEnableTOC to addParserOutputText().
Bug: T130632
Change-Id: Iaad84d241a4c4348c712ac1087a664b8c9c46da4
This will allow CSS to target just the parser output, without also
accidentally targeting the edit form, diff tables, and so on.
Bug: T37247
Change-Id: If4eb5bf71f94fa366ec4eddb6964e8f4df6b824a
Depends-On: I330c6aa4aaee045614b1801ed34bc9e03be69650
Depends-On: I52a518fa44e017841fe78474012cd69823e0a41d
Move link normalization directly into addExternalLink() method,
since you always need to do it - having it separate is just
inviting people to forget to normalize a link.
Additionally, links weren't properly registered for <gallery>.
This was somewhat unnoticed, as the call to recursiveTagParse()
would register free links, but it wouldn't work for example with
protocol relative links.
Issue originally reported by MZMcBride.
Bug: T48143
Change-Id: I557fb3b433ef9d618097b6ba4eacc6bada250ca2
* This was introduced in 4d3446a8e3 when galleries were tables.
However, in 05579cf0e6, it switched to ul's, but missed updating the
sanitization.
* As an example, the test shows that summary is currently wrongly
permitted.
Change-Id: I8c52477dc65499d0c8a1ee5cc661a5f9ae78cc07
U+0000 is not allowed in HTML5, there's no reason to allow it in wikitext.
It simplifies our code if we can just strip them at the start. Strip in
PST as well so they don't sneak into our database either.
Tweaked the EXT_LINK URLs to account for the fact that invalid characters
get transformed into U+FFFD when using Preprocessor_DOM. See 73649741ed
(r65967) for context on that change.
Bug: T159174
Change-Id: I3f67e92b61aacc87a40c3662085c84d1dac08bfb
I was bored. What? Don't look at me that way.
I mostly targetted mixed tabs and spaces, but others were not spared.
Note that some of the whitespace changes are inside HTML output,
extended regexps or SQL snippets.
Change-Id: Ie206cc946459f6befcfc2d520e35ad3ea3c0f1e0
$index is definitely not a int here, see the big switch( $index )-case
statement below. It switches for strings, not numbers. Also, note that
this is lowercase, one might expect it to be uppercase as this is how
magic words are written in wikitext.
Bug: T96633
Change-Id: Iea93c3796fdee4ed7abbb7608e89b627ca95aead
Use of &$this doesn't work in PHP 7.1. For callbacks to methods like
array_map() it's completely unnecessary, while for hooks we still need
to pass a reference and so we need to copy $this into a local variable.
Bug: T153505
Change-Id: I8bbb26e248cd6f213fd0e7460d6d6935a3f9e468
The leading spaces on the link only cause us problems, such as for the
$noforce check 20 lines later.
Bug: T129218
Change-Id: I93a8da1f73b38fa3da362f8f27479b3039ed3f13
This also protects naked external links, which are internally surrounded by
`-{R|...}-` by LanguageConverter::markNoConversion.
Originally found in failed tests in I7fa2d85d6.
Bug: T54190
Change-Id: I9b099273203482ffb570a5654d8ba50c833e526d
A protected version of explode is factored out as
`StringUtils::delimiterExplode`, since it will be used in follow-up
patches in this series. The `delimiterExplode` implementation creates
an intermediate array of the exploded results, which is reasonable as
the number of image options is small; but since an Iterator is
returned the implementation can be upgraded in the future (at the cost
of additional complexity) to avoid this. The additional code in that
case would be similar to ExplodeIterator.
Bug: T146305
Change-Id: I1327685e9e8c07ef476dceaa6f6dae4ba40989ef