Commit graph

1543 commits

Author SHA1 Message Date
jenkins-bot
3cb14f56bd Merge "Show dimensions in TraditionalImageGallery" 2017-06-15 17:16:13 +00:00
Brad Jorsch
da43a0ae34 ParserCache: Delete old-style key when saving
It was noticed that disk usage on the parser cache machines was
increasing since shortly after wmf.4 was redeployed everywhere on the
9th. One theory is that I7fb9ffca9 causes this by making reparses for an
existing old-style cache entry start writing the new-style key where
they would previously have overwritten the old-style key. On that
theory, let's delete that old-style key (that should now be useless) on
save.

I'm assuming here that firing a blind delete for keys that probably
don't exist in the cache (i.e. every new edit) isn't going to hurt
anything. If that's not the case, we'd need to check existence before
deleting.

Bug: T167784
Change-Id: Ie5efb05722cb7da2a90da195a1f244468177175d
2017-06-14 13:42:36 +00:00
Brad Jorsch
46c0c39514 ParserOptions: Fix handling of 'editsection'
The handling of the 'editsection' option prior to I7fb9ffca9 was
unusual: it was included in the cache key, but the getter didn't ever
flag it as "used". This was overlooked in I7fb9ffca9.

This fixes the handling to restore that behavior. It's no longer
considered to be a real parser option, so changing it won't make
isSafeToCache() fail while reading it won't flag it as 'used'.

But to keep Wikibase working (see T85252), if 'editsection' is supplied
in $forOptions optionsHash() will still include it in the hash so
whatever Wikibase is doing by forcing that doesn't break. The hash when
it is included is the same as was used in I7fb9ffca9 to reuse keys.

Once optionsHashPre30() is removed, Wikibase should be changed to use
some other method to fix T85252 so we can remove that hack from
optionsHash().

Change-Id: I77b5519c5a1122a1fafbfc523b77b2268c0efeb1
2017-06-14 04:52:36 +00:00
Brad Jorsch
0facbe3e3d Try harder to avoid parser cache pollution
* ParserOptions is reorganized so it knows all the options and their
  defaults, and can report whether the non-key options are at their
  defaults.
* Definition of the "canonical" ParserOptions (which is unfortunately
  different from the "default" ParserOptions) is moved from
  ContentHandler to ParserOptions.
* WikiPage uses this to throw an exception if it's asked to cache
  with options that aren't used in the cache key.
* ParserCache gets some temporary code to try to avoid a massive cache
  stampede on upgrade.

Bug: T110269
Change-Id: I7fb9ffca96e6bd04db44d2d5f2509ec96ad9371f
Depends-On: I4070a8f51927121f690469716625db4a1064dea5
2017-06-05 14:17:28 +00:00
jenkins-bot
c5fa9b4e64 Merge "Parser: Better debugging of lock errors ("Did you call Parser::parse recursively?")" 2017-05-30 19:01:45 +00:00
Kunal Mehta
ff8a0c788b parser: Avoid deprecated wfMemcKey()
Tested that parser cache keys stay the same, before and after this
change.

Also use the more obvious ObjectCache::getLocalClusterInstance() instead
of looking up the main cache type in config and using
ObjectCache::getInstance().

Change-Id: Icef646b3c05e732ef4079d6900e6bce111debf2b
2017-05-25 12:05:49 -07:00
jenkins-bot
5f764ec180 Merge "Move loading of mediawiki.toc from Parser to Skin" 2017-05-23 18:42:44 +00:00
C. Scott Ananian
186a182a15 Protect language converter markup in the preprocessor (take 2).
This revises 2877402276, which was
reverted in master due to unexpected issues with `-{{...}} ` markup
on translatewiki and enwiki.  Test cases are added to ensure that this
is parsed as a template, not as language converter markup.

https://www.mediawiki.org/wiki/Preprocessor_ABNF is the canonical
documentation for the preprocessor; this will be updated after this
patch is merged.  The basic principles described in that page are
maintained in this patch:

* Rightmost opening structure has precedence: `-{{` is parsed as a
dash followed by template opening.

* `{{{` has precedence over `{{` and `-{`: `-{{{{` is parsed as
`-{` `{{{` since we first grab the rightmost `{{{`.

A bunch of test cases were added to verify the "ideal precedence"
order described on that wiki page.

This patch introduced some minor incompatibilities in existing
markup, in particular with chemical formulae in templates.
Fixes for these are being tracked at
https://www.mediawiki.org/wiki/Parsoid/Language_conversion/Preprocessor_fixups

Bug: T146304
Bug: T153761
Change-Id: I2f0c186c75e392c95e1a3d89266cae2586349150
2017-05-23 15:43:49 +01:00
Bartosz Dziewoński
baab085b32 Parser: Better debugging of lock errors ("Did you call Parser::parse recursively?")
Save the backtrace when locking, so that if some code tries locking again,
we can print the lock owner's backtrace for easier debugging.

Change-Id: I6e352b4aa5e7cb35825a66592f6c066d9e8b95c9
2017-05-23 14:50:13 +02:00
Timo Tijhof
c7e00974c7 Move loading of mediawiki.toc from Parser to Skin
This was the only addModules() call ever to be inside Parser.
Introduced in a54ef1a203. Prior to that, mediawiki.toc had always been loaded
by OutputPage (via mediawiki.util; and before that, via wikibits).

This patch restores that, and also fixes T130632 by making OutputPage get
it from the Skin, instead of hardcoding this somewhere in addParserOutput().

* Remove deprecated method OutputPage::enableTOC().
* Move mEnableTOC to addParserOutputText().

Bug: T130632
Change-Id: Iaad84d241a4c4348c712ac1087a664b8c9c46da4
2017-05-21 19:06:43 +02:00
Brion Vibber
33e4ac5b22 Add \b to regexes in BlockLevelPass to avoid confusing tr & track
With TimedMediaHandler in video.js mode, videos can be inline,
without a wrapper div.

Previously, in this mode two paragraphs where one contained a
video would end up merged into one paragraph, due to BlockLevelPass
matching "<track .../>" against "<tr" in its regexes.

Added \b to a couple of the regexes to protect against such errors,
and corrected a parser test case that had bad output listed, where
"<link .../>" matched against "<li".

Bug: T165817
Change-Id: I06e82b881f5ebddae5e7df7fb940adfa54f6b659
2017-05-20 00:53:05 +02:00
Brad Jorsch
ecb4c0e3fe ParserOptions: Include wrapping class in options hash
Avoids polluting the cache when things take advantage of the option.

Bug: T165115
Bug: T165161
Change-Id: I5be25c6de68012df58b6a0cbf92e2f972be2b68a
2017-05-15 05:55:51 +00:00
Brad Jorsch
1aac0a2992 Wrap parser output in <div class="mw-parser-output">
This will allow CSS to target just the parser output, without also
accidentally targeting the edit form, diff tables, and so on.

Bug: T37247
Change-Id: If4eb5bf71f94fa366ec4eddb6964e8f4df6b824a
Depends-On: I330c6aa4aaee045614b1801ed34bc9e03be69650
Depends-On: I52a518fa44e017841fe78474012cd69823e0a41d
2017-05-08 05:32:03 +00:00
Paladox
54c56da85a Fix php code style
Preparation change for updating mediawiki code sniffer to 0.8.0

Change-Id: Ib0b3fe4afea9096ffa3a1347b4f7e07d3398b0b2
2017-05-05 12:03:54 +00:00
Umherirrender
84132a2613 Remove unused var assign in Parser::getTemplateDom
Change-Id: If11d7a2568d4235df6888e4001500bdf45f58eae
2017-05-02 21:18:54 +02:00
Fomafix
5c41b29993 Use isSpecialPage() where possible
Change-Id: Ie4d0838acf96a7ed4a1fe4cfdc901c77d3312174
2017-04-29 22:31:42 +02:00
Tim Starling
448be2ed3e Add benchmarkTidy.php, to benchmark tidy drivers
Plus representative input file

Change-Id: I254793fc55c57a98c07ae1e4c27e6005965c9a20
2017-04-21 01:02:22 +00:00
Umherirrender
bf51e3542b Add grep infos to Parser::getImageParams
Comments for grep makes searching easier

Change-Id: I98e93baf6bd89df36185d535d6e63c51c6f65bc9
2017-04-14 23:42:15 +02:00
Brian Wolff
17e7bc0235 SECURITY: Always normalize link url before adding to ParserOutput
Move link normalization directly into addExternalLink() method,
since you always need to do it - having it separate is just
inviting people to forget to normalize a link.

Additionally, links weren't properly registered for <gallery>.
This was somewhat unnoticed, as the call to recursiveTagParse()
would register free links, but it wouldn't work for example with
protocol relative links.

Issue originally reported by MZMcBride.

Bug: T48143
Change-Id: I557fb3b433ef9d618097b6ba4eacc6bada250ca2
2017-04-06 13:44:44 -07:00
jenkins-bot
4f43a9608f Merge "Valid tags on a gallery should correspond to unordered list" 2017-04-03 18:25:37 +00:00
Arlo Breault
aea349488c Valid tags on a gallery should correspond to unordered list
* This was introduced in 4d3446a8e3 when galleries were tables.
   However, in 05579cf0e6, it switched to ul's, but missed updating the
   sanitization.

 * As an example, the test shows that summary is currently wrongly
   permitted.

Change-Id: I8c52477dc65499d0c8a1ee5cc661a5f9ae78cc07
2017-04-01 09:59:21 -04:00
Brian Wolff
1c7889446d SECURITY: Disable <html> tag on system messages despite $wgRawHtml = true;
System messages may take parameters from untrusted sources. This
may include taking parameters from urls given by unauthenticated
users even if the wiki is a read-only wiki. Allowing <html> tags
in such a context seems like an accident waiting to happen.

Bug: T156184
Change-Id: I661f482986d319cf41da1d3e7b20a0f028a42e90
2017-03-28 21:51:44 +00:00
Matěj Suchánek
ec25c79139 Add a tracking category when a template loop is detected
Bug: T160743
Change-Id: Ib888634af281fc2347eaa389db4141782a98c15c
2017-03-17 11:52:38 +00:00
jenkins-bot
980c688c2b Merge "Add RemexHtml to the list of available Tidy drivers" 2017-03-08 23:33:17 +00:00
Tim Starling
50fe941457 Add RemexHtml to the list of available Tidy drivers
Change-Id: I5a87a6ed24ca3ef7c5fdb21e74f9eb410bf74b4c
2017-03-09 10:19:23 +11:00
Matthias Mullie
ebb1680359 Show dimensions in TraditionalImageGallery
Bug: T121869
Change-Id: Ie2cb3f1594302f1726ae3d9d2d668c81b7e6b0f1
2017-03-07 13:09:00 +01:00
C. Scott Ananian
3e32d21210 Strip U+0000 in wikitext
U+0000 is not allowed in HTML5, there's no reason to allow it in wikitext.

It simplifies our code if we can just strip them at the start.  Strip in
PST as well so they don't sneak into our database either.

Tweaked the EXT_LINK URLs to account for the fact that invalid characters
get transformed into U+FFFD when using Preprocessor_DOM.  See 73649741ed
(r65967) for context on that change.

Bug: T159174
Change-Id: I3f67e92b61aacc87a40c3662085c84d1dac08bfb
2017-03-06 22:23:38 +00:00
jenkins-bot
aa3319c4c0 Merge "Miscellaneous indentation tweaks" 2017-02-28 18:38:36 +00:00
Bartosz Dziewoński
ecdef925bb Miscellaneous indentation tweaks
I was bored. What? Don't look at me that way.

I mostly targetted mixed tabs and spaces, but others were not spared.
Note that some of the whitespace changes are inside HTML output,
extended regexps or SQL snippets.

Change-Id: Ie206cc946459f6befcfc2d520e35ad3ea3c0f1e0
2017-02-27 19:23:54 +01:00
James D. Forrester
9635dda73a includes: Replace implicit Bugzilla bug numbers with Phab ones
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.

Change-Id: I6f59febaf8fc96e80f8cfc11f4356283f461142a
2017-02-21 18:13:24 +00:00
Eddie Greiner-Petter
635040c91d Fixed documentation on Parser::getVariableValue
$index is definitely not a int here, see the big switch( $index )-case
statement below. It switches for strings, not numbers. Also, note that
this is lowercase, one might expect it to be uppercase as this is how
magic words are written in wikitext.

Bug: T96633
Change-Id: Iea93c3796fdee4ed7abbb7608e89b627ca95aead
2017-02-18 22:41:12 +00:00
C. Scott Ananian
3a8a986e35 Don't bail on single-line definition list due to excess close tags.
When parsing a single line definition list, we track nested tags so that:

	; <b>foo:bar</b>: baz

breaks before `baz`, not between `foo` and `bar`.  But we currently bail
out of this algorithm entirely if we see a mismatched close tag.  We should
just ignore the unmatched tag, like Parsoid does.

Change-Id: I6306dcad6347abeb6ab001d35562f1ab9f374bd1
2017-02-17 16:34:55 -05:00
C. Scott Ananian
ee002d67c9 Protect -{...}- variant constructs in definition lists.
Given the wikitext:

	;-{zh-cn:AAA;zh-tw:BBB}-

Prevent `doBlockLevels` from trying to split the definition list at the
embedded colon and using `AAA;zh-tw:BBB}-` as the `<dd>` portion.

Bug: T153135
Change-Id: I3a4d02f1fbd0d0fe8278d6b7c66005f0dd3dd36b
2017-02-17 15:52:44 -05:00
Brad Jorsch
fb3ae6fbe3 Replace use of &$this
Use of &$this doesn't work in PHP 7.1. For callbacks to methods like
array_map() it's completely unnecessary, while for hooks we still need
to pass a reference and so we need to copy $this into a local variable.

Bug: T153505
Change-Id: I8bbb26e248cd6f213fd0e7460d6d6935a3f9e468
2017-01-31 23:01:54 -05:00
Max Semenik
13054a4c70 refreshLinks.php: allow refreshing by categories, tracking or not
Needed for selective updates of pages using a particular feature.
Intended to be run in production, so needs to scale.

Bug: T149723
Change-Id: If20fb1f91de8d4227def5b07d6d52b91161ed3fd
2017-01-23 14:30:16 -08:00
This, that and the other
79cbecae83 Parser: Trim leading whitespace from links before checking for leading :
The leading spaces on the link only cause us problems, such as for the
$noforce check 20 lines later.

Bug: T129218
Change-Id: I93a8da1f73b38fa3da362f8f27479b3039ed3f13
2017-01-21 23:55:02 +00:00
Max Semenik
4ca09bd76f Make most of DateFormatter private
As discussed in https://gerrit.wikimedia.org/r/#/c/332702/ , these
methods and fields shouldn't have been marked public in the first place.
No outside users. Also, declare a couple of fields and remove unused ones.

Change-Id: I7775978c87d983784a484ee2ad901d25c42499b3
2017-01-19 13:39:18 -08:00
jenkins-bot
5eaf2e1f15 Merge "Un-blacklist PhanUndeclaredVariable" 2017-01-19 19:00:58 +00:00
Max Semenik
8cf5c2a37c Remove deprecated Parser::replaceUnusualEscapes()
Deprecated since 1.24, no callers.

Change-Id: Ib780f1a7b77d3ce624112f59c8e57820fecb6bf2
2017-01-18 19:19:00 -08:00
Erik Bernhardson
e5b8bf4942 Un-blacklist PhanUndeclaredVariable
Undeclared variables are a very common error type that we want to catch
as often as possible. To avoid needing to refactor a variety of global
level code (mostly in old-style maintenance scripts) this ignores
undeclared variables in global scope. This is still a good improvement
over what was happening previously.

Change-Id: I50b41d571724244552074b9408abbdf6160aca59
2017-01-18 13:07:39 -08:00
divadsn
e8d13ad1a2 Add a new {{PAGELANGUAGE}} variable for use in wikitext
Returns the language code of the page being parsed.

Bug: T59603
Change-Id: I229edd6251cf1120b3395d1811dbb9d96d9cd8ee
2017-01-07 02:03:53 +00:00
Aaron Schulz
b03b387e5a Include JS variable for NewPP report
Adapted from reverted commit b7c4c8717f.

Bug: T110763
Change-Id: If249b679c534879bfac622592a1d2fa913a0cf9d
2017-01-05 19:11:38 -08:00
jenkins-bot
704f307289 Merge "parser: Update outdated comment about ImageGallery" 2017-01-05 23:06:57 +00:00
Fomafix
d2997347a2 PHP code style: No space after unary not operator
Change-Id: I4d3df0cfcda4d88e405164123893e57786fbe15e
2017-01-05 16:00:59 +00:00
Timo Tijhof
7cd37c9f0e parser: Update outdated comment about ImageGallery
Follows-up f90634a6.

Change-Id: Ic17dc03cc37b85f222f3bb525e4cb39afc6f22ae
2017-01-03 18:15:40 -08:00
C. Scott Ananian
046c463635 Revert "Protect language converter markup in the preprocessor."
This effectively reverts commit 2877402276 in
order to unblock the deploy train.  The underlying behavior might not be
incorrect, but it was unexpected.

Bug: T153761
Change-Id: Ifc9c7cf3482dd5d222ff4da24a6d4cc401e9d965
2017-01-03 17:23:28 -05:00
C. Scott Ananian
23fd64afde Don't parse language converter markup as a cell parameter in tables.
Bug: T153140
Change-Id: I799363727162a0f337652b26bb69fe35c61a8553
2016-12-22 11:09:50 -05:00
Niklas Laxström
c9d638d6c7 CoreParserFunctions: Use Title::inNamespace instead of manual comparison
Change-Id: I60c02bc68ef0d48b1dc66ba0961275feec5789fb
2016-12-21 10:55:14 +01:00
Nikerabbit
a2fadd0619 Merge "Parser functions now format numbers according to page language (2nd attempt)" 2016-12-21 07:58:26 +00:00
C. Scott Ananian
ae934157b2 Protect -{...}- variant constructs in galleries
This also protects naked external links, which are internally surrounded by
`-{R|...}-` by LanguageConverter::markNoConversion.

Originally found in failed tests in I7fa2d85d6.

Bug: T54190
Change-Id: I9b099273203482ffb570a5654d8ba50c833e526d
2016-12-20 22:14:37 +00:00