Commit graph

856 commits

Author SHA1 Message Date
jenkins-bot
704f307289 Merge "parser: Update outdated comment about ImageGallery" 2017-01-05 23:06:57 +00:00
Fomafix
d2997347a2 PHP code style: No space after unary not operator
Change-Id: I4d3df0cfcda4d88e405164123893e57786fbe15e
2017-01-05 16:00:59 +00:00
Timo Tijhof
7cd37c9f0e parser: Update outdated comment about ImageGallery
Follows-up f90634a6.

Change-Id: Ic17dc03cc37b85f222f3bb525e4cb39afc6f22ae
2017-01-03 18:15:40 -08:00
C. Scott Ananian
23fd64afde Don't parse language converter markup as a cell parameter in tables.
Bug: T153140
Change-Id: I799363727162a0f337652b26bb69fe35c61a8553
2016-12-22 11:09:50 -05:00
C. Scott Ananian
ae934157b2 Protect -{...}- variant constructs in galleries
This also protects naked external links, which are internally surrounded by
`-{R|...}-` by LanguageConverter::markNoConversion.

Originally found in failed tests in I7fa2d85d6.

Bug: T54190
Change-Id: I9b099273203482ffb570a5654d8ba50c833e526d
2016-12-20 22:14:37 +00:00
C. Scott Ananian
51d54b4b91 Protect -{...}- variant constructs in images.
A protected version of explode is factored out as
`StringUtils::delimiterExplode`, since it will be used in follow-up
patches in this series.  The `delimiterExplode` implementation creates
an intermediate array of the exploded results, which is reasonable as
the number of image options is small; but since an Iterator is
returned the implementation can be upgraded in the future (at the cost
of additional complexity) to avoid this.  The additional code in that
case would be similar to ExplodeIterator.

Bug: T146305
Change-Id: I1327685e9e8c07ef476dceaa6f6dae4ba40989ef
2016-12-20 22:08:36 +00:00
Leszek Manicki
95b9d82a3a Fix parameter type docs
Changes:
 - uses int instead of number as param and return value type,
 - uses stdClass instead of stdObject
 - fixes ResourceLoaderClientHtml constructor's $target param type:
   it is string|null, not an array (previously misspelled as "aray")
 - changes the type of references to XML parser in XMP lib to resource
   instead of not existing XMLParser

Change-Id: I98c363ebc6658d1f4dcabad97a9a92f3fcd7ea8c
2016-12-14 17:01:47 +01:00
Sébastien Santoro
42d1b3c169 Convert legacy bugs ID to Phabricator tasks ID for Parser class
Side edits to comments:
  - update a @fixme as T10068 has been declined
  - fix spelling

Change-Id: I7f9f191ff68bb56de72563dde957ccf4731267e4
2016-12-13 03:15:02 +00:00
Thiemo Mättig
00c3f09566 Remove empty lines from PHP and JavaScript comment blocks
This is a pure documentation change. It mostly removes empty lines from
comments (and entirely empty comments), as well as adds a few missing
documentation blocks and fixes a minor mistake. I hope it's ok to have
this in one patch. I can split it, please tell me.

Change-Id: I9668338602ac77b903ab6b02ff56bd52743c37c4
2016-12-09 09:01:06 +00:00
jenkins-bot
2fdcd7bfdd Merge "Add ParserFetchTemplate hook" 2016-11-15 23:36:18 +00:00
Bartosz Dziewoński
0e15a6068a Revert "Move NewPP limit report HTML comments to JS variables" and followups
This change resulted in unreasonable feature loss (human-readable
limit report was gone). Three months and multiple followups later,
the functionality is still not completely restored. Given lack
of response from the original author, I think it is time to revert
and reconsider, especially since the 1.28 release is soon.

A machine-readable limit report would be a very useful feature,
but not at the cost of losing human-readable limit report.

This reverts the following commits:

* Move NewPP limit report HTML comments to JS variables
  b7c4c8717f
* Only pretty-print the parser report JS vars
  28adc4d7ee
* Show wgPageParseReport on page previews too
  1255654ed5
* Re-add human readable parser limit report
  0051f108b9
* Restore hooks.txt for ParserLimitReportFormat
  4663e7a737

Resolved minor merge conflicts in OutputPage (with 80e5b160)
and release notes.

Bug: T110763
Bug: T142210
Change-Id: Id88c8066fae3f369e8977b4b7488f67071bdeeb7
2016-11-08 22:35:15 +01:00
jenkins-bot
69ae945e8d Merge "Update weblinks in comments from HTTP to HTTPS" 2016-11-08 21:32:00 +00:00
Fomafix
202f695f67 Update weblinks in comments from HTTP to HTTPS
Use HTTPS instead of HTTP where the HTTP link is a redirect to the HTTPS link.

Also update some defect links.

Change-Id: Ic3a5eac910d098ed5c2a21e9f47c9b6ee06b2643
2016-11-07 15:24:46 +01:00
Kunal Mehta
0ca1563644 Add tracking categories when magic links are used
This adds 3 tracking categories, one for each type of magic link (ISBN,
RFC, PMID). This will allow wikis to gauge usage and identify pages that
need migrating.

These will only show up if the respective magic links are enabled via
$wgEnableMagicLinks.

Change-Id: Ic483f0c493112bf6373e1b37961e1241c20c3582
2016-11-04 02:20:05 +00:00
Brad Jorsch
b33a5c013e Don't parse <nowiki><span class="error"></nowiki>
Which means we can't check if a parser limit was exceeded while trying
to expand the content of a tag, but that's probably not a huge loss.
It'll just result in potentially strange output rather than an exception.

Bug: T149622
Change-Id: I7910dfa0f61b1cc9168c7ed1498b2bda27c47f0e
2016-10-31 20:10:36 -07:00
Kunal Mehta
0d4e0135c4 Remove tracking category stuff that accidentally slipped into 61adc1e14
Bug: T149310
Change-Id: I0a3725a72b1467c57280ae1880935dd5fa54ae9e
2016-10-27 17:59:10 +00:00
Kunal Mehta
61adc1e146 Use namespaced ScopedCallback
The un-namespaced \ScopedCallback is deprecated.

Change-Id: Ie014d5a775ead66335a24acac9d339915884d1a4
2016-10-17 15:46:05 -07:00
Brad Jorsch
d86056fee6 Avoid blowing up inside Parser::extensionSubstitution() when PP limits are exceeded
The most critical one is if the marker name is bad, since that causes
StripState to throw an exception since I798d31af. But we may as well
check the other expand calls in this function too to avoid outputting
broken wikitext.

Bug: T136401
Change-Id: I1cb353d74f9a46168055e1abeb22cf569fe9354a
2016-10-13 13:06:47 -04:00
jenkins-bot
2d63ce056d Merge "Parser: Allow <s> and <strike> in table of contents" 2016-09-28 21:31:33 +00:00
Bartosz Dziewoński
c60e85c4dc Parser: Allow <s> and <strike> in table of contents
Bug: T35715
Change-Id: Iec6a05e3e6bb622f477e6ebeb57e9f65da5f22bd
2016-09-28 21:12:01 +00:00
Max Semenik
068e0e6ca0 Remove/actualize unused imports
Change-Id: I6ef19d5d982aa45dbf5554107ad9ee720442f466
2016-09-26 17:03:26 -07:00
Legoktm
3172dfe21e Revert "Move wfEscapeWikiText() to Parser::escapeWikitext()"
Apparently it is possible for Parser::mParserOptions 
to not be set in some cases. I'll try again later.

This reverts commit bda74bff6e.

Bug: T146433
Change-Id: Idb6d1b20995d5f86b712abb386ab987356c4f560
2016-09-23 00:29:21 +00:00
Kunal Mehta
bda74bff6e Move wfEscapeWikiText() to Parser::escapeWikitext()
wfEscapeWikiText() used $wgEnableMagicLinks, but that could result in an
inconsistency when something modifies the magic link related
ParserOptions.

In general, most uses of wfEscapeWikiText() are in parser functions or
when message parsing, so the Parser is a logical place for it.

A future patch will make it easy to use Parser::escapeWikitext() in
message parameters.

Change-Id: I0fd4d5c135541971b1384a20328f1302b03d715f
2016-09-13 22:34:24 -07:00
Kunal Mehta
78debba3aa Parser: Allow disabling magic link functionality
The magic link functionality is "old backwards-compatibility baggage"
that we probably want to get rid of eventually. The first step to doing
so would be making it configurable and allowing it to be turned off on
wikis that don't use it.

This adds each of the 3 magic link types as individual parser options,
which can be controlled by the $wgEnableMagicLinks setting.

Additionally, wfEscapeWikiText() was updated to only escape enabled
magic link types.

Bug: T47942
Change-Id: If63965f31d17da4b864510146e0018da1cae188c
2016-09-12 22:00:05 -07:00
Tim Starling
df29a359f8 Renames preparatory to parser tests refactor
Since in several cases, with an all-in-one commit, git's file rename
detection failed, I split the renames out into their own commit to
make review easier. Some changes here won't make complete sense without
the following commit.

* Moved TestsAutoLoader to tests/common/. It will be joined by a friend.
* Renamed ParserTest to ParserTestRunner, since the former name was
  overly generic.
* Renamed TestFileIterator to TestFileReader. Please see the subsequent
  commit for rationale.
* Moved parserTests.php to tests/parser/. It was the only file left in
  tests/, and it should have been moved to tests/parser years ago,
  analogous to phpunit.php.
* Renamed NewParserTest to ParserIntegrationTest. This was a tricky one,
  apparently the name has to end in "Test" or else the structure test
  will fail. Analogous to ParserMethodsTest etc. Rationale: because it's
  not new anymore.
* Renamed MediaWikiParserTest to ParserTestTopLevelSuite and moved it to
  the suites directory. A more descriptive name. Being in suites/
  shields it from StructureTests, and is correct anyway.

Change-Id: Iddc6eaf815fdd64b3addb8570b4b6303ab99d634
2016-09-12 15:46:15 +10:00
Aaron Schulz
950cf6016c Rename DB_SLAVE constant to DB_REPLICA
This is more consistent with LoadBalancer, modern, and inclusive
of master/master mysql, NDB cluster, and MariaDB galera cluster.

The old constant is an alias now.

Change-Id: I0b37299ecb439cc446ffbe8c341365d1eef45849
2016-09-05 22:55:53 -07:00
Aaron Schulz
d957cb7347 Cache revision lookups done by Parser
Inverse flame graphs shows revision lookups as one of the
big three queries (Revision, LinkCache, getTitleInfo of
ResourceLoaderWikiModule).

This works via a new Revision::newKnownCurrent() method
needs both page/rev ID from the DB (to avoid invalidation)
and fetches the user name and rev_deleted if needed (again
to avoid invalidation). Parser does not care about fields
anyway in the template path.

Also improved cross-wiki support a bit, and fixed up some
docs and IDEA errors.

Change-Id: Icad602dba5de18c7758b77fd23b0a450ff21d09f
2016-09-05 02:22:51 +00:00
Aaron Schulz
97f004694c Adapt the ParserOutput cache TTL when including special pages
For simple pages that transclude special pages, like user pages
including Special:PrefixIndex, the TTL is allowed to drop to 15
seconds if the page parses fast enough.

Bug: T139893
Change-Id: If41885ded648d68352fe3d06336d98aa0ab53966
2016-08-31 17:17:38 +00:00
Glaisher
8c5aa2d645 Add ParserFetchTemplate hook
This allows extensions to add custom content for transclusion
text if they want.

Bug: T47096
Change-Id: I0de1c96bb968a99a2c81a9977655780a78988a20
2016-08-27 21:56:47 +05:00
Amir Sarabadani
6b221fa96a Clean up array() syntax in docs, part IV
Change-Id: If626409a93d31bf90c054c9bf7ba44a78ea9a621
2016-08-26 16:06:58 +04:30
Kunal Mehta
85034abca5 content: Refactor normalization of line endings code
The code that normalizes line endings ("\r\n" and "\r" to "\n") and
trims trailing whitespace is buried in Parser::preSaveTransform(), and
was duplicated to TextContent in 96b6afb31d, as non-wikitext content
models should still be normalizing line endings.

This splits the duplicated code into
TextContent::normalizeLineEndings(), and utilize it in the Parser.
Additionally, expand the documentation of
TextContent::preSaveTransform() to document that subclasses should make
sure they normalize line endings during the PST stage.

And remove a useless rtrim() call from WikitextContent that did nothing.

Change-Id: I9094c671d4bbd23d75436f8f1d682d6dd6e6d2fc
2016-08-23 11:09:59 -07:00
Brian Wolff
e2a6fe5711 SECURITY: XSS in unclosed internal links
rawurldecode was being run on unclosed internal links
which could allow an attacker to insert arbitrary
html into the page.

See also related: r13302

Bug: T137264
Change-Id: I4e112a9e918df9fe78b62c311939239b483a21f5
2016-08-23 03:39:36 +00:00
Brad Jorsch
96b6afb31d TextContent: Normalize newlines in preSaveTransform()
This does the same normalization of newlines that
Parser::preSaveTransform() does. This should be appropriate for any text
content type, especially considering that EditPage uses
WebRequest::getText() which does a less-strict version of this same
transformation.

This also cleans up the code for doing that newline replacement
to be a bit less verbose.

Bug: T142805
Change-Id: I462afcda502f031a8b0360d982ce2398a0383a96
2016-08-16 10:21:32 -04:00
Florian
794bb8bb25 Fix comment of get/setLinkRenderer in doxygen
Doxygen requires the full qualified name of the class in a comment
or in the @aram/@return annotation, otherwise the class isn't linked
in the resulting output[1]. This commit changes the LinkRenderer
annotations in SpecialPage and Parser to \MediaWiki\Linker\LinkRenderer.

[1] https://doc.wikimedia.org/mediawiki-core/master/php/classSpecialPage.html#a3560214f63fc2f20c63b4025db5cd81d

Change-Id: I74cedcd764a6053cc5a0c6d2eedbedb72651f57c
2016-08-09 17:23:00 +02:00
Tim Starling
134f8c4513 Don't run the non-Tidy "bug 2702" hack unless Tidy is really missing
We have two hacks which are used when Tidy is not available: one in
Sanitizer::removeHTMLtags(), and the second here as a late Parser pass
equivalent to Tidy itself. But the Sanitizer one was enabled only if
MWTidy::isEnabled() returned false, whereas the Parser one was enabled
also when tidy was disabled in ParserOptions. This patch makes them both
consistent, it enables the bug 2702 hack only when MWTidy::isEnabled()
returns false, and when Tidy is disabled in parser options, the output
is simply passed through.

This allows tidying to be done separately on the ParserOutput, as is
required by the proposed ParserMigration extension (I24d0776a933fa3f).

Eventually the bug 2702 hack will be removed in favour of a pure-PHP
HTML 5 parser, but it looks like it is too early for that.

Change-Id: I94be6c9dec531c23ef80cb36732243bd6858bf22
2016-07-27 14:47:36 +10:00
Aaron Schulz
b7c4c8717f Move NewPP limit report HTML comments to JS variables
* Instead of having messy code to create a hidden HTML
  comment of English strings at the bottom of the page,
  expose the structured data of the parse information
  to JS so tools can use it.
* Make makeConfigSetScript() use pretty output so these
  variables are also easy to read in "view source".
* Remove ParserLimitReportFormat hook, since the data
  is not formatted to HTML anymore.

Bug: T110763
Change-Id: I2783c46c6d80f828f9ecf5e71fc8f35910454582
2016-07-26 11:31:20 -07:00
Tim Starling
d3d682fb45 Hide marked empty elements by default (stage 1)
We originally imagined rolling out the display of empty elements
simultaneously with the Html5Depurate, but now we have added support for
marking empty elements to Html5Depurate and plan on having some sort of
longer migration period. So, move the relevant CSS to content.css, and
remove the concept of CSS dependant on tidy driver.

Add a body class which will allow the effect to be toggled in a gadget or
extension. Actual toggling in the CSS will be in the stage 2 patch, to be
deployed after the varnish cache and parser cache have expired.

I originally imagined that there would be a gadget that overrides the
rule with an !important selector, but that method does not allow you to
recover the original display property, which is often overridden by the
style attribute or site CSS to be "inline".

Also, in RaggettWrapper, switch to the new class mw-empty-elt, following
Html5Depurate, instead of mw-empty-li. The old class will be removed in
the stage 2 patch.

Change-Id: Ic0f432c43a006629ca5a1a7c2dda3552ceb4dc4f
2016-07-14 14:24:27 -07:00
C. Scott Ananian
6cdae80513 Add tracking category when editors use the deprecated self-closed tag hack.
Some pages use constructs like `<b/>` or `<span/>` to protect spaces or
special characters at the beginning/end of templates.  This syntax is
incompatible with HTML5 parsing rules, which dictate that these should
be treated as open tags, and instead rely on an unusual quirk of the
`tidy` program that removes invalid constructs.

This syntax is deprecated as part of the process of reconciling `tidy`
with modern HTML5 parsing semantics.  Authors can use `&#32;` or `<nowiki/>`
as valid replacements.

In order to provide time to transition existing content, pages using
self-closing tags in violation of the HTML5 parsing specification
will have their templates/pages added to a new tracking category.
After these uses are fixed, we will change the sanitizer to treat these
as normal open tags, to be consistent with the HTML5 parsing spec.

Note that this construct is already disallowed if tidy is disabled; it
is rendered as `&lt;b/>`.  We add a tracking category in the no-tidy
case as well, in preparation for eventually making the no-tidy and
with-tidy behaviors consistent.

Bug: T134423
Change-Id: Ie1cf3aa40d5483bf395ece539f0240b694ff04ab
2016-07-12 14:18:04 +10:00
Aaron Schulz
005b4d6fff Try to predict the rev_id when preparing edits
During both the edit stash and first parse in on page save,
guess what the rev_id will be and use that instead of null.
Only reparse if it turns out to be wrong. This avoids extra
parsing on wikis that have low-medium traffic, and does not
cost much. The parsing that can be avoided is:
a) in doEditContent() by using the stash
b) in doEditUpdates() by using the doEditContent() result,
   whether that was able to use the stash or not itself

Also improved the parse operation logging in save paths.

Bug: T137900
Change-Id: Ic6faae70a78b4e223e4d3585cefd482c0fa00677
2016-06-29 05:39:33 -07:00
Kunal Mehta
fb04b0ce28 Parser: Use LinkRenderer for building ISBN magic links
Instead of manually building the <a> tag, use LinkRenderer to create it.

Change-Id: Iaefe85527307a8399e9f52dde58fb2c24c4753c2
2016-06-23 14:11:24 +00:00
Kunal Mehta
8a6326c211 Add SpecialPage::getLinkRenderer()
And SpecialPage::setLinkRenderer(), so the Parser can pass on its
LinkRenderer instance for when special pages are being included in a
page.

Change-Id: If9a9c648ab670b824ce534e7cf0d20d41e1bfd12
2016-06-22 23:32:00 +02:00
Jackmcbarn
d05dde4329 Render bad images in wikitext as links
In galleries, bad images are rendered as links. This causes the same behavior
to occur in wikitext, rather than the current behavior of not rendering
anything.

Change-Id: I1a074bff7cb661b5b4e6db9503eb6a5de702ee2f
2016-06-19 03:24:57 +00:00
Aaron Schulz
7d42e96748 Deprecate Parser::disableCache
Few maintained extensions still rely on this and it is
bad practice to use this for handling cache correctness.

Change-Id: I2de481198bbff5c4f3dd81fc6d1b137e4c37b93f
2016-06-18 19:55:43 +00:00
Brian Wolff
7730dee63b Make transcluded special pages not disable cache in miser mode.
Previously {{Special:Foo}} would cause parser cache to be disabled,
now have a method in SpecialPage to control this behaviour and set
arbitrary caching times.

Note: This does not affect caching of direct views to the special page

The new default is now disabling cache if not in miser mode,
otherwise setting to 1 hour, except for Special:Recentchanges
and Special:Newpages which set to 5 minutes. These values are
possibly really low, but for now I think best to be close to the
old behaviour. We had 0 caching for these things for years, and
afaik it hasn't caused any big issues. Part of me wonders if
Special:Recentchanges should stay at 0, but that sounds crazy.

This change also causes transcluded special pages to not be
"per-user" if they are being cached (Specificly $wgUser et al
become 127.0.0.1).

Bug: 60561
Change-Id: Id9ce987adeaa69d886eb1c5cd74c01072583e84d
2016-06-14 20:46:32 -07:00
Aaron Schulz
879ebfb18a Use a low TTL for parser output when special pages are included
Previously, no TTL at all was used, which is quite harsh on
performance and had downstream effects like disabling edit
stashing for affected pages.

Bug: T136678
Change-Id: I2462057aa189cfb05fe65d0b3c081a9fd10066a2
2016-06-14 17:48:04 -07:00
Aaron Schulz
147f79eedd Improvements to {{REVISIONUSER}} handling
* Do not change the result to a null editing user anymore.
* Use a new vary-user flag instead of vary-revision. This
  will only cause a reparse on null edits. Normal edits
  can still use the prepared output now.
* Edit stashing now applies for pages with this magic word.
* Fixed bug where the second prepareContentForEdit() call
  (due to vary-X flags) would still check the edit stash.

Bug: T135261
Bug: T136678
Change-Id: Id1733443ac3bf053ca61e5ae25db3fbf4499e9f9
2016-06-14 19:28:09 +00:00
Timo Tijhof
8eca3b5027 parser: Remove redundant comment about revisionsize cache vary
Follows-up 457431b.

Change-Id: Iac3e4d6c11de3737155e7f7ff35ec7a6a3873865
2016-06-14 01:26:37 +02:00
Aaron Schulz
457431b57b Avoid setting vary-revision for {{REVISIONSIZE}}
Just always use the input size for new revisions. If they are
saved, then that should be the revision size. If they are just
null edits, then the size must have matched the current revision.

This also enables edit stashing for this case.

Change-Id: I428c0cc87750eeddd1d7dcebd1a2b03817cec441
2016-06-13 23:00:05 +00:00
Kunal Mehta
d671429e41 Parser: Pass Title onto Linker::makeExternalLink()
Otherwise $wgNoFollowNsExceptions functionality won't work.

Change-Id: I2e1c5ad41f94568bff7f24a400d555b604cfe22e
2016-05-31 22:47:51 -07:00
Kunal Mehta
b07eb85267 Make $url parameter to Parser::getExternalLinkAttribs() required
All callers in Gerrit pass $url in.

Change-Id: I36246f6510db414dcc7023f8779796c060c3eba5
2016-05-31 21:25:18 -07:00