Commit graph

141 commits

Author SHA1 Message Date
Fomafix
43244db9a2 Use PHP 7 '??' operator instead of if-then-else
Change-Id: If9d4be5d88c8927f63cbb84dfc8181baf62ea3eb
2018-10-21 21:46:46 +02:00
Bartosz Dziewoński
485f66f174 Use PHP 7 '??' operator instead of '?:' with 'isset()' where convenient
Find: /isset\(\s*([^()]+?)\s*\)\s*\?\s*\1\s*:\s*/
Replace with: '\1 ?? '

(Everywhere except includes/PHPVersionCheck.php)
(Then, manually fix some line length and indentation issues)

Then manually reviewed the replacements for cases where confusing
operator precedence would result in incorrect results
(fixing those in I478db046a1cc162c6767003ce45c9b56270f3372).

Change-Id: I33b421c8cb11cdd4ce896488c9ff5313f03a38cf
2018-05-30 18:06:13 -07:00
Kunal Mehta
f093fb4192 Don't globally disable PHPCS's prohibition of assert()
Whitelist the remaining usages of assert(), and reinstate the PHPCS sniff
that forbids usage of it. Add FIXME comments as well, so any casual readers
of the code will not think that the disabling and usage is intentional.

Change-Id: I7cabe715c0e6aa6a9ef3ffe5657f3de7fd8e662b
2018-05-07 17:15:16 +00:00
Arlo Breault
f27c50b580 Clarify -{ => {{ transition
Ensure we have the correct rule on the stack.

Change-Id: Ie814df7b759a2381be0b815eeefdb5d1f7adcde0
2018-03-15 13:04:59 -04:00
Arlo Breault
7c450dff64 Remove "dash" case in preprocessToObj
This was introduced in 2877402 and removed in 186a182

Change-Id: Ibfa1ae1597bfc50ae6ea49402c7966ca042f12e5
2018-03-09 17:46:53 -05:00
Reedy
39f0f919c5 Update suppressWarning()/restoreWarning() calls
Bug: T182273
Change-Id: I9e1b628fe5949ca54258424c2e45b2fb6d491d0f
2018-02-10 08:50:12 +00:00
Umherirrender
3124a990a2 Use ::class to resolve class names in includes files
This helps to find renamed or misspelled classes earlier.
Phan will check the class names

Change-Id: I07a925c2a9404b0865e8a8703864ded9d14aa769
2018-01-27 20:34:29 +01:00
Umherirrender
255d76f2a1 build: Updating mediawiki/mediawiki-codesniffer to 15.0.0
Clean up use of @codingStandardsIgnore
- @codingStandardsIgnoreFile -> phpcs:ignoreFile
- @codingStandardsIgnoreLine -> phpcs:ignore
- @codingStandardsIgnoreStart -> phpcs:disable
- @codingStandardsIgnoreEnd -> phpcs:enable

For phpcs:disable always the necessary sniffs are provided.
Some start/end pairs are changed to line ignore

Change-Id: I92ef235849bcc349c69e53504e664a155dd162c8
2018-01-01 14:10:16 +01:00
Umherirrender
3d560be428 Change php extract() to explicit code
Avoid php magic and make var settings more visible

Change-Id: I223874fd871104b0ac6a80d7f39c6dd997d0551d
2017-12-08 14:46:33 +01:00
WMDE-Fisch
6df9ed1ad6 update mediawiki-codesniffer to 0.11.0 and fix issues
- mostly auto fixes
- some too long lines fixed
- ignore amp space in one case  passing by reference

Change-Id: I6472f83bc3cbf4bd629d83050cc3319b19ec465c
2017-08-11 22:27:51 +02:00
Umherirrender
b5cddfb27b Remove empty lines at begin of function, if, foreach, switch
Organize phpcs.xml a bit

Change-Id: Ifb767729b481b4b686e6d6444cf48b1f580cc478
2017-07-01 11:34:16 +00:00
C. Scott Ananian
186a182a15 Protect language converter markup in the preprocessor (take 2).
This revises 2877402276, which was
reverted in master due to unexpected issues with `-{{...}} ` markup
on translatewiki and enwiki.  Test cases are added to ensure that this
is parsed as a template, not as language converter markup.

https://www.mediawiki.org/wiki/Preprocessor_ABNF is the canonical
documentation for the preprocessor; this will be updated after this
patch is merged.  The basic principles described in that page are
maintained in this patch:

* Rightmost opening structure has precedence: `-{{` is parsed as a
dash followed by template opening.

* `{{{` has precedence over `{{` and `-{`: `-{{{{` is parsed as
`-{` `{{{` since we first grab the rightmost `{{{`.

A bunch of test cases were added to verify the "ideal precedence"
order described on that wiki page.

This patch introduced some minor incompatibilities in existing
markup, in particular with chemical formulae in templates.
Fixes for these are being tracked at
https://www.mediawiki.org/wiki/Parsoid/Language_conversion/Preprocessor_fixups

Bug: T146304
Bug: T153761
Change-Id: I2f0c186c75e392c95e1a3d89266cae2586349150
2017-05-23 15:43:49 +01:00
James D. Forrester
9635dda73a includes: Replace implicit Bugzilla bug numbers with Phab ones
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.

Change-Id: I6f59febaf8fc96e80f8cfc11f4356283f461142a
2017-02-21 18:13:24 +00:00
C. Scott Ananian
046c463635 Revert "Protect language converter markup in the preprocessor."
This effectively reverts commit 2877402276 in
order to unblock the deploy train.  The underlying behavior might not be
incorrect, but it was unexpected.

Bug: T153761
Change-Id: Ifc9c7cf3482dd5d222ff4da24a6d4cc401e9d965
2017-01-03 17:23:28 -05:00
C. Scott Ananian
2877402276 Protect language converter markup in the preprocessor.
This ensures that `{{echo|-{R|foo}-}}` is parsed correctly as
a template invocation with a single argument, not as two separate
arguments split by the `|`.

Bug: T146304
Change-Id: I709d007c70a3fd19264790055042c615999b2f67
2016-12-15 23:50:44 +00:00
Tim Starling
6a04d86149 Remove all assert() calls with string parameters
These fail when HHVM is in RepoAuthoritative mode

Change-Id: Ifb1628f8269b2b651154b740b95cc14163a1b186
2016-08-15 23:11:18 +00:00
Bartosz Dziewoński
674e8388cb Preprocessor: Don't allow unclosed extension tags (matching until end of input)
(Previously done in f51d0d9a81 and
reverted in 543f46e9c08e0ff8c5e8b4e917fcc045730ef1bc.)

I think it's saner to treat this as invalid syntax, and output the
mismatched tag code verbatim. The current behavior is particularly
annoying for <ref> tags, which often swallow everything afterwards.

This does not affect HTML tags, though. Assuming Tidy is enabled, they
are still auto-closed at the end of the page content. (For tags that
"shadow" a HTML tag name, this results in the tag being treated as a
HTML tag. This currently only affects <pre> tags: if unclosed, they
are still displayed as preformatted text, but without suppressing
wikitext formatting.)

It also does not affect <includeonly>, <noinclude> and <onlyinclude>
tags. Changing this behavior now would be too disruptive to existing
content, and is the reason why previous attempt was reverted. (They
are already special-cased enough that this isn't too weird, for example
mismatched closing tags are hidden.)

Related to T17712 and T58306. I think this brings the PHP parser closer
to Parsoid's interpretation.

It reduces performance somewhat in the worst case, though. Testing with
https://phabricator.wikimedia.org/F3245989 (a 1 MB page starting with
3000 opening tags of 15 different types), parsing time rises from
~0.2 seconds to ~1.1 seconds on my setup. We go from O(N) to O(kN),
where N is bytes of input and k is the number of types of tags present
on the page. Maximum k shouldn't exceed 30 or so in reasonable setups
(depends on installed extensions, it's 20 on English Wikipedia).

Change-Id: Ide8b034e464eefb1b7c9e2a48ed06e21a7f8d434
2016-04-05 12:28:10 -07:00
Thiemo Mättig
6906de45c1 Fix @param and @return types on all PPFrame::getArgument methods
This is about template parameters. They can be indexed by position (int) or
name (string). The returned value is always a string, or false (bool) on
failure.

Change-Id: I565210ad485505281246ef2bb3086a675b905976
2016-03-29 06:12:18 +00:00
Kunal Mehta
6e9b4f0e9c Convert all array() syntax to []
Per wikitech-l consensus:
 https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html

Notes:
* Disabled CallTimePassByReference due to false positives (T127163)

Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
2016-02-17 01:33:00 -08:00
Legoktm
543f46e9c0 Revert "Preprocessor: Don't allow unclosed extension tags (matching until end of input)"
This reverts commit f51d0d9a81.

Breaks templates with non-closed </noinclude> tags, which
were previously acceptable.

Bug: T125754
Change-Id: I8bafb15eefac4e1d3e727c1c84782636d8b82c2b
2016-02-04 00:38:35 +00:00
Bartosz Dziewoński
f51d0d9a81 Preprocessor: Don't allow unclosed extension tags (matching until end of input)
I think it's saner to treat this as invalid syntax, and output the
mismatched tag code verbatim. The current behavior is particularly
annoying for <ref> tags, which often swallow everything afterwards.

This does not affect HTML tags, though. Assuming Tidy is enabled, they
are still auto-closed at the end of the page content.

Related to T17712 and T58306. I think this brings the PHP parser closer
to Parsoid's interpretation.

It reduces performance somewhat in the worst case, though. Testing with
https://phabricator.wikimedia.org/F3245989 (a 1 MB page starting with
3000 opening tags of 15 different types), parsing time rises from
~0.2 seconds to ~1.1 seconds on my setup. We go from O(N) to O(kN),
where N is bytes of input and k is the number of types of tags present
on the page. Maximum k shouldn't exceed 30 or so in reasonable setups
(depends on installed extensions, it's 20 on English Wikipedia).

To consider:
* Should we keep previous behavior for unclosed <includeonly> /
  <noinclude>? This would be particularly disruptive for these if
  someone relied on the old behavior, and they're already
  special-cased in places.
* Unclosed <pre> tags are now treated as HTML tags, and are still
  displayed as preformatted text, but without suppressing wikitext
  formatting.

Change-Id: Ia2f24dbfb3567c4b0778761585e6c0303d11ddd0
2016-01-21 04:22:34 +00:00
Ori Livneh
447f40d2e9 Move brace matching rules to Preprocessor class
Instead of declaring the array of rules within both Preprocessor_DOM:: and
Preprocessor_Hash::preprocessToXml(), declare it as a protected property of the
parent Preprocessor class.

Change-Id: I6193de66566c164fe85cdd6a88c04fa9c565f1a9
2015-11-03 02:57:05 +00:00
Ori Livneh
1559be9f7a Consolidate common Preprocessor caching code
* Consolidate nearly-identical caching code in Preprocessor_DOM and
  Preprocessor_Hash by making Preprocessor an abstract class rather than an
  interface and by implementing Preprocessor::cacheSetTree() and
  Preprocessor::cacheGetTree().
* Cache trees for wikitext blobs that have length equal or greater to
  PreprocessorCacheThreshold. Previously they needed to be greater than
  PreprocessorCacheThreshold, so this changes the requirement by one character.
  I did it because it seems more natural.
* Modernize the code to use singleton service objects rather than globals.

We spend a lot of time in the Preprocessor, so it would be nice for this code
to be well-factored and clear.

Change-Id: Ib71c29f14a28445a505e12c774a24ad964330b95
2015-10-25 23:06:48 +00:00
umherirrender
c5ab19bf31 Use line comments for @codingStandardsIgnoreStart
In Preprocess_DOM.php and Preprocess_Hash.php the
@codingStandardsIgnoreStart is inside a doc comment, but phpcs does not
see this tag and does not ignore the error. Using line comments fix this
problems.

See
https://integration.wikimedia.org/ci/job/mediawiki-core-phpcs/842/console

Change-Id: Id0edf6edb2902466748165c2e820d2cf4b7fcf75
2015-10-07 20:15:31 +02:00
umherirrender
be0d31ace0 Use variable documentation in Preprocessor_DOM.php
Instead of having comments behind variable declaration.

This also avoids mixed tabs and spaces at begin of line

Change-Id: Iba62430f4413fd52bac1d51f5c5df4cb6479284d
2015-09-28 19:48:33 +02:00
Vivek Ghaisas
c54766586a Fix issues identified by SpaceBeforeSingleLineComment sniff
Change-Id: I048ccb1fa260e4b7152ca5f09b053defdd72d8f9
2015-09-26 23:06:52 +00:00
Kevin Israel
20cbd0f226 Make PPFrame::RECOVER_COMMENTS actually work
Because of a missing condition, it generally only had an effect on
output type Parser::OT_WIKI, and thus {{msgnw:}} would strip comments
except when substituted during a pre-save transform.

Bug: T98841
Change-Id: I1e47696434fe87475f9902e6bfb8990566456e2f
2015-08-15 04:12:23 -04:00
umherirrender
70f3afd548 Remove unneeded empty lines at begin of if/else/foreach body
An if body must not begin with an empty line

Change-Id: I62b058be337fcc85a120fcd3dadce564db59a271
2015-06-19 20:05:45 +02:00
Kunal Mehta
f6e5079a69 Use mediawiki/at-ease library for suppressing warnings
wfSuppressWarnings() and wfRestoreWarnings() were split out into a
separate library. All usages in core were replaced with the new
functions, and the wf* global functions are marked as deprecated.

Additionally, some uses of @ were replaced due to composer's autoloader
being loaded even earlier.

Ie1234f8c12693408de9b94bf6f84480a90bd4f8e adds the library to
mediawiki/vendor.

Bug: T100923
Change-Id: I5c35079a0a656180852be0ae6b1262d40f6534c4
2015-06-11 18:49:29 +00:00
Ori Livneh
12571bde26 Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:

* The strip marker regexes don't benefit from JIT compilation, so they are
  slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
  compiled, because HHVM bets on regexes getting reused. This extra work is
  fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
  displaces from the cache regexes which are in fact reused.

Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:

* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
  complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
  replace any occurences of \x7f with '?', to prevent strip marker forgery.
  \x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
  prefix may no longer be specified.

Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-31 19:33:36 -07:00
Jackmcbarn
9a805b816d Warn when duplicate arguments are found
Currently, duplicate arguments result in a categorization but not a
warning, and it's often difficult to find where in the template hierarchy
the problem lies. This causes a warning to be provided containing the
calling page's name, the called template's name, and the parameter's name.

Bug: T85352
Change-Id: I26b9a7ed5a2f246d00a49a5f6effe40b4443a9d0
2015-05-28 13:36:50 -04:00
Kunal Mehta
13975fe76a Use wikimedia/utfnormal library, add backwards-compatability layer
This drops support for the custom utf8 normal PHP extension in favor
of the intl extension.

Bug: T90825
Change-Id: Ifbaeb2ef684217cf6187ccc4fb4d303f89608300
2015-03-24 12:59:26 -07:00
Ricordisamoa
2ae155da52 Fix phpcs errors in includes/
Mostly Squiz.WhiteSpace.SuperfluousWhitespace.EmptyLines

Change-Id: I678b2f0902f11cd1dfa1611b9da24e7237df9122
2015-01-08 20:15:07 +01:00
Aaron Schulz
4ff8136807 Removed remaining profile calls
Change-Id: I31c81c78715048004fc8fca0f27d09c1fa71c118
2015-01-08 02:49:33 -08:00
Chad Horohoe
aa21e125a3 Remove obvious function-level profiling
Xhprof generates this data now. Custom profiling of various
sub-function units are kept.

Calls to profiler represented about 3% of page execution
time on Special:BlankPage (1.5% in/out); after this change
it's down to about 0.98% of page execution time.

Change-Id: Id9a1dc9d8f80bbd52e42226b724a1e1213d07af7
2015-01-07 11:14:24 -08:00
Reedy
4d9143c7f5 Add lots of @throws
Change-Id: I09d0c13070f966fcf23d2638d8fc1328279a5995
2014-12-24 13:49:20 +00:00
Jackmcbarn
05b7a51966 Add a tracking category for duplicate arguments
If a page accidentally duplicates an argument, such as
{{foo|bar=1|bar=2}} or {{foo|bar|1=baz}}, add it to a tracking category.

Bug: 69964
Change-Id: I3b6eeff8b51859bc7af0ea985f6f7528c2e9d220
2014-10-16 04:38:14 +00:00
Brad Jorsch
6a7b192a0a Handle multiple ownerDocuments for args in Preprocessor_DOM
As long as Preprocessor_DOM::newPartNodeArray returns nodes with
different roots when called multiple times, PPFrame_DOM::newChild should
be prepared to receive such.

Bug: 70046
Change-Id: Ie048d8dbd3042f19d934ff0dd8d32b4c46f9f952
2014-08-26 15:39:04 +00:00
umherirrender
6b4c44c2db Add missing @param to function docs
Change-Id: Ib26407bc55dff7969d8a3b1e2ae51751b202d8fb
2014-08-18 16:24:59 +00:00
addshore
99d7ca6bfc Add @codingStandardsIgnore tags to parser classes
Change-Id: I16d19de3d2b2461a68030afd3a79aa59c9e948d4
2014-08-12 01:01:11 +00:00
addshore
61c989cfc0 Fix phpcs issues in parser
This fixes all issues except for:
 - class names
 - line length

Change-Id: Ie91b010d5b3eec49d3b80b6e93b125a901ef43c6
2014-08-12 01:00:15 +00:00
Jackmcbarn
368aa5dc67 Make RECOVER_ORIG preserve extension tags
Add PPFrame::NO_TAGS, set by PPFrame::RECOVER_ORIG, to preserve extension
tags rather than expanding them.

Bug: 22683
Change-Id: I427333a20d32eb711a7b5d5ac8b780ef89c752a1
2014-06-19 18:12:14 +00:00
Jackmcbarn
18d15fa138 Add PPFrame::getTTL() and setTTL()
Add functions to frames to control the TTL of their output, and expose
this via expandtemplates in the API.

Bug: 49803
Change-Id: I412febf3469503bf4839fb1ef4dca098a8c79457
2014-06-09 20:40:22 +00:00
Brad Jorsch
d18ba4e9df Add PPFrame::isVolatile and PPFrame::setVolatile
Most wikitext is safe to parse once and then cache for when that same
wikitext is used again, such as for multiple transclusions of the same
template within a page. There are occasions, though, where some piece of
wikitext has side effects and so should not be cached; a prominent
example of such wikitext is the <ref> and <references> tags in Cite.php.

This change adds PPFrame::setVolatile so parser hooks such as <ref> and
<references> can indicate that they have done something that should not
be cached, and PPFrame::isVolatile so that callers of PPFrame::expand
can know when to avoid caching.

Bug: 46815
Bug: 31834
Change-Id: I95b3cf8781cf047cdb63da221cef45f3e7d1632e
2014-05-30 14:07:06 -04:00
Jackmcbarn
2094e578b4 Restrict empty-frame cache entries to their parent
Remove the parser's global $mTplExpandCache, and replace it with an
alternative that is separated by parent frame. This allows the integrity
of the empty-frame expansion cache to be maintained while also allowing
parent frame access.

A page with 3 copies of 
http://ja.wikipedia.org/wiki/%E4%B8%AD%E5%A4%AE%E7%B7%9A_(%E9%9F%93%E5%9B%BD) 
has the following statistics: Without this change, there are 4625 cache hits
on this page, and a sample of 3 parses took 16.6, 16.9, and 16.8 seconds.
With this change, there are 2588 cache hits, and a sample of 3 parses took
16.7, 16.7, and 17.0 seconds.

Change-Id: I621e9075e0f136ac188a4d2f53418b7cc957408d
2014-05-30 01:38:15 +00:00
jenkins-bot
49952a4050 Merge "Make phpcs-strict pass on includes/ (7/7)" 2014-05-19 19:38:53 +00:00
Ori.livneh
df983f6642 Revert "Declare visibility on class properties of includes/parser/"
See https://bugzilla.wikimedia.org/65375#c4

This reverts commit f359cdf614.

Bug: 65375
Change-Id: I12a60b5cc52a07a6deabcbf47c7c99cd2faac3c3
2014-05-16 00:52:24 +00:00
Siebrand Mazeland
a7fbdd6503 Make phpcs-strict pass on includes/ (7/7)
Change-Id: Ia9baaf0b3cdbe1a3c6b50ef8c4fe86fead88f909
2014-05-15 20:07:09 +02:00
Brad Jorsch
ff78abc1a1 Preprocessor_DOM::newPartNodeArray should check that loadXML succeeded
If something manages to get invalid UTF-8 into
Preprocessor_DOM::newPartNodeArray, or anything else that somehow is
invalid XML, it should handle it in the same way that
Preprocessor_DOM::preprocessToObj does rather than having something
further down the line blow up on a PPNode_DOM with a null node.

Bug: 65081
Change-Id: Ic24db455808106e17d49a11e41df33ec170f1206
2014-05-12 03:44:23 +00:00
Siebrand Mazeland
dfc7416fbe Various documentation updates for includes/parser/
Change-Id: I16dd3a792cc83f8c80b3652d42c055730f6d177a
2014-05-11 18:18:26 +02:00