Because of a missing condition, it generally only had an effect on
output type Parser::OT_WIKI, and thus {{msgnw:}} would strip comments
except when substituted during a pre-save transform.
Bug: T98841
Change-Id: I1e47696434fe87475f9902e6bfb8990566456e2f
wfSuppressWarnings() and wfRestoreWarnings() were split out into a
separate library. All usages in core were replaced with the new
functions, and the wf* global functions are marked as deprecated.
Additionally, some uses of @ were replaced due to composer's autoloader
being loaded even earlier.
Ie1234f8c12693408de9b94bf6f84480a90bd4f8e adds the library to
mediawiki/vendor.
Bug: T100923
Change-Id: I5c35079a0a656180852be0ae6b1262d40f6534c4
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
Currently, duplicate arguments result in a categorization but not a
warning, and it's often difficult to find where in the template hierarchy
the problem lies. This causes a warning to be provided containing the
calling page's name, the called template's name, and the parameter's name.
Bug: T85352
Change-Id: I26b9a7ed5a2f246d00a49a5f6effe40b4443a9d0
This drops support for the custom utf8 normal PHP extension in favor
of the intl extension.
Bug: T90825
Change-Id: Ifbaeb2ef684217cf6187ccc4fb4d303f89608300
Xhprof generates this data now. Custom profiling of various
sub-function units are kept.
Calls to profiler represented about 3% of page execution
time on Special:BlankPage (1.5% in/out); after this change
it's down to about 0.98% of page execution time.
Change-Id: Id9a1dc9d8f80bbd52e42226b724a1e1213d07af7
If a page accidentally duplicates an argument, such as
{{foo|bar=1|bar=2}} or {{foo|bar|1=baz}}, add it to a tracking category.
Bug: 69964
Change-Id: I3b6eeff8b51859bc7af0ea985f6f7528c2e9d220
As long as Preprocessor_DOM::newPartNodeArray returns nodes with
different roots when called multiple times, PPFrame_DOM::newChild should
be prepared to receive such.
Bug: 70046
Change-Id: Ie048d8dbd3042f19d934ff0dd8d32b4c46f9f952
Add PPFrame::NO_TAGS, set by PPFrame::RECOVER_ORIG, to preserve extension
tags rather than expanding them.
Bug: 22683
Change-Id: I427333a20d32eb711a7b5d5ac8b780ef89c752a1
Add functions to frames to control the TTL of their output, and expose
this via expandtemplates in the API.
Bug: 49803
Change-Id: I412febf3469503bf4839fb1ef4dca098a8c79457
Most wikitext is safe to parse once and then cache for when that same
wikitext is used again, such as for multiple transclusions of the same
template within a page. There are occasions, though, where some piece of
wikitext has side effects and so should not be cached; a prominent
example of such wikitext is the <ref> and <references> tags in Cite.php.
This change adds PPFrame::setVolatile so parser hooks such as <ref> and
<references> can indicate that they have done something that should not
be cached, and PPFrame::isVolatile so that callers of PPFrame::expand
can know when to avoid caching.
Bug: 46815
Bug: 31834
Change-Id: I95b3cf8781cf047cdb63da221cef45f3e7d1632e
Remove the parser's global $mTplExpandCache, and replace it with an
alternative that is separated by parent frame. This allows the integrity
of the empty-frame expansion cache to be maintained while also allowing
parent frame access.
A page with 3 copies of
http://ja.wikipedia.org/wiki/%E4%B8%AD%E5%A4%AE%E7%B7%9A_(%E9%9F%93%E5%9B%BD)
has the following statistics: Without this change, there are 4625 cache hits
on this page, and a sample of 3 parses took 16.6, 16.9, and 16.8 seconds.
With this change, there are 2588 cache hits, and a sample of 3 parses took
16.7, 16.7, and 17.0 seconds.
Change-Id: I621e9075e0f136ac188a4d2f53418b7cc957408d
If something manages to get invalid UTF-8 into
Preprocessor_DOM::newPartNodeArray, or anything else that somehow is
invalid XML, it should handle it in the same way that
Preprocessor_DOM::preprocessToObj does rather than having something
further down the line blow up on a PPNode_DOM with a null node.
Bug: 65081
Change-Id: Ic24db455808106e17d49a11e41df33ec170f1206
Remaining are the classes containing underscores and possibly a few other
issues that will be addressed soonish.
Change-Id: Icf56374c71afc134420ebbcfecf12dcb29dc9564
Swapped some "$var type" to "type $var" or added missing types
before the $var. Changed some other types to match the more common
spelling. Makes beginning of some text in captial.
Also added some missing @param.
Change-Id: I49f8f48b521878de7abd9cc40efdeff6cf9a37e0
The Line continuation Coding conventions prefers the closing parenthesis
on the same line than the beginning curly braces. This is done for ifs
and functions.
Also move some boolean operator from the end of a line to the beginning
and changed some indentation to make the condition hopefully better
readable.
Change-Id: Id0437b06bde86eb5a75bc59eefa19e7edb624426
We originally allowed only spaces around comments. Now allow tabs as
well. This ought to affect very few pages, but it helps predictability
and to maintain consistency between the PHP preprocessor and parsoid.
Change-Id: Icb3ff6eec08aaa83ae332d03c910c13995c9c9ee
After this patch, 'a', 'b', and 'c' are all treated as members of the
same list in the following wikitext:
*a
<!--x-->
*b
<!--x--> <!--y-->
*c
The old comment-removal rule was "trim a comment which is both
preceded and followed by a newline (ignoring spaces)". This only works
if there is a single comment on the line, and was often surprising
to users. The new rule allows any number of whitespace-separated
comments on the line.
Bug: 41756
Change-Id: I6030086226e1eeece59643c29dbb4361668b4bd6
And added/removed spaces around some other tokens,
like +, -, *, /, <, >, =, !
Fixed windows newline style
Change-Id: I0b9c8c408f3f6bfc0d685a074d7ec468fb848fc8
* Removed spaces around array index
* Removed double spaces or added spaces to begin or end of function
calls, method signature, conditions or foreachs
* Added braces to one-line ifs
* Changed multi line conditions to one line conditions
* Realigned some arrays
Change-Id: Ia04d2a99d663b07101013c2d53b3b2e872fd9cc3
Doxygen expects parameter types to come before the
parameter name in @param tags. Used a quick regex
to switch everything around where possible. This
only fixes cases where a primitve variable (or a
primitive followed by other types) is the variable
type. Other cases will need to be fixed manually.
Change-Id: Ic59fd20856eb0489d70f3469a56ebce0efb3db13
Added/removed spaces around logical/arithmetic operator
Reduced multiple empty lines to one empty line
Removed wrong tabs before comments at end of line
Removed too many spaces in assigments
Change-Id: I2bba4e72f9b5f88c53324d7b70e6042f1aad8f6b
Parser tests also included, test case and original patch supplied by
Bergi on bugzilla. Tested against the current version.
Change-Id: Id7ec4e694783dd0f682f65f39d8b9e59f82e58aa
To prevent large template DOM caches from sending servers into swap,
throw an exception when more than some number of DOM elements are
parsed. Unfortunately, it wasn't possible to return a normal error
message, because it broke PST and extractSections and corrupted the
article text. It's safer to refuse to save the edit, and we don't
have decent ways to do that short of throwing an exception.
Ideally we would like to have an upstream patch that hooks libxml to
allocate memory from PHP's request pool, then a fatal error would be
raised instead of swapping.
Change-Id: I4cb4f6fd313e1e0940b56cc5e586afd1bea9267a
The $text is constant and that means, the length of $text is also
constant, store it in a local var is easy than.
Change-Id: I9631b862f40eef7f8b18559ffd474a0037077d18