Commit graph

1296 commits

Author SHA1 Message Date
Kunal Mehta
e4c41d5126 Document that ParserCache::get() may be passed a WikiPage or Article
This is terrible, but at least it is no longer lying.

Change-Id: Id1cc1616b60dbde45a12ce9a23b76282efd1c6a9
2015-06-24 01:21:10 +00:00
Ori Livneh
207dfd2adf Add RejectParserCacheValue hook
Add a new hook, 'RejectParserCacheValue', which allows extensions to reject an
otherwise-successful parser cache lookup. The intent is to allow extensions to
manage the eviction of archaic HTML output from the cache.

Change-Id: I660679a48c46608f859bd52b31d6a888aabcc9ac
2015-06-23 11:23:57 -07:00
umherirrender
70f3afd548 Remove unneeded empty lines at begin of if/else/foreach body
An if body must not begin with an empty line

Change-Id: I62b058be337fcc85a120fcd3dadce564db59a271
2015-06-19 20:05:45 +02:00
Vivek Ghaisas
9f5b6f5aeb Fix whitespace issues around parentheses
Fix issues found by MediaWiki.WhiteSpace.SpaceyParenthesis sniff.

Bug: T102617
Change-Id: Iec7f71e64081659fba373ec20d9d2006306a98f4
2015-06-16 22:14:02 +03:00
Kunal Mehta
f6e5079a69 Use mediawiki/at-ease library for suppressing warnings
wfSuppressWarnings() and wfRestoreWarnings() were split out into a
separate library. All usages in core were replaced with the new
functions, and the wf* global functions are marked as deprecated.

Additionally, some uses of @ were replaced due to composer's autoloader
being loaded even earlier.

Ie1234f8c12693408de9b94bf6f84480a90bd4f8e adds the library to
mediawiki/vendor.

Bug: T100923
Change-Id: I5c35079a0a656180852be0ae6b1262d40f6534c4
2015-06-11 18:49:29 +00:00
Aaron Schulz
8af83f4ff8 Use instanceof in ParserCache::getKey to help IDEs
Change-Id: I772f53ee28ade5da499fe05259a17fed5cc52adb
2015-06-10 14:09:20 -07:00
Aaron Schulz
6b0163391b Avoid parser cache miss that often occurs post-save
* This should not happen as doEditContent() saves the parser cache,
  so only the rare casing if incompatible options should have misses
* The bug could also cause post-save misses with edit stashing
* Avoid the second page parse post-redirect by making sure cache
  timestamps match up instead of calling time() at several points
* Likewise for null edits, which used a different code path
* Removed redundant purge in onArticleCreate() as the new row sets _touched
* Removed pointless purge in onArticleDelete() as there is no row to update
  (the method no-ops in that case to avoid contention already)

Change-Id: I178fe334a3f8691ffd9452bec30561a0c5d37c6c
2015-06-09 01:01:03 +00:00
Ori Livneh
b31e567b78 hierarchicalize(!) stat names
Graphite expects name components to be dot-separated, so our habit of using
dashes doesn't really make sense. Change metric names to be more compatible
with Graphite, except the job queue's, since that will require a gdash
dashboard definition migration.

Change-Id: I77d0ff7606a8fc88434e4352d23415a9a8f4725a
2015-06-03 16:27:13 -07:00
jenkins-bot
0e1c80e6e1 Merge "Check result of preg_match_all in Parser.php" 2015-06-02 22:08:42 +00:00
Ori Livneh
12571bde26 Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:

* The strip marker regexes don't benefit from JIT compilation, so they are
  slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
  compiled, because HHVM bets on regexes getting reused. This extra work is
  fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
  displaces from the cache regexes which are in fact reused.

Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:

* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
  complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
  replace any occurences of \x7f with '?', to prevent strip marker forgery.
  \x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
  prefix may no longer be specified.

Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-31 19:33:36 -07:00
umherirrender
c430850154 Check result of preg_match_all in Parser.php
preg_match_all can return false on failure, which than results in
undefined index access.
Check the result and just keep it as nothing found by processing an
empty array

Change-Id: I1f11894240dc6869506d68d3513715abdc3abb5d
2015-05-29 05:16:08 +00:00
Jackmcbarn
9a805b816d Warn when duplicate arguments are found
Currently, duplicate arguments result in a categorization but not a
warning, and it's often difficult to find where in the template hierarchy
the problem lies. This causes a warning to be provided containing the
calling page's name, the called template's name, and the parameter's name.

Bug: T85352
Change-Id: I26b9a7ed5a2f246d00a49a5f6effe40b4443a9d0
2015-05-28 13:36:50 -04:00
daniel
d39e1e24d1 Introduce ParserCacheSaveComplete hook.
Rationale: give extensions a way to track which "renderings"
of a page exist in the cache. This is particularly relevant
for multi-lingual wikis that splpit the parser cache by user
language on some pages. In that case, hooking into
ParserAfterParse or LinksUpdateComplete is insufficient to
track all language specific renderings.

Bug: T99511
Change-Id: Iebf526098ca837a7df637c650097119495000c81
2015-05-25 13:35:23 +00:00
Timo Tijhof
9b6ee1da59 resourceloader: Remove only=messages
This isn't needed.

* Deprecate 'modulemessages' API, return empty result and warning.
* Remove half-implemented only=templates.

Change-Id: Ia6c87d687c6670b3ebf24251572191732651e8f5
2015-05-13 20:17:35 +01:00
Jackmcbarn
62c3fe221f Allow running code during unstrip
When adding strip markers, allow closures to be passed in place of text.
The closure is then called during unstrip. Also, add a hook that runs
after unstripGeneral. This is needed for Extension:Cite's I0e136f952.

Change-Id: If83b0623671fd67e5ccc9deaaaab456a6679af8f
2015-05-13 02:44:20 +00:00
Aaron Schulz
d9505b9dc1 Updated ParserCache doc types
Change-Id: I71fead62a4a498e40b2aa57e6d2701409bf7c7c0
2015-05-01 23:07:18 -07:00
Timo Tijhof
d06855ecbe Parser: Say tildes instead of ~~~ in comment to fix Doxygen fatal
Doxygen was unable to parse the file past validateSig().

> Parser.php:6397: warning: reached end of file while inside a ~~~ block!
> The command that should end the block seems to be missing!

Change-Id: I3d1b547968302611d2bd78a7c11dd0738b40d23a
2015-04-06 12:32:25 +00:00
Max Semenik
08762b02de Minor cleanups
* Declare undeclared variables
* Kill unused variables
* Fix comments including PHPDoc

Change-Id: I60015f6b6740aa9088bda3745f4dc4e65e29fcb1
2015-04-02 16:22:42 +00:00
jenkins-bot
d1adda0be9 Merge "Fixed {{REVISION(TIMESTAMP|USER|SIZE)}} on new revisions" 2015-03-31 18:00:30 +00:00
Aaron Schulz
ac0de3c430 Fixed {{REVISION(TIMESTAMP|USER|SIZE)}} on new revisions
* This makes use of the injected new revision object used elsewhere
  in Parser to solve this problem.

Bug: T94407
Change-Id: I7881583cf7cb2bc799c89ffaa2a344a2d4ca3a4e
2015-03-30 21:10:09 -07:00
Kunal Mehta
13975fe76a Use wikimedia/utfnormal library, add backwards-compatability layer
This drops support for the custom utf8 normal PHP extension in favor
of the intl extension.

Bug: T90825
Change-Id: Ifbaeb2ef684217cf6187ccc4fb4d303f89608300
2015-03-24 12:59:26 -07:00
Arlo Breault
78c3f2f4b1 Tidy up tidy usage
* There's a branch path in the sanitizer that depends on $wgUseTidy,
   which means the test output differs from on wiki.

 * In general, we should set these variables to match the wiki behaviour
   in tests.

 * Exposes T92892, Sanitizer removes empty tags when tidy is disabled.

 * Tweaked tests for T19663 to use an extension tag to show that
   HTML5 tags with non-word characters make it through the parser
   intact (before being ultimately sanitized).

Change-Id: I09c72fd739e11a8b757f37dc4c790758d782ad73
2015-03-16 16:33:50 -04:00
jenkins-bot
4e004124cd Merge "Removed obsolete "containsOldMagic" code" 2015-03-04 06:02:04 +00:00
Chad Horohoe
c33f4de066 Profile all external HTTP requests from MW
Change-Id: Ie980b080da2ef21ec7d9fc32f1accc55710de140
2015-03-03 20:54:30 -08:00
daniel
95c85f71b1 Remove getSecondaryDataUpdates and friends from ParserOutput.
This is a hard deprecation, with getSecondaryDataUpdates returning an
empty array and addSecondaryDataUpdate throwing an exception. This seems
prudent since there are no known users of these methods, and they
interfere with the parser cache:

DataUpdates are basically jobs, they need access to services to
function. That makes them inherently non-serializable. This interferes
with the function of the parser cache, which serializes ParserOutput
objects in order to persist them.

This could be solved by splitting DataUpdates into DataUpdateDefinitions
and DataUpdateHandlers, similar to how JobSpecification works with
wgJobClasses. That however seems pointless and overkill, since
ParserOutput already has a mechanism for storing arbitrary data,
including any info needed by an UpdateJob: the setExtensionData method.

After this change, the preferred method to introduce custom data updates
is to store any relevant data using setExtensionData and 
implement Content::getSecondaryDataUpdates() if possible. If not,
use the 'SecondaryDataUpdates' hook to construct the necessary update
objects from the info stored using setExtensionData.

Change-Id: I0f6f49e61fa3d8904e55f42c99f342a3dc357495
2015-02-24 11:01:16 +01:00
jenkins-bot
cb4f6e9341 Merge "Removed doCascadeProtectionUpdates method to avoid DB writes on page views" 2015-02-23 01:18:05 +00:00
Aaron Schulz
df5ef8b5d7 Removed doCascadeProtectionUpdates method to avoid DB writes on page views
* Use special prioritized refreshLinksJobs instead, which triggers when
  transcluded pages are changed
* Also added a triggerOpportunisticLinksUpdate() method to handle
  dynamic transcludes

bug: T89389
Change-Id: Iea952d4d2e660b7957eafb5f73fc87fab347dbe7
2015-02-22 13:36:13 -08:00
Erik Bernhardson
e73f17527e Correct misleading documentation
Change-Id: Ib020467488616eeaa9b53672e5cc45c72f240a54
2015-02-20 19:55:11 +00:00
Aude
2664ccdc43 Revert "Removed doCascadeProtectionUpdates method to avoid DB writes on page views"
due to breakage at least in phpunit tests for mysql:

https://travis-ci.org/wikimedia/mediawiki-extensions-Wikibase/jobs/51490784

This reverts commit 132f7bb89f.

Change-Id: I85d19ab5ad30e8d13a956d7b7467a94c9e73219d
2015-02-20 13:17:41 +00:00
Aaron Schulz
132f7bb89f Removed doCascadeProtectionUpdates method to avoid DB writes on page views
* Use special prioritized refreshLinksJobs instead, which triggers when
  transcluded pages are changed
* Also added a triggerOpportunisticLinksUpdate() method to handle
  dynamic transcludes

bug: T89389
Change-Id: I8e5a6ddb643c12e0fb5c1c68bc83f912944e6e8d
2015-02-20 03:16:18 +00:00
Jackmcbarn
cab99af90e Fix TOC anchor name collisions in edge cases
Currently, the parser adds a "_2" to the second of two identical headlines to
avoid collisions, but there's still a collision if another headline actually
ends in "_2". This change causes the new headline to also be checked for a
collision, and advances to "_3" or beyond if there is one.

Bug: T26787
Change-Id: Id0a55aa4c1917bac2f8f0d4863fcb85bd3dff1ca
2015-02-17 20:59:33 +00:00
Aaron Schulz
4111ff0dc3 Removed obsolete "containsOldMagic" code
Change-Id: Id225347e0599a6f79b30b0793cce7d97daed46f2
2015-02-15 14:41:49 -08:00
Timo Tijhof
d62a2b76b1 Replace dev.w3.org with more permanent or stable urls
* Sanitizer: dev.w3.org/html5/spec-preview
  Follows-up 8e8b15afc6.
  Use stable reference to www.w3.org/TR/html5 instead (currently
  from October 2014) instead of an old preview branch from 2012.

* parserTests: dev.w3.org/html5
  Follows-up 959aa336a1.
  Url is now a dead end. Replaced with link to a draft from around
  that time. The relevant section no longer exists in the curent
  spec as it got split off into a separate spec. Maybe this one:
  https://url.spec.whatwg.org/#percent-encoded-bytes

* Parser, HTMLIntField: dev.w3.org/html5
  Use stable reference to www.w3.org/TR/html5 instead.

* HTMLFloatField.php: dev.w3.org/html5
  Url is now a dead end. Draft from around that time:
  http://www.w3.org/TR/2011/WD-html5-20110525/common-microsyntaxes.html#real-numbers
  The section "Real numbers" no longer exists in the current spec,
  but the Infrastructure chapter has a section on floating point
  numbers that describes the same sequence now.

Change-Id: I7dcd49b6cd39785fb1b294e4eeaf39bda52337b2
2015-02-14 14:21:33 +00:00
Sam Reed
f41e2ddb6a Don't split regex string unnecessarily
Change-Id: Id5912e64916ce5c7be2991478c32531596917540
2015-01-28 16:17:41 +00:00
m4tx
aa72c4e0d2 Add missing documentation in DateFormatter.php
Change-Id: Ic5c04bdb88bc57a7c44159d7858ef791c24354c4
2015-01-26 17:58:50 +00:00
Kunal Mehta
247ecab445 SpecialTrackingCategories: Read from the extension registry
This demonstrates how we can transition from extensions putting
things into the global scope ($wgTrackingCategories) to instead
storing them in the extension registry. This will increase the
overall performance of the extension registry since it no
longer needs to do an array_merge with $wgTrackingCategories.

For extensions already converted to using the registry
no change is needed as the schema is still the same.

Change-Id: Ie0df4c20b123dac784a1c02eb991edc609a911b6
2015-01-23 10:33:45 -08:00
Aaron Schulz
6921770414 Updated some try-catch statements: MWException -> Exception
Change-Id: I76601a86e30f4984e3b1a8c8ec5ef5a0f652433a
2015-01-09 17:20:22 -08:00
daniel
f10b8df598 Fix ApiStashEdit wrt custom DataUpdates.
My previous patch broke this: ApiStashEdit would stash ParserOutput
with no custom DataUpdates, but calling getSecondaryDataUpdates still
failed after unserialization. This patch should fix that.

Bug: T86305
Change-Id: Ic114e521c5dfd0d3c028ea7d16e93eace758deef
2015-01-09 19:19:13 +00:00
jenkins-bot
dfc2775848 Merge "Skip ApiStashEdit if custom DataUpdates are present." 2015-01-09 16:22:25 +00:00
daniel
d509361e67 Skip ApiStashEdit if custom DataUpdates are present.
Bug: T86305
Change-Id: I423ba39a46a08edf2862b8439169ff91338fb6eb
2015-01-09 15:51:15 +00:00
Ricordisamoa
2ae155da52 Fix phpcs errors in includes/
Mostly Squiz.WhiteSpace.SuperfluousWhitespace.EmptyLines

Change-Id: I678b2f0902f11cd1dfa1611b9da24e7237df9122
2015-01-08 20:15:07 +01:00
Aaron Schulz
4ff8136807 Removed remaining profile calls
Change-Id: I31c81c78715048004fc8fca0f27d09c1fa71c118
2015-01-08 02:49:33 -08:00
Chad Horohoe
aa21e125a3 Remove obvious function-level profiling
Xhprof generates this data now. Custom profiling of various
sub-function units are kept.

Calls to profiler represented about 3% of page execution
time on Special:BlankPage (1.5% in/out); after this change
it's down to about 0.98% of page execution time.

Change-Id: Id9a1dc9d8f80bbd52e42226b724a1e1213d07af7
2015-01-07 11:14:24 -08:00
jenkins-bot
7746e1458b Merge "Use preview content when it transcludes itself" 2014-12-31 16:19:24 +00:00
Jackmcbarn
779f1024c1 Use preview content when it transcludes itself
When a page transcludes itself, such as <noinclude>foo
{{:{{FULLPAGENAME}}}}</noinclude><includeonly>bar</includeonly>, use the
preview content in its own transclusions. This code was basically ripped
straight from Extension:TemplateSandbox.

Bug: T85408
Bug: T7278
Change-Id: I1aa091a395a4f7b7b744e09e0bed59bc2e1176d0
2014-12-30 12:59:16 -05:00
Amir E. Aharoni
144d741196 Shorten lines to pass phpcs test
Change-Id: I5588e1f16f1a23d77160cd180058bd2000a93ab6
2014-12-29 17:14:08 +02:00
Derk-Jan Hartman
e20e64eb6b Parser: Add <bdi> to the whitelist for TOC links
Bug: 72884
Change-Id: Id5aa9a4eb32fb185881141e55de700ae36f806c5
2014-12-27 21:24:42 +01:00
Reedy
4d9143c7f5 Add lots of @throws
Change-Id: I09d0c13070f966fcf23d2638d8fc1328279a5995
2014-12-24 13:49:20 +00:00
C. Scott Ananian
54a8199f87 Don't allow embedded newlines in magic links, but do allow &nbsp;
This continues the work started in T67278 to make magic link parsing
more consistent with wiki text parsing in general, and closes two
long-standing bugs.

Bug: T30950
Bug: T31025
Change-Id: I71f8b337543163569c64bbfdec154eb9b69d7264
2014-12-22 04:14:55 +00:00
Jackmcbarn
c05b4c9bc4 Re-emit unknown tags from #tag
When #tag is given a tag that it doesn't recognize, re-emit it as a
regular tag instead of giving an error. This allows for it to be used with
transparent tags and HTML tags.

Change-Id: I0ceee8a4fdaf2d3142054a108f445ff06597c31a
2014-12-18 23:06:22 -05:00