Add a new hook, 'RejectParserCacheValue', which allows extensions to reject an
otherwise-successful parser cache lookup. The intent is to allow extensions to
manage the eviction of archaic HTML output from the cache.
Change-Id: I660679a48c46608f859bd52b31d6a888aabcc9ac
wfSuppressWarnings() and wfRestoreWarnings() were split out into a
separate library. All usages in core were replaced with the new
functions, and the wf* global functions are marked as deprecated.
Additionally, some uses of @ were replaced due to composer's autoloader
being loaded even earlier.
Ie1234f8c12693408de9b94bf6f84480a90bd4f8e adds the library to
mediawiki/vendor.
Bug: T100923
Change-Id: I5c35079a0a656180852be0ae6b1262d40f6534c4
* This should not happen as doEditContent() saves the parser cache,
so only the rare casing if incompatible options should have misses
* The bug could also cause post-save misses with edit stashing
* Avoid the second page parse post-redirect by making sure cache
timestamps match up instead of calling time() at several points
* Likewise for null edits, which used a different code path
* Removed redundant purge in onArticleCreate() as the new row sets _touched
* Removed pointless purge in onArticleDelete() as there is no row to update
(the method no-ops in that case to avoid contention already)
Change-Id: I178fe334a3f8691ffd9452bec30561a0c5d37c6c
Graphite expects name components to be dot-separated, so our habit of using
dashes doesn't really make sense. Change metric names to be more compatible
with Graphite, except the job queue's, since that will require a gdash
dashboard definition migration.
Change-Id: I77d0ff7606a8fc88434e4352d23415a9a8f4725a
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
preg_match_all can return false on failure, which than results in
undefined index access.
Check the result and just keep it as nothing found by processing an
empty array
Change-Id: I1f11894240dc6869506d68d3513715abdc3abb5d
Currently, duplicate arguments result in a categorization but not a
warning, and it's often difficult to find where in the template hierarchy
the problem lies. This causes a warning to be provided containing the
calling page's name, the called template's name, and the parameter's name.
Bug: T85352
Change-Id: I26b9a7ed5a2f246d00a49a5f6effe40b4443a9d0
Rationale: give extensions a way to track which "renderings"
of a page exist in the cache. This is particularly relevant
for multi-lingual wikis that splpit the parser cache by user
language on some pages. In that case, hooking into
ParserAfterParse or LinksUpdateComplete is insufficient to
track all language specific renderings.
Bug: T99511
Change-Id: Iebf526098ca837a7df637c650097119495000c81
When adding strip markers, allow closures to be passed in place of text.
The closure is then called during unstrip. Also, add a hook that runs
after unstripGeneral. This is needed for Extension:Cite's I0e136f952.
Change-Id: If83b0623671fd67e5ccc9deaaaab456a6679af8f
Doxygen was unable to parse the file past validateSig().
> Parser.php:6397: warning: reached end of file while inside a ~~~ block!
> The command that should end the block seems to be missing!
Change-Id: I3d1b547968302611d2bd78a7c11dd0738b40d23a
* This makes use of the injected new revision object used elsewhere
in Parser to solve this problem.
Bug: T94407
Change-Id: I7881583cf7cb2bc799c89ffaa2a344a2d4ca3a4e
This drops support for the custom utf8 normal PHP extension in favor
of the intl extension.
Bug: T90825
Change-Id: Ifbaeb2ef684217cf6187ccc4fb4d303f89608300
* There's a branch path in the sanitizer that depends on $wgUseTidy,
which means the test output differs from on wiki.
* In general, we should set these variables to match the wiki behaviour
in tests.
* Exposes T92892, Sanitizer removes empty tags when tidy is disabled.
* Tweaked tests for T19663 to use an extension tag to show that
HTML5 tags with non-word characters make it through the parser
intact (before being ultimately sanitized).
Change-Id: I09c72fd739e11a8b757f37dc4c790758d782ad73
This is a hard deprecation, with getSecondaryDataUpdates returning an
empty array and addSecondaryDataUpdate throwing an exception. This seems
prudent since there are no known users of these methods, and they
interfere with the parser cache:
DataUpdates are basically jobs, they need access to services to
function. That makes them inherently non-serializable. This interferes
with the function of the parser cache, which serializes ParserOutput
objects in order to persist them.
This could be solved by splitting DataUpdates into DataUpdateDefinitions
and DataUpdateHandlers, similar to how JobSpecification works with
wgJobClasses. That however seems pointless and overkill, since
ParserOutput already has a mechanism for storing arbitrary data,
including any info needed by an UpdateJob: the setExtensionData method.
After this change, the preferred method to introduce custom data updates
is to store any relevant data using setExtensionData and
implement Content::getSecondaryDataUpdates() if possible. If not,
use the 'SecondaryDataUpdates' hook to construct the necessary update
objects from the info stored using setExtensionData.
Change-Id: I0f6f49e61fa3d8904e55f42c99f342a3dc357495
* Use special prioritized refreshLinksJobs instead, which triggers when
transcluded pages are changed
* Also added a triggerOpportunisticLinksUpdate() method to handle
dynamic transcludes
bug: T89389
Change-Id: Iea952d4d2e660b7957eafb5f73fc87fab347dbe7
* Use special prioritized refreshLinksJobs instead, which triggers when
transcluded pages are changed
* Also added a triggerOpportunisticLinksUpdate() method to handle
dynamic transcludes
bug: T89389
Change-Id: I8e5a6ddb643c12e0fb5c1c68bc83f912944e6e8d
Currently, the parser adds a "_2" to the second of two identical headlines to
avoid collisions, but there's still a collision if another headline actually
ends in "_2". This change causes the new headline to also be checked for a
collision, and advances to "_3" or beyond if there is one.
Bug: T26787
Change-Id: Id0a55aa4c1917bac2f8f0d4863fcb85bd3dff1ca
* Sanitizer: dev.w3.org/html5/spec-preview
Follows-up 8e8b15afc6.
Use stable reference to www.w3.org/TR/html5 instead (currently
from October 2014) instead of an old preview branch from 2012.
* parserTests: dev.w3.org/html5
Follows-up 959aa336a1.
Url is now a dead end. Replaced with link to a draft from around
that time. The relevant section no longer exists in the curent
spec as it got split off into a separate spec. Maybe this one:
https://url.spec.whatwg.org/#percent-encoded-bytes
* Parser, HTMLIntField: dev.w3.org/html5
Use stable reference to www.w3.org/TR/html5 instead.
* HTMLFloatField.php: dev.w3.org/html5
Url is now a dead end. Draft from around that time:
http://www.w3.org/TR/2011/WD-html5-20110525/common-microsyntaxes.html#real-numbers
The section "Real numbers" no longer exists in the current spec,
but the Infrastructure chapter has a section on floating point
numbers that describes the same sequence now.
Change-Id: I7dcd49b6cd39785fb1b294e4eeaf39bda52337b2
This demonstrates how we can transition from extensions putting
things into the global scope ($wgTrackingCategories) to instead
storing them in the extension registry. This will increase the
overall performance of the extension registry since it no
longer needs to do an array_merge with $wgTrackingCategories.
For extensions already converted to using the registry
no change is needed as the schema is still the same.
Change-Id: Ie0df4c20b123dac784a1c02eb991edc609a911b6
My previous patch broke this: ApiStashEdit would stash ParserOutput
with no custom DataUpdates, but calling getSecondaryDataUpdates still
failed after unserialization. This patch should fix that.
Bug: T86305
Change-Id: Ic114e521c5dfd0d3c028ea7d16e93eace758deef
Xhprof generates this data now. Custom profiling of various
sub-function units are kept.
Calls to profiler represented about 3% of page execution
time on Special:BlankPage (1.5% in/out); after this change
it's down to about 0.98% of page execution time.
Change-Id: Id9a1dc9d8f80bbd52e42226b724a1e1213d07af7
When a page transcludes itself, such as <noinclude>foo
{{:{{FULLPAGENAME}}}}</noinclude><includeonly>bar</includeonly>, use the
preview content in its own transclusions. This code was basically ripped
straight from Extension:TemplateSandbox.
Bug: T85408
Bug: T7278
Change-Id: I1aa091a395a4f7b7b744e09e0bed59bc2e1176d0
This continues the work started in T67278 to make magic link parsing
more consistent with wiki text parsing in general, and closes two
long-standing bugs.
Bug: T30950
Bug: T31025
Change-Id: I71f8b337543163569c64bbfdec154eb9b69d7264
When #tag is given a tag that it doesn't recognize, re-emit it as a
regular tag instead of giving an error. This allows for it to be used with
transparent tags and HTML tags.
Change-Id: I0ceee8a4fdaf2d3142054a108f445ff06597c31a