Added/removed spaces around logical/arithmetic operator
Reduced multiple empty lines to one empty line
Removed wrong tabs before comments at end of line
Removed too many spaces in assigments
Change-Id: I2bba4e72f9b5f88c53324d7b70e6042f1aad8f6b
Added/removed spaces after opening/before closing parentheses
Added a space after a comma
Removed unneeded parentheses in condition
Change-Id: I306091347ccaaf11dee0cdfda3019cb0c12be51b
Remove $wgUseDynamicDates and everything related to it.
I left DateFormatter::reformat() alone, since it might possibly be
used elsewhere, and to be honest I'm afraid to touch it.
Change-Id: I609db8471c14e5e5946916f085d2ee5b96204d81
To reduce the maintenance burden for changes such as Id7ec4e69. The
project to optimise the preprocessor for hiphop is incomplete and is not
especially useful given the present state of hiphop support.
Change-Id: Iebcfe4d40f74520e29e7feb522251892fab2f652
Parser tests also included, test case and original patch supplied by
Bergi on bugzilla. Tested against the current version.
Change-Id: Id7ec4e694783dd0f682f65f39d8b9e59f82e58aa
By PSR2 PHP Standard, the files should ends with exactly one newline.
Some of our files have 2 or more and some other were missing a newline.
Fix almost all occurences of CodeSniffer sniff:
PSR2.Files.EndFileNewline.TooMany
I have not fixed the selenium files, I believe we will drop them.
Change-Id: I89fca8c1786fee94855b7b77bb0f364001ee84b6
Extensions sometimes need to stash information in the ParserOutput
for later use. This change provides a clean way to do that.
Change-Id: I8bc571d13c9a70bb71430862c2ab679ff1947126
Also moved the retrieval of the revision ID near the one of the
page ID so that the call of ParserOutput::addTemplate() is much
clearer than the actual one.
Change-Id: Ie71ee76e90cc131eac25c0f339d5250d5163ce2e
Makes life easier for static analysis, since they don't need to
handle if the end of a function where a wfProfileOut was not called
was reachable or not.
It is recommended to review this change ignoring whitespaces
(specially for includes/parser/Tidy.php)
Also documented the rationale for the elseif chain in UploadBase::detectVirus()
Change-Id: Ic4f65937fa9e6f926d8fcfd670e3b0e99e06eefc
By the way the check $oldkey != $vardbk is unnecessary because
there's already $variant != $category check.
Change-Id: I963be065723059073c9cb83c6ef636af8d023faf
- call them pipe tricks (plural) as there's more than one
- mention double-width comma as well
- use one tab character for alignment due to double-width chars
- document reverse pipe trick
Change-Id: I27a1d04362eb3988fc1318fa1f73f69877019439
I've implemented the function Parser::getExternalLinkRel which
gives the 'rel' attribute for a given link in a given NS. Per Tim's
suggestion, as it's currently impossible to invoke the logic in
Parser::getExternalLinkAttribs externally.
Change-Id: Id0bfed81e2afd6730d820b6c9a4a09155a557f37
Additionally remove creation of bogus title in transformMsg.
The only place preprocess uses the title is in startParse. And that explicitly allows null.
Change-Id: I33d090bf250092fc541e284eb19dbd4053f40ae5
Patch let us handle the <data>, <time>, <meta>, and <link> elements.
* handles one part of bug 32545 requesting us to support the <time>
element in WikiText.
* Partially fix bug 28776 about whitelisting global HTML5 semantic
attributes and inline meta element.
* <meta> and <link> are only permitted when Microdata is enabled using
* the global $wgAllowMicrodataAttributes. For for security reason, the
links are only allowed to be actual elements when they have a
strict set of attributes set.
Change-Id: Ica11be186bd62eb154e1ebc400acb515c10fb65f
* IRIs are getting more and more widely used these days so Chinese
characters are also needed to be prevented from being converted
in text of external links.
* So now all markNoConversion() functions in languages with variants
do the same thing. Merge them into a single function in the
Language class and drop implementations in individual languages.
* By the way rephrase phpdoc of that function, and (bug 24798) fix
the link detection regex to use wfUrlProtocolsWithoutProtRel().
Protocol-relative regex is excluded to avoid false positives.
* Add parser test for it.
Change-Id: I2ec0ac2b9b11221584adb72555168498de209d57
We store various bits of data as "expando" properties on the Parser
object, to pass information from one stage of the parser to another. If
the parser is cloned, however, we can run into trouble because two
different Parser objects are now manipulating the same extension data
structure; this often shows up when ParserClearState is called on one
clone and clears the state of the other as well.
Since a deep clone might be too expensive and still might be wrong in
some cases, it seems most useful to simply provide a ParserCloned hook
so extensions can just do The Right Thing.
Change-Id: Ieec65c908d71e89b9a66f83b9a626f842aadacbb
Before the introduction of the content handler, missing content was
signified by getText() returning null instead of a string. null will
work much like an empty string in most contexts, so in many places,
it was not checked explcitely whether the conent was null.
Now, when getContent() returns null, this often caused a fatal error,
because the code would access whatever getContent() returned as an object,
without checking whether it was null (because no such check was performed
previously, when the content was represented as a string).
This check introduces explicite checks for getContent() returning null
in the most essential core classes.
Change-Id: I551a90b0b67b8edc7570ca5d252ecc1de903f097
- Actually mention "pipe trick" so the code is searchable
- Use spaces rather than tabs for vertical alignment
- Clarify comment for double-width brackets and mention revision it was added
Change-Id: Iaf365e313144e378133fb16c64efa5b7e47d4a6a
* Fixes bug 11748 (Parser issue for HTML definition list) and similar
issues for nested unordered / ordered lists
* Stops wrapping HTML-syntax definition lists into paragraphs
for consistency with their wikitext variants
* Enables one previously disabled test and adds another for nested
definition lists with HTML syntax
Change-Id: If75ed54e11452dbcf5e6213cc20923064f811715
(bug 24502) Resolve the various issues with this accidental feature
by removing it. I think it could be done properly, along the lines of
my comment #5, but I don't think just changing the DB schema to make
langlinks non-unique is a good direction to take. A comment on
I4e1e08a3 from Daniel Kinzler indicates that duplicate language links
won't be possible with Wikidata anyway, so there's not much value in
I4e1e08a3 for WMF wikis.
Change-Id: Iba5f3f29e20f5119d4414b1e87ce5eee674701a8
Setting $wgRegisterInternalExternals = false for proto server should not
store the http/https links in externallinks table
Also fix detection of own links for links with query or anchor or
nothing
new also detected:
//localhost
//localhost?query
//localhost#anchor
already detected:
//localhost/path
Change-Id: Idd03d309cc3b71728a8cbea460efa12b10348d64
To prevent large template DOM caches from sending servers into swap,
throw an exception when more than some number of DOM elements are
parsed. Unfortunately, it wasn't possible to return a normal error
message, because it broke PST and extractSections and corrupted the
article text. It's safer to refuse to save the edit, and we don't
have decent ways to do that short of throwing an exception.
Ideally we would like to have an upstream patch that hooks libxml to
allocate memory from PHP's request pool, then a fatal error would be
raised instead of swapping.
Change-Id: I4cb4f6fd313e1e0940b56cc5e586afd1bea9267a
This patch marks the regex matching url protocol as being case
insensitive. We will from now render links like [HTTP://ww].
Tests added.
Change-Id: I706acb7a0ae194b50d2318763beae4e5e83671f3
* Also normalized 0 => false for the rev ID parameter in some places.
* Broke some long lines and shorted a variable name in Skin.php.
Change-Id: I6645315699ec7670ae22aa1dbf787d75d6e6b7ec
Replacing wfMsgExt() with wfMessage() in 4e1ccf0 causes an exception on
parse when the defaults are used for $current and $max. I don't know if
there are other similar fatal errors caused by that set of commits.
Change-Id: I84cfdede844bb2dd3c106721b972ed1cd8bfe480
If PHP's PCRE is not compiled with unicode property support, this causes
the regexes used by the parser to not compile, causing the parser to
output giberish. Its been reported that the default PHP package for
cent os has PCRE in such a config.
As a result the installer will output total giberish. The user has
no idea what went wrong because there is no meaningful output.
To counter that, cause Parser to throw an exception in that case.
It seemed easier than figuring out how to convince the installer
not to parse the environment check. For completeness sake though
I fixed the PCRE environment check to adequetely check for PCRE
not having unicode support.
This should be backported to 1.19 since there are quite a few
complaints about the issue on project:Support_desk. /me has
no idea what the procedure for that is in our new git world
Change-Id: Idb1658be4ee6203a55740450e335f570a616671c
* Replaced WikiPage::DATA_FROM_* constants with IDBAccessObject ones.
* Renamed IDBAccessObject constants a bit for visual consistency.
* Removed AVOID_MASTER parameter and replaced calling instances with READ_NORMAL.
Instead of getting page_latest from the master and the revision from a
slave, just get it all from the master in one RTT. Most callers used
AVOID_MASTER (and now READ_NORMAL), so this case is barely hit anymore.
Change-Id: Ifbefdcd4490094b38e49bbb46c95fdb71b5c9e1a
* Made refreshLinksJob2 always spawn smaller jobs. This can reduce
the problem of all runners doing the same refresh jobs by increasing the
granularity of the work to single pages parses per job.
* Avoid master queries when fetching the latest revision for refresh links jobs.
Also avoid the master for template fetching on parse. A LoadBalancer waitFor()
call is used instead. The main reason for hitting the master to fetch templates
was this job itself.
* Fixed bug in refreshLinksJob2 where one missing page would cause all the
remaining updates for pages to be aborted.
* Factored out some code duplication between the two refresh links job classes.
Change-Id: Ieca51567a888f50a6f15b6c2606323da80d6584b
The $text is constant and that means, the length of $text is also
constant, store it in a local var is easy than.
Change-Id: I9631b862f40eef7f8b18559ffd474a0037077d18
I am not sure, but this looks wrong, because it adds the type to the
array and not the child.
This method is unused in core and wmf extensions, maybe
removing/deprecating is a better idea, but I am not sure, if that is
possible.
I have only see this possible error, while looking through the
preprocessor.
Change-Id: I5b7492d455989a8a3e71b5db6d31091b986c502a
The alignment of image thumbs should follow the page content language instead of the wiki content language.
For this it needs the parser context, and because it makes sense to have it as first parameter, I renamed makeImageLink2() to makeImageLink(), the 2 seemed to be redundant anyway.
The old function name keeps the old behaviour, but can be removed quite soon since almost no extension is using it.
Change-Id: I0c35b06a85528dcc43fdd0578dc9b327c495cf4a
Using the same regex like [[File:|]]
With heigth, the width inside the thumb link can be calculated, if the
height not fit in the width.
Change-Id: If188d923d6cd25ea6a5118098f3a513ca5135d43
Add a lang parameter to DateFormatter so it can work in any language instead of only the site content language.
(The memcached key is now per language code.)
Use by default parser->getTargetLanguage() so it is parsed in the page content language by default.
Also add some documentation and remove unneeded whitespace.
If needed, a parameter to {{#dateformatter}} can now be easily added, to specify the language to format in.
Change-Id: If61854920065f7c3b4170ab89e9aa66b299f9dd8
When inserting XML elements inline <such as this one>, doxygen chokes
about it not being known. Simply enclosing the tag in double quotes
prevents doxygen from emitting a warning.
Also enclosed a few invalid functions calls such as \. and double quoted
the HTML entities such as &foobar;
Change-Id: I4019637145e683c2bec3d17b2fd98b0c50a932f1
This patch add the hook 'InternalParseBeforeSanitize' which gets called
during Parser's internalParse method just before the parser removes
unwanted/dangerous HTML tags.
Change-Id: If32053f9304088d7943aa0c9e78716a644c34fe1
In order to correctly output an error message that might contain
wikilinks, Cite.php needs a hook that is called after the page is parsed
but before the call to replaceLinkHolders().
Change-Id: Iaa2755f994edb081eb1d176f632f7add41640dbf
Currently a simple Title::exists() or Title::getArticleId() call on a
non-existent title can cause the title marked as redlinks in LinkCache,
even if a title in another variant exists. A visible appearance is that
the function refuses to try other variants of a link if the link has
already been checked by {{#ifexist:}}, which internally calls
$lang->findVariantLink() then $title->exists() is invoked.
$titlesToBeConverted is also tweaked to avoid the trailing "\0".
Change-Id: I741e2938eb364ed29f10f058da260848a6774f9f
Please note on preview of a new page, this magic word will return 0 so
we have to set the vary-revision flag.
Change-Id: I11d42ca773ad84b73cc84f2c7dd2d09f1982d97a
Pre-save transform now accepts full width commas, and a parser test is added,
which passes. Originally done by Conrad Irwin, branched out by Tim in r62689
along with a bunch of other stuff, and then it sat in bugzilla for a few months.
Change-Id: I3302e43bab423835cdaee6bdcfc0252a206490fc
Add $indexOffset parameter to PPFrame::newChild(). This makes it
possible to use newChild() for interpreting named parameters to
invoke in Scribunto -- otherwise I would have had to duplicate its
functionality, which would have been tricky given that I wanted to
make a real frame with an expand() method. Setting $indexOffset allows
newChild() to start counting numbered parameters from somewhere other
than the first pipe character, leaving room for the Scribunto function
name.
Fixed PPCustomFrame_*::getArguments(), was missing for no apparent
reason. I didn't end up using it in Scribunto, but there's no harm in
adding it anyway.
Change-Id: I0c761aab8a7f1ae74e8d151a1346febb5c466e18
This supercedes I6d03bf2a, using better names for the new classes and
incorporating the changes requested by Aaron.
This change introduces the base class SecondaryDataUpdate to be used for any
updates that need to be applied when a page is changed or deleted. Until now,
this was done by the LinksUpdate class for updates and WikiPage::doDeletionUpdates
upon deletion. This patch uses a list of SecondaryDataUpdates in both cases.
This allows extensions (e.g. via the ContentHandler facility, once that is in) to
easily specify what needs to be done when a page is updated or deleted in order to
keep any secondary data stores (such as link tables) in sync.
Note that limited transactional logic is also introduced, so SecondaryDataUpdate
can be implemented to only commit their changes if all updates were performed
sucessfully.
Patch Set 2: fixing some coding style issues mentioned by Nikerabbit.
Patch Set 4: some stuff I kept from the old LinksUpdate class needs cleanup,
but might break extensions when changed. Marking as todo for now.
Patch Set 5: fixed misnamed member in LinksDeletionUpdate (thanks Aaron).
Change-Id: Ibe3e88fadd8c1d4063cf13bb6972f2a23569a73f
This whitespaces causes an extra empty paragraph between text and transcluding a special page.
When a heading precedes a transcluded special page, there is no difference and it's fine with or without this whitespace.
See for example http://incubator.wikimedia.org/w/index.php?title=Incubator:Sandbox&oldid=822299
Change-Id: I6b06006d921368619d3969660c244176344e8aff
rename MWNamespace::isNonincludableNamespace
to MWNamespace::isNonincludable, because "Namespace" is already in the
class name
Change-Id: Ie982835c7dc84cb10c823996e5360cc1b342f704
Explicitly detect circular references in strip tags and break the loop,
similar to how we deal with circular references in templates. This is
necessary to support Scribunto since we imagine we will provide an API
that allows strip markers to be forged.
The recursion depth limit is a consequence of changing the algorithm
from iterative to recursive, it's required to protect the stack against
deeply nested #tag invocations.
Change-Id: Icc8dc4aedbced55ad75b3b5a5429a376d06d9b31
Method is a wrapper around $wgNonincludableNamespaces,
replaced the one place in parser and
add it as info to api's meta=siteinfo
Change-Id: I501b811137c39f5c2d9ea35c78fef8ae22d21bfe
With 1.20wmf2 we get a tracking category with all the problem pages,
seeing the limit for a page is a helpful information than
Change-Id: I1916e5fa6de06b923a01cf1f0ca9362287a9fd70
I have only add things and not change the current error strings to
messages, because bug 21521 is WONTFIX
Change of Preprocessor_HipHop.php is not tested
Change-Id: I7a7243b8ba010dbb395bdbbb3e00e3217088038e
The patch adds an optional parameter |link= to the <gallery>
tag. This will allow for images to link to other pages and
externals urls instead of being hardlinked to the image file
that is displayed in the gallery.
Here are a couple of examples.
Link as WikiLink:
<gallery>
File:20120106_001.jpg|link=Main_Page
</gallery>
Link as absolute URI:
<gallery>
File:20120106_001.jpg|my caption|alt=my alt
text|link=http://bugzilla.wikimedia.org
</gallery>
this would cause the link on the thumbnails rendered by the gallery tag to link
to a custom page/url instead of the actual media/image.
a link should be an internal wiki link or an absolute uri as shown in the examples.
Change-Id: I21b276ad5c7a8df13b3a716957d23fd53c37d29e
This reused the gender state of a user on a page. This is helpful for
special pages which shows the group name, because the each group name
used gender, which result in often use.
Change-Id: I8e816f54aaa100c3333e84e19299fd194323341d
Also add explicit Title::getPrefixedText() in
CoreParserFunctions::special, so that method does not rely on
Title::toString.
Change-Id: I1d041b11386bff15811e19de47a662e5ed7a2b07
- MWCryptRand: A new api for generating cryptographic randomness for security tokens. Uses whatever cryptographic source is available and if not falls back to using random state and clock drift.
- wfRandomString - A simple non-cryptographic pesudo-random string generation function to replace wfGenerateToken which was written pretending to be secure when it's really not.
- Core updates to use MWCryptRand in various places:
-- user_token generation (to do this we stop generating user_token implicitly and only generate it when needed to avoid depleting the system's entropy pool by reading random data we'll never use)
-- email confirmation token generation
-- password salt generation
-- temporary password generation
-- Generation of the automatic watchlist token
-- login and create user tokens
-- session ids when php's entropy sources are not set
-- the installer when generating wgSecretKey and the upgrade key
* Introduced Parser::killMarkers() based on the concept from StringFunctions. Used it in cases where markerStripCallback() doesn't make sense semantically, namely grammar, padleft, padright and anchorencode. Used markerStripCallback() in other cases.
* Changed headline unstrip order as suggested by P.Copp on bug 18295
* In CPF::lc() and CPF::uc(), removed the is_callable(). This was a temporary testing hack committed by me in r30109, which allowed me to do differential testing against a copy of the parser from before that revision.
* Add @since, fix indentation.
* Change default from 'all' to 'mw' as it's the most used (so default fetchLanguageNames() is equivalent to default getLanguageNames()).
* Add the include parameter also to fetchLanguageName() as it's needed in Parser: interlanguage links should only take into account mediawiki names. (Doesn't make a difference with how the functions are now, but could have been later.)
Introduce a global variable which causes language conversion to not be disabled in interface messages (as before r94279). Use $wgContLang for conversion (as before r97849) since $wgContLang is set to the base language (e.g. zh) on converter wikis, whereas a typical user language (e.g. zh-tw) only has a FakeConverter.
* The language object used for lc() in Parser::braceSubstitution() must match the one used in setFunctionHook() during firstCallInit(). It can't change depending on what message you are parsing. Use $wgContLang like before r97849.
* Reduces the overly long code in r107002, and reduces code for {{#language:}}
* Fixes the language list in Special:Translate which contained languages that gave "invalid code" when selecting
We do this by replacing every <link> and <meta> with a <html-link> or <html-meta> element and adding html-link and html-meta to tidy's new-empty-tags config so that Tidy doesn't strip it, and then restoring the <html-*> elements back to normal.
* Removed 'pcache_miss_invalid' from stats.php and clear_stats.php, no longer used
* Added missing 'job-insert' and 'job-pop' to clear_stats.php
* Added missing call to wfIncrStats( 'pcache_miss_absent' ) when there's no key in ParserCacge::get()
* Removed useless 'pcache_not_possible' stat from OutputPage::addWikiTextTitle() since that function is mostly used for interface messages
Fixes the problems with r102179 and r102179, as there are
valid tags which begin the same, which meant they were not removed from
the TOC (the second regex, intended to remove tag parameters, then converted
<img or <blockquote> into <i> / <b>).
The same problem existed in the original regex, but as there are no valid
tags which begin with sup or sub, it never happened).
Added comment explaining the tocline regex, and added a bunch of parser tests.
Fixed this by "abusing" of the $options parameter of Linker::link() to pass the Language object (as we did for wfMsgExt()), has the two following advantages:
* The tooltip is displayed in the requested language instead of depending on $wgLang
* The usage of the Language object is detected in the ParserOptions, thus the parser cache key will not have "*" for the language
{{NAMESPACE}} relative to the Title of the currently being parsed page.
Basically wfMsgForContent expands messages with wrong title while doing linksupdate stuff via job queue. For the broken file tracking category (r86534),Wikipedia folk want to sort the page into different categories based on namespace, and for some namespaces not categorize them at all (After all, a broken file link in a talk namespace is often not a bad thing).
Anyhow, explicitly set the title object for the message using wfMessage. There's probably deeper issues here in regards to why wfMsg et al is using wrong title, but this should fix the immediate issue.
So far I've encountered 2 extensions that give fatal errors from calling $wgParser->disableOutput() from hooks that are called at points where parsing is not taking place! Exception with a backtrace is much nicer than "Fatal error: Call to a member function disableCache() on a non-object..."
Just skip the whole replaceInternalLinks2 parser function whenever we hit
js/css pages. Previous patch r103476 only handled Category links which was
not enough.
This patch skip the [[Category:#]] parsing logic when the Title is in
NS_MEDIAWIKI and ends with .js or .css. This way the code is kept as is
and pages are no more categorized.
How to reproduce the issue:
$ echo 'var foo = "[[Category:bug32450]]"' \
| php maintenance/parse.php --title MediaWiki:Foobar.js
<p>var foo = ""
</p>
$
Note how the text got stripped.
After this patch:
$ echo 'var foo = "[[Category:bug32450]]"' \
| php maintenance/parse.php --title MediaWiki:Foobar.js
<p>var foo = "[[Category:bug32450]]"
</p>
$
TEST PLAN:
==========
$ php parserTests.php --quiet
This is MediaWiki version 1.19alpha (r103473).
Reading tests from "tests/parser/parserTests.txt"...
Reading tests from "tests/parser/extraParserTests.txt"...
Passed 654 of 654 tests (100%)... ALL TESTS PASSED!
$
markers.
Not sure the preg_match() is actually needed. Or it may be
appropriate to use MARKER_SUFFIX for the match.
The error message may also need to be rewritten to be more
user-friendly, but I'm pretty sure *an* error message is friendlier
than UNIQ garbage. And making them visible error messages makes them
easier to be found.
* Missed one call to ParserOptions::getUserLang() in Parser
* Also convert RefreshLinksJob and RefreshLinksJob2 to use ParserOptions::newFromUserAndLang() and pass $wgContLang instead of whatever $wgLang could be
* ParserOptions::getUserLang() will still return a string for compatibility, added ParserOptions::getUserLangObj() to get the object
* Added ParserOptions::newFromUserAndLang() and ParserOptions::newFromContext() to easily get a ParserOptions object when a context is available or when someone wants to force the language
* Updated OutputPage and Preferences to use newFromContext() and WikiPage to use newFromUserAndLang()
* ParserOptions::setUserLang() still accepts either a string or a Language object, but changed the calls to pass an object instead of a string
* Changed Parser::getFunctionLang() to return the Language object from ParserOptions when parsing interface messages rather than $wgLang directly and updated the documentation to say that $wgLang should not be used directly (as $wgUser, $wgTitle and $wgRequest)