(bug 24502) Resolve the various issues with this accidental feature
by removing it. I think it could be done properly, along the lines of
my comment #5, but I don't think just changing the DB schema to make
langlinks non-unique is a good direction to take. A comment on
I4e1e08a3 from Daniel Kinzler indicates that duplicate language links
won't be possible with Wikidata anyway, so there's not much value in
I4e1e08a3 for WMF wikis.
Change-Id: Iba5f3f29e20f5119d4414b1e87ce5eee674701a8
To prevent large template DOM caches from sending servers into swap,
throw an exception when more than some number of DOM elements are
parsed. Unfortunately, it wasn't possible to return a normal error
message, because it broke PST and extractSections and corrupted the
article text. It's safer to refuse to save the edit, and we don't
have decent ways to do that short of throwing an exception.
Ideally we would like to have an upstream patch that hooks libxml to
allocate memory from PHP's request pool, then a fatal error would be
raised instead of swapping.
Change-Id: I4cb4f6fd313e1e0940b56cc5e586afd1bea9267a
This patch marks the regex matching url protocol as being case
insensitive. We will from now render links like [HTTP://ww].
Tests added.
Change-Id: I706acb7a0ae194b50d2318763beae4e5e83671f3
* Also normalized 0 => false for the rev ID parameter in some places.
* Broke some long lines and shorted a variable name in Skin.php.
Change-Id: I6645315699ec7670ae22aa1dbf787d75d6e6b7ec
Replacing wfMsgExt() with wfMessage() in 4e1ccf0 causes an exception on
parse when the defaults are used for $current and $max. I don't know if
there are other similar fatal errors caused by that set of commits.
Change-Id: I84cfdede844bb2dd3c106721b972ed1cd8bfe480
If PHP's PCRE is not compiled with unicode property support, this causes
the regexes used by the parser to not compile, causing the parser to
output giberish. Its been reported that the default PHP package for
cent os has PCRE in such a config.
As a result the installer will output total giberish. The user has
no idea what went wrong because there is no meaningful output.
To counter that, cause Parser to throw an exception in that case.
It seemed easier than figuring out how to convince the installer
not to parse the environment check. For completeness sake though
I fixed the PCRE environment check to adequetely check for PCRE
not having unicode support.
This should be backported to 1.19 since there are quite a few
complaints about the issue on project:Support_desk. /me has
no idea what the procedure for that is in our new git world
Change-Id: Idb1658be4ee6203a55740450e335f570a616671c
* Replaced WikiPage::DATA_FROM_* constants with IDBAccessObject ones.
* Renamed IDBAccessObject constants a bit for visual consistency.
* Removed AVOID_MASTER parameter and replaced calling instances with READ_NORMAL.
Instead of getting page_latest from the master and the revision from a
slave, just get it all from the master in one RTT. Most callers used
AVOID_MASTER (and now READ_NORMAL), so this case is barely hit anymore.
Change-Id: Ifbefdcd4490094b38e49bbb46c95fdb71b5c9e1a
* Made refreshLinksJob2 always spawn smaller jobs. This can reduce
the problem of all runners doing the same refresh jobs by increasing the
granularity of the work to single pages parses per job.
* Avoid master queries when fetching the latest revision for refresh links jobs.
Also avoid the master for template fetching on parse. A LoadBalancer waitFor()
call is used instead. The main reason for hitting the master to fetch templates
was this job itself.
* Fixed bug in refreshLinksJob2 where one missing page would cause all the
remaining updates for pages to be aborted.
* Factored out some code duplication between the two refresh links job classes.
Change-Id: Ieca51567a888f50a6f15b6c2606323da80d6584b
The alignment of image thumbs should follow the page content language instead of the wiki content language.
For this it needs the parser context, and because it makes sense to have it as first parameter, I renamed makeImageLink2() to makeImageLink(), the 2 seemed to be redundant anyway.
The old function name keeps the old behaviour, but can be removed quite soon since almost no extension is using it.
Change-Id: I0c35b06a85528dcc43fdd0578dc9b327c495cf4a
Using the same regex like [[File:|]]
With heigth, the width inside the thumb link can be calculated, if the
height not fit in the width.
Change-Id: If188d923d6cd25ea6a5118098f3a513ca5135d43
When inserting XML elements inline <such as this one>, doxygen chokes
about it not being known. Simply enclosing the tag in double quotes
prevents doxygen from emitting a warning.
Also enclosed a few invalid functions calls such as \. and double quoted
the HTML entities such as &foobar;
Change-Id: I4019637145e683c2bec3d17b2fd98b0c50a932f1
This patch add the hook 'InternalParseBeforeSanitize' which gets called
during Parser's internalParse method just before the parser removes
unwanted/dangerous HTML tags.
Change-Id: If32053f9304088d7943aa0c9e78716a644c34fe1
In order to correctly output an error message that might contain
wikilinks, Cite.php needs a hook that is called after the page is parsed
but before the call to replaceLinkHolders().
Change-Id: Iaa2755f994edb081eb1d176f632f7add41640dbf
Please note on preview of a new page, this magic word will return 0 so
we have to set the vary-revision flag.
Change-Id: I11d42ca773ad84b73cc84f2c7dd2d09f1982d97a
Pre-save transform now accepts full width commas, and a parser test is added,
which passes. Originally done by Conrad Irwin, branched out by Tim in r62689
along with a bunch of other stuff, and then it sat in bugzilla for a few months.
Change-Id: I3302e43bab423835cdaee6bdcfc0252a206490fc
This whitespaces causes an extra empty paragraph between text and transcluding a special page.
When a heading precedes a transcluded special page, there is no difference and it's fine with or without this whitespace.
See for example http://incubator.wikimedia.org/w/index.php?title=Incubator:Sandbox&oldid=822299
Change-Id: I6b06006d921368619d3969660c244176344e8aff
rename MWNamespace::isNonincludableNamespace
to MWNamespace::isNonincludable, because "Namespace" is already in the
class name
Change-Id: Ie982835c7dc84cb10c823996e5360cc1b342f704
Method is a wrapper around $wgNonincludableNamespaces,
replaced the one place in parser and
add it as info to api's meta=siteinfo
Change-Id: I501b811137c39f5c2d9ea35c78fef8ae22d21bfe
With 1.20wmf2 we get a tracking category with all the problem pages,
seeing the limit for a page is a helpful information than
Change-Id: I1916e5fa6de06b923a01cf1f0ca9362287a9fd70
The patch adds an optional parameter |link= to the <gallery>
tag. This will allow for images to link to other pages and
externals urls instead of being hardlinked to the image file
that is displayed in the gallery.
Here are a couple of examples.
Link as WikiLink:
<gallery>
File:20120106_001.jpg|link=Main_Page
</gallery>
Link as absolute URI:
<gallery>
File:20120106_001.jpg|my caption|alt=my alt
text|link=http://bugzilla.wikimedia.org
</gallery>
this would cause the link on the thumbnails rendered by the gallery tag to link
to a custom page/url instead of the actual media/image.
a link should be an internal wiki link or an absolute uri as shown in the examples.
Change-Id: I21b276ad5c7a8df13b3a716957d23fd53c37d29e
- MWCryptRand: A new api for generating cryptographic randomness for security tokens. Uses whatever cryptographic source is available and if not falls back to using random state and clock drift.
- wfRandomString - A simple non-cryptographic pesudo-random string generation function to replace wfGenerateToken which was written pretending to be secure when it's really not.
- Core updates to use MWCryptRand in various places:
-- user_token generation (to do this we stop generating user_token implicitly and only generate it when needed to avoid depleting the system's entropy pool by reading random data we'll never use)
-- email confirmation token generation
-- password salt generation
-- temporary password generation
-- Generation of the automatic watchlist token
-- login and create user tokens
-- session ids when php's entropy sources are not set
-- the installer when generating wgSecretKey and the upgrade key
* Introduced Parser::killMarkers() based on the concept from StringFunctions. Used it in cases where markerStripCallback() doesn't make sense semantically, namely grammar, padleft, padright and anchorencode. Used markerStripCallback() in other cases.
* Changed headline unstrip order as suggested by P.Copp on bug 18295
* In CPF::lc() and CPF::uc(), removed the is_callable(). This was a temporary testing hack committed by me in r30109, which allowed me to do differential testing against a copy of the parser from before that revision.
* Add @since, fix indentation.
* Change default from 'all' to 'mw' as it's the most used (so default fetchLanguageNames() is equivalent to default getLanguageNames()).
* Add the include parameter also to fetchLanguageName() as it's needed in Parser: interlanguage links should only take into account mediawiki names. (Doesn't make a difference with how the functions are now, but could have been later.)
Introduce a global variable which causes language conversion to not be disabled in interface messages (as before r94279). Use $wgContLang for conversion (as before r97849) since $wgContLang is set to the base language (e.g. zh) on converter wikis, whereas a typical user language (e.g. zh-tw) only has a FakeConverter.
* The language object used for lc() in Parser::braceSubstitution() must match the one used in setFunctionHook() during firstCallInit(). It can't change depending on what message you are parsing. Use $wgContLang like before r97849.
Fixes the problems with r102179 and r102179, as there are
valid tags which begin the same, which meant they were not removed from
the TOC (the second regex, intended to remove tag parameters, then converted
<img or <blockquote> into <i> / <b>).
The same problem existed in the original regex, but as there are no valid
tags which begin with sup or sub, it never happened).
Added comment explaining the tocline regex, and added a bunch of parser tests.
{{NAMESPACE}} relative to the Title of the currently being parsed page.
Basically wfMsgForContent expands messages with wrong title while doing linksupdate stuff via job queue. For the broken file tracking category (r86534),Wikipedia folk want to sort the page into different categories based on namespace, and for some namespaces not categorize them at all (After all, a broken file link in a talk namespace is often not a bad thing).
Anyhow, explicitly set the title object for the message using wfMessage. There's probably deeper issues here in regards to why wfMsg et al is using wrong title, but this should fix the immediate issue.
So far I've encountered 2 extensions that give fatal errors from calling $wgParser->disableOutput() from hooks that are called at points where parsing is not taking place! Exception with a backtrace is much nicer than "Fatal error: Call to a member function disableCache() on a non-object..."
Just skip the whole replaceInternalLinks2 parser function whenever we hit
js/css pages. Previous patch r103476 only handled Category links which was
not enough.
This patch skip the [[Category:#]] parsing logic when the Title is in
NS_MEDIAWIKI and ends with .js or .css. This way the code is kept as is
and pages are no more categorized.
How to reproduce the issue:
$ echo 'var foo = "[[Category:bug32450]]"' \
| php maintenance/parse.php --title MediaWiki:Foobar.js
<p>var foo = ""
</p>
$
Note how the text got stripped.
After this patch:
$ echo 'var foo = "[[Category:bug32450]]"' \
| php maintenance/parse.php --title MediaWiki:Foobar.js
<p>var foo = "[[Category:bug32450]]"
</p>
$
TEST PLAN:
==========
$ php parserTests.php --quiet
This is MediaWiki version 1.19alpha (r103473).
Reading tests from "tests/parser/parserTests.txt"...
Reading tests from "tests/parser/extraParserTests.txt"...
Passed 654 of 654 tests (100%)... ALL TESTS PASSED!
$
* Missed one call to ParserOptions::getUserLang() in Parser
* Also convert RefreshLinksJob and RefreshLinksJob2 to use ParserOptions::newFromUserAndLang() and pass $wgContLang instead of whatever $wgLang could be
* ParserOptions::getUserLang() will still return a string for compatibility, added ParserOptions::getUserLangObj() to get the object
* Added ParserOptions::newFromUserAndLang() and ParserOptions::newFromContext() to easily get a ParserOptions object when a context is available or when someone wants to force the language
* Updated OutputPage and Preferences to use newFromContext() and WikiPage to use newFromUserAndLang()
* ParserOptions::setUserLang() still accepts either a string or a Language object, but changed the calls to pass an object instead of a string
* Changed Parser::getFunctionLang() to return the Language object from ParserOptions when parsing interface messages rather than $wgLang directly and updated the documentation to say that $wgLang should not be used directly (as $wgUser, $wgTitle and $wgRequest)
* Made variant tabs hidden on special pages (has no or sometimes wrong/inconsistent effect there; plus it's mainly in the user language)
* Made redirects be in content language again (remove from Title->getPageLanguage())
* Introduce a boolean parameter to wfUrlProtocols() which, if set to false, will cause '//' to be dropped from the returned regex so it doesn't match protocol-relative URLs
* Introduce wfUrlProtocolsWithoutProtRel() as a wrapper for wfUrlProtocols( false ). The latter should not be used directly because the former is much clearer
* Use this new function in Parser::doMagicLinks() to fix the original bug. Also use it in ApiFormatBase::formatHTML() and CodeCommentLinker::link(), which probably had similar bugs
Simple use case (PHP 5.3+) that will work show the expand text passed to a <preprocess> tag:
$wgHooks['ParserFirstCallInit'][] = function( $parser ) {
$parser->setHook( 'preprocess', function( $text, $attr, $parser, $frame ) {
return $parser->recursivePreprocess( $text, $frame );
} );
return true;
};
* Parser.php: Apparently that was a much bigger bug: do not convert interface text going through the parser either.
* Preferences.php: Do not convert the user signature.
For this bug in action, see e.g. http://sr.wikipedia.org/sr-ec/Посебно:Подешавањ?uselang=en (e.g. "Username" -> "Усернаме")
* Changed SpecialPageFactory::capturePath() to take the same parameters as SpecialPageFactory::executePath() (except the last one) and made it always return a bool instead of bool-or-string. The result HTML can be fetched from the OutputPage object of the context.
* Added module styles, scritps and messages members to ParserOutput in addition to modules. The first one is needed to display Special:RecentChanges correctly when transcluded since EnhancedChangesList::beginRecentChangesList() calls addModuleStyles( 'mediawiki.special.changeslist' )
Yes, this means that you can use {{Special:Recentchanges|enhanced=0}} to use the old changes list. For the ones that wonder, {{Special:Recentchanges|uselang=something}} will not work since the Language object is forced to the one used by the parser.
Breaks unit tests as below, not going to be able to fix them before I disappear for the evening, so might aswell leave trunk clean
ArticleTablesTest testbug14404
Error:
ArticleTablesTest::testbug14404
Undefined offset: 0
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/includes/ArticleTablesTest.php:31
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiTestCase.php:60
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiPHPUnitCommand.php:20
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/phpunit.php:60
ParserTests testParserTest #552 - testParserTest with data set #551
Failure:
ParserTests::testParserTest with data set #551 ('RAW magic word', '{{RAW:QUERTY}}', '<p><a href="/index.php?title=Template:QUERTY&action=edit&redlink=1" class="new" title="Template:QUERTY (page does not exist)">Template:QUERTY</a>
</p>', '', '')
RAW magic word
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-<p><a href="/index.php?title=Template:QUERTY&action=edit&redlink=1" class="new" title="Template:QUERTY (page does not exist)">Template:QUERTY</a>
+<p><a href="/index.php?title=Template:RAW:QUERTY&action=edit&redlink=1" class="new" title="Template:RAW:QUERTY (page does not exist)">Template:RAW:QUERTY</a>
</p>
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/includes/parser/NewParserTest.php:545
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiTestCase.php:60
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiPHPUnitCommand.php:20
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/phpunit.php:60
Fix 19052 which was only reporting the issue for U+3000 IDEOGRAPHIC SPACE.
Covers both external links and images links. See parser tests for examples.
Unicode 'Zs' includes all characters from the 'separator, space' category.
Characters part of this category are:
Char Name
U+0020 SPACE
U+00A0 NO-BREAK SPACE
U+1680 OGHAM SPACE MARK
U+180E MONGOLIAN VOWEL SEPARATOR
U+2000 EN QUAD
U+2001 EM QUAD
U+2002 EN SPACE
U+2003 EM SPACE
U+2004 THREE-PER-EM SPACE
U+2005 FOUR-PER-EM SPACE
U+2006 SIX-PER-EM SPACE
U+2007 FIGURE SPACE
U+2008 PUNCTUATION SPACE
U+2009 THIN SPACE
U+200A HAIR SPACE
U+202F NARROW NO-BREAK SPACE
U+205F MEDIUM MATHEMATICAL SPACE
U+3000 IDEOGRAPHIC SPACE
TEST PLAN:
$ php parserTests.php --quiet
This is MediaWiki version 1.19alpha (r93258).
Reading tests from "tests/parser/parserTests.txt"...
Reading tests from "tests/parser/extraParserTests.txt"...
Reading tests from "../mwexts/LabeledSectionTransclusion/lstParserTests.txt"...
Passed 686 of 686 tests (100%)... ALL TESTS PASSED!
Sounds good :-)
Added a hook that's called for each section. This allows sections to be addressed separately at the document level, for example by wrapping each in a div (as in the reverted r50769). This in-turn enables richer section-specific UI, like highlighting or in-line editing.
* Make TOC numberings be in the page content language instead of wiki content language.
* Update getPageLanguage() and add a hook (for bug 9360/28970).
* Show redirects (when viewing a page with &redirect=no) in the user language direction (not essential but nicer imo).
* Fixed parse() arguments in getRevIncludes()
* Changed clearTagHook() to avoid preprocessed-xml cache corruption
* Check current version cache in getRevIncludes()
* Encapsulate index.php in wfIndexMain() (similar to r77873)
* Kill $wgArticle check in Exception, not necessary anymore
* Kill $wgArticle in Setup, also not necessary
* Add angry note about $wgArticle to rebuildFileCache.
* Remove note about $wgArticle in Parser since it's dying anyway
* Parser now only adds the redirect source to imagelinks, like it does in pagelinks
* ImagePage now shows the file redirects in-line with the normal "The following pages link to this file:"
** Added message linkstoimage-redirect
** Removed the separate file redirects section and removed associated message redirectstofile
** ImagePage::imageLinks will first fetch image links to the file, determine which are redirects, and if there are fewer links than the limit, fetch the redirect target links.
Added notes about the 'extended' return values from some of the hooks, which seems to be undocumented but is vaguely similar to what is documented for parser function callbacks. It doesn't look like a very safe interface, and could stomp on internal variables if one isn't careful. Needs to be better defined and doc'd.
setTransparentTagHook() seems to have been largely undocumented and is only used by the 'geoserver' extension currently. Is this considered obsolete? Should it be simply replaced by use of the frame & recursiveTagParse goodies, or is it still needed for something?
* (bug 16129) Transcluded special pages expose strip markers when they output parsed messages
Also adding some related documentation during my travels around the code
Also make a few changes to the functions available. SpecialPageFactory::resolveAlias() now takes an optional subpage and returns array(<name>,<subpage>). Similarly merge getPage() and getPageByAlias(). There were many examples of (extensions particularly) making dubious assumptions about the presence or absence of subpages or canonical-ness.
I didn't deprecate SpecialPage::getTitleFor() as it's got over six hundred calls. I'm rather undecided on the best position of getPage()/executePath(). Although the latter needs cleanup anyway.