This reverts r45470 and fixes the problems it identified. Things should
now work as they always used to, but with less code duplication, and
with $wgEnforceHtmlIds = false working correctly as well.
This means that old links will still work in a lot of cases. However,
if the legacy anchor is invalid XML, we omit it. In particular, on
non-Latin wikis, practically all old section links (from external
sources or using external links) are likely to break, since the first
character of legacy anchors starting with a non-ASCII character will be
".", which is invalid in XML as well as in HTML4.
r45267 commented out the logic prohibiting numbers and so on at the
start of id's. I've uncommented this logic, but passed 'noninitial' as
an option to escapeId() in all necessary circumstances. Shouldn't
change behavior, this is to simplify some further work I'm about to do.
Fixes other cases broken by Parser's assumptions failing to hold after change in Title::isAlwaysKnown()'s behavior:
* Links to invalid Special: pages were being recorded, but shouldn't
* Links to valid MediaWiki: pages were no longer recorded
Instead of the NS_FILE special-case in r45174, I'm just tossing *all* isAlwaysKnown links over to ParserOutput::addLink(), and letting the latter worry about what types of titles it won't record.
Just for good measure, in case any NS_MEDIA titles make it into ParserOutput::addLink() they'll be normalized to NS_FILE.
This adds a config option, $wgEnforceHtmlIds, true by default. If this
is set to false, all characters that are allowed in XML ids are let
through in header ids and manually-specified ids. In particular, this
should include all alphabetic and numeric characters.
Some remaining issues to work out:
* This will cause backward-compatibility issues for some types of links
and references: links from non-MediaWiki sources, links from MediaWiki
sources running a different version, external links, and references from
stylesheets/scripts. These could be partially alleviated by having a
second <a name="" id=""> for headers where the two versions differ, but
it would remain an issue for manually-specified id's.
* Any invalid characters are now, effectively, stripped (replaced with
underscores). This might cause problems if some writing systems are
invalid in id's for some reason: we'll want to double-check the list of
prohibited characters carefully.
* Some user agents might not support these links. IE5 appears to, and
so do recent versions of Opera and Firefox, but I didn't do extensive
testing.
* Not tested extensively, there are probably some bugs.
I think this would be good to enable on testwiki for the moment to see
how it goes.
No parser test regressions. No change to RELEASE-NOTES, we can add that
when the option is enabled by default (ideally, removed entirely).
Calling it with no extra arguments will now assume that you're escaping
a whole id, not an id fragment, which is safer. Also, instead of ugly
bitfield-based options, I've changed the options to use an array of
strings. I fixed all callers in trunk. Out-of-tree callers that were
using Sanitizer::NONE will get correct behavior, while those that were
calling it with no arguments will get slightly changed behavior (an x
will be prepended). I think this is harmless enough that we can skip
back-compat cruft here.
This should cause no visible changes. No parser test regressions.
This will break any preexisting links to such sections (other than
those generated by the software, of course). There should be no parser
test regressions.
Behavior seems a bit hard to predict, as far as what's going to go in the header and what in the browser window etc. Pulling it back for further testing and discussion.
This is a global search and replace of NS_IMAGE and NS_IMAGE_TALK with NS_FILE and NS_FILE_TALK respectively in all core files, excluding those already updated in step 1 (r44004).
In PPCustomFrame_DOM and PPCustomFrame_Hash, no checking is performed in getArgument() when arguments not contained in frames are requested, causing PHP undefined variable error messages. This happens while expanding templates using a custom frame.
A simple check is needed using isset(), just like those found in PPFrame_* and PPTemplateFrame_*.
* Use wfMsg to show error message in user preference language. wfMsgForContent does not make sense
* Rename new message name to MediaWiki code standards. Use dash and not underscore
The namespace parsing thing feels very hacky and grabs bits out of an internal implementation function which doesn't feel like a stable interface.
Would recommend thinking about this and coming up with a more serious stable interface for it.
** <poem> handling fixes: use DoubleReplacer class instead of create_function(), moving recursiveTagParse above line-break replacements, removed strip items, updated parser tests to reflect new output when combined with <nowiki>
Alt text is now set in the following ways, in decreasing priority:
1) Set to the alt= parameter if present.
2) Set to the unnamed (caption) parameter if present, and if the image does not have the thumb or frame option set (i.e., if the unnamed parameter is not actually being used for a caption -- using it as both caption and alt text would just lead to text being repeated).
3) Set to the empty string.
Title text and captions should not be affected in any case. The only backward-compatibility effect (i.e., on images not using the new alt= syntax) should be that if previously the same text was repeated in the alt text and then again in the caption, the alt text will now be empty. Setting the alt parameter should never change the HTML output compared to not setting it, except of course changing the alt text.
All parser tests pass, except the usual ones.
* Renamed to "link", which seems clearer and less mouse-centric ;)
* Added parser test cases:
3 new PASSING test(s) :)
* Image with link parameter, wiki target [Has never failed]
* Image with link parameter, URL target [Has never failed]
* Image with empty link parameter [Has never failed]
* Don't call quickUserCan('edit') unless section edit is enabled
* In DatabasePostgres and DatabaseSqlite: throw an exception on connection error
* In DatabasePostgres: don't send an invalid connection string whenever one of the fields is empty. Use quoting.
* In Database: make the captured PHP error prettier
* Display a descriptive error message when the user navigates to index.php with PHP 4, not a parse error. Check to see if the *.php5 extension works, using file_get_contents().
* The default port number for PostgreSQL is 5432, not blank.
* Better default for $wgDBname
The caption was originally defined *as* the alt text (defaulting to the image file name if there is no alt text). Note that a separate caption text is only displayed in some display modes ('frame' and 'thumb', iirc), and not by default.
Please run the parser tests and check the effect you have on them. If it's really an appropriate change, then update the test cases. If you're not sure, consider backing out pending further discussion. :)
It might be appropriate to not set the 'alt' attribute for frame/thumb cases, but definitely not for inline images where we already have a way of setting the alt text which you're removing!
* Removed the namespace parameter from Linker::makeExternalLink(), added a generic associative array of attributes instead. Let the Parser decide whether to use rel=nofollow.
* Added an on-wiki external image whitelist. Items in this whitelist are
treated as regular expression fragments to match for when possibly
displaying an external image inline. Controlled by $wgEnableImageWhitelist
(true by default)
* Merged replaceFreeExternalLinks() with doMagicLinks(). Makes a lot of sense, very similar operations, doesn't break any parser tests. Stops free links from interacting with other parser stages, the same way ISBN links don't.
* The pass order change fixes Brion's complaint in r39980. Early link expansion, triggered by having more than 1000 links in the page, was outputting URLs which were destroyed by RFEL. Added parser test.
* Fixed an unrelated bug in LinkHolderArray::replace(): if a link to a redirect appears in two separate RLH calls, the second and subsequent calls do not add the mw-redirect class. Caused by an unmigrated LinkCache fetch.
* Added a parser test for a pass interaction bug that the pass order change fixes.
* The fuzzer told me to tell you that free external links in non-caption image parameters, which are and have always been invisible, are now not registered either.
* Miscellaneous supporting updates to the test infrastructure.
Causes weird regressions on http://meta.wikimedia.org/wiki/Talk:Spam_blacklist
Couldn't isolate to a parser test in a few minutes; some kind of template interaction perhaps.
Sample bad HTML like:
The associated page is used by the Mediawiki <a href="<a href=" class="external free" title="http://www.mediawiki.org/wiki/Extension:SpamBlacklist" rel="nofollow">http://www.mediawiki.org/wiki/Extension:SpamBlacklist</a>" class="extiw" title="mw:Extension:SpamBlacklist">Spam Blacklist extension, and lists strings of text that may not be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta <a href="/wiki/Administrator" title="Administrator">administrator</a> can edit the spam blacklist. There is also a more aggressive way to block spamming through direct use of <a href="/wiki/Anti-spam_features#.24wgSpamRegex" title="Anti-spam features">$wgSpamRegex</a>. Only <a href="/wiki/Developers" title="Developers" class="mw-redirect">developers</a> can make changes to $wgSpamRegex, and its use is to be avoided whenever possible.
* Fixed image link whitespace handling (Brion's complaint, r39662)
* Added fuzz test capability to parserTests.php
* Added __destruct() functions to Parser and Language, and called them explicitly from parserTests.inc, to avoid unconstrained memory usage during fuzz testing.
* Added unified diff to output of Parser_DiffTest
* Fixed whitespace change in Parser::doTableStuff() (found by fuzzing)
* Added feature to RELEASE-NOTES which I'd committed last time but forgotten to note: <gallery> will accept image names with no "Image:" prefix (rediscovered by fuzzing)
* Limit memory usage in Title::getInterwikiLink()
* Fixed chronic fail of all interwiki link parser tests (hid Siebrand's complaint, r39464)
* Fixed chronic fail of one of the LanguageConverter parser tests. Was actually an ignored bug.
This will allow us to develop a new method of parsing links (non-hacky), without breaking current syntax, and also allow us to make sure new methods don't break syntax.
Currently this class merely inherits from the Parser class. Constants and static functions are coppied so that use of self:: won't break when we modify things.
* Split link placeholder/replacement handling into a separate object, LinkHolderArray.
* Remove Title objects from LinkCache, they apparently weren't being used at all. Same unconstrained memory usage as the former $parser->mLinkHolders.
* Introduced ExplodeIterator -- a workalike for explode() which doesn't use a significant amount of memory
* Introduced StringUtils::explode() -- select whether to use the simulated or native explode() depending on how many items there are
* Migrated most instances of explode() in Parser.php to StringUtils::explode()
* Renamed some variables in Parser::doBlockLevels()
* In Parser.php: $fname => __METHOD__, Parser => self/__CLASS__, to support Parser_DiffTest more easily
* Doc update in includes/MessageCache.php for r39412
* MW_TITLECACHE_MAX => Title::CACHE_MAX, nicer name, easier to access from another module
Warning: Parameter 2 to Parser::parse() expected to be a reference, value given in ./includes/StubObject.php on line 58
followed by:
Fatal error: Call to a member function getLanguageLinks() on a non-object in ./includes/OutputPage.php on line 463
allows having the unprefixed page title as the default category sortkey.
Although creating sane defaults should always be preferred over introducing
new config options, we cannot just remove the old behaviour here, as some
peoply might still rely on it. However, the sortkey {{PAGENAME}} is already
widely used for circumventing the current behaviour.
This broke four test cases:
4 previously failing test(s) now PASSING! :)
* Right-aligned image [Fixed between 08-Aug-2008 21:37:38, 1.14alpha (r38954) and now]
* Centre-aligned image [Fixed between 08-Aug-2008 21:37:38, 1.14alpha (r38954) and now]
* None-aligned image [Fixed between 08-Aug-2008 21:37:38, 1.14alpha (r38954) and now]
* Width + Height sized image (using px) (height is ignored) [Fixed between 08-Aug-2008 21:37:38, 1.14alpha (r38954) and now]
Please recommit with fixes to the existing test cases and some new test cases to cover cases where an empty caption is explicitly requested, see https://bugzilla.wikimedia.org/show_bug.cgi?id=2443#c11
* Support class="extiw"
* Do not double the fragment for external links with fragments. Move the code for this into a new Title::getLinkUrl() instead of a Linker method, because it seems like a useful concept to be able to get a *usable* link to the current Title.
* Fix a few parser tests that expected attributes in the opposite order.
* Don't overwrite actions for broken links.
* Style
If you want me to stop doing this, by the way, please say so before I spend too many more hours of my life on it.
I don't really like this in general; the API isn't meant for the UI and there should be little to no call to link to it from body content.
Additionally, I believe we're trying to move all new parser functions to the convention of using the # prefix to avoid conflict with the template namespace.
Cause regression in 19 parser test cases, looks like messing up the tooltips for section edit links.
19 previously failing test(s) now PASSING! :)
* Bug 6563: Edit link generation for section shown by <includeonly> [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Bug 6563: Edit link generation for section suppressed by <includeonly> [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Basic section headings [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Section headings with TOC [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Handling of sections up to level 6 and beyond [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* TOC regression (bug 9764) [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* TOC with wgMaxTocLevel=3 (bug 6204) [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Resolving duplicate section names [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Resolving duplicate section names with differing case (bug 10721) [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Template with sections, __NOTOC__ [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Link inside a section heading [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* TOC regression (bug 12077) [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Fuzz testing: Parser14 [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Fuzz testing: Parser14-table [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Inclusion of !userCanEdit() content [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Out-of-order TOC heading levels [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* -{}- tags within headlines (within html for parserConvert()) [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* Morwen/13: Unclosed link followed by heading [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
* HHP2.2: Heuristics for headings in preprocessor parenthetical structures [Fixed between 29-Jul-2008 22:42:06, 1.14alpha (r38207) and 29-Jul-2008 23:54:51, 1.14alpha (r38207)]
Bug 14965 - https://bugzilla.wikimedia.org/show_bug.cgi?id=14965
PHP Catchable fatal error: Argument 1 passed to Title::equals() must be an instance of Title, null given, called in /usr/local/apache/common-local/php-1.5/includes/Linker.php on line 1323 and defined in /usr/local/apache/common-local/php-1.5/includes/Title.php on line 3003
$wgTitle isn't available in this sort of background rendering.
Make Linker::doEditSectionLink() public, and change its interface to be like that of editSectionLink(). Use that in Parser (which is the only place that uses the old functions that I can find), and mark the old two functions deprecated. Add a hook 'DoEditSectionLink' with a new, clean interface, which is run immediately before the return so it can override the whole function. Advise people in hooks.txt to use the new hook, not the old ones.
* Currently __INDEX__ will override __NOINDEX__ regardless of their relative positions, due to the way things are written. Instead, the last one on the page should win. This should be pretty easy to fix.
* __INDEX__ and __NOINDEX__ override $wgArticleRobotPolicies. This is almost certainly incorrect, but it's not totally obvious how to fix it, because of the way the code is structured. Probably not a big deal, but should probably be fixed at some point.
* Anyone can add and remove the magic words, and there's no config option to disable them. It's not obvious whether this is okay or not. It would be a one-line change to OutputPage.php to have a config option to ignore the magic words, maybe per-namespace or who knows what.
This implements an mb_str_pad fallback function, but there is no mb_str_pad in PHP documentation, and the doc comments are really weird -- it says it returns an integer!
If this function is created from whole cloth and doesn't exist in PHP, it should be given a MediaWiki style name and not be done with a function_exists check as though it were a compat function.
* Use an associative array to initialise LoadBalancer objects
* By default, use Preprocessor_DOM if available, otherwise use Preprocessor_Hash. Preprocessor_Hash has worse performance.
* Fix parserTests.php for replicated databases. Use CREATE TABLE instead of CREATE TEMPORARY TABLE if there is more than one server configured.
* Log exceptions even in command-line mode.
This hook allows for modification of the title and text of a template which is being transcluded.
Use of this hook will allow extensions to create features such as TransWiki for an alternative to ScaryTransclusions."
This hook seems a bit oddly placed to me; the template gets fetched locally, and *then* we give the opportunity to fetch it remotely instead? Just seems to be in the wrong order, and pretty unclear.
This hook allows for modification of the title and text of a template which is being transcluded.
Use of this hook will allow extensions to create features such as TransWiki for an alternative to ScaryTransclusions.
$url and $alt parameters in makeExternalImage() are now normalized to be escaped on output instead of before they reach the function. This ensures that any hooks processing them won't accidentally send plaintext which might become an injection vector, or just get confused on pre-escaped input they didn't expect.
* Wrote a tool to check the integrity of the autoloader class list, fixed some issues that came up.
* Start the autoloader before LocalSettings.php, so that when an extension writer thinks an inefficient one-file special page extension is the way to go, they don't have to use explicit includes to make the class inheritance work. Should continue to work with $IP set in LocalSettings.php as long as $IP is set before extensions are included.