Previously things like "192.168.1.1" couldn't be searched very cleanly in the MySQL backend for two reasons:
* First, the periods were stripped out. This resulted in it being broken into multiple short words: "192 168 1 1", leading at best to false positives and general weirdness.
* Second, for IP addresses these were shorter than the default minimum word length of 4 and thus didn't even get indexed!
The addition of padding for short words let them at least get indexed, but they still didn't turn up cleanly due to the word split. Now allowing periods through to the indexed text, and encoding periods that appear within a compound word so they get caught more cleanly.
Also made a tweak so highlighting works a bit better on word boundaries -- eg "192.168.1.1" no longer hits a highlight match for "192.168.1.100". However it's still not 100% handling some cases with the periods. Sigh.
requiring customization of MySQL server settings
Short words are padded so they now get indexed. Yay!
Adapted part of Werdna's patch, with some additional cleanup:
* Using 'U00' to pad instead of 'SMALL' to reduce false positives (eg search for "small*" could match "Smallville" and "SMALLc")
* Checking server's ft_min_word_len variable to see if we need to do anything. This preserves index compatibility with existing installations which have customized their index length.
* Some further cleanup on redundant code -- just toss everything through lc() and be done with it :D
* Cleaned out some more evals in zh and yue classes :P
* Fixed yue class to call the parent adjustor properly
* Updated message 'and' for all languages to keep behaviour the same, no change for 'ksh' (wanted behaviour), changed 'en' (trailing comma).
* Added message 'word-separator' as optional message. Just a space for all languages at the moment.
* Two callers of wfArrayMerge() were bugs, both assuming strange and complex behaviour in wfArrayMerge() which has never been present or documented.
* Introduced wfMergeErrorArrays() to remove duplicates from merged error arrays, e.g. from getUserPermissionsErrors().
* Rewrote the remaining callers of wfArrayMerge() to use array plus. It makes the code clearer, assuming the reader knows more about basic PHP operators than GlobalFunctions.php. Considering the two bugs discussed above, this seems like a fair assumption. If you don't know PHP, you shouldn't be writing MediaWiki code.
** Also supports blocking user from editing whole namespace
* Replace ugly ipboptions parsing code in Title.php with a simple message
Requires schema change (I showed it to Tim Starling).
* Merged replaceFreeExternalLinks() with doMagicLinks(). Makes a lot of sense, very similar operations, doesn't break any parser tests. Stops free links from interacting with other parser stages, the same way ISBN links don't.
* The pass order change fixes Brion's complaint in r39980. Early link expansion, triggered by having more than 1000 links in the page, was outputting URLs which were destroyed by RFEL. Added parser test.
* Fixed an unrelated bug in LinkHolderArray::replace(): if a link to a redirect appears in two separate RLH calls, the second and subsequent calls do not add the mw-redirect class. Caused by an unmigrated LinkCache fetch.
* Added a parser test for a pass interaction bug that the pass order change fixes.
* The fuzzer told me to tell you that free external links in non-caption image parameters, which are and have always been invisible, are now not registered either.
* Miscellaneous supporting updates to the test infrastructure.
Causes weird regressions on http://meta.wikimedia.org/wiki/Talk:Spam_blacklist
Couldn't isolate to a parser test in a few minutes; some kind of template interaction perhaps.
Sample bad HTML like:
The associated page is used by the Mediawiki <a href="<a href=" class="external free" title="http://www.mediawiki.org/wiki/Extension:SpamBlacklist" rel="nofollow">http://www.mediawiki.org/wiki/Extension:SpamBlacklist</a>" class="extiw" title="mw:Extension:SpamBlacklist">Spam Blacklist extension, and lists strings of text that may not be used in URLs in any page in Wikimedia Foundation projects (as well as many external wikis). Any meta <a href="/wiki/Administrator" title="Administrator">administrator</a> can edit the spam blacklist. There is also a more aggressive way to block spamming through direct use of <a href="/wiki/Anti-spam_features#.24wgSpamRegex" title="Anti-spam features">$wgSpamRegex</a>. Only <a href="/wiki/Developers" title="Developers" class="mw-redirect">developers</a> can make changes to $wgSpamRegex, and its use is to be avoided whenever possible.
* Fixed image link whitespace handling (Brion's complaint, r39662)
* Added fuzz test capability to parserTests.php
* Added __destruct() functions to Parser and Language, and called them explicitly from parserTests.inc, to avoid unconstrained memory usage during fuzz testing.
* Added unified diff to output of Parser_DiffTest
* Fixed whitespace change in Parser::doTableStuff() (found by fuzzing)
* Added feature to RELEASE-NOTES which I'd committed last time but forgotten to note: <gallery> will accept image names with no "Image:" prefix (rediscovered by fuzzing)
* Limit memory usage in Title::getInterwikiLink()
* Fixed chronic fail of all interwiki link parser tests (hid Siebrand's complaint, r39464)
* Fixed chronic fail of one of the LanguageConverter parser tests. Was actually an ignored bug.
* @ingrouo -> @ingroup (whoops!)
* Fix doxygen warnings
* Remove duplicate definition of $wgMetaNamespaceTalk in DefaultSettings.php, thanks to VasilievVV for pointing this out
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.