* Rename wordSegmentation() to segmentByWord().
* Consolidate search index locking and iteration to Maintenance.php
* Add maintenance/updateDoubleWidthSearch.php to take care of new
format for normalized double-width roman characters.
* Add error checking to updateSearchIndex.php for creating $posFile.
* Add note to UPGRADE about running updateDoubleWidthSearch.php.
* Don't set Parser::$mTitle to random garbage.
* Remove ParserOutput::$displayTitle, make setDisplayTitle() and
getDisplayTitle() wrappers for their *TitleText() equivalents.
* Remove Parser::$mDo*Convert member variables, move test for
$mDoubleUnderScores[] directives closer to the action.
* Remove bogus "global $wgContLang".
* Use accessor to get at $mConvRuleTitle
* Fix up showtitle option in parserTests.inc
* TODO: refactor FakeConverter class away
Recover the -{T| }- rule. Add the ability to test for it to the parserTests and add a test for it. Add a couple of disabled tests that I think demonstrate bugs in the LanguageTranslator
* Added $wgFixArchaicUnicode, which, if enabled, converts some deprecated Unicode sequences in Arabic and Malayalam text to their Unicode 5.1 equivalents.
* Added generateNormalizerData.php to generate the relevant data files. Added the generated data files also.
* Made most things call the new wrapper method $wgContLang->normalize() instead of UtfNormal::cleanUp(), so that Unicode normalization can be customised on a per-language basis.
* Added some generic support for conversion tables to Language so that subclasses can easily implement these kinds of transformations.
Introduced helpers:
$lang->getDir() returns 'ltr' or 'rtl' for HTML 'dir' attrib
$lang->alignStart() returns 'left' or 'right' for HTML 'align' attrib or CSS 'text-align' property
$lang->alignEnd() returns 'right' or 'left'
And cleaned up a couple arrays of icons to just reverse the order of items rather than repeating the items twice for each possibility.
* The serialized message cache, which would have been redundant, has been removed. Similar performance characteristics can be achieved with $wgLocalisationCacheConf['manualRecache'] = true;
* Added a maintenance script rebuildLocalisationCache.php for offline rebuilding of the localisation cache.
* Extension i18n files can now contain any of the variables which can be set in Messages*.php. It is possible, and recommended, to use this feature instead of the hooks for special page aliases and magic words.
* $wgExtensionAliasesFiles, LanguageGetMagic and LanguageGetSpecialPageAliases are retained for backwards compatibility. $wgMessageCache->addMessages() and related functions have been removed. wfLoadExtensionMessages() is a no-op and can continue to be called for b/c.
* Introduced $wgCacheDirectory as a default location for the various local caches that have accumulated. Suggested $IP/cache as a good place for it in the default LocalSettings.php and created this directory with a deny-all .htaccess.
* Patched Exception.php to avoid using the message cache when an exception is thrown from within LocalisationCache, since this tends to fail horribly.
* Removed Language::getLocalisationArray(), Language::loadLocalisation(), Language::load()
* Fixed FileDependency::__sleep()
* In Cdb.php, fixed newlines in debug messages
In MessageCache::get():
* Replaced calls to $wgContLang capitalisation functions with plain PHP functions, reducing the typical case from 99us to 93us. Message cache keys are already documented as being restricted to ASCII.
* Implemented a more efficient way to filter out bogus language codes, reducing the "foo/en" case from 430us to 101us
* Optimised wfRunHooks() in the typical do-nothing case, from ~30us to ~3us. This reduced MessageCache::get() typical case time from 93us to 38us.
* Removed hook MessageNotInMwNs to save an extra 3us per cache hit. Reimplemented the only user (LocalisationUpdate) using the new hook LocalisationCacheRecache.
Big fixup for Chinese word breaks and variant conversions in the MySQL search backend...
- removed redunant variant terms for Chinese, which forces all search indexing to canonical zh-hans
- added parens to properly group variants for languages such as Serbian which do need them at search time
- added quotes to properly group multi-word terms coming out of stripForSearch, as for Chinese where we segment up the characters. This is based on Language::hasWordBreaks() check.
- also cleaned up LanguageZh_hans::stripForSearch() to just do segmentation and pass on the Unicode stripping to the base Language implementation, avoiding scary code duplication. Segmentation was already pulled up to LanguageZh, but was being run again at the second level. :P
- made a fix to Chinese word segmentation to handle the case where a Han character is followed by a Latin char or numeral; a space is now added after as well. Spaces are then normalized for prettiness.
** Todo: combine all three list functions (comma, semicolon, pipe) into one function with a parameter?
* Use pipe as backlink separator to be consistent with other navigation elements
* Show the colon for case 'afh_actions' only if parameters exist
** Remove the now useless message
* Localize the usages of comma and semicolon