2. Patch for situations that some wikis like zhwikisource may disabled some language variants. We should treat these disabled variants unacceptable in LanguageConverter.
Big fixup for Chinese word breaks and variant conversions in the MySQL search backend...
- removed redunant variant terms for Chinese, which forces all search indexing to canonical zh-hans
- added parens to properly group variants for languages such as Serbian which do need them at search time
- added quotes to properly group multi-word terms coming out of stripForSearch, as for Chinese where we segment up the characters. This is based on Language::hasWordBreaks() check.
- also cleaned up LanguageZh_hans::stripForSearch() to just do segmentation and pass on the Unicode stripping to the base Language implementation, avoiding scary code duplication. Segmentation was already pulled up to LanguageZh, but was being run again at the second level. :P
- made a fix to Chinese word segmentation to handle the case where a Han character is followed by a Latin char or numeral; a space is now added after as well. Spaces are then normalized for prettiness.
PHP Notice: Undefined property: FakeConverter::$mMainLanguageCode in /var/www/w/languages/Language.php on line 2230
PHP Notice: Undefined property: FakeConverter::$mVariants in /var/www/w/languages/Language.php on line 2233
PHP Warning: in_array() [<a href='function.in-array'>function.in-array</a>]: Wrong datatype for second argument in /var/www/w/languages/Language.php on line 2233
PHP Notice: Undefined property: FakeConverter::$mMainLanguageCode in /var/www/w/languages/Language.php on line 2234
Uses $dir in extension files, and assumes that it remains unchanged in require_once( 'maintenance/commandLine.inc' ).
In fact, it is likely that '$dir' will be set when setting up command-line, as some extensions will use the same var.
Recommended fix: Use $CentralAuth_dir, $EmailPage_dir, etc.
requiring customization of MySQL server settings
Short words are padded so they now get indexed. Yay!
Adapted part of Werdna's patch, with some additional cleanup:
* Using 'U00' to pad instead of 'SMALL' to reduce false positives (eg search for "small*" could match "Smallville" and "SMALLc")
* Checking server's ft_min_word_len variable to see if we need to do anything. This preserves index compatibility with existing installations which have customized their index length.
* Some further cleanup on redundant code -- just toss everything through lc() and be done with it :D
* Cleaned out some more evals in zh and yue classes :P
* Fixed yue class to call the parent adjustor properly
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
* Fix some scripts that assumed include_path was set with various additional directories
Stuff now seems to mostly work when not overriding include_path.
Taking that out of LocalSettings is the next step... whee!
* Removed some backtracking regexes with an O(N^2) worst case, replaced with StringUtils::delimiterReplace(). There is a beneficial functional difference: /*/ is no longer considered to be a complete CSS comment.
* Changed the parser strip state from an array to an object. This should hopefully avoid the PHP bugs with array references. StripState uses the new ReplacementArray to do the replacements, thereby supporting FSS.
* Removed DatabaseFunctions.php from the default startup sequence. Moved wfGetDB() to GlobalFunctions.php.
* Introduced the SiteStats class, with a collection of cached site stats accessor functions.
* Removed all global functions from Parser.php, they don't belong there.
* Made LanguageConverter use the new ReplacementArray class instead of managing its own FSS objects.