wiki.techinc.nl/languages/classes/LanguageYue.php
Brion Vibber 7ebf0e431b * (bug 5477) Searches for words less than 4 characters now work without
requiring customization of MySQL server settings

Short words are padded so they now get indexed. Yay!

Adapted part of Werdna's patch, with some additional cleanup:
* Using 'U00' to pad instead of 'SMALL' to reduce false positives (eg search for "small*" could match "Smallville" and "SMALLc")
* Checking server's ft_min_word_len variable to see if we need to do anything. This preserves index compatibility with existing installations which have customized their index length.
* Some further cleanup on redundant code -- just toss everything through lc() and be done with it :D
* Cleaned out some more evals in zh and yue classes :P
* Fixed yue class to call the parent adjustor properly
2008-11-25 02:39:06 +00:00

21 lines
502 B
PHP

<?php
/**
* @ingroup Language
*/
class LanguageYue extends Language {
function stripForSearch( $string ) {
wfProfileIn( __METHOD__ );
// eventually this should be a word segmentation
// for now just treat each character as a word
// @fixme only do this for Han characters...
$t = preg_replace(
"/([\\xc0-\\xff][\\x80-\\xbf]*)/",
" $1", $string);
// Do general case folding and UTF-8 armoring
$t = parent::stripForSearch( $t );
wfProfileOut( __METHOD__ );
return $t;
}
}