Commit graph

16 commits

Author SHA1 Message Date
Sam Reed
117fb34efc Some language love 2011-05-29 15:21:03 +00:00
Sam Reed
810610a153 Some language love 2011-05-29 15:03:33 +00:00
Siebrand Mazeland
75c6696aa8 Use consistent notation for "@todo FIXME". Should update http://svn.wikimedia.org/doc/todo.html nicely. 2011-05-17 22:03:20 +00:00
Alexandre Emsenhuber
0ef95c5586 Added description to language classes 2010-10-10 12:53:37 +00:00
Sam Reed
6afa6c6ab5 Fix some wrong usages of static method calls (actually belong to class instance) 2010-07-25 18:00:32 +00:00
Tim Starling
410736b38f Fix total breakage of search engine in zh wikis due to incorrect variable name 2010-04-10 13:16:49 +00:00
Mark A. Hershberger
560b72c4ab * Implement normalization of fullwidth latin characters for all Languages, not just Japanese and Chinese.
* Tune Language::convertDoubleWidth() so that it is 8-10x faster.  (See http://xrl.us/bg2mon)
2010-03-23 19:50:59 +00:00
Mark A. Hershberger
54badce2d8 Follow-up r61856
* Rename wordSegmentation() to segmentByWord().
* Consolidate search index locking and iteration to Maintenance.php
* Add maintenance/updateDoubleWidthSearch.php to take care of new
  format for normalized double-width roman characters.
* Add error checking to updateSearchIndex.php for creating $posFile.
* Add note to UPGRADE about running updateDoubleWidthSearch.php.
2010-03-10 21:54:23 +00:00
Mark A. Hershberger
92ed21f0ab follow-up r61856 — wordsegmentation should be done for all search engines, not just mysql 2010-03-09 04:19:55 +00:00
Philip Tzou
d6b6766f3a Follow up r60742, r60743, r60764, r60766, r61214, r61390. Split stripForSearch into wordSegmentation and normalizeForSearch. So the wordSegmentation could be called by search engines separately. 2010-02-02 15:09:01 +00:00
Philip Tzou
8bbfbf5628 follow-up r60743.
1. Changed the conditions, not only for LuceneSearch, but also more commonly to others.
2. Reduced code duplication.
2010-01-07 04:50:32 +00:00
Philip Tzou
339f0bb3d9 1. Add conditions to stripForSearch for LuceneSearch / MWSearch.
2. Add double-width roman characters conversion support to zh, gan, and yue.
2010-01-06 19:51:29 +00:00
Alexandre Emsenhuber
c3ec19debc Replaced all @fixme with "@todo Fixme" since doxygen doesn't have a @fixme command 2009-12-15 21:26:58 +00:00
Brion Vibber
ceedb37941 * (bug 8445) Multiple-character search terms are now handled properly for Chinese
Big fixup for Chinese word breaks and variant conversions in the MySQL search backend...
- removed redunant variant terms for Chinese, which forces all search indexing to canonical zh-hans
- added parens to properly group variants for languages such as Serbian which do need them at search time
- added quotes to properly group multi-word terms coming out of stripForSearch, as for Chinese where we segment up the characters. This is based on Language::hasWordBreaks() check.
- also cleaned up LanguageZh_hans::stripForSearch() to just do segmentation and pass on the Unicode stripping to the base Language implementation, avoiding scary code duplication. Segmentation was already pulled up to LanguageZh, but was being run again at the second level. :P
- made a fix to Chinese word segmentation to handle the case where a Han character is followed by a Latin char or numeral; a space is now added after as well. Spaces are then normalized for prettiness.
2009-06-24 02:27:51 +00:00
Alexandre Emsenhuber
087a9f70c5 WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>

Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage

One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
Raimond Spekking
0b37df653c * (bug 11284) Update Chinese translations
Patch by Shinjiman
2007-09-27 15:40:35 +00:00
Renamed from languages/classes/LanguageZh_yue.php (Browse further)