Commit graph

202 commits

Author SHA1 Message Date
Robin Pepermans
6837668a40 Fix for r86670: properly convert uppercase latin to syllabics 2011-04-25 19:00:53 +00:00
Sam Reed
8e80b8c3c1 Tidy up some unused variables and such 2011-04-23 21:40:52 +00:00
Robin Pepermans
cfc9b193a1 Conversion script between Syllabics and Latin for the Inuktitut language 2011-04-21 21:21:32 +00:00
Chad Horohoe
0327a95ea1 (bug 28643) Merge Serbian language variant conversion improvements to trunk (r85224, r85239, r85308) from Nikola's branch 2011-04-21 14:02:38 +00:00
Paul Copperman
efb8c6b899 Fix some Notices:
* LanguageKaa.php: Fix ucfirst and lcfirst for empty strings.
* SkinTemplate.php: Fix undefined array access.
* ProxyTools.php: When running hiphop in cli mode, apache_request_headers() returns null. Fix wfGetForwardedFor() to account for that.
2011-04-11 16:49:36 +00:00
Antoine Musso
97783870bc Makes LanguageTr uc & lc match parent declaration
Methods were introduced in r84057 which, unfortunatly was tested with
PHP errors disabled :\

Additionally add tests for the full Turkish alphabet based on an article
from Wikipedia http://en.wikipedia.org/wiki/Turkish_alphabet
2011-03-16 07:38:15 +00:00
Antoine Musso
3e6e06a3bb bug 28040 Turkish: properly handle dotted and dotless i
As mentioned by Bawolff on code review, r83970 only handled case change
of the first character lacking full strings support.

This patch override the uc and lc methods for the Turkish language (tr)
using preg_replace() which know about unicode. Other possible choices
would have been:
 - strtr() =>  outputs garbage
 - mbstring => can not know we handle turkish and transform i to I!

I have amended the RELEASE-NOTES to reflect this patch.

Some new tests are added as well to cover the regular functions as
well as the specific Turkish overriding. Result in testdox:

LanguageTr
[x] Change case of first char being dotted and dotless i
[x] Language tr lower casing override
[x] Language tr upper casing override
[x] Upper casing of a string with dotted and dot less i
[x] Lower casing of a string with dotted and dot less i
2011-03-15 21:56:54 +00:00
Antoine Musso
b8522fac08 bug 28040 Turkish: properly lower case 'I' to 'ı' (dotless i)
Turkish has two different i, one with a dot and another without a dot. They
are totally different letters in this language, so we have to override the
ucfirst and lcfirst methods.
See http://en.wikipedia.org/wiki/Dotted_and_dotless_I

Credits to #wikipedia-tr users berm, []LuCkY[] and Emperyan
2011-03-14 22:14:39 +00:00
Niklas Laxström
158dc1b3cd Fix bug that caused "Non-string key given" exception.
Broken since introducsed in r10810 almost six years ago.
This proves that the exception in message caches is valid and covers errors.
Tested that exception is no longer thrown.
2011-02-26 08:19:21 +00:00
Alexandre Emsenhuber
6cf384a749 * (bug 27560) Search queries no longer fail in walloon language 2011-02-19 20:06:50 +00:00
Sam Reed
15f4f6f360 Explicitally define some variables
Function documentation
2011-02-18 23:21:48 +00:00
Chad Horohoe
d15b8f962c Merge callback fixes from 1.17wmf1. /e is evil. 2011-02-11 10:27:07 +00:00
Purodha B Blissenbach
0ed585bdaa Typo in a comment. 2011-02-05 22:27:34 +00:00
Sam Reed
e07e24c316 Followup r74519, fixup code style issues 2010-12-31 07:25:51 +00:00
Niklas Laxström
cd6753234d New plural rules 2010-12-18 14:34:43 +00:00
Antoine Musso
d4a07fe0cf revert r71054. Although unused, the variable might be used later. 2010-11-13 13:13:12 +00:00
Sam Reed
6b3b915353 Big attack on unused variables... 2010-10-14 20:53:04 +00:00
Alexandre Emsenhuber
0ef95c5586 Added description to language classes 2010-10-10 12:53:37 +00:00
Purodha B Blissenbach
733d862dfe More documentation. 2010-10-08 17:51:23 +00:00
Purodha B Blissenbach
5f1a69c3cc ConvertGrammar() added to LanguageKsh.php 2010-10-08 14:34:56 +00:00
Sam Reed
22764a53f8 Braces, spaces, and a few unused arrays 2010-09-21 06:55:49 +00:00
Sam Reed
d8d7065a15 Revert commenting out in r72239 (leaving whitespace changes 2010-09-11 20:54:13 +00:00
Platonides
d3e1536f46 Follow-up r72561. Missed language folder. 2010-09-07 22:39:05 +00:00
Sam Reed
b7b548cedf Comment out some unused array declarations
Move some comments
2010-09-02 22:20:00 +00:00
Alexandre Emsenhuber
954ef59e62 * (bug 24804) Corrected commafying in Polish and Ukrainian 2010-08-21 14:57:08 +00:00
Brion Vibber
fca83f7741 My first commit in weeks and it's got a typo ;) 2010-08-13 23:39:27 +00:00
Brion Vibber
95081de751 More explanatory (and English ;) doc comments for Esperanto surrogate conversion in LanguageEo::iconv() 2010-08-13 23:37:45 +00:00
Sam Reed
96edd5f50c Remove unused $voicedPhonemes 2010-08-13 23:30:53 +00:00
Sam Reed
380b6725d5 Remove some unused variables
Move variable in languages/classes/LanguageKu.php into commented code (used in comment)
2010-08-13 20:58:16 +00:00
Sam Reed
6e6e0ed520 Switch /e preg_replace for callbacks
Swap "and" for &&
2010-08-13 09:46:08 +00:00
Sam Reed
e476b51e3a Stylize languages/*, languages/classes/*, but not languages/messages/* 2010-07-29 09:43:18 +00:00
Sam Reed
f4f5d17105 A few more wrong static things
Remove some =& from LanguageKk_cyrl
2010-07-25 21:15:27 +00:00
Sam Reed
8b8500c121 More self:: to $this-> 2010-07-25 18:13:56 +00:00
Sam Reed
6afa6c6ab5 Fix some wrong usages of static method calls (actually belong to class instance) 2010-07-25 18:00:32 +00:00
Philip Tzou
279a29cdc1 1. Fix the underline bug in the title(namespace) conversion, which displayed title like "User_talk:Example".
2. Improve the function of namespace conversion. Allow admins to custom namespace conversion in MediaWiki's messages([[MediaWiki:conversion-nsX]]).
2010-07-06 05:00:15 +00:00
Alexandre Emsenhuber
8cc340437e Fixed some doxygen warnings 2010-06-13 21:13:07 +00:00
Niklas Laxström
56ad7335db Array indexes start from zero 2010-06-13 20:23:53 +00:00
Alexandre Emsenhuber
3065e28d2d Fixed some doxygen warnings 2010-06-05 19:15:50 +00:00
Platonides
551902b22b Bug 23707: Zero items is plural in brazilian portuguese, bug 7309 was wrong. 2010-05-29 18:58:29 +00:00
Siebrand Mazeland
923329f60f (bug 23156) Commafy and search normalization update for Belarusian (Taraškievica). Contributed by Jaska Zedlik. 2010-04-12 21:12:34 +00:00
Tim Starling
410736b38f Fix total breakage of search engine in zh wikis due to incorrect variable name 2010-04-10 13:16:49 +00:00
Mark A. Hershberger
560b72c4ab * Implement normalization of fullwidth latin characters for all Languages, not just Japanese and Chinese.
* Tune Language::convertDoubleWidth() so that it is 8-10x faster.  (See http://xrl.us/bg2mon)
2010-03-23 19:50:59 +00:00
Mark A. Hershberger
54badce2d8 Follow-up r61856
* Rename wordSegmentation() to segmentByWord().
* Consolidate search index locking and iteration to Maintenance.php
* Add maintenance/updateDoubleWidthSearch.php to take care of new
  format for normalized double-width roman characters.
* Add error checking to updateSearchIndex.php for creating $posFile.
* Add note to UPGRADE about running updateDoubleWidthSearch.php.
2010-03-10 21:54:23 +00:00
Mark A. Hershberger
92ed21f0ab follow-up r61856 — wordsegmentation should be done for all search engines, not just mysql 2010-03-09 04:19:55 +00:00
Philip Tzou
b1efb02e12 Follow up r61856, no need. 2010-02-02 15:26:20 +00:00
Philip Tzou
d6b6766f3a Follow up r60742, r60743, r60764, r60766, r61214, r61390. Split stripForSearch into wordSegmentation and normalizeForSearch. So the wordSegmentation could be called by search engines separately. 2010-02-02 15:09:01 +00:00
Tim Starling
1603ed2f22 Fixes for r60599:
* Split $wgFixArchaicUnicode into two separate variables, one for Malayalam and one for Arabic
* Clarified documentation and switched them both on by default
* Removed accidentally added variable LanguageAr::$normalizeArray
2010-01-20 01:50:16 +00:00
Tim Starling
750b8f7c04 In LanguageConverter:
* Rewrote convertArray() as an RD parser (with inline tokenizer) as suggested on CR r60986. Fixes unclosed rule issue (with parser test). Fixes O(N^2) timing.
* Removed $this->mMarkup abstraction. Life is complicated enough as it is.
* Replaced a couple of instances of explode() with StringUtils::explode(), limited element count in a couple more.

In ConverterRule:
* Removed mConvTable initialisation from the constructor, unnecessary
* Optimised the "-{xxx}-" tight loop by replacing function calls such as count() and in_array() with language constructs such as isset(). Reduced execution time from 356us to 275us.
* Cached $varsep_pattern for further reduction to 243us.
* A couple more parseFlags() hacks brings it back to 230us.
* Split out $this->mVariantFlags from $this->mFlags. Rearranged flag detection into a foreach/switch to avoid unnecessary isset() calls. 189us.
* Added a special-case optimisation to generateConvTable() for the case where there are no tables defined inline in the article. 116us.
* Fixed bug from r37499: "!R || !N" is always true since they are mutually exclusive, "!R && !N" was intended (with parser test).
* Fixed E_NOTICE from "-{N|foo}-"
2010-01-19 02:36:33 +00:00
Max Semenik
09951b7e0e Removed a couple of ancient @bug's 2010-01-07 21:54:39 +00:00
Philip Tzou
5c8e60f959 follow-up r60764. compatible fix. 2010-01-07 17:48:52 +00:00