Methods were introduced in r84057 which, unfortunatly was tested with
PHP errors disabled :\
Additionally add tests for the full Turkish alphabet based on an article
from Wikipedia http://en.wikipedia.org/wiki/Turkish_alphabet
As mentioned by Bawolff on code review, r83970 only handled case change
of the first character lacking full strings support.
This patch override the uc and lc methods for the Turkish language (tr)
using preg_replace() which know about unicode. Other possible choices
would have been:
- strtr() => outputs garbage
- mbstring => can not know we handle turkish and transform i to I!
I have amended the RELEASE-NOTES to reflect this patch.
Some new tests are added as well to cover the regular functions as
well as the specific Turkish overriding. Result in testdox:
LanguageTr
[x] Change case of first char being dotted and dotless i
[x] Language tr lower casing override
[x] Language tr upper casing override
[x] Upper casing of a string with dotted and dot less i
[x] Lower casing of a string with dotted and dot less i
Turkish has two different i, one with a dot and another without a dot. They
are totally different letters in this language, so we have to override the
ucfirst and lcfirst methods.
See http://en.wikipedia.org/wiki/Dotted_and_dotless_I
Credits to #wikipedia-tr users berm, []LuCkY[] and Emperyan
Broken since introducsed in r10810 almost six years ago.
This proves that the exception in message caches is valid and covers errors.
Tested that exception is no longer thrown.
* Rename wordSegmentation() to segmentByWord().
* Consolidate search index locking and iteration to Maintenance.php
* Add maintenance/updateDoubleWidthSearch.php to take care of new
format for normalized double-width roman characters.
* Add error checking to updateSearchIndex.php for creating $posFile.
* Add note to UPGRADE about running updateDoubleWidthSearch.php.
* Split $wgFixArchaicUnicode into two separate variables, one for Malayalam and one for Arabic
* Clarified documentation and switched them both on by default
* Removed accidentally added variable LanguageAr::$normalizeArray
* Rewrote convertArray() as an RD parser (with inline tokenizer) as suggested on CR r60986. Fixes unclosed rule issue (with parser test). Fixes O(N^2) timing.
* Removed $this->mMarkup abstraction. Life is complicated enough as it is.
* Replaced a couple of instances of explode() with StringUtils::explode(), limited element count in a couple more.
In ConverterRule:
* Removed mConvTable initialisation from the constructor, unnecessary
* Optimised the "-{xxx}-" tight loop by replacing function calls such as count() and in_array() with language constructs such as isset(). Reduced execution time from 356us to 275us.
* Cached $varsep_pattern for further reduction to 243us.
* A couple more parseFlags() hacks brings it back to 230us.
* Split out $this->mVariantFlags from $this->mFlags. Rearranged flag detection into a foreach/switch to avoid unnecessary isset() calls. 189us.
* Added a special-case optimisation to generateConvTable() for the case where there are no tables defined inline in the article. 116us.
* Fixed bug from r37499: "!R || !N" is always true since they are mutually exclusive, "!R && !N" was intended (with parser test).
* Fixed E_NOTICE from "-{N|foo}-"