Move ZhConversion.php and Names.php to languages/data and make them both
expose their data as static class variables instead of in the local
scope. This means that the autoloader can be used to load the data,
which is efficient and secure. This also makes additional request-local
caching of the arrays unnecessary.
Change-Id: Iafb96ac4165d0965fcb9a69f1d0a91139ea9790c
Previously various language objects would install a hook to update the
shared conversion table cache when the object was constructed. This is
not a good idea since language objects may be constructed even when they
are not the content language, but only the content language is
associated with variant conversion and the conversion cache.
Instead, have WikiPage call a method on $wgContLang directly. I put this
with message cache update since the logic is almost identical.
Change-Id: Ief9c0ef993e39645e74a6e158cb4e6e2139ce91d
Remove require_once for LanguageConverter and base classes. These
are in the autoloader now, so an explicit require is no longer
necessary.
Change-Id: Ie34ffc58fd9ec89fb57cf077dd5ac1746c35c48e
Two rules are ignored for now to allow us to upgrade:
* MediaWiki.ControlStructures.AssignmentInControlStructures.AssignmentInControlStructures
* Generic.ControlStructures.InlineControlStructure.NotAllowed
Also ignore the .git folder.
Change-Id: I1b149c72b27be54e22e369999ad0c41c2d1fc2b4
This method calls out to LanguageConverter which involve the User,
Request, and additional validation.
Change-Id: I3edae1244073767a8d8888708024bb5498c70dc9
Remove empty line comments as found by the
MediaWiki.WhiteSpace.SpaceBeforeSingleLineComment.EmptyComment sniff
Change-Id: I5d694f7a7d3bc97e16300ba03c60ad17f3c912a5
This is a follow-up to
Ib6a0afa5c3736f8b9b2e121cd752c53ee50fad75
The PHP logic for grammatical cases in Russian was growing.
It was too long and not reusable is JavaScript.
This patch moves all the logic to a JSON file,
indexed by the grammatical case name and then
by regular expressions that match the different
word classes, with the values being the replacements
that should be compatible with common regular expression
replacement functions in modern programming languages.
This patch doesn't introduce any functional changes
and doesn't change any tests.
The next steps, not necessarily in this order, are:
* Make it work also with JavaScript.
* Make JSON grammar data files loadable with ResourceLoader.
* Convert most or all grammar rules for all the languages to JSON.
* Make the data processing loop generic for all languages.
* Convert it also in jquery.i18n (Milkshake).
* Convert the test cases data from code to generic data.
* Move the JSON data to a separate reusable repository.
Change-Id: I0e8e1bfb9d3ec9f841f733356af32dad7d130e94
CLDR provides translated language names. They are useful for showing
names by themselves in menus and lists, but it's often problematic to add them
to Russian sentences, because they need to be declined, so a message like
"This page is not available in the $1 language" is hard to localize.
This patch adds new cases for Russian -
"languagegen", "languageprep" and "languageadverb".
(The last one, as its name says, it's not actually a grammatical case,
but a transformation to an adverbial expression.)
This covers most of the needs for language names that MediaWiki supports.
Change-Id: Ib6a0afa5c3736f8b9b2e121cd752c53ee50fad75
* Remove Latin letters - they are used inconsistently in this file
and they aren't used anywhere in the code, because we only use
Cyrillic Tuvan in MediaWiki UI.
* Remove commented-out variables.
Change-Id: I723ba331f27d313647b67c3af11c2e53ccc72961
* Fix the '-ти' rule to match the name of Wikiquote.
* Add tests for '-ти' and '-ник' rules.
* Remove the '-ь' and '-ка' rules, which were copied from Russian
and are not used in Ukrainian, and remove their tests as well.
* Remove non-implemented ("stub") cases.
* Cleanup the code of commafy().
Change-Id: I98647ceb8806d845f3c8150b92a5d9f7fe5866f2
For unknown things like <site> and <nowiki> it defaults to text,
but (like wikitext) it does support certain tags such as <em>.
Change-Id: Ib7bead3cb72fd7c361c8032bfc3069da970226bc
The Chinese conversion table is substantially updated to fix a lot of
bugs reported in recent years, and the script generating conversion
table (LanguageZh.php) is also modified to facilitate the maintenance.
Zh-sg and zh-my is set to fallback to zh-cn to improve reading
experience, since there is only trivial difference among them, just like
zh-hk and zh-mo. Further optimization for zh-sg and zh-my will be
performed in local conversion table of Chinese WikiProjects.
Bug: T91620
Change-Id: I1bb0315d6d7a2c9653905654d933942e362bcc42
Apply the conversion variants from specific zones before zh-hans
and zh-hant, to allow fitting specific linguistic habits before
falling back to the generic ones. The actual rules will be added
in a followup patch.
Previously, the zh-cn table was composed by:
(1) Load zh2Hans as zh-hans table
(2) Load zh2CN + zh2Hans as zh-cn table
(3) Load Conversiontable/zh-hans + zh-hans as zh-hans table
(4) Load Conversiontable/zh-cn + zh-cn as zh-cn table
(5) Load zh-hans + zh-cn as the final zh-cn table
The new loading order is:
(1) Load zh2Hans as zh-hans table
(2) Load zh2CN as zh-cn table
(3) Load Conversiontable/zh-hans + zh-hans as zh-hans table
(4) Load Conversiontable/zh-cn + zh-cn as zh-cn table
(5) Load zh-cn + zh-hans as the final zh-cn table
Change-Id: Ie9d08b85d4911618946fa7efd23eb898412449e5
Xhprof generates this data now. Custom profiling of various
sub-function units are kept.
Calls to profiler represented about 3% of page execution
time on Special:BlankPage (1.5% in/out); after this change
it's down to about 0.98% of page execution time.
Change-Id: Id9a1dc9d8f80bbd52e42226b724a1e1213d07af7
This was causing problems with the setlang= and uselang= parameters in the
URL being ignored on kkwiki.
Bug: 64440
Change-Id: I4e2a20c707fd4b9b8a581a9d14e04b6cfc8b4b5c
$wgTitle was simply used in these classes to prevent conversion
of prefixed names for files. Now that we have $wgNamespaceAliases
that is rendered moot, so this will instead return just the parent
class call for autoconvert().
Bug: 57562
Change-Id: I26f8a8aa0be3822056f286f2e15201cbe4d592bf
Swapped some "$var type" to "type $var" or added missing types
before the $var. Changed some other types to match the more common
spelling. Makes beginning of some text in captial.
Change-Id: I7a4dec6a8de96ee21ef34e52bb755f723aa3b0e6
Also swapped some "$var type" to "type $var" or added missing types
before the $var. Changed some other types to match the more common
spelling. Makes beginning of some text in captial.
Change-Id: Ic36c8c7820a6c2d603f1138130670c6bf6a1ca59
- Added spaces after if/foreach/catch
- Added new line before end of file
- Added or removed spaces before/after parenthesis, comma
- Added spaces around string concat
Change-Id: I0590070f1b3542108e242730e8d9a3ba9831e94f
Russian (ru) plural rules have a major change. The 'few' form is
merged with the 'other' form. The current forms are 'one', 'many', 'other'.
In MW ru plural rules were overridden using convertPlural methdod
in LanguagesRu.php with 3 forms.
Effectively forms[1] and forms[2] are swapped.
Followup: I9930b290d004667a3bb09e5c1663ec2c9c27d8a6
Bug: 56931
Change-Id: Ia5779e42315d3f41f52dce2bfffaee0a4297d23b
Updated plurals.xml with new data from CLDR 24.
This data is according to UTS #35 Rev 33.
Update the CLDRPluralRuleParser.js to version 1.1 from upstream
https://github.com/santhoshtr/CLDRPluralRuleParser
Changes to the plural rules:
* Hebrew override removed since CLDR 24 matches with MW plural rules.
* Updated the syntax of overridden rules to TR35 Rev 33 for
Lower Sorbian (dsb), Upper Sorbian (hsb), Belarusian in Taraskievica
orthography (be_tarask), Old Church Slavonic (cu), Bhojpuri (bho),
Samogitian (sgs).
* Removed Manx (gv) override. See I46ab3dadc7fe08c1e60bbd81a1ee841e166e9608.
* Removed the overriden convertPlural method for Serbian from LanguageSr.php,
since CLDR 24 matches with MW rules. Updated and added more tests.
Tests updated for Serbocroatian (sh), too. Old CLDR versions had 4 plural
rules and MW had only 3. In CLDR 24, the form 'many' was removed and it
became identical to the MW. Same for Bosnian (bs) and Croatian (hr).
Also for variants sr-ec and sr-el
* Macedonian (mk) used to count 11 as 'other' form.
CLDR 24 counts it as 'one'.
Not overriding, using CLDR 24 here.
Updated the tests. MW will not override this.
* Armenian (hy) used to count 0 as 'other'.
Now it is 'one' form.
Updated the tests. MW will not override this.
* Latvian (lv) used to count only 0 as 'zero' form, but CLDR 24, any number
satisifying the following formula is counted as zero:
n % 10 = 0 or n % 100 = 11..19 or v = 2 and f % 100 = 11..19
Examples: 0, 10~20, 30, 40, 50, 60, 100.
Updated the tests accordingly. Not overriding it in MW.
Users will see different plural form for the above numbers.
* Removed Ukranian custom plural rule since it match with MW
* Russian (ru) plural rules have a major change.
The 'few' form is merged with the 'other' form.
The current forms are 'one', 'many', 'other'.
In MW ru plural rules were overridden using convertPlural methdod
in LanguagesRu.php with 3 forms.
Effectively forms[1] and forms[2] are swapped.
This will affect the messages, and such messages
must be reviewed and updated. This change is not included in this patch and
wil be done separately.
Russian is the only remaining language class with convertPlural method overridden.
Notable impact on the exising messages:
* For languages ru, uk, be_tarask, sr, For the special case
of two plural forms and first mapped to 1 and rest to the other form, syntax like
{{plural:$1|1=one|other}} should be used.
For further information regarding each of the above language changes, see
1. http://unicode.org/cldr/trac/ticket/3727
2. http://goo.gl/H2HEz
CLDR 24 can handle fractions. Ideally it should start working
in MW without any code changes, but MW language test suite
does not have enough tests to confirm.
Followup: e571717e06
Bug: 56931
Change-Id: I9930b290d004667a3bb09e5c1663ec2c9c27d8a6
Backported the plural rules from CLDR 24 as an override to CLDR 23 rules
exising in MediaWiki. The syntax for plural rules changed in CLDR 24, so
modified the syntax to fit the CLDR 23 syntax
Once we are ready with the updated parsers for CLDR 24(See bug 56931),
we should remove the override.
Since we remove the custom plural forms in MW in favor of CLDR,
the following changes comes into effect:
1. 'few' form used as 'zero' form in MW. Practially that make 'one' form used
as 'two' form and 'two' used as 'few' form.
This breaks existing gv {{PURAL}} usage as dicussed in the bug report
2. CLDR defines 'few' form as n % 100 = 0,20,40,60 but MW adds 80 also to that
list, ie n % 100 = 0,20,40,60, 80. So with this patch, 80 is no longer considered
as 'few' plural form.
Bug: 47099
Change-Id: I46ab3dadc7fe08c1e60bbd81a1ee841e166e9608
Russian has overridden convertPlural method, that was not
taking care of explicit plural forms.
Follow up: I2a9f93567087babb896999f1214d3c56afc67c96
Bug: 54514
Change-Id: Ia977fa544b1d0e40222c7296b7145dcd6f93ecc2
A new protected method looks for explicitly defined forms.
Every overriden language class is required to use this method.
Includes tests.
Redoing old patch I6dc759e3dfb05d6673209ba00da6592a384d5300
Bug: 46422
Change-Id: I2a9f93567087babb896999f1214d3c56afc67c96
- Place commas correct
- Moved comments
- Add space after if/foreach/catch
- Reformat some conditions
- Removed trailing spaces/tabs
Change-Id: I40ccda72c418c4a33fcd675773cb08d971510cdb
In 2005, the code to perform the conversion from the Esperanto
x-system to Unicode was recognized as problematic and commented
out (bug 1512, r10997). Hence this function has done nothing
useful in the last eight years.
Change-Id: I9f4351c638645024b17a828144bba8575db14195
There are some more language classes with convertPlural methods,
but they need some careful removal.
Change-Id: Idbcb397f750fb463608de4396018dadb6fccc9a7