MediaWiki uses a number of nonstandard codes which do not validate
according to the IANA language subtag registry. Some of them have
the wrong semantics entirely: MediaWiki's `sr-ec` variant maps to
BCP 47 `sr-EC` which is "Serbian as used in Ethiopia" (!).
Extend LanguageCode::bcp47() to map our nonstandard codes to valid
BCP 47 language codes. Export the mapping so that it can be used
in JavaScript's corresponding mw.language.bcp47() implementation
as well.
Thanks to TheDJ (I10b4473c7e53f027812bbccf26bb47aec15fddfd) and
Fomafix (I93efc190714ba76247d30ba49fc21ae872fc3555) for previous
attempts at this!
Also removed a fixme for the name of 'Twi', dating back to 2004
(f59c3be23b) -- checking
tw.wikipedia.org it certainly appears that the autonym of 'Twi'
is correctly 'Twi'.
Tracking bugs for invalid language codes are T125073 and T145535.
Discussion of zh-XX => zh-HanX-XX mapping is at T198419.
Bug: T34483
Bug: T106367
Bug: T120847
Change-Id: I807dd55d49e9bd19443329231326a5b0d3e6c453
Redundant given this is the project-wide license already,
especially in file headers that already include the GPL license
header.
This and other minor fixups based on feedback from Ie0cea0ef5027c7e5.
* Add @file where missing.
* Move @ingroup and @deprecated from file to class doc where needed.
Change-Id: I7067abb7abee1f0c238cb2536e16192e946d8daa
$wgDummyLanguageCodes is a set and mapping of different language codes:
* Renamed language codes: ['als' => 'gsw', 'bat-smg' => 'sgs',
'be-xold' => 'be-tarask', 'fiu-vro' => 'vro',
'roa-rup' => 'rup', 'zh-classical' => 'lzh',
'zh-min-nan' => 'nan', 'zh-yue' => 'yue'].
The old language codes are deprecated because they are invalid but
should be supported for compatibility reasons for a while.
* Language codes of macro languages, which get mapped to the main
language: ['bh' => 'bho', 'no' => 'nb'].
* Language variants which get mapped to main language:
['simple' => 'en'].
* Internal language codes of the private-use-area which get mapped to
itself: ['qqq' => 'qqq', 'qqx' => 'qqx']
This is a very strange conglomeration which should get differentiated,
and were split up in the following ways:
* Renamed language codes are available from
LanguageCode::getDeprecatedCodeMapping().
* Language codes of macro languages and the variants that are mapped to
the main language are available as $wgExtraLanguageCodes and are set
in DefaultSettings.php.
* Internal language codes are set in $wgDummyLanguageCodes in Setup.php.
Change-Id: If73c74ee87d8235381449cab7dcd9f46b0f23590