Commit graph

53 commits

Author SHA1 Message Date
Chiefwei
f968119a28 Update the Chinese conversion table for Chinese WikiProjects
The Chinese conversion table is substantially updated to fix a lot of
bugs reported in recent years, and the script generating conversion
table (LanguageZh.php) is also modified to facilitate the maintenance.

Zh-sg and zh-my is set to fallback to zh-cn to improve reading
experience, since there is only trivial difference among them, just like
zh-hk and zh-mo. Further optimization for zh-sg and zh-my will be
performed in local conversion table of Chinese WikiProjects.

Bug: T91620
Change-Id: I1bb0315d6d7a2c9653905654d933942e362bcc42
2015-03-06 19:51:13 +00:00
jenkins-bot
31d0a18d3e Merge "Change loading order of Chinese conversion tables" 2015-01-08 15:46:10 +00:00
Liangent
2e0b23d689 Change loading order of Chinese conversion tables
Apply the conversion variants from specific zones before zh-hans
and zh-hant, to allow fitting specific linguistic habits before
falling back to the generic ones. The actual rules will be added
in a followup patch.

Previously, the zh-cn table was composed by:

(1) Load zh2Hans as zh-hans table
(2) Load zh2CN + zh2Hans as zh-cn table
(3) Load Conversiontable/zh-hans + zh-hans as zh-hans table
(4) Load Conversiontable/zh-cn + zh-cn as zh-cn table
(5) Load zh-hans + zh-cn as the final zh-cn table

The new loading order is:

(1) Load zh2Hans as zh-hans table
(2) Load zh2CN as zh-cn table
(3) Load Conversiontable/zh-hans + zh-hans as zh-hans table
(4) Load Conversiontable/zh-cn + zh-cn as zh-cn table
(5) Load zh-cn + zh-hans as the final zh-cn table

Change-Id: Ie9d08b85d4911618946fa7efd23eb898412449e5
2015-01-08 15:35:42 +00:00
Chad Horohoe
aa21e125a3 Remove obvious function-level profiling
Xhprof generates this data now. Custom profiling of various
sub-function units are kept.

Calls to profiler represented about 3% of page execution
time on Special:BlankPage (1.5% in/out); after this change
it's down to about 0.98% of page execution time.

Change-Id: Id9a1dc9d8f80bbd52e42226b724a1e1213d07af7
2015-01-07 11:14:24 -08:00
umherirrender
ae3c883150 Cleanup some docs (languages)
- Makes beginning of @param in capital
- Removed return void

Change-Id: Ie05436c1ef886cb23c62ccde95384f253f83694c
2014-08-09 22:20:15 +02:00
Siebrand Mazeland
03b2b42084 Make languages/classes pass phpcs-strict
Change-Id: I0985f3c7e4b36338c68a4a63cfba4eaa4af567c0
2014-04-22 14:13:02 +02:00
umherirrender
55e8a9abfd Fixed some @params documentation (languages)
Swapped some "$var type" to "type $var" or added missing types
before the $var. Changed some other types to match the more common
spelling. Makes beginning of some text in captial.

Change-Id: I7a4dec6a8de96ee21ef34e52bb755f723aa3b0e6
2014-04-17 13:32:54 +00:00
Timo Tijhof
beb1c4a0ec phpcs: More require/include is not a function
Follows-up I1343872de7, Ia533aedf63 and I2df2f80b81.

Also updated usage in text in documentation and the
installer LocalSettingsGenerator.

Most of them were handled by this regex:
- find: (require|include|require_once|include_once)\s*\(\s*(.+?)\s*\)\s*;$
- replace: $1 $2;

Change-Id: I6b38aad9a5149c9c43ce18bd8edbab14b8ce43fa
2013-05-21 23:26:28 +02:00
umherirrender
3c7bcf4658 Fixed spacing in languages folder
Added spaces before if
Added some braces for one line statements

Change-Id: I980771894369499646532b13b801db6447381773
2013-04-17 21:10:02 +02:00
Kevin Israel
4c69569db7 Get rid of preg_replace( '/.../e', ... )
This is deprecated as of PHP 5.5, and the remaining uses are quite
silly. Tim said I should remove his easter egg from Special:Version,
as it already was broken, and a new one can be added in a separate
commit.

Change-Id: I0f09f4efc7afe5933c8317462026a475530a5324
2013-04-09 02:31:57 +00:00
Liangent
e01adbfc0b Clean up Language::markNoConversion().
* IRIs are getting more and more widely used these days so Chinese
  characters are also needed to be prevented from being converted
  in text of external links.
* So now all markNoConversion() functions in languages with variants
  do the same thing. Merge them into a single function in the
  Language class and drop implementations in individual languages.
* By the way rephrase phpdoc of that function, and (bug 24798) fix
  the link detection regex to use wfUrlProtocolsWithoutProtRel().
  Protocol-relative regex is excluded to avoid false positives.
* Add parser test for it.

Change-Id: I2ec0ac2b9b11221584adb72555168498de209d57
2012-11-18 03:46:53 +08:00
Siebrand Mazeland
4b62b0339c Prefix new ContentHandler hooks in WikiPage with Page instead of Article
Covers 3 hooks:
* ArticleContentInsertComplete -> PageContentInsertComplete
* ArticleContentSave -> PageContentSave
* ArticleContentSaveComplete -> PageContentSaveComplete

Change-Id: I186669a5941d8982725ed364b481215d291b2043
2012-10-11 18:22:52 +02:00
daniel
a138706729 Fix usage of deprecated ArticleSaveComplete hook in core
Change-Id: Ic01fd95d50a909470d6f0ffd93c972322789d49a
2012-10-08 14:13:59 +02:00
daniel
9994968774 merged master
Change-Id: Ib2b879c4daa17401eeeb50767c0e5a54254855c3
2012-08-29 15:20:15 +02:00
Daniel Kinzler
392af46809 Revert "merged master"
This reverts commit 67bfdc7a68
2012-08-29 13:14:49 +00:00
daniel
67bfdc7a68 merged master
Change-Id: Ib2b879c4daa17401eeeb50767c0e5a54254855c3
2012-08-29 12:06:38 +02:00
Alexandre Emsenhuber
a9d92e0960 Added missing GPLv2 headers in some places.
Also made file/class documentation more consistent.

Change-Id: I162f57c994765189681ac3fb30f889e648c6c6a1
2012-06-10 19:40:03 +02:00
Sam Reed
b6b807b2bc More documentation! 2011-05-29 16:32:05 +00:00
Sam Reed
e476b51e3a Stylize languages/*, languages/classes/*, but not languages/messages/* 2010-07-29 09:43:18 +00:00
Philip Tzou
279a29cdc1 1. Fix the underline bug in the title(namespace) conversion, which displayed title like "User_talk:Example".
2. Improve the function of namespace conversion. Allow admins to custom namespace conversion in MediaWiki's messages([[MediaWiki:conversion-nsX]]).
2010-07-06 05:00:15 +00:00
Philip Tzou
b1efb02e12 Follow up r61856, no need. 2010-02-02 15:26:20 +00:00
Philip Tzou
d6b6766f3a Follow up r60742, r60743, r60764, r60766, r61214, r61390. Split stripForSearch into wordSegmentation and normalizeForSearch. So the wordSegmentation could be called by search engines separately. 2010-02-02 15:09:01 +00:00
Tim Starling
750b8f7c04 In LanguageConverter:
* Rewrote convertArray() as an RD parser (with inline tokenizer) as suggested on CR r60986. Fixes unclosed rule issue (with parser test). Fixes O(N^2) timing.
* Removed $this->mMarkup abstraction. Life is complicated enough as it is.
* Replaced a couple of instances of explode() with StringUtils::explode(), limited element count in a couple more.

In ConverterRule:
* Removed mConvTable initialisation from the constructor, unnecessary
* Optimised the "-{xxx}-" tight loop by replacing function calls such as count() and in_array() with language constructs such as isset(). Reduced execution time from 356us to 275us.
* Cached $varsep_pattern for further reduction to 243us.
* A couple more parseFlags() hacks brings it back to 230us.
* Split out $this->mVariantFlags from $this->mFlags. Rearranged flag detection into a foreach/switch to avoid unnecessary isset() calls. 189us.
* Added a special-case optimisation to generateConvTable() for the case where there are no tables defined inline in the article. 116us.
* Fixed bug from r37499: "!R || !N" is always true since they are mutually exclusive, "!R && !N" was intended (with parser test).
* Fixed E_NOTICE from "-{N|foo}-"
2010-01-19 02:36:33 +00:00
Philip Tzou
8bbfbf5628 follow-up r60743.
1. Changed the conditions, not only for LuceneSearch, but also more commonly to others.
2. Reduced code duplication.
2010-01-07 04:50:32 +00:00
Philip Tzou
339f0bb3d9 1. Add conditions to stripForSearch for LuceneSearch / MWSearch.
2. Add double-width roman characters conversion support to zh, gan, and yue.
2010-01-06 19:51:29 +00:00
Andrew Garrett
70464b0cdf Fix E_NOTICE in r55415 breaking Zh variants 2009-09-16 23:53:31 +00:00
Andrew Garrett
6b0be31e02 Partial revert of r55415, calling wfMsg in Language object constructor causes unstub loops 2009-09-16 23:50:29 +00:00
Philip Tzou
e05338c102 1. Revert my revision r55371. Since it may override logged user's settings.
2. Patch for situations that some wikis like zhwikisource may disabled some language variants. We should treat these disabled variants unacceptable in LanguageConverter.
2009-08-21 16:00:01 +00:00
Brion Vibber
ceedb37941 * (bug 8445) Multiple-character search terms are now handled properly for Chinese
Big fixup for Chinese word breaks and variant conversions in the MySQL search backend...
- removed redunant variant terms for Chinese, which forces all search indexing to canonical zh-hans
- added parens to properly group variants for languages such as Serbian which do need them at search time
- added quotes to properly group multi-word terms coming out of stripForSearch, as for Chinese where we segment up the characters. This is based on Language::hasWordBreaks() check.
- also cleaned up LanguageZh_hans::stripForSearch() to just do segmentation and pass on the Unicode stripping to the base Language implementation, avoiding scary code duplication. Segmentation was already pulled up to LanguageZh, but was being run again at the second level. :P
- made a fix to Chinese word segmentation to handle the case where a Han character is followed by a Latin char or numeral; a space is now added after as well. Spaces are then normalized for prettiness.
2009-06-24 02:27:51 +00:00
Philip Tzou
82a2dc03f5 New function to convert content text to specified language (only applies on wiki with
LanguageConverter class)
2009-03-04 06:56:37 +00:00
Philip Tzou
9b9bbbc485 1. Namespace translation for Chinese Language.
2. New function to convert namespace text for display. (only applies on wiki with LanguageConverter class)
2009-02-06 13:21:11 +00:00
Siebrand Mazeland
99733619ef Revert r46523, r46525. Spewing errors. See below. Behaviour observed on http://translatewiki.net.
PHP Notice:  Undefined property:  FakeConverter::$mMainLanguageCode in /var/www/w/languages/Language.php on line 2230
PHP Notice:  Undefined property:  FakeConverter::$mVariants in /var/www/w/languages/Language.php on line 2233
PHP Warning:  in_array() [<a href='function.in-array'>function.in-array</a>]: Wrong datatype for second argument in /var/www/w/languages/Language.php on line 2233
PHP Notice:  Undefined property:  FakeConverter::$mMainLanguageCode in /var/www/w/languages/Language.php on line 2234
2009-01-29 08:59:53 +00:00
Philip Tzou
3bea66b61a Move method 'getPreferredVariant' to Language class, patched by Fdcn. 2009-01-29 06:51:20 +00:00
Andrew Garrett
c06afd56b3 Revert "Follow up on r43982. Reduce dirname(__FILE__) calls in core and extensions."
Uses $dir in extension files, and assumes that it remains unchanged in require_once( 'maintenance/commandLine.inc' ).
In fact, it is likely that '$dir' will be set when setting up command-line, as some extensions will use the same var.

Recommended fix: Use $CentralAuth_dir, $EmailPage_dir, etc.
2008-11-30 03:15:22 +00:00
Siebrand Mazeland
daaa7f37a1 Follow up on r43982. Reduce dirname(__FILE__) calls in core and extensions. 2008-11-26 23:17:15 +00:00
Brion Vibber
7ebf0e431b * (bug 5477) Searches for words less than 4 characters now work without
requiring customization of MySQL server settings

Short words are padded so they now get indexed. Yay!

Adapted part of Werdna's patch, with some additional cleanup:
* Using 'U00' to pad instead of 'SMALL' to reduce false positives (eg search for "small*" could match "Smallville" and "SMALLc")
* Checking server's ft_min_word_len variable to see if we need to do anything. This preserves index compatibility with existing installations which have customized their index length.
* Some further cleanup on redundant code -- just toss everything through lc() and be done with it :D
* Cleaned out some more evals in zh and yue classes :P
* Fixed yue class to call the parent adjustor properly
2008-11-25 02:39:06 +00:00
Shinjiman
90628b9c48 * (bug 14604) Update LanguageConverter for compatibility on -{*|xxx}- usage
patch by fdcn
2008-07-09 08:16:39 +00:00
Robert Stojnić
3e1c8acb67 (bug 14604#c6): Fix regression in variant conversion when semicolon is within -{}- tags. Patch by fdcn, thanks! 2008-07-06 17:23:00 +00:00
Shinjiman
9bfe56aa37 fixing encoding problems on r36664
patch by fdcn
2008-06-27 15:08:07 +00:00
Shinjiman
69dbeb97f1 * (bug 14604) Introduced the following features for the LanguageConverter: Multi-tag support, single conversion flag, remove conversion flag on a single page, description flag, variant name, multi-variant fallbacks.
patch by fdcn
* Added zh-mo and zh-my variants for the zh language
2008-06-26 03:00:34 +00:00
Alexandre Emsenhuber
087a9f70c5 WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>

Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage

One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
Siebrand Mazeland
555019a2c6 remove EOL whitespace, and excess empty lines 2008-05-17 17:10:18 +00:00
Shinjiman
3fa8b188cd set the variant fallback from zh-CN/zh-TW to zh 2007-12-03 11:55:44 +00:00
Shinjiman
9c1fea181a fixing zh-TW array to merging with zh-Hant instead of zh-Hans 2007-12-03 03:51:44 +00:00
Shinjiman
c7c389848b * (bug 451) adding a generic Traditional / Simplified Chinese conversion table, with their variants 2007-12-02 09:02:09 +00:00
Shinjiman
b6674ff4d3 * updating the manual tables, fixing some conversion errors, and regenerate the ZhConversion.php using the scripts and manual tables
* minor cleanup for the LanguageZh.php
2007-12-02 03:41:12 +00:00
Raimond Spekking
0b37df653c * (bug 11284) Update Chinese translations
Patch by Shinjiman
2007-09-27 15:40:35 +00:00
Aryeh Gregor
a15c419b3d Remove ?>'s from files. They're pointless, and just asking for people to mess with the files and add trailing whitespace. (Yes, I looked over every one and reverted those that were bogus. Slash-enter a million times in less worked well enough, although it was a bit mind-numbing.) 2007-06-29 01:19:14 +00:00
Brion Vibber
72a4abe588 * Skip additional setting of include_path in commandLine.inc (for non-Wikimedia mode)
* Fix some scripts that assumed include_path was set with various additional directories
Stuff now seems to mostly work when not overriding include_path.
Taking that out of LocalSettings is the next step... whee!
2007-06-06 16:01:14 +00:00
Antoine Musso
c771fc9c96 Use Doxygen @addtogroup instead of phpdoc @package && @subpackage 2007-01-20 15:09:52 +00:00