Commit graph

281 commits

Author SHA1 Message Date
Umherirrender
b5cddfb27b Remove empty lines at begin of function, if, foreach, switch
Organize phpcs.xml a bit

Change-Id: Ifb767729b481b4b686e6d6444cf48b1f580cc478
2017-07-01 11:34:16 +00:00
C. Scott Ananian
5e76bb2657 tests: Use TestingAccessWrapper to reload LanguageConverter tables
Make the LanguageConverter::reloadTables method actually private,
and use the TestingAccessWrapper to call it when running parser tests.

Follow-up to I65736520cd04bfe8949b29ade07338a6e1b88a4d.

Change-Id: I43b81b8fef6441ad50b858ff7757732ecb5eef91
2017-06-27 17:11:09 -04:00
C. Scott Ananian
b80b7020ce tests: Reset LanguageConverter conversion tables between test cases
Conversion rules defined in a previous test case were leaking into
subsequent test cases.  Existing tests had worked around this by defining
non-overlapping rules, but it's better to just fix the problem at the
source.

Change-Id: I65736520cd04bfe8949b29ade07338a6e1b88a4d
2017-06-26 13:56:30 -04:00
Liangent
d8375bee24 New language variant 'en-x-piglatin' for easier variant testing
Guarded by the $wgUsePigLatinVariant variable, off by default.

Pig Latin is a language game where words in English are altered
according to the following rules:

* Words starting with a vowel have a '-way' suffix appended.
* Words starting with a consonant have the initial consonants (or 'qu'
  group) moved to the end and an '-ay' suffix appended.

https://en.wikipedia.org/wiki/Pig_Latin

* Added 'en-x-piglatin' as a language name.
* Added 'en' to LanguageConverter::$languagesWithVariants.
* Added LanguageEn class and its corresponding EnConverter which
  provides one-way translation from English to Pig Latin.
* Some minor internal changes in code that assumed that English
  doesn't have a language class or converter.

Bug: T45547
Depends-On: I1d9691c784032669979f8109c9a5f65cbf4122c9
Change-Id: I7fa2d85d6364958c5138366e8b4504a2697a8731
2017-06-12 16:59:57 -04:00
Kunal Mehta
642ffff845 LanguageConverter: Avoid deprecated wfMemcKey()
Change-Id: I7fe8e3ad6de2eb0a156b046805fa0eca928d0892
2017-05-25 11:41:56 -07:00
Thiemo Mättig
8bbf6cb2eb Use more specific string[] type hint for language variants
This patch only touches PHPDoc documentation, nothing else.

Change-Id: Ia79d06425a3b8629c171cd68ae435c64dac86f46
2017-04-17 22:31:22 +02:00
WMDE-Fisch
caae756f72 Remove deprecated noop functions
Change-Id: Ia821d43e243b1ee146d3bc4ed35f6aff0bf17466
2017-03-17 11:27:04 +01:00
Timo Tijhof
3a2a707546 Clean up remaining get_class() uses
* get_class()        -> __CLASS__ (same as self::class)
* get_called_class() -> static::class
* get_class($this)   -> static::class

Change-Id: I1888a1897ecf4548a2e5a67a942e5c080dd7e3d3
2017-03-07 22:03:47 +00:00
C. Scott Ananian
3e32d21210 Strip U+0000 in wikitext
U+0000 is not allowed in HTML5, there's no reason to allow it in wikitext.

It simplifies our code if we can just strip them at the start.  Strip in
PST as well so they don't sneak into our database either.

Tweaked the EXT_LINK URLs to account for the fact that invalid characters
get transformed into U+FFFD when using Preprocessor_DOM.  See 73649741ed
(r65967) for context on that change.

Bug: T159174
Change-Id: I3f67e92b61aacc87a40c3662085c84d1dac08bfb
2017-03-06 22:23:38 +00:00
jenkins-bot
aa3319c4c0 Merge "Miscellaneous indentation tweaks" 2017-02-28 18:38:36 +00:00
James D. Forrester
3526417586 languages: Replace implicit Bugzilla bug numbers with Phab ones
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.

Change-Id: Id2f9d229d17b8eee66b2ca4e3927f3f66ac62988
2017-02-28 00:33:38 +00:00
Bartosz Dziewoński
ecdef925bb Miscellaneous indentation tweaks
I was bored. What? Don't look at me that way.

I mostly targetted mixed tabs and spaces, but others were not spared.
Note that some of the whitespace changes are inside HTML output,
extended regexps or SQL snippets.

Change-Id: Ie206cc946459f6befcfc2d520e35ad3ea3c0f1e0
2017-02-27 19:23:54 +01:00
C. Scott Ananian
5b050be643 Allow HTML tags in LanguageConverter output.
A "remove HTML tags to avoid disrupting the layout" block is removed
(previously added in f16d1e4ed7).

This is a follow-up to I9b099273203482ffb570a5654d8ba50c833e526d.

Bug: T54192
Change-Id: I565fac58b3b0da7bfaedf64f5001c364f52e2244
2016-12-22 01:32:24 +00:00
Aaron Schulz
aac4b448cf Make MessageCache::load() require a language code
Also make it protected; no outside callers exist.

Change-Id: I9f35d05a5e031d1c536a44b19b108803db068677
2016-10-18 17:50:12 -07:00
Aaron Schulz
0809631edd Convert LanguageConverter to using getLocalServerObjectCache()
Change-Id: I7bfcc389ef0266299d887a3520ab9581ef9aa9be
2016-10-11 20:24:42 +00:00
Amir Sarabadani
9850c542c6 Clean up array() syntax in docs, part VII
Last part

Change-Id: I38f015e2122ef4fd2d2141718bd889794c29f06c
2016-09-27 06:53:25 +03:30
Brad Jorsch
9b94bd502f Check User::isSafeToLoad() in LanguageConverter
Ideally LanguageConverter shouldn't be relying on global state at all.
But as a first step let's make it not try to use the global state when
that global state isn't even there.

Bug: T127233
Change-Id: I391cef3ec211d648b078fc509e0139daa58eb875
2016-03-09 21:59:04 +00:00
Bartosz Dziewoński
c161c46d26 Improve code suffering from PHP 5.3's lack of support for foo()[]
I searched for /\$(\S+) = (.+?\(.*?\);)\n.*?\$\1\[/, ignored
everything involving isset(), unset() or array assigments, then
skimmed through the remaining results and changed things where they
made sense. These changes were not automated, so please review them.

Change-Id: Ib37b4c66fc57648470f151ad412210b3629c2538
2016-02-28 22:49:20 +01:00
Kunal Mehta
6e9b4f0e9c Convert all array() syntax to []
Per wikitech-l consensus:
 https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html

Notes:
* Disabled CallTimePassByReference due to false positives (T127163)

Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
2016-02-17 01:33:00 -08:00
Tim Starling
059fd9a2ae Don't modify $wgHooks on language object construction
Previously various language objects would install a hook to update the
shared conversion table cache when the object was constructed. This is
not a good idea since language objects may be constructed even when they
are not the content language, but only the content language is
associated with variant conversion and the conversion cache.

Instead, have WikiPage call a method on $wgContLang directly. I put this
with message cache update since the logic is almost identical.

Change-Id: Ief9c0ef993e39645e74a6e158cb4e6e2139ce91d
2016-01-29 15:03:56 +11:00
Florian
e0ad37d49a Remove Language::armourMath() and friends
Change-Id: I0ce18bce2d9b5787221e2dabff143de9792abb3a
2016-01-07 09:21:53 -08:00
jenkins-bot
c14fcf8015 Merge "Made convertNamespace() use APC" 2015-09-28 20:44:38 +00:00
Vivek Ghaisas
c54766586a Fix issues identified by SpaceBeforeSingleLineComment sniff
Change-Id: I048ccb1fa260e4b7152ca5f09b053defdd72d8f9
2015-09-26 23:06:52 +00:00
Aaron Schulz
eb5a2fd8ea Made convertNamespace() use APC
* This can avoid MessageCache::load() calls on another
  language due to variants. The convertNamespace() method
  takes up a significant amount of time for 404 pages.

Change-Id: I4551d5b8e5b5a0bc01d02702b80f93591fc19440
2015-09-25 22:57:58 -07:00
Liangent
ca38682dda LanguageConverter fix of empty and numeric strings
Bug: T51072
Bug: T48634
Bug: T53551
Change-Id: I2c88f1cf7c0014bebf5c798916b660b334a0b78b
2015-06-08 14:23:42 +00:00
Ori Livneh
12571bde26 Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:

* The strip marker regexes don't benefit from JIT compilation, so they are
  slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
  compiled, because HHVM bets on regexes getting reused. This extra work is
  fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
  displaces from the cache regexes which are in fact reused.

Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:

* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
  complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
  replace any occurences of \x7f with '?', to prevent strip marker forgery.
  \x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
  prefix may no longer be specified.

Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-31 19:33:36 -07:00
Chad Horohoe
9971834131 Delay language conversion cache construction until needed
Instead of instantiating this on every single request. Removes
wfGetLangConverterCacheStorage() and $wgLangConvMemc which were
otherwise unused.

Change-Id: Ic500944a92c2a94bc649e1b492c33714d81dca00
2015-03-03 21:12:28 -08:00
Chad Horohoe
aa21e125a3 Remove obvious function-level profiling
Xhprof generates this data now. Custom profiling of various
sub-function units are kept.

Calls to profiler represented about 3% of page execution
time on Special:BlankPage (1.5% in/out); after this change
it's down to about 0.98% of page execution time.

Change-Id: Id9a1dc9d8f80bbd52e42226b724a1e1213d07af7
2015-01-07 11:14:24 -08:00
umherirrender
b0cfcd0fcb Add missing @return and @param to doc blocks
Change-Id: I9d99ba1968ed8f97624d957754c8847dfe1b41da
2014-08-27 21:57:45 +02:00
umherirrender
ae3c883150 Cleanup some docs (languages)
- Makes beginning of @param in capital
- Removed return void

Change-Id: Ie05436c1ef886cb23c62ccde95384f253f83694c
2014-08-09 22:20:15 +02:00
Thiemo Mättig
f6cff5e392 Update documentation of what a "section" is
There are so many slightly different understandings of what a
"section" is or can be. I'm aware the documentation was improved
just a few weeks ago. I still find it incomplete and confusing.

1. I renamed it to $sectionId to make it more clear what it
really is.

2. Sections are usually numbers. 0, 1 and so on. There is no
reason to disallow the use of ints or even floats (this works
because the string representation of 0.0 is "0"). The code never
disallowed numbers.

3. 'T1' never was supported, as far as I can tell. 'T-1' is
supported. See Parser::extractSections().

4. null and false and '' all mean "the whole page" in
WikiPage::replaceSectionAtRev() but for some reason this meaning got
lost in WikitextContent::replaceSection(). I made it the same again.

Change-Id: Icc3997722d2ed742bf7703cd7c06d09199225720
2014-06-12 18:13:23 +02:00
Liangent
c17b0fce9a Do title conversion on &action=edit if &redlink=1 exists
Bug: 33231
Change-Id: I33c3c9df4ff2215710bacb696b64bb4291dda24e
2014-05-09 17:44:02 +00:00
Siebrand Mazeland
835b69e59b Make languages/ pass phpcs-strict
Change-Id: I0c4a68d140fae27857cbc3684fe51d7880d92118
2014-04-22 09:02:27 +00:00
umherirrender
55e8a9abfd Fixed some @params documentation (languages)
Swapped some "$var type" to "type $var" or added missing types
before the $var. Changed some other types to match the more common
spelling. Makes beginning of some text in captial.

Change-Id: I7a4dec6a8de96ee21ef34e52bb755f723aa3b0e6
2014-04-17 13:32:54 +00:00
addshore
6503a529d8 Move ConverterRule class to its own file
Change-Id: I0d743625e32f903ecd13f3c1f5aaeabdaca70f9d
2014-04-08 23:39:55 +01:00
umherirrender
725d9d125d Removed unneeded spaces and colons in @param and friends
Also swapped some "$var type" to "type $var" or added missing types
before the $var. Changed some other types to match the more common
spelling. Makes beginning of some text in captial.

Change-Id: Ic36c8c7820a6c2d603f1138130670c6bf6a1ca59
2014-04-08 16:02:49 +00:00
Liangent
333bf3ae5b Remove user preference "noconvertlink"
This toggle was introduced in 8d06ad6e, but the most useful feature for
human users there (disabling <h1> conversion on a per-user basis) has
been dropped due to cache fragmentation. The only remaining part is not
quite useful and can be covered by the URL parameter &linkconvert=no.

Change-Id: I12f2cdc9b0d44d6e47487b14fa8ef010de5c94a7
2014-02-08 03:10:16 +00:00
Chad Horohoe
423c0682c5 Remove deprecated convertLinkToAllVariants()
Deprecated since 1.17, not used anywhere in core or extensions

Change-Id: Id90ee1765899ea331a65ce372744ed465686c84b
2014-01-02 12:01:42 -08:00
umherirrender
073abe3e12 No variable assignment on return statement
Split the variable assignment and the return statement in two lines for
better readability.

When there was two return statements in one method the logic was swapped
to have only one return statement.

Change-Id: Id7a01b4a2df96036435f9e1a9be5678dd124b0af
2014-01-02 09:43:35 +00:00
umherirrender
0bc583af2c Move closing parenthesis from multi line if and function to own line
The Line continuation Coding conventions prefers the closing parenthesis
on the same line than the beginning curly braces. This is done for ifs
and functions.
Also move some boolean operator from the end of a line to the beginning
and changed some indentation to make the condition hopefully better
readable.

Change-Id: Id0437b06bde86eb5a75bc59eefa19e7edb624426
2013-12-01 21:39:00 +01:00
umherirrender
5ca5672aac Fixed spacing
- Place commas correct
- Moved comments
- Add space after if/foreach/catch
- Reformat some conditions
- Removed trailing spaces/tabs

Change-Id: I40ccda72c418c4a33fcd675773cb08d971510cdb
2013-12-01 20:58:51 +01:00
Niklas Laxström
b1ab14c26e Increase LanguageConverter cache version
It stores ReplacementArray objects, which had changes in
I1b2e3360468cbfc8.

Bug: 56911
Change-Id: Ief8f4848ab33ae3cb3356e60030f76f20f135dcb
2013-11-11 19:44:36 +00:00
physikerwelt (Moritz Schubotz)
545b712ed4 Mark Math-specific functions in core as deprecated
The math specific functions in core are not needed
anymore and should be removed in future versions.
Math can access these settings in the same way as
all other extensions do.

Since Math 2.0 the rendered element has the property
"markerType" => 'nowiki'

Change-Id: I20d3714bed9da864146f133a08cf4ca90eda42ab
2013-11-06 17:41:31 +01:00
Liangent
325632162c Don't match HTML entities in language conversion syntax
RegEx provided by Gabriel Wicke.

Change-Id: Idca127acc6f4cdc159ee85d5f816a5d120cbe44e
2013-09-15 18:16:54 +00:00
jenkins-bot
b8af5d9485 Merge "Add converted namespace names as aliases to avoid confusion." 2013-09-02 17:47:46 +00:00
Liangent
3a06dd9be9 Allow more than one variant set in user preferences.
Now with the introduction of page language, a site can have pages
in all languages, and different languages have different variants.
This patch allows users to set preferred variants for every page
they may see on the site.

Change-Id: Ie7e82bee0b1f8f902b38bb4a464cf0ebc4df4d89
2013-08-16 05:46:14 +00:00
Liangent
d0e3dc94c3 Add converted namespace names as aliases to avoid confusion.
Currently if the site language is zh and a user is using variant zh-tw,
namespace names from zh-hant are displayed because of the language
converter, but they're not accepted by MediaWiki as valid namespace names
by default because zh falls back to zh-hans.

For core namespaces, all converted namespace names are manually added as
$namespaceAliases in MessagesZh.php but it's not always done in extensions.
With this patch converted namespace names are automatically added as
namespace aliases when namespace aliases are loaded.

In some followup commit it makes sense to remove existing core namespace
aliases which were created for this reason.

Change-Id: I01873d9c64a9943afbb655d6203cec9ebd39fb72
2013-08-13 13:01:40 +00:00
Kevin Israel
876bddf637 Change @since and @deprecated notes to 1.22
Using the following command line, I have found doc comments mentioning
"1.21" when they should mention "1.22" instead, which I have fixed
manually:

git diff REL1_21 | grep --color=always -C 10 -iE \
'^\+.*(since|deprecated).*1\.21(\D|$)' | aha > oldver.html

I also moved the release notes for I1987190f ("Combine JavaScript and
JSON encoding logic") from RELEASE-NOTES-1.21 to RELEASE-NOTES-1.22
because I had reverted the commit on REL1_21 only (see Id3b88102 and
bug 47431 for the rationale).

Change-Id: I11b917a371e07267dfa98b8449776d0c1cb29b15
Follows-Up: I25cf5a94f6e47f85a9d0b80cc1c9c9f957288478
Follows-Up: I3d72e4105f6244b0695116940e62a2ddef66eb66
Follows-Up: I3faa9c3e8107c6e46cdf21f8c18adda1f42890d7
Follows-Up: I6aab19c8d68bf47beddad42632b0360a7b12f251
Follows-Up: I86368821fc2cd0729df5342b8572eb470c0f77a0
Follows-Up: Id3b88102e768318e3605a19e9952121091a40915
Follows-Up: Ie667088010e24eb6cb569f9e8e8e2553005223eb
2013-06-21 05:33:22 +00:00
Timo Tijhof
b7bec085ce Drop redundant attributes in hardcoded html
Follows-up 97caae596d which makes HTML5 the default
and removes support for XHTML 1.0 and HTML < 5.

* <script type>
* <style type>
* <html xmlns>
* Quick-closing slash in non-XML HTML5 documents

Change-Id: I71855fa8d4095a5a448ebdc3dc36506ddab6f70c
2013-05-21 01:05:12 +02:00
Timo Tijhof
4bd5471ca3 docs: Remove odd colons after @todo
Most were this way already:
https://doc.wikimedia.org/mediawiki-core/master/php/html/todo.html

Ran a find/replace on the odd ones. Also made them all
lower case.

Change-Id: I70c6a69344ddebc603e9a1c1d87e3cc4f4f4c560
2013-05-15 06:23:40 +00:00