In some languages it's conventional not to insert a thousands
separator in numbers that are four digits long (1000-9999).
Rather than copy-paste the custom code to do this between 13 files,
introduce another option and have the base Language class handle it.
This also fixes an issue in several languages where this logic
previously would not work for negative or fractional numbers.
To implement this, a new option is added to MessagesXx.php files,
`$minimumGroupingDigits = 2;`, with the meaning as defined in
<http://unicode.org/reports/tr35/tr35-numbers.html>. It is a little
roundabout, but it could allow us to migrate the number formatting
(currently all custom code) to some generic library easily.
Bug: T177846
Change-Id: Iedd8de5648cf2de1c94044918626de2f96365d48
Clean up use of @codingStandardsIgnore
- @codingStandardsIgnoreFile -> phpcs:ignoreFile
- @codingStandardsIgnoreLine -> phpcs:ignore
- @codingStandardsIgnoreStart -> phpcs:disable
- @codingStandardsIgnoreEnd -> phpcs:enable
For phpcs:disable always the necessary sniffs are provided.
Some start/end pairs are changed to line ignore
Change-Id: I92ef235849bcc349c69e53504e664a155dd162c8
- mostly auto fixes
- some too long lines fixed
- ignore amp space in one case passing by reference
Change-Id: I6472f83bc3cbf4bd629d83050cc3319b19ec465c
Change 950cf6016c took care of the most,
but a few remain, either outside of includes/ and maintenance/
directories (which that change was limited to), or in code introduced
afterwards.
Change-Id: I9c363d0219ea7e71cde520faba39406949a36d27
And auto-fix all errors.
The `<exclude-pattern>` stanzas are now included in the default ruleset
and don't need to be repeated.
Change-Id: I928af549dc88ac2c6cb82058f64c7c7f3111598a
Previously, even if $wgUsePigLatinVariant was false, the language
would show up on Special:Preferences (and some other places) as
'en-x-piglatin - Igpay Atinlay'.
Follow-up to d8375bee24.
Change-Id: I08faacabca87c04299c7b535be8df1770e0a37ac
Guarded by the $wgUsePigLatinVariant variable, off by default.
Pig Latin is a language game where words in English are altered
according to the following rules:
* Words starting with a vowel have a '-way' suffix appended.
* Words starting with a consonant have the initial consonants (or 'qu'
group) moved to the end and an '-ay' suffix appended.
https://en.wikipedia.org/wiki/Pig_Latin
* Added 'en-x-piglatin' as a language name.
* Added 'en' to LanguageConverter::$languagesWithVariants.
* Added LanguageEn class and its corresponding EnConverter which
provides one-way translation from English to Pig Latin.
* Some minor internal changes in code that assumed that English
doesn't have a language class or converter.
Bug: T45547
Depends-On: I1d9691c784032669979f8109c9a5f65cbf4122c9
Change-Id: I7fa2d85d6364958c5138366e8b4504a2697a8731
Look at languages/messages/MessagesEo.php for one of about a dozen real
world examples where this is set to false. All code calling
getDatePreferences checks if it got a truthy value first before using
it.
Change-Id: I4ef5c8be618d41039297325c9dd4cf554ea14559
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.
Change-Id: Id2f9d229d17b8eee66b2ca4e3927f3f66ac62988
I was bored. What? Don't look at me that way.
I mostly targetted mixed tabs and spaces, but others were not spared.
Note that some of the whitespace changes are inside HTML output,
extended regexps or SQL snippets.
Change-Id: Ie206cc946459f6befcfc2d520e35ad3ea3c0f1e0
Use of &$this doesn't work in PHP 7.1. For callbacks to methods like
array_map() it's completely unnecessary, while for hooks we still need
to pass a reference and so we need to copy $this into a local variable.
Bug: T153505
Change-Id: I8bbb26e248cd6f213fd0e7460d6d6935a3f9e468
For relative timestamps in $str, strtotime( $str, $now ) returns an
absolute Unix timestamp $str since $now, and this timestamp is given
to $time. However, Language::formatDuration expects a time duration,
not an absolute timestamp. We obtain this duration from the difference
between $time, the absolute timestamp of block expiry, and $now, the
absolute timestamp of the time in which the block action happened.
Tests have been added to test both this patch and 01936fa, the patch
that caused this regression.
Bug: T156453
Change-Id: I6fd8c02dc3c6456067fe25cb9f33f5b4c78332aa
Currently, a user who has an invalid time zone stored in the database is
effectively locked out of their account on HHVM sites. This patch addresses
this by (1) preventing users from setting invalid time zones, and (2) not
throwing an unhandled exception if a user's TZ is unknown.
When the user saves their preferences, the code silently rewrites invalid
time zones to UTC. I think this is OK, since to cause this to happen you
have to manually muck around with the Preferences page DOM or submit the
form from a script.
Bug: T137182
Change-Id: I28c5e2ac9f2e681718c6080fb49b3b01e4af46dd
This makes the code for processing JSON files with
grammar transformations reusable by different languages
and applies the same logic to Russian and Hebrew.
It will be done to other languages in further patches.
This patch is not supposed to change any functionality,
and the tests are intact (except a comment in the test
for Hebrew - the class doesn't exist any longer).
PHP:
* Move the JSON grammar transformation data processing logic
from LanguageRu.php to convertGrammar() in Language.php.
By default all these data files are supposed to be
processed identically, so the code should be common.
If there is no JSON data file, nothing new happens.
* LanguageRu's own convertGrammar() method is removed.
* The LanguageHe class is removed, now that all its functionality
is handled by generic JSON data processing in the Language class.
LanguageHe.php file is removed from the repo and from autoloading.
JavaScript:
* Move the JSON grammar transformation data processing logic
from ru.js to mediawiki.language.js.
* JavaScript grammar code files he.js and ru.js are removed
from the repo and from Resources.php, because all the data
is in JSON, and the default logic in mediawiki.language.js
works for both languages.
Bug: T115217
Change-Id: I5e75467121c3d791bb84f9e6fdfcf07c1840f81a
Not entirely sure what's going on here. Best guess is phan isn't able
to figure out that array + mixed will result in an array, and then
adding validNamespaces (another array) is ok. Could make things a little
more explicit with array_merge, but this seems to work to remove the
issue without changing the meaning of the code.
Change-Id: I7031ae4e68878ec3198e47c55ab5de4d52a6d922
Adds the functionality of per-line @suppress annotations. Phan already
supports per-class or per-method annotations, but does not have any
per-line support due to the PHP AST only returning comments that are
class/property/method level docblocks.
This is a bit of a hack, but get's the job done. Removes the
PhanTypeInvalidLeftOperand issue from blacklist and suppresses it
to demonstrate the supression works as expected.
Change-Id: I5066b3b431fb69175a711ee366e95f31c7c47639
Most of these are simply changing annotations to reflect
reality. If a function can return false to indicate failure
the @return should indicate it.
Some are fixing preg_match calls, preg match returns 1, 0 or false,
but the functions all claim to return booleans.
This is far from all the incorrect return types in mediawiki, there
are around 250 detected by phan, but have to start somewhere.
Change-Id: I1bbdfee6190747bde460f8a7084212ccafe169ef
* Load the data of this variable from a JSON file to the same
data structure that ResourceLoader uses for digitTransformTable,
pluralRules, etc.
* Change the JSON structure to ensure the order of the rules.
Otherwise JavaScript processes the keys in a random order.
* Delete the grammar code from JS and replace it with
the same logic that is used in PHP for processing the data.
For now this is done only for Russian.
The next step will be to make the PHP and JS
data processing logic reusable.
Bug: T115217
Change-Id: I6b9b29b7017f958d62611671be017f97cee73415
Currently, the function formatTimePeriod does not have a very descriptive
documentation and has a TODO tag on it to change the documentation
Change the documentation to be more descriptive and remove the TODO
Change-Id: Idde711f1d3d6cbe0ecab0c3b49c68a4876d9e8b2
This class isn't any more special than others, it can be autoloaded
like all the others and there's nothing to execute in the file scope
Change-Id: I7c415025c9c15cf19110f39452df3a14e44bf6f9
Deletes LanguageEo.php class which only had remains of the server-side
character conversion (sx <-> ŝ, etc). This is being obsoleted in favor
of client-side IMEs provided by UniversalLanguageSelector extension.
Removes deprecated $wgEditEncoding, which was only used for this.
Turns Language::recodeInput() and Language::recordForEdit() into no-ops
for any old or extension code that happened to still use them.
Bug: T62677
Change-Id: Ib647353538d258dee941f2f7c571191060bc9c7d
All of them are already being used outside the class:
* getMonthAbbreviation
* getMonthAbbreviationsArray
* getWeekdayName
* sprintfDate
* userAdjust
* date
* time
* timeanddate
* getMessage
* iconv
* ucfirst
* uc
Change-Id: I63ec93858cebc02cdf3b9b042eddf4ef620cc110
For this add an user parameter to Language::translateBlockExpiry.
This allows the function to display the absolute block expiry in the
user's time zone. Use this when formatting block log entries.
This also avoids the use of $wgUser
Bug: T131241
Change-Id: If0a1d3c88bb4242a016eb9b2df115413de786149
* Removed fallback code from Language, the associated data file
(Utf8Case.ser), and the code to generate that data file.
* Removed comment in LanguageFi that "mb_substr has a compatibility
function in GlobalFunctions.php".
* Removed check for mbstring in bench_utf8_title_check.php.
* In the tests for StringUtils::isUtf8():
* Removed separate test for the non-mbstring code path.
* Removed mentions of mbstring from function names and assertion
messages, since mb_check_encoding() is now always used.
* Also updated the comment in StringUtils::isUtf8() referring to
PHP 5.3, which is no longer supported in MediaWiki, to indicate
that the same issue also exists in old versions of HHVM. (If
we don't have to support 3.4 or older, then the function could
be deprecated and removed if desired.)
Follows-up 943563062f.
Change-Id: I55e5cd534b849c6ea06a7fadacbbf34a12d87ebe
I searched for /\$(\S+) = (.+?\(.*?\);)\n.*?\$\1\[/, ignored
everything involving isset(), unset() or array assigments, then
skimmed through the remaining results and changed things where they
made sense. These changes were not automated, so please review them.
Change-Id: Ib37b4c66fc57648470f151ad412210b3629c2538
Move ZhConversion.php and Names.php to languages/data and make them both
expose their data as static class variables instead of in the local
scope. This means that the autoloader can be used to load the data,
which is efficient and secure. This also makes additional request-local
caching of the arrays unnecessary.
Change-Id: Iafb96ac4165d0965fcb9a69f1d0a91139ea9790c
Previously various language objects would install a hook to update the
shared conversion table cache when the object was constructed. This is
not a good idea since language objects may be constructed even when they
are not the content language, but only the content language is
associated with variant conversion and the conversion cache.
Instead, have WikiPage call a method on $wgContLang directly. I put this
with message cache update since the logic is almost identical.
Change-Id: Ief9c0ef993e39645e74a6e158cb4e6e2139ce91d
* Remove strcspn() check in newFromCode already in isValidCode().
* Leverage the autoloader instead via class_exists instead of
including files based on user input.
* Create fallback instance directly instead of recursing back
into newFromCode().
* Remove method preloadLanguageClass (unused).
* Remove method getClassFileName (unused)
Change-Id: I90035ca4b07facae051b1a584e92df72b42c4a49
To detect whether the truncation had chopped up a multibyte
character after the first byte, a regex was used. But in this
regex, the dot (.) didn't match newlines, so it failed to
detect chopped multibyte characters (after the first byte)
if there was a newline preceding the chopped character.
Bug: T116693
Change-Id: I66e4fd451acac0a1019da7060d5a37d70963a15a
Changed some old bugzilla links to new phabricator links in comments,
test data and error message. This reduces the need for redirects from
old bugzilla to new phabricator from our source code.
Change-Id: Id98278e26ce31656295a23f3cadb536859c4caa5
* If using 'default', still fallback to 'date' if 'pretty' is
unavailable.
* Fix instance caching of 'default', which never worked since $pref
would be changed inside the !isset() block.
Bug: T110945
Change-Id: Ic53b279f8741371fa1cb642c53e6d487cb1c6b81
The xenon logs on performance.wikimedia.org show this function as being on-CPU
about 1% of the time, making it a candidate for optimization. The version in
this patch is about 30-50% faster in my benchmarks.
Change-Id: I26385ade7600fc11965d94468b57e41ec257de51
Any page on translatewiki with param setlang=qqq times out. All messages
get parsed recursively until parser-template-loop-warning is reached.
uselang=qqq is already ignored, see RequestContext::sanitizeLangCode.
There is a counterpart to this patch in ULS, where it is changed
to use Language::isSupportedLanguage.
Bug: T104987
Change-Id: Ie77fe18681dfd5f9089fbaa8090dd9cc1c206da4
strtr() is marginally faster as it runs through the string only
once. A better fit for one-for-one character translation.
The strtr() function also supports an associative array as second
parameter for entire string replacements. This, too, has the same
performance and predictable behaviour (starts with the longest key).
Whereas str_replace is for more aggressive needs where you want
multiple passes until there are no further matches.
The associative array form is arguably also easier to understand
and harder to mess up since the needle/replacement pairs are
explicitly connected instead of two separate arrays.
Also:
* Use getFormattedNsText instead of strtr( getNsText, .. ) which
reduces duplication of this fact through a more semantic intent.
Change-Id: Ie23e4210a5b6908dd79eebc8a2b931d12fe31af6
wfSuppressWarnings() and wfRestoreWarnings() were split out into a
separate library. All usages in core were replaced with the new
functions, and the wf* global functions are marked as deprecated.
Additionally, some uses of @ were replaced due to composer's autoloader
being loaded even earlier.
Ie1234f8c12693408de9b94bf6f84480a90bd4f8e adds the library to
mediawiki/vendor.
Bug: T100923
Change-Id: I5c35079a0a656180852be0ae6b1262d40f6534c4
The 'fallbackSequence' is exactly generated for this purpose. It
is equal to the value of 'fallback' after splitting, trimming
and ensuring 'en' is present. See LocalisationCache::recache().
Also simplify returning of the first array index by returning it
directly instead of modifying the array first.
Due to an inconsistency between how LocalisationCache and Language classes
treat the fallback sequence differently, we have to manually fallback
to 'en' in case of unknown language codes. This is because otherwise
Language::factory() will throw an exception causing tests to fail.
We should investigate whether this is desirable or not, but keeping
existing behaviour for now and documenting it.
Change-Id: I9c1d51b59aabebf5a31f38205304bb8cc22dcd8c
Use arrays instead of strings, to avoid using
string functions with Unicode.
Handle thousands according to how years like 1000, 2000, etc.
are named in the Hebrew Wikipedia.
Bug: T97444
Change-Id: I5334e86793d28dfcf8939a249b03a5ea85fa4e69
We're trying to remove MediaWiki dependencies from MWTimestamp so it can
be split out into a separate library. In addition to getting rid of a
dependency on RequestContext, Language, and User, it makes more logical
sense there anyways.
Bug: T100924
Change-Id: If46eaea42d8a5a808c10f0dc353e181714295a44
This drops support for the custom utf8 normal PHP extension in favor
of the intl extension.
Bug: T90825
Change-Id: Ifbaeb2ef684217cf6187ccc4fb4d303f89608300
There's a bunch of stuff that probably only works because the database
representation of infinity is actually 'infinity' on all databases
besides Oracle, and Oracle in general isn't maintained.
Generally, we should probably use 'infinity' everywhere except where
directly dealing with the database.
* Many extension callers of Language::formatExpiry() with $format !==
true are assuming it'll return 'infinity', none are checking for
$db->getInfinity().
* And Language::formatExpiry() would choke if passed 'infinity', despite
callers doing this.
* And Language::formatExpiry() could be more useful for the API if we
can override the string returned for infinity.
* As for core, Title is using Language::formatExpiry() with TS_MW which
is going to be changing anyway. Extension callers mostly don't exist.
* Block already normalizes its mExpiry field (and ->getExpiry()),
but some stuff is comparing it with $db->getInfinity() anyway. A few
external users set mExpiry to $db->getInfinity(), but this is mostly
because SpecialBlock::parseExpiryInput() returns $db->getInfinity()
while most callers (including all extensions) are assuming 'infinity'.
* And for that matter, Block should use $db->decodeExpiry() instead of
manually doing it, once we make that safe to call with 'infinity' for
all the extensions passing $db->getInfinity() to Block's contructor.
* WikiPage::doUpdateRestrictions() and some of its callers are using
$db->getInfinity(), when all the inserts using that value are using
$db->encodeExpiry() which will convert 'infinity'.
This also cleans up a slave-lag issue I noticed in ApiBlock while
testing.
Bug: T92550
Change-Id: I5eb68c1fb6029da8289276ecf7c81330575029ef
Refactor out 'infinity' vartiant values which used in blocking and
protecting actions. This patchset adds GlobalFunction wfIsInfinity.
Bug: T68646
Change-Id: I60cc55a5bbd43c72916a1c2ea3807457d4e33765