Commit graph

1737 commits

Author SHA1 Message Date
daniel
1ae400bb93 Remove some deprecated methods from the Language class
The following methods and fields in the Language class, deprecated since
1.35, have been removed:
- findVariantLink()
- convertTitle()
- updateConversionTable()
- classFromCode()
- clearCaches()
- mConverter

Change-Id: I9281f37be3a374e072d6afde8f352138af13adbe
2021-04-13 13:26:27 +00:00
Peter Ovchyn
45140daa29 Avoid using User ::getDefaultOption, ::getDefaultOptions
This patch hard-deprecates the methods above

Bug: T276035
Change-Id: Ic36b0702f7547acce0d162d6e0b54bbd4ecf4d81
2021-03-16 17:24:17 +02:00
jenkins-bot
d7af1cdbe3 Merge "Convert Language to UserIdentity" 2021-03-11 19:53:34 +00:00
daniel
7e2f7efa27 Convert Language to UserIdentity
This also introduces minimal instance caching into UserFactory

Change-Id: I594c5668c537477516dda4beecd11b8aa840ae62
2021-03-11 20:23:23 +01:00
DannyS712
35bd84e9ee Comments: use only // instead of more
No need for three or more slashes

Except in some places where a bunch more are
used for drawing attention to something

Change-Id: Ic90358eb89a14a04d2b66c48e52e8fb20de0eb04
2021-03-10 15:05:57 +00:00
Umherirrender
9284893744 Avoid unstubbing user in BlockErrorFormatter::getFormattedBlockErrorInfo
Allow to pass User object to Language::formatExpiry which does not
depend on global state when passing a user

Bug: T267445
Change-Id: Ibd2991b7f051f2a7635c5f4844c8cbfab473557e
2021-02-24 17:18:50 +01:00
Reedy
f0a950703c Language.php: Mark some closures as static
Bug: T274036
Change-Id: I01d986ba32d53c29a585dcdbea9e510832dfba32
2021-02-07 02:20:35 +00:00
Umherirrender
49e7981cc2 Fixed mixed escaping in Language::translateBlockExpiry
Bug: T268938
Change-Id: I44c12b9676610e596254b68b829ac62bc109cdde
2020-12-07 15:40:26 +01:00
jenkins-bot
4c9899ea99 Merge "languages: Language::formatNum() should accept any valid number" 2020-11-23 20:54:40 +00:00
C. Scott Ananian
e099c38ef4 languages: Language::formatNum() should accept any valid number
The PHP function is_numeric() returns true for numbers like '123.456'
and even '1.23e45'. However, it returns false for (string)NAN,
(string)INF, and (string)-INF (which are "NAN", "INF" and "-INF"
respectively).  We can return the appropriate unicode characters for
the infinities to localize these/make them universal, and allow a
localization of the "Not a Number" message.

Make the corresponding change to Language::parseFormattedNumber() so
that its remains the inverse operation to ::formatNum().

Accept "NAN"/"INF"/"-INF" only when they stand alone in the string;
in the legacy case where text and numbers are intermingled, split
only on "traditional" numbers; I think we're more likely to find
INF/NAN "innocently" in the middle of text than we are to find it
as a "real" number.

Change-Id: I3ff227a4aac66fc938182dc9fb8a7b743e94faca
2020-11-23 15:20:43 -05:00
Umherirrender
201980999a build: Updating mediawiki/mediawiki-phan-config to 0.10.4
Change-Id: I56538eaa498ab6d312240f9a534c2d2da11c34cb
2020-11-20 17:33:22 +01:00
jenkins-bot
7f61804bf5 Merge "Use Unicode minus in output of {{formatnum}}" 2020-11-19 23:06:54 +00:00
C. Scott Ananian
5d145154ca Check validity of language code before constructing NumberFormatter
The underlying libICU only allows language codes of length 157 or less
(ULOC_FULLNAME_CAPACITY from
https://github.com/unicode-org/icu/blob/master/icu4c/source/common/unicode/uloc.h).

Bug: T267589
Change-Id: I1e182053dec6c6f8ad379cde544b829f410664d3
2020-11-19 16:26:54 -05:00
C. Scott Ananian
5553106baf Use Unicode minus in output of {{formatnum}}
Bug: T10327
Change-Id: I4b315d439fef7d7cdf2fc5ae1904e0460a2a60e0
2020-11-16 18:08:31 +00:00
C. Scott Ananian
95db8114be language: Don't add formatNum tracking category for #s in exponential notation
NumberFormatter handles exponential notation fine, and is_numeric
recognizes it, but some of our checks on the {{formatnum}} parser
function were a bit too strict.

Bug: T237467
Change-Id: I20c51da1e58bffeefba18237815541c1b6ccb415
2020-11-10 22:22:50 -05:00
jenkins-bot
08deaf12d6 Merge "Downgrade the severity of the non-numeric argument to formatNum warnings" 2020-11-10 19:37:46 +00:00
C. Scott Ananian
c79d3289f8 Downgrade the severity of the non-numeric argument to formatNum warnings
The core Language::formatNum() method w/ a non-numeric argument is
deprecated, but it may take us a while to chase down more of the long
tail of callers so temporarily downgrade the severity of the
hard-deprecation warning.

The uses in media (FormatMetadata) are due to unknown EXIF tags (which
we should eventually teach our EXIF parser about so they get localized
properly) and bad metadata in uploaded images (which we will probably
never fix).  Downgrade the severity of these logs permanently so we
can track down unknown EXIF tags w/o flooding our logs w/ every bad
image uploaded to commons.

Bug: T267370
Bug: T267587
Change-Id: I778daed5c2d23ff880ada9e226902ad97b6d00c4
2020-11-10 12:44:22 -06:00
C. Scott Ananian
73c29dbe23 language: Honor $wgTranslateNumerals, even if PHP does digit translation
The PHP NumberFormatter class usually does digit translation itself,
which can be a problem if a wiki has explicitly elected *not* to
localize numerals.  Use the 'C' locale to bypass this feature of PHP
NumberFormatter in the case where a wiki has explicitly set
$wgTranslateNumerals to false.

Bug: T267614
Change-Id: I7a21577a7dfb5274a125515068da9e3418f8a472
2020-11-10 13:29:58 -05:00
C. Scott Ananian
d7b2fe4d46 language: Clean up $separatorTransformTable in km/la/my
LanguageKm and LanguageMy defined overrides for ::formatNum()
which did nothing -- but what they really wanted to do was just to
suppress separators.  Do that in the 'modern' way by invoking
::formatNumNoSeparators().

MessagesLa and MessagesKm defined $separatorTransformTable for one but
not both of the keys `.` and `,`.  Add defaults to
Language::formatNum() to handle this case without burping out a PHP
notice.  Use belt and suspenders by also defining an identity mapping
for '.' in MessagesLa::$separatorTransformTable.

Bug: T267091
Change-Id: I0169606ca1e211d241fa71f23ee0a16edc64b7ae
2020-11-05 03:30:05 +00:00
jenkins-bot
dbf24bff58 Merge "Hard deprecate Language::commafy; deprecate mediawiki.language.commafy" 2020-10-29 00:45:40 +00:00
jenkins-bot
563a4d430d Merge "Correct misinterpretation of $minimumGroupingDigits" 2020-10-29 00:45:33 +00:00
C. Scott Ananian
4bc5c76129 Hard deprecate Language::commafy; deprecate mediawiki.language.commafy
Language::commafy was deprecated; it is poorly named, and its functionality
is rolled into ::formatNum/::formatNumNoSeparators.

Deprecate mediawiki.language.commafy for similar reasons, and because
it parallels a deprecated core PHP method.  Perhaps it would be
worthwhile to add a new JS method in the future more closely matching
the PHP ::formatNum/::formatNumNoSeparators pair.

Code search: https://codesearch.wmcloud.org/search/?q=commafy%5C%28&i=nope&files=&repos=

Change-Id: Id3fc5dc2c7e62495a532db93d85a6f1cb8e8cbeb
2020-10-28 19:39:17 +00:00
C. Scott Ananian
af231558d6 Correct misinterpretation of $minimumGroupingDigits
In a previous patch (ce8d0e9599) some
comments were left based on a misinterpretation of the differences in
$minimumGroupingDigits for the hy/ru/uk languages.  Correct the
misunderstanding in the comments and invert the sense of the test to
match a more natural interpretation.

Bug: T262500
Change-Id: I22954cb989ade9e2bcfa2acdda714389696251f2
2020-10-28 10:21:24 -04:00
Thiemo Kreuz
1fc8d79ac6 Remove documentation that literally repeats the code
For example, documenting the method getUser() with "get the User
object" does not add any information that's not already there.
But I have to read the text first to understand that it doesn't
document anything that's not already obvious from the code.

Some of this is from a time when we had a PHPCS sniff that was
complaining when a line like `@param User $user` doesn't end
with some descriptive text. Some users started adding text like
`@param User $user The User` back then. Let's please remove
this.

Change-Id: I0ea8d051bc732466c73940de9259f87ffb86ce7a
2020-10-27 19:20:26 +00:00
Reedy
77c3b90307 Language: Fix 'uesd' typo to 'used'
Change-Id: I20c615b6b8c7f9420b29f8e8486d8979fefb6a49
2020-10-23 16:39:20 +00:00
C. Scott Ananian
1c9bbfc14e Language: hard deprecate the noSeparators parameter to ::formatNum
Code should use Language::formatNumNoSeparators() instead, which has
existed since MW 1.21.

Code search:
https://codesearch.wmcloud.org/search/?q=formatNum%5C%28%5B%5E%29%5D*%2C&i=nope&files=&repos=

Depends-On: I95c365e2535bb3c47bed69a9b702c8f13d9fab87
Depends-On: I012434d5f6c749fec45a6c160e8d5d03686192e9
Depends-On: If3de5645a92514f605d4117fea3a820ed6c86624
Change-Id: I58a66975e505f16d8db5d663a9ca225535277983
2020-10-21 10:08:04 -04:00
Santhosh Thottingal
ce8d0e9599 Update formatNum implementation to match tr35 and latest CLDR
* Update digitGroupingPattern to match CLDR 31: New versions of CLDR has
  digit grouping pattern with decimal part. Update digitGroupingPattern
  values in Message classes with this improved pattern.
  Refer: http://unicode.org/reports/tr35/tr35-numbers.html

* Refer the following chart for the decimal patterns.
  http://www.unicode.org/cldr/charts/31/by_type/numbers.number_formatting_patterns.html

* Uses PHP NumberFormatter class for the commafy implementation, which
  is available in PHP 7.

* Some tests need to update to match the TR 35 spec

* The formatNum public method in Language.php is the preferred way to
  use this feature. It does separator transformation and digit transformation
  wherever applicable.

* Renamed the second param name for formatNum from noCommafy to noSeparators

* commafy method is deprecated and formatNum is preferred. Practically,
  we are not just adding comma, but seperators according to the language.
  Replaced some tests based on commafy methods with tests based on formatNum.

Note: The corresponding js implementation is not changed in this commit.
It would probably be a good idea to use globalize.js, which is also based
on the CLDR patterns.

Note: This patch preserves the existing off-by-one error in
$minimumGroupingDigits; T262500 will eventually fix this.

Bug: T167088
Co-Authored-By: C. Scott Ananian <cscott@cscott.net>
Change-Id: Ic721b9a91e78e4ef07040339d1006b7a90a910c0
2020-10-21 10:08:04 -04:00
jenkins-bot
9470446c30 Merge "Hard deprecation of Language::convertTitle(), ::findVariantLink(), ::updateConversionTable()" 2020-10-20 21:57:32 +00:00
ArtBaltai
e7dbd69de0 Hard deprecation of Language::convertTitle(),
::findVariantLink(), ::updateConversionTable()

Co-authored-by: C. Scott Ananian <cananian@wikimedia.org>
Bug: T226832
Change-Id: I41a3b67490fc6b9d4c484f566d346a0d10c670e9
2020-10-20 16:49:17 -04:00
C. Scott Ananian
448a51ae08 Re-enable deprecation warning in Language::commafy() for non-numeric string
This reverts commit 10b0e1468e.

The underlying issue was fixed by 45183f0bf2.

Bug: T263592
Bug: T237467
Change-Id: I05fdc99b2326f36e89cbd39672eca404af49a18a
2020-10-20 15:46:17 -04:00
Petr Pchelko
da2f8ac8e6 Remove unused methods in language hard-deprecated in 1.34
Change-Id: I21f14b290b0c8fb22a687598836749e9f65528ba
2020-10-18 23:55:45 +00:00
Reedy
10b0e1468e Disable deprecated warning in Language::commafy() for non numeric string
Bug: T263592
Bug: T237467
Change-Id: I7a24659dd0931b59bdb01a283b6f8f81e933406a
2020-09-22 22:03:22 +01:00
C. Scott Ananian
b9b8b53682 Language: ensure commafy does not corrupt UTF-8 strings
The commafy method "should" be given valid numeric strings, but this
wasn't enforced, and if were provided input which started with a UTF-8
multibyte character and then had a single ASCII digit somewhere after
that, it would return the first byte of the input string, resulting in
an invalid UTF-8 sequence.

Fix this bug with belt and suspenders: first, enforce the expected
input structure at the top of the function.  Since there is existing
code which expects us to "do our best" with invalid input, split the
input string into valid numeric chunks before processing it.  This
split code triggers a hard deprecation warning, so we can eventually
remove it.

Second, make the sign test more robust and anchor the $integerPart
regexp to match assumptions made in the algorithm, so that even if
bogus input *did* creep through (a sloppy future maintainer, say) it
wouldn't lead to corrupt UTF-8 in the output.

Add test cases covering these conditions, borrowing liberally from
I741b70757e43b1312c86719920e29885566e916c, which points out that while
commafy expects numeric strings, formatNum replaces – character by
character – digits and separator characters with language specific
ones. Optionally thousand separators are added (a.k.a. "commafy").
Eventually we should tighten the spec for formatNum as well; some of
this has already been done in
I03ffa99f7de1dcc48535ba1e1251567dbf3db116 and
I89b17a9e11b3afc6c653ba7ccc6ff84c37863b66.

Some additional test case fixes borrowed from
If45ef33a50b2623322f17306d123f0d8cb468618 which updated a few test
cases to be more specific, i.e. actually test stuff (for example,
commafy doesn't happen on 3-digit numbers, and numerals are not
translated in English).

Bug: T237467
Depends-On: I89b17a9e11b3afc6c653ba7ccc6ff84c37863b66
Depends-On: I9dcbe91fa926dba1cfd24d9bf075ee1ebef36b9e
Depends-On: I03ffa99f7de1dcc48535ba1e1251567dbf3db116
Change-Id: If3dcfd71acd8ebf3eea6a49408260f2aaa07e469
2020-09-18 08:37:56 +00:00
DannyS712
dcb3e9cfd6 Language::translateBlockExpiry update $user docs
Also update for LanguageFi

Bug: T243708
Change-Id: Icb42980855d8b5db16cfef4a7dade0b341cb95bd
2020-09-15 21:47:08 +00:00
DannyS712
aedb7dce16 Remove use of global $wgUser in Language::dateFormat and ::userAdjust
Bug: T246862
Change-Id: I321e7f81347a96170c6dad12deeb333922016951
2020-09-15 00:51:38 +00:00
Umherirrender
ba216e52e7 includes: Use expression assignment operator += or |= where possible
It is easier to read.

Change-Id: Ia3965b80153d64f95b415c6c30f526efa252f554
2020-07-31 22:26:42 +00:00
Ed Sanders
0cf40a4f7a Flip Yoda conditionals
Change-Id: Id3495b6f15c267123c89f3a0ace496e6ecbeb58e
2020-07-22 17:49:12 +01:00
Thiemo Kreuz
554872a201 language: Improve documentation of bool return values and such
Change-Id: I9e2680b10473115a4d9af1083fbd2484af1c23f4
2020-07-14 18:51:19 +00:00
jenkins-bot
8b2f44b6e7 Merge "phan: Enable redundant_condition_detection" 2020-07-02 00:28:10 +00:00
Umherirrender
bc5cb7ae64 phan: Enable redundant_condition_detection
Remove duplicate casts
Suppress false positives

Bug: T248438
Change-Id: I2f89664a4bcd3b39b15e7cf850adda2f0c90ae6f
2020-07-01 20:13:07 +00:00
DannyS712
94169ee873 Whitespace cleanup: Use tabs for indentation, avoid double spaces
Change-Id: I346073b59d283029bd6666356c62c81e687ea5e6
2020-06-27 07:53:07 +00:00
Tim Starling
d459add63d Introduce wfDeprecatedMsg()
Deprecating something means to say something nasty about it, or to draw
its character into question. For example, "this function is lazy and good
for nothing". Deprecatory remarks by a developer are generally taken as a
warning that violence will soon be done against the function in question.
Other developers are thus warned to avoid associating with the deprecated
function.

However, since wfDeprecated() was introduced, it has become obvious that
the targets of deprecation are not limited to functions. Developers can
deprecate literally anything: a parameter, a return value, a file
format, Mondays, the concept of being, etc. wfDeprecated() requires
every deprecatory statement to begin with "use of", leading to some
awkward sentences. For example, one might say: "Use of your mouth to
cough without it being covered by your arm is deprecated since 2020."

So, introduce wfDeprecatedMsg(), which allows deprecation messages to be
specified in plain text, with the caller description being optionally
appended. Migrate incorrect or gramatically awkward uses of wfDeprecated()
to wfDeprecatedMsg().

Change-Id: Ib3dd2fe37677d98425d0f3692db5c9e988943ae8
2020-06-22 14:34:39 +10:00
Tim Starling
68c433bd23 Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.

General principles:
* Use DI if it is already used. We're not changing the way state is
  managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
  is a service, it's a more generic interface, it is the only
  thing that provides isRegistered() which is needed in some cases,
  and a HookRunner can be efficiently constructed from it
  (confirmed by benchmark). Because HookContainer is needed
  for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
  SpecialPage and ApiBase have getHookContainer() and getHookRunner()
  methods in the base class, and classes that extend that base class
  are not expected to know or care where the base class gets its
  HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
  getHookRunner() methods, getting them from the global service
  container. The point of this is to ease migration to DI by ensuring
  that call sites ask their local friendly base class rather than
  getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
  methods did not seem warranted, there is a private HookRunner property
  which is accessed directly. Very rarely (two cases), there is a
  protected property, for consistency with code that conventionally
  assumes protected=private, but in cases where the class might actually
  be overridden, a protected accessor is preferred over a protected
  property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
  global code. In a few cases it was used for objects with broken
  construction schemes, out of horror or laziness.

Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore

Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router

setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine

Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-05-30 14:23:28 +00:00
Reedy
5f1000f7bd Fix languages/ PSR12.Properties.ConstantVisibility.NotFound
Change-Id: I6f88f2eaf2fc69016b99124eeb9f6e2616c148d2
2020-05-16 21:49:02 +01:00
Reedy
12a3883a7b Fix SingleSpaceBeforeSingleLineComment
Change-Id: I285af438ce484af40741489797f20455726ec110
2020-05-11 00:57:11 +00:00
Ori Livneh
b5ccf229c3 languages: Apply a small optimization to Language::ucfirst()
We don't need to scan the whole string for multi-byte characters here,
since we only care about the first character in the string. By only
checking the byte-width of the first character we avoid a potentially
expensive call to mb_strlen.

Change-Id: I1e8dd6e3f3c5a8d795085edd1b128bc38ec67d78
2020-05-04 23:41:25 +00:00
Timo Tijhof
75ccdc6147 languages: Move default $wgNamespaceAliases to MessagesEn.php
These are not configuration but business logic, similar to the
canonical names that are in NamespaceInfo.php, these must always
exist and cannot be altered or unset.

They were previously unconditionally assigned during all requests
in Setup.php and passed down as "site configuration".

Changes:

* Move them to MessagesEn.php where they can be cached and
  processed the same way as other core-provided aliases.

  Document and confirm with tests that this is a mergeable
  attribute that follows the language chain.

* Remove the duplicated code in a few places that was reading
  this variable + Language::getNamespaceAliases(), to instead
  just call the latter and move the logic there, centralised,
  and tested.

  In doing so I noticed that these were applied in an
  inconsistent order. Sometimes the config won, sometimes not.
  There's no obvious right or wrong way here, but I've chosen
  to standardise on the way that Language::getNamespaceIds() did
  it, which is that config wins. This because that method seems
  to be most widely used of the three (it decides how URLs and
  titles are parsed), and thus the one I least want to change
  the behaviour of.

* Document that $wgNamespaceAliases may only be used to
  define (extra) aliases, it is and never was a way to access
  the complete list of aliases.

Bug: T189966
Change-Id: Ibb14181aba8c1b509264ed40523e9ab4000fd71a
2020-03-14 19:27:40 +00:00
ArtBaltai
31283f34bf Reduce usage of the Language class
reduce/deprecate visibility of some members of the Language class

Bug: T243913
Change-Id: I6bad608455ceaa46f895f00dcc6380cec6d32680
2020-03-03 01:38:27 +03:00
jenkins-bot
0f357294ca Merge "Remove LanguageConverter dependencies on Title and use LinkTarget" 2020-02-12 22:44:12 +00:00
ArtBaltai
3bf4b42490 Remove LanguageConverter dependencies on Title and use LinkTarget
Replace usage of Title in LanguageConverter with LinkTarget which
is more light weighted and provides just the props needed in Language.

Bug: T226834
Change-Id: I02a386bd9898e83c773cbd3d738d347d08f52c11
2020-02-12 18:37:11 +03:00