Commit graph

323 commits

Author SHA1 Message Date
daniel
4880a82555 Parser: remove Title from method signatures
Bug: T281068
Change-Id: I3280e38dd82d71845c343eeb911e71dd33bb380b
2021-04-29 18:11:46 +02:00
Daimona Eaytoy
535d7abf59 phpunit: Mass-replace setMethods with onlyMethods and adjust
Ended up using
  grep -Prl '\->setMethods\(' . | xargs sed -r -i 's/setMethods\(/onlyMethods\(/g'

special-casing setMethods( null ) -> onlyMethods( [] )

and then manual fix of failing test (from PS2 onwards).

Bug: T278010
Change-Id: I012dca7ae774bb430c1c44d50991ba0b633353f1
2021-04-16 20:15:00 +02:00
daniel
1ae400bb93 Remove some deprecated methods from the Language class
The following methods and fields in the Language class, deprecated since
1.35, have been removed:
- findVariantLink()
- convertTitle()
- updateConversionTable()
- classFromCode()
- clearCaches()
- mConverter

Change-Id: I9281f37be3a374e072d6afde8f352138af13adbe
2021-04-13 13:26:27 +00:00
Umherirrender
cfcb3e4785 Use ::class for class name
This works also for non-existing classes,
because it is resolved on compile time

Change-Id: Id3132341856fb1eb20e8b494bb4acdfe3a394db6
2021-04-08 21:17:42 +02:00
jenkins-bot
04626a940f Merge "Add converter for the Talysh language (tly)" 2021-02-23 12:45:31 +00:00
James D. Forrester
5a622b6a2e build: Upgrade eslint-config-wikimedia from 0.17.0 to 0.18.1
Change-Id: I5e3687be2b197134578126e1b890ee37dbc1bc1b
2021-02-18 08:39:09 -08:00
Amir Aharoni
14d363c29f Add converter for the Talysh language (tly)
Mostly copied from UzConverter.

This is a very simple converter, with bidirectional one to one
correspondence: for every Latin letter there is a corresponding
Cyrillic letter and vice versa. There are no digraphs or punctuation
to convert.

The Latin alphabet is the primary one used for this language today,
and will probably remain so for the foreseeable future, so "tly" remains
the usual code, and "tly-cyrl" is added for Cyrillic.

Language name is changed:
* The main language name is now Latin.
* The word "language" ("зывон") is removed.
* The spelling of the word "Talysh" is based on the Pireyko dictionary.

Bug: T258975
Change-Id: I552e07967ea82e03c413a0b10b129a846aa007c7
2021-02-17 13:49:36 +00:00
Umherirrender
a1de8b8700 Tests: Mark more more closures as static
Result of a new sniff I25a17fb22b6b669e817317a0f45051ae9c608208

Bug: T274036
Change-Id: I695873737167a75f0d94901fa40383a33984ca55
2021-02-09 02:55:57 +00:00
jenkins-bot
9fa1d86728 Merge "Improve some class properties documentation in tests" 2021-02-05 02:52:22 +00:00
Peter Ovchyn
d8f54cae8b Language: Turn public properties into Getters in LanguageConverter based hierarchy
The main goal is to simplify the construction of the LanguageConverter and
avoid using constructors for derived classes.

In order to hard-deprecate removed property, DeprecationHelper::deprecatePublicPropertyFallback
was introduced.

Bug: T253834
Change-Id: Ib167982e4e872cfdf0fbcb78b7ca597f5ac8d60a
2021-02-03 15:17:42 +02:00
DannyS712
210bbe84d4 LanguageConverterTest: fix assertStringNotContainsString call
Autofix in codesniffer was wrong, see T273624

Follow-up: Idb413be4b8cba8611afdc022af59810ce1a4531e
Change-Id: I97c174661ba9ed3e8da7f6b164666d53f36e9fa3
2021-02-03 04:00:52 +00:00
Umherirrender
205f141bb8 Improve some class properties documentation in tests
Change-Id: Id9c9e56865cf9a6bb112be37a5674ec753604fb1
2021-02-02 16:48:15 +00:00
Umherirrender
62002cdcf1 build: Update mediawiki/mediawiki-codesniffer to 35.0.0
Change-Id: Idb413be4b8cba8611afdc022af59810ce1a4531e
2021-01-31 13:34:38 +00:00
jenkins-bot
eb4b304ed8 Merge "Add missing @param and @return to documentation in tests" 2021-01-30 15:22:33 +00:00
vladshapik
3669d658f2 LanguageConverter: add test for deeper checking of each converter`s work
LanguageConverterIntegrationTest was added. It checks work of each converter with providing of long texts to translate. It must check each converter(includes/language/converters) as much as possible.

Bug: T272164
Change-Id: I4504b8071c4ae0427375a6c1acffd62d4e6121b5
2021-01-29 14:12:31 +02:00
Amir Aharoni
0e68b408c4 Update the converter for the Tashelhit language (shi)
The alphabets are based on what is used in
"Dictionnaire Général de la Langue Amazighe Informatisé"
by IRCAM (https://tal.ircam.ma/dglai/lexieam.php, DGLAi).

This was also requested by the community in the Tashelhit
Wikipedia Incubator.

Changes:
* Tests are enhanced to cover the whole alphabet.
* The Latin letter š is replaced with the letter c as
  the equivalent of Tifinagh letter ⵛ.
* The Tifinagh letters ⵠ and ⵒ are eliminated
  because they are not used in the Moroccan version
  of Tifinagh, as presented in the DGLAi.
* The Latin letters O, P, V are converted to the
  Tifinagh letters ⵓ, ⴱ, ⴼ, which are the same letters
  that correspond to the Latin letters U, B, and F.
* Greek Gamma and Epsilon characters are replaced with
  the corresponding characters from the Latin range.
* All the non-ASCII Latin letters are added to
  the uppercase-lowercase Latin conversion.

This is the first patch in a series to fix the most important
issues listed above. It's kept minimal to make reviewing easier.
There will be more patches to fix readability and public/private
members, and to add more tests.

Change-Id: I7134216457b12018fd187ca7200e45c1b5a67471
2021-01-25 14:40:55 +00:00
Umherirrender
7691dbeca9 Add missing @param and @return to documentation in tests
Change-Id: Ic663e81cca0bf007804a70772250914a85f1fef4
2021-01-22 19:57:25 +01:00
Umherirrender
0347fd0631 Improve some function documentation in tests
Also fix some whitespaces

Change-Id: Ibed50a4f07442d3f299cf545c16f5dbb5f27a411
2021-01-14 22:13:55 +01:00
jenkins-bot
64fd98b19a Merge "LanguageConverterTest: use some data providers" 2021-01-14 15:34:15 +00:00
DannyS712
8dbcddb333 LanguageConverterFactoryTest::codeProvider simplify using a loop
Change-Id: I06adfcd43ef846124742d901e15c98e92bfc9bd1
2021-01-14 08:43:02 +00:00
DannyS712
f3e79c753b LanguageConverterTest: use some data providers
Change-Id: If59a636af0cf410a25cc36956d62d3f5a2deb1df
2021-01-12 12:56:03 +00:00
DannyS712
6a93b0ca93 More misc test cleanup
* parent::setUp() should be first, and ::tearDown()
  should be last
* Move tests that directly extend PHPUnit\Framework\TestCase
  to /unit

Change-Id: I1172855c58f4f52a8f624e6d596ec43beb8c93ff
2020-12-24 00:52:06 +00:00
David Kamholz
9cb5187944 Implement Balinese language converter
This patch implements the BanConverter class for Balinese. Its purpose is to transliterate Balinese in Balinese script to Latin script. Latin to Balinese is not currently supported, because (1) the Latin transliteration is not fully one-to-one, (2) I'm not aware of any users who currently need Latin to Balinese.

The converter supports three distinct Latin transliteration variants: ban-dharma, ban-palmleaf, and ban-puri-kauhan-ubud. All three variants have been requested by different Balinese community members working with Balinese palm-leaf manuscripts. ban-puri-kauhan-ubud is the default, as it is the most familiar to lontar scholars, but Balinese Wikisource users will be able to select their preferred variant via a user script.

Conversion is accomplished via ICU Rule-Based Transliterators, bindings for which are available through the Intl extension.

This patchset adds the abstract class LanguageConverterIcu and has BanConverter inherit from it (makes future ICU-based LanguageConverters easier).

Bug: T263082
Change-Id: Ic3a46a215fbf020a022726e6b130b1d25496e284
2020-12-21 12:45:41 -08:00
jenkins-bot
02342b9065 Merge "Don't access $wgRequest from User" 2020-12-16 05:06:16 +00:00
Tim Starling
6b2a52181f Don't access $wgRequest from User
Some User methods fail if they are called before $wgRequest is
set. But according to the Setup.php comment, it is only set for b/c.
The global request object can be lazy-initialised at any time.

This is sufficient to avoid T263911 (loss/obfuscation of the $wgServer
error message).

In tests, try to keep $wgRequest and RequestContext::$request in sync.
Introduce MediaWikiIntegrationTestCase::setRequest() which sets both at
once, and use that instead of setMwGlobals() or direct assignment.

BlockManagerTest was accidentally exploiting the fact that the global
context request and $wgRequest were separate objects. Making them the
same causes session cookies to appear in the response, breaking the
cookie counts. Use a new response for the test.

Bug: T263911
Bug: T245940
Change-Id: I2be99f7251a837bc6b62be0b152038157dec10f2
2020-12-16 12:21:00 +11:00
DannyS712
9c47a99639 LanguageConverterTest: reduce direct references to $wgUser
Should be a no-op, doesn't actually reduce the places where $wgUser
is set, just reduces the number of hits in codesearch and makes the
future migration of LanguageConverter to not use $wgUser a bit
easier

Bug: T243708
Change-Id: Ieb04b0e760dd37e037a95408ef429ac5c510f1d9
2020-12-14 21:24:17 +00:00
Umherirrender
12e15e410b LanguageClassesTestCase::setup: Validate language code
This trait is used in class LanguageIntegrationTest, resulting in the
language code "integration" being used, which is an invalid language.

Change-Id: I9398015bb0e85eb26116bb8608c8f39216ce8204
2020-12-13 17:19:57 +00:00
C. Scott Ananian
c64e71615e Replace $wgDisable{Lang,Title}Conversion with LanguageConverterFactory methods
Replace direct access to $wgDisableLangConversion with
LanguageConverterFactory::isConversionDisabled(), and replace direct
access to $wgDisableTitleConversion with
LanguageConverterFactory::isTitleConversionDisabled().  However, most
places that check ::isTitleConversionDisabled() actually want
::isLinkConversionDisabled(), so add that too (and deprecate
isTitleConversionDisabled()).

Code search:
https://codesearch.wmcloud.org/search/?q=Disable%28Lang|Title%29Conversion&i=nope&files=&repos=

This change removes a number of spurious dependencies on the global
configuration and reduces code duplication (for example, if the logic
for disabling language conversion were ever to change).

Depends-On: I6fa8230ae97b0e34c381003548e61f9b7387d363
Change-Id: Icc4687638ff1815003dd903854efdbd904854f1e
2020-11-25 12:47:26 -05:00
jenkins-bot
4c9899ea99 Merge "languages: Language::formatNum() should accept any valid number" 2020-11-23 20:54:40 +00:00
C. Scott Ananian
e099c38ef4 languages: Language::formatNum() should accept any valid number
The PHP function is_numeric() returns true for numbers like '123.456'
and even '1.23e45'. However, it returns false for (string)NAN,
(string)INF, and (string)-INF (which are "NAN", "INF" and "-INF"
respectively).  We can return the appropriate unicode characters for
the infinities to localize these/make them universal, and allow a
localization of the "Not a Number" message.

Make the corresponding change to Language::parseFormattedNumber() so
that its remains the inverse operation to ::formatNum().

Accept "NAN"/"INF"/"-INF" only when they stand alone in the string;
in the legacy case where text and numbers are intermingled, split
only on "traditional" numbers; I think we're more likely to find
INF/NAN "innocently" in the middle of text than we are to find it
as a "real" number.

Change-Id: I3ff227a4aac66fc938182dc9fb8a7b743e94faca
2020-11-23 15:20:43 -05:00
jenkins-bot
7f61804bf5 Merge "Use Unicode minus in output of {{formatnum}}" 2020-11-19 23:06:54 +00:00
jenkins-bot
e6a6592ecf Merge "Fix some unit tests accessing MediaWikiServices" 2020-11-17 18:36:37 +00:00
C. Scott Ananian
5553106baf Use Unicode minus in output of {{formatnum}}
Bug: T10327
Change-Id: I4b315d439fef7d7cdf2fc5ae1904e0460a2a60e0
2020-11-16 18:08:31 +00:00
Daimona Eaytoy
95e17ee645 Fix some unit tests accessing MediaWikiServices
These are mostly easy fixes. Tests were fixed when that didn't require
any change to the tested code, and moved to /integration otherwise.

MediaWikiUnitTestCase::setTemporaryHook was removed: the
caller should provide a HookContainer, at which point it would just
become a useless wrapper around HookContainer::register. (We don't
really need it to be temporary, if proper DI is used).
The method was only used in the tests touched by this commit.

Change-Id: I2aba02560c41b77eea9dd4bff0e4d1c4bb0da9a2
2020-11-12 19:13:47 +00:00
C. Scott Ananian
95db8114be language: Don't add formatNum tracking category for #s in exponential notation
NumberFormatter handles exponential notation fine, and is_numeric
recognizes it, but some of our checks on the {{formatnum}} parser
function were a bit too strict.

Bug: T237467
Change-Id: I20c51da1e58bffeefba18237815541c1b6ccb415
2020-11-10 22:22:50 -05:00
C. Scott Ananian
73c29dbe23 language: Honor $wgTranslateNumerals, even if PHP does digit translation
The PHP NumberFormatter class usually does digit translation itself,
which can be a problem if a wiki has explicitly elected *not* to
localize numerals.  Use the 'C' locale to bypass this feature of PHP
NumberFormatter in the case where a wiki has explicitly set
$wgTranslateNumerals to false.

Bug: T267614
Change-Id: I7a21577a7dfb5274a125515068da9e3418f8a472
2020-11-10 13:29:58 -05:00
C. Scott Ananian
4bc5c76129 Hard deprecate Language::commafy; deprecate mediawiki.language.commafy
Language::commafy was deprecated; it is poorly named, and its functionality
is rolled into ::formatNum/::formatNumNoSeparators.

Deprecate mediawiki.language.commafy for similar reasons, and because
it parallels a deprecated core PHP method.  Perhaps it would be
worthwhile to add a new JS method in the future more closely matching
the PHP ::formatNum/::formatNumNoSeparators pair.

Code search: https://codesearch.wmcloud.org/search/?q=commafy%5C%28&i=nope&files=&repos=

Change-Id: Id3fc5dc2c7e62495a532db93d85a6f1cb8e8cbeb
2020-10-28 19:39:17 +00:00
Santhosh Thottingal
ce8d0e9599 Update formatNum implementation to match tr35 and latest CLDR
* Update digitGroupingPattern to match CLDR 31: New versions of CLDR has
  digit grouping pattern with decimal part. Update digitGroupingPattern
  values in Message classes with this improved pattern.
  Refer: http://unicode.org/reports/tr35/tr35-numbers.html

* Refer the following chart for the decimal patterns.
  http://www.unicode.org/cldr/charts/31/by_type/numbers.number_formatting_patterns.html

* Uses PHP NumberFormatter class for the commafy implementation, which
  is available in PHP 7.

* Some tests need to update to match the TR 35 spec

* The formatNum public method in Language.php is the preferred way to
  use this feature. It does separator transformation and digit transformation
  wherever applicable.

* Renamed the second param name for formatNum from noCommafy to noSeparators

* commafy method is deprecated and formatNum is preferred. Practically,
  we are not just adding comma, but seperators according to the language.
  Replaced some tests based on commafy methods with tests based on formatNum.

Note: The corresponding js implementation is not changed in this commit.
It would probably be a good idea to use globalize.js, which is also based
on the CLDR patterns.

Note: This patch preserves the existing off-by-one error in
$minimumGroupingDigits; T262500 will eventually fix this.

Bug: T167088
Co-Authored-By: C. Scott Ananian <cscott@cscott.net>
Change-Id: Ic721b9a91e78e4ef07040339d1006b7a90a910c0
2020-10-21 10:08:04 -04:00
Amir Aharoni
605ff51bf1 Add accusative case to Russian language GRAMMAR
Bug: T257500
Change-Id: I30a892a936c0ed9247bc6b63be747697cb9f3e26
2020-10-10 18:03:52 +03:00
C. Scott Ananian
b9b8b53682 Language: ensure commafy does not corrupt UTF-8 strings
The commafy method "should" be given valid numeric strings, but this
wasn't enforced, and if were provided input which started with a UTF-8
multibyte character and then had a single ASCII digit somewhere after
that, it would return the first byte of the input string, resulting in
an invalid UTF-8 sequence.

Fix this bug with belt and suspenders: first, enforce the expected
input structure at the top of the function.  Since there is existing
code which expects us to "do our best" with invalid input, split the
input string into valid numeric chunks before processing it.  This
split code triggers a hard deprecation warning, so we can eventually
remove it.

Second, make the sign test more robust and anchor the $integerPart
regexp to match assumptions made in the algorithm, so that even if
bogus input *did* creep through (a sloppy future maintainer, say) it
wouldn't lead to corrupt UTF-8 in the output.

Add test cases covering these conditions, borrowing liberally from
I741b70757e43b1312c86719920e29885566e916c, which points out that while
commafy expects numeric strings, formatNum replaces – character by
character – digits and separator characters with language specific
ones. Optionally thousand separators are added (a.k.a. "commafy").
Eventually we should tighten the spec for formatNum as well; some of
this has already been done in
I03ffa99f7de1dcc48535ba1e1251567dbf3db116 and
I89b17a9e11b3afc6c653ba7ccc6ff84c37863b66.

Some additional test case fixes borrowed from
If45ef33a50b2623322f17306d123f0d8cb468618 which updated a few test
cases to be more specific, i.e. actually test stuff (for example,
commafy doesn't happen on 3-digit numbers, and numerals are not
translated in English).

Bug: T237467
Depends-On: I89b17a9e11b3afc6c653ba7ccc6ff84c37863b66
Depends-On: I9dcbe91fa926dba1cfd24d9bf075ee1ebef36b9e
Depends-On: I03ffa99f7de1dcc48535ba1e1251567dbf3db116
Change-Id: If3dcfd71acd8ebf3eea6a49408260f2aaa07e469
2020-09-18 08:37:56 +00:00
addshore
959bc315f2 MediaWikiTestCase to MediaWikiIntegrationTestCase
The name change happened some time ago, and I think its
about time to start using the name name!
(Done with a find and replace)

My personal motivation for doing this is that I have started
trying out vscode as an IDE for mediawiki development, and
right now it doesn't appear to handle php aliases very well
or at all.

Change-Id: I412235d91ae26e4c1c6a62e0dbb7e7cf3c5ed4a6
2020-06-30 17:02:22 +01:00
Thiemo Kreuz
5f3a92385b Fix visibility of setUp/tearDown
Change-Id: I636be48eb9f713680abac35d46091f7b49374696
2020-06-16 21:02:05 +02:00
Base
1f44fd171a Adding default locative rule for Ukrainian
Prepending a default preposition to locative GRAMMAR forms
in Ukrainian language

Bug: T149550
Change-Id: I4649549dc3c722e53c7ea3accb6747df420e56f7
2020-06-12 09:49:03 +03:00
DannyS712
25a90e8009 Remove unneeded ::setUp and ::tearDown methods that only call parent
Change-Id: I2a12859cfc56059bb71afa2aac08a8f86e746612
2020-06-07 10:09:32 +00:00
Tim Starling
47a1619027 Remove terminating line breaks from debug messages
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.

So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.

Also:
* Fix the stripping of leading line breaks from the log header emitted
  by Setup.php. This feature, accidentally broken in 2014, allows
  requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
  WebRequest in the hopes that it would not always be needed, however
  $wgRequest->getIP() is now called unconditionally a few lines up in
  Setup.php. This means that it is put in its proper place after the
  "start request" message.
* Wrap the log header code in a closure so that variables like $name do
  not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
  parameter to wfDebug().

Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
2020-06-03 12:01:16 +10:00
Daimona Eaytoy
2b37cfaf18 build: Bump mediawiki-codesniffer to 31.0.0
Done with `composer fix` and suppressing the rest (i.e. sniffs for
global variables, which for core should be suppressed anyway).

Additionally, add `-p` to `phpcbf`, as otherwise it just seems stuck.

Change-Id: Ide8d6cdd083655891b6d654e78440fbda81ab2bc
2020-05-30 14:56:28 +00:00
Tim Starling
68c433bd23 Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.

General principles:
* Use DI if it is already used. We're not changing the way state is
  managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
  is a service, it's a more generic interface, it is the only
  thing that provides isRegistered() which is needed in some cases,
  and a HookRunner can be efficiently constructed from it
  (confirmed by benchmark). Because HookContainer is needed
  for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
  SpecialPage and ApiBase have getHookContainer() and getHookRunner()
  methods in the base class, and classes that extend that base class
  are not expected to know or care where the base class gets its
  HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
  getHookRunner() methods, getting them from the global service
  container. The point of this is to ease migration to DI by ensuring
  that call sites ask their local friendly base class rather than
  getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
  methods did not seem warranted, there is a private HookRunner property
  which is accessed directly. Very rarely (two cases), there is a
  protected property, for consistency with code that conventionally
  assumes protected=private, but in cases where the class might actually
  be overridden, a protected accessor is preferred over a protected
  property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
  global code. In a few cases it was used for objects with broken
  construction schemes, out of horror or laziness.

Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore

Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router

setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine

Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-05-30 14:23:28 +00:00
Reedy
5f1000f7bd Fix languages/ PSR12.Properties.ConstantVisibility.NotFound
Change-Id: I6f88f2eaf2fc69016b99124eeb9f6e2616c148d2
2020-05-16 21:49:02 +01:00
Reedy
12a3883a7b Fix SingleSpaceBeforeSingleLineComment
Change-Id: I285af438ce484af40741489797f20455726ec110
2020-05-11 00:57:11 +00:00
jenkins-bot
3cfaa194ed Merge "Introduce UserOptionsManager and DefaultOptionsManager" 2020-05-01 20:22:56 +00:00