Commit graph

664 commits

Author SHA1 Message Date
Peter Ovchyn
61e0908fa2 languages: Introduce LanguageConverterFactory
Done:
* Replace LanguageConverter::newConverter by LanguageConverterFactory::getLanguageConverter
* Remove LanguageConverter::newConverter from all subclasses
* Add LanguageConverterFactory integration tests which covers all languages by their code.
* Caching of LanguageConverters in factory
* Make all tests running (hope that's would be enough)
* Uncomment  the deprecated functions.
* Rename FakeConverter to TrivialLanguageConverter
* Create ILanguageConverter to have shared ancestor
* Make the LanguageConverter class abstract.
* Create table with mapping between lang code and converter instead of using name convention
* ILanguageConverter @internal
* Clean up code

Change-Id: I0e4d77de0f44e18c19956a1ffd69d30e63cf51bf
Bug: T226833, T243332
2020-02-03 11:38:03 +02:00
Umherirrender
9187f97aef Pass Title to RevisionStore::newRevisionFromRow in MessageCache
The query for this result already contains a join to page table,
reuse that information and avoid a Title::newFromId in the RevisionStore

Change-Id: Ie02c238292778c3048cc950e37f51f04fc238fea
2020-01-17 16:54:30 +00:00
James D. Forrester
0958a0bce4 Coding style: Auto-fix MediaWiki.Usage.IsNull.IsNull
Change-Id: I90cfe8366c0245c9c67e598d17800684897a4e27
2020-01-10 14:17:13 -08:00
James D. Forrester
4f2d1efdda Coding style: Auto-fix MediaWiki.Classes.UnsortedUseStatements.UnsortedUse
Change-Id: I94a0ae83c65e8ee419bbd1ae1e86ab21ed4d8210
2020-01-10 09:32:25 -08:00
Umherirrender
77cc34e734 Allow GenderCache to accept UserIdentity and LinkTarget
Change-Id: I0bb6c13ed2fbc3b247058c3b6e528e2b7015b757
2020-01-08 08:44:23 +00:00
jenkins-bot
b874e6ab98 Merge "Provide a full trace to GlobalTitleFail debug entries" 2020-01-06 19:24:24 +00:00
mainframe98
297a89069a Add LinkBatchFactory to inject services into LinkBatch
All services required by LinkBatch are now injected by the
LinkBatchFactory. The constructor for LinkBatch has been
soft-deprecated, but the required services are still optional.

Bug: T239855
Depends-On: If49cbb730d4ac48586b891908cf24601efbc5d6a
Change-Id: I93d931ab60305ad49a6e419f8269c77791a3938d
2020-01-06 17:02:31 +01:00
Kunal Mehta
99007e96c7 Use namespaced IPUtils class
Change-Id: I047e099a93203a59093946d336a143d899d0271f
2020-01-01 02:36:49 -08:00
Daimona Eaytoy
f16d87a8cc Provide a full trace to GlobalTitleFail debug entries
None of these is currently useful: see for instance [1]. Having a whole
trace can help a lot in finding where the faulty call is. Also avoid the
overhead of calling wfDebugLog.

[1] - https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-mediawiki-2019.12.19/mediawiki?id=AW8gPL5bKWrIH1QRtQa3&_g=h@1251ff0

Bug: T159284
Change-Id: I25e0f0e2ef4899b4eaa644c74fdeaba21d79aefe
2019-12-20 14:12:59 +01:00
Adam Wight
3746d3dacb Cleanup: Explicit public access
Change-Id: If004457382105dc67b9f5471ddf74261a2374d04
2019-12-13 12:03:05 +01:00
Umherirrender
eb2373dcd1 Set visibility on php magic functions __destruct/sleep/wakeup/get/call
All the magic functions needs public visibility to be callable from php
internals like garbage collector

Change-Id: I1baf04bf8ff787da880d46e4a6daa77f5a6de73f
2019-12-05 18:52:55 +01:00
jenkins-bot
2b04ef6657 Merge "Set method visibility for various constructors" 2019-12-05 10:23:34 +00:00
Umherirrender
0688dd7c6d Set method visibility for various constructors
Change-Id: Id3c88257e866923b06e878ccdeddded7f08f2c98
2019-12-03 20:17:30 +01:00
Thiemo Kreuz
78ca9eff4a Remove duplicate variable name from class property PHPDocs
Repeating the variable name doesn't do anything. Documentation
generators don't need it. It's more stuff to read that doesn't add new
information. And it can become outdated.

Note there are two types of @var docs. When used inline (and not on a
class property) the variable name is needed.

Change-Id: If5a520405efacd8cefd90b878c999b842b91ac61
2019-12-02 12:58:29 +00:00
jenkins-bot
af4b49e67c Merge "Improve param docs" 2019-11-28 19:36:24 +00:00
Umherirrender
c7ad21c25f Improve param docs
Change-Id: I746a69f6ed01c3ff000da125457df62b02d13b34
2019-11-28 19:08:59 +01:00
jenkins-bot
5e9c40b66d Merge "Remove IE 6 security features from server-side code" 2019-11-28 04:43:35 +00:00
Tim Starling
164a3ac1f0 Remove IE 6 security features from server-side code
* Deprecate WebRequest::checkUrlExtension() and have it always return
  true. This reverts the security fixes made for T30235.
* Remove IEUrlExtension. This is a helper for checkUrlExtension() which
  is not used in any extensions.
* Remove CSS sanitization code which is specific to IE6. This reverts
  the changes made to fix T57332, and related followups. I confirmed
  that the relevant test cases do not result in XSS on IE8.
* Remove related tests.

Bug: T232563
Change-Id: I7318ea4a63210252ebc64968691d4f62d79a63e9
2019-11-28 15:11:56 +11:00
daniel
e98094956a Don't fail hard on bad titles in the database.
This updates some code that has been constructing TitleValue directly
to use TitleValue::tryNew or TitleParser::makeTitleValueSafe.

Bug: T200055
Change-Id: If781fe62213413c8fb847fd9e90f079e2f9ffc9d
2019-11-25 22:15:38 +01:00
Brad Jorsch
aa0720d37c ParamValidator: Use MessageValue!
Trying to get away with returning a single code and parameter-list that
was supposed to represent both human-readable and machine-readable data
was a mistake.

This patch converts it to use DataMessageValue, which represents the two
separately and also provides guidance for supplying translations of all
the error codes.

This also eliminates the "describeSettings()" method that was trying to
serve multiple use cases (in terms of the Action API, action=paraminfo
and action=help). It's replaced by two methods that each serve one of
the use cases. Also some of the functionality was moved out of the
TypeDef base class into ParamValidator, to better match where the
constants themselves live.

Also I wound up creating a NumericDef base class so FloatDef can share
the same range-checking logic that IntegerDef has. I probably should
have done that as a separate patch, but untangling it now would be too
much work.

Bug: T235801
Change-Id: Iea6d4a1d05bb4b92d60415b0f03ff9d3dc99a80b
2019-11-01 15:49:31 -04:00
Aryeh Gregor
537bdc2d7d Deprecate Language::getMessage*For()
These call the deprecated Language::getLocalisationCache() method and
thereby indirectly access MediaWikiServices. Callers should be changed
to inject a LocalisationCache and use that directly.

There are only a few callers in code search, so it didn't seem worth
adding convenience methods to LocalisationCache. The one caller that was
already using DI was MessageCache, and I injected LocalisationCache
there.

Bug: T201405
Change-Id: I01919fba5685fc5e0a31f739714f125a22de8939
2019-10-29 11:52:07 +02:00
Aryeh Gregor
0de9c47b50 Remove Language::factory and getParentLanguage use
Change-Id: I11f8801ef47ec1a1f63d840116e69667e6f3ae3c
2019-10-27 12:34:28 +02:00
Amir Aharoni
64e2d73f5c Split rest messages from the main en.json
Bug: T233192
Change-Id: I3990ae4e34a51e7648f74a05a4b7ac744fa9b9c4
2019-10-22 03:07:42 +00:00
Timo Tijhof
156e0aed63 localisation: Convert MessageCache to PSR-3 logging
Change-Id: I9eaf8e419cf2895733fce1bff83aa81a3d21c39c
2019-10-12 17:38:59 +01:00
Daimona Eaytoy
19cd15f7cd Fix some phan warnings for too many params (part 1)
Bug: T231636
Change-Id: Ib0ca6bf2c426c21c4d42944c53a219e5940a5f11
2019-10-10 04:44:53 +00:00
jenkins-bot
52b44696ba Merge "Split some Language methods to LanguageNameUtils" 2019-10-08 21:10:07 +00:00
James D. Forrester
ebac0247cf Services: Convert LocalisationCache's static to a const now HHVM is gone
Change-Id: If5c015debed7efc034613b976bc5292ac30036d7
2019-10-08 11:25:30 -07:00
Aryeh Gregor
6d80b6c082 Split some Language methods to LanguageNameUtils
These are static methods that have to do with processing language names
and codes. I didn't include fallback behavior, because that would mean a
circular dependency with LocalisationCache.

In the new class, I renamed AS_AUTONYMS to AUTONYMS, and added a class
constant DEFINED for 'mw' to match the existing SUPPORTED and ALL. I
also renamed fetchLanguageName(s) to getLanguageName(s).

There is 100% test coverage for the code in the new class.

This was previously committed as 2e52f48c2e and reverted because it
depended on e4468a1d6b, which had to be reverted for performance
issues. There should be no changes other than rebasing.

Bug: T201405
Change-Id: Ifa346c8a92bf1eb57dc5e79458b32b7b26f1ee8a
2019-10-07 15:20:52 -07:00
Aryeh Gregor
043d88f680 Make LocalisationCache a service
This removes Language::$dataCache without deprecation, because 1) I
don't know of a way to properly simulate it in the new paradigm, and 2)
I found no direct access to the member outside of the Language and
LanguageTest classes.

An earlier version of this patch (e4468a1d6b) had to be reverted
because of a massive slowdown on test runs. Based on some local testing,
this should fix the problem. Running all tests in languages is slowed
down by only around 20% instead of a factor of five, and memory usage is
actually reduced greatly (~350 MB -> ~200 MB). The slowdown is still not
great, but I assume it's par for the course for converting things to
services and is acceptable. If not, I can try to optimize further.

Bug: T231220
Bug: T231198
Bug: T231200
Bug: T201405
Change-Id: Ieadbd820379a006d8ad2d2e4a1e96241e172ec5a
2019-10-07 13:18:47 -07:00
Timo Tijhof
67f3df57f9 MessageCache: Replace internal loadedLanguages array with special cache key
Before c962b48056, the 'loadedLanguages' array was used to track
which languages were loaded and in the cache, with 'cache' being a
simple array. In that commit, the 'cache' array also started being used
for incomplete datasets, which didn't affect 'loadedLanguages'.

Then in 97e86d934b, the 'loadedLanguages' array was removed in favour
of checking keys on 'cache' directly, and 'cache' was converted to
MapCacheLRU.

This led to problem where partially loaded data was mistaken for being
full datasets (fatal error, T208897). This was fixed in a5c984cc59,
by bringing back the 'loadedLanguages' array, which fixed the issue from
the POV of partially loaded data.

However, this then exposed a new problem. The 'cache' data can be evicted
by MapCacheLRU, whereas 'loadedLanguages' is not aware of that. Thus it
claims languages are loaded that sometimes aren't. (This only affects web
requests where more than 5 language codes are involved, per MapCacheLRU.)

Fix this by re-removing the 'loadedLanguages' array, this time
strengthening the 'cache' key check to not just check that the root key
exists, but that it is in fact holding the full dataset as generated by
MessageCache::load(). The 'VERSION' key appears to be a good proxy for
that.

Bug: T230690
Change-Id: I1162a3857376aa37e5894ae3c8be84a2295782a3
2019-10-02 22:47:00 +00:00
Timo Tijhof
7d82ce8bfd localisation: Remove PHP5-specific perf optimisation
The `apc.cache_by_default` setting is a PHP5-era setting relating
to the part of php5-apc now known as opcache (as opposed to the
part now known as apcu).

This setting doesn't exist in PHP 7, and trying to set it doesn't
do anything useful.

Bug: T206986
Change-Id: I46a91897b2b33b5ce6505beb74d404982cb0641c
2019-09-21 02:31:04 +01:00
jenkins-bot
f6059f9fab Merge "Cleanup and document some LCStoreDB fields" 2019-09-10 03:32:45 +00:00
Aaron Schulz
6a68b89a57 Cleanup and document some LCStoreDB fields
Change-Id: I1edcfbaa0889a84803a9d66d2bc6962664867650
2019-09-09 17:09:55 -07:00
Aaron Schulz
a5c7fd0db2 Move callers away from Title::GAID_FOR_UPDATE
These callers just need to load some data from DB_MASTER.
Subsequent code needing that latest title data should also use the
required flags, rather than relying on flakey global cache state.

Change-Id: I53248ea4b5bf1cd953f956c41b8244831ec5ef04
2019-09-09 13:19:08 -07:00
Brad Jorsch
c29909e59f Mostly drop old pre-actor user schemas
This removes most of the pre-actor user and user_text columns, and the
$wgActorTableSchemaMigrationStage setting that used to determine
whether the columns were used.

rev_user and rev_user_text remain in the code, as on Wikimedia wikis the
revision table is too large to alter at this time. A future change will
combine that with the removal of rev_comment, rev_content_model, and
rev_content_format (and the addition of rev_comment_id and rev_actor).

ActorMigration's constructor continues to take a $stage parameter, and
continues to have the logic for handling it, for the benefit of
extensions that might need their own migration process. Code using
ActorMigration for accessing the core fields should be updated to use
the new actor fields directly. That will be done for in a followup.

Bug: T188327
Change-Id: Id35544b879af1cd708f3efd303fce8d9a1b9eb02
2019-09-09 11:38:36 -04:00
jenkins-bot
b328ae4a4e Merge "Setup: Move MWDebug logic to MWDebug.php" 2019-09-05 16:58:08 +00:00
Timo Tijhof
55db848b77 localisation: Release data from memory in LCStoreStaticArray::finishWrite
With this change, the memory behaviour of LCStoreStaticArray
matches the other LCStore implementations. Specifically, that when
mass-rebuilding LocalisationCache entries for all language codes,
the computed data should be released from memory after
calling LCStore::finishWrite().

This doesn't affect user-facing web requests, even in the case
of stock MW where every once in a while a user request can lazy-
regenerate the LCStore, there is a process-cache in front of LCStore
in the LocalisationCache class.

The rebuildLocalisationCache.php clears that via
LocalisationCacheBulkLoad::unload(), but due to LCStoreStaticArray
internally holding on to the data, it was still leaking.

The leak was found by @Nikerabbit as part of testing for T218207.

To test this, amend rebuildLocalisationCache.php and add the
following on line 161, as the first line of the doRebuild/foreach/if
block:

  echo "[$code-start-mem] " . round(memory_get_usage(true)/1024/1024, 2) . " MB\n";

If you then have LocalSettings.php configured like so:

  $wgCacheDirectory = $wgTmpDirectory;
  $wgLocalisationCacheConf['store'] = 'array';

Then before this patch, running rebuildLocalisationCache.php,
shows memory starting at 12 MB and growing 2-3 MB for every language
until the very end, closing with 970 MB memory use.

After this patch, it starts at 12 MB and stops growing at 32 MB.

When configuring as `['store'] = 'files'`, which uses LCStoreCDB,
the memory starts at 12 MB and stops growing at 44 MB, both before
and after this patch.

Bug: T218207
Change-Id: I0d215efee5b31766776a068b16811d52f9879312
2019-09-04 21:35:10 +01:00
Timo Tijhof
d18e76dbef Setup: Move MWDebug logic to MWDebug.php
* Remove checks in HTMLFileCache.php and Article.php.

  These haven't been needed since the same check was added to Setup.php,
  many years ago. When FileCache is enabled, The Setup.php code disables
  MWDebug. There is no reason for FileCache to then also disable itself
  based on unused config. That means both of them lose.
  We now handle this logic in one place: MWDebug::setup().

* In rebuildFileCache.php, turn it off explicitly, just in case.
  The previous code there didn't work because finalSetup()
  is called after doMaintenance.php includes Setup.php, which
  is what checked this config var to decide on MWDebug::init.
  On the other hand, it's also always off in CLI mode.
  But, let's not depend on that, maybe we decide to enable it on
  CLI one day! Just keep it off explicitly here.

Bug: T189966
Change-Id: I45a8f77092249751dc6f276aa5bb67ebf5b4f64c
2019-09-04 16:33:25 +00:00
Daimona Eaytoy
e70b5b3309 Unsuppress other phan issues (part 4)
Bug: T231636
Depends-On: I58e67c2b38389df874438deada4239510d21654f
Change-Id: I6e5fba7bd273219b1206559420b5bdb78734aa84
2019-08-31 17:13:39 +00:00
Daimona Eaytoy
5eac6d131c Unsuppress more phan issues (part 3)
Bug: T231636
Depends-On: I78354bf5f0c831108c8f606e50c87cf6bc00d8bd
Change-Id: I58e67c2b38389df874438deada4239510d21654f
2019-08-31 16:38:55 +00:00
Daimona Eaytoy
fb3428eb8f Unsuppress other phan issues with low count
And also update approximated counts, which for the most part are lower
than reported (hooray!)

Bug: T231636
Depends-On: Ica50297ec7c71a81ba2204f9763499da925067bd
Change-Id: I78354bf5f0c831108c8f606e50c87cf6bc00d8bd
2019-08-30 09:42:15 +00:00
daniel
b860ef0d13 Avoid fatal errors when reporting exceptions.
When reporting exceptions that occur during initialization, wgUser may
be null. Don't die when that happens.

Change-Id: I65d5a17d80f9021e28a218c7a5a17e399bc7ce98
2019-08-29 13:07:46 +02:00
jenkins-bot
bf7284d975 Merge "MessageCache: Add STRAIGHT_JOIN to avoid planner oddness" 2019-08-28 04:29:31 +00:00
Timo Tijhof
1d7f793108 MessageCache: Increase APC 'messages-big' expiry from 1min to 1h
Bug: T218207
Change-Id: Ic5d2a556912e2a16ee899eec3a0670f00dec9a8c
2019-08-27 22:58:59 +00:00
jenkins-bot
da5cb17341 Merge "MessageCache: Remove $wgMsgCacheExpiry configuration var" 2019-08-27 18:33:05 +00:00
jenkins-bot
7e675fbb16 Merge "MessageCache: Minor wgMsgCacheExpiry doc fix, and clear constant access" 2019-08-27 18:31:18 +00:00
Timo Tijhof
178d312eb8 MessageCache: Remove $wgMsgCacheExpiry configuration var
This variable has never been set to anything other than the default value of
24 hours as introduced in 2003 (r2203, r2204; or 036ff960ce, edf6b38626).

The variable has never changed in core, it's not overridden at WMF,
and MessageCache is not constructed anywhere other than ServiceWiring.php
anywhere in repos on Wikimedia Gerrit, indexed by MediaWiki Codesearch,
or any GitHub-hosted repository (incl Wikia repos and WikiHow mirrors).

I've also checked all GitHub-hosted repos for boilerplates and/or public
settings files from devs or prod, and couldn't find any example of
this being overridden (after filtering out copies of the core files
themselves). Rather than having to support potentially hard-to-predict
interactions betweeen caching layers by checking its state, make it
a constant so we can code reason about it more easily.

Change-Id: Ie2e139001aae3ac54b509d94a3d917bb408eaca0
2019-08-27 17:33:11 +00:00
Timo Tijhof
f084d0f194 MessageCache: Minor wgMsgCacheExpiry doc fix, and clear constant access
The class used is typed against BagOStuff so access the constant
from there instead.

Bug: T218207
Change-Id: Ie22d6aa5877fb5e8e2ae0b3be87f4b28f45ad763
2019-08-27 16:23:44 +00:00
Brad Jorsch
9e871e05b7 MessageCache: Add STRAIGHT_JOIN to avoid planner oddness
For some unknown reason, when the `actor` table has few enough NS8 rows
compared to `page` MariaDB 10.1.37 decides it makes more sense to fetch
everything from `actor` then join `revision` then `page` rather than
fetching the rows from `page` in the first place.

We can work around it by telling it to not reorder the query, but then
we also have to reorder it ourselves to put `page` first instead of
`revision`.

Bug: T231196
Change-Id: I2b2fb209e648d1e407c5c2d32d3ac9e574e361d5
2019-08-26 15:12:30 -04:00
Amir Sarabadani
308e6427ae Revert "Make LocalisationCache a service"
This reverts commits:
 - 76a940350d
 - b78b8804d0
 - 2e52f48c2e
 - e4468a1d6b

Bug: T231200
Bug: T231198
Change-Id: I1a7e46a979ae5c9c8130dd3927f6663a216ba753
2019-08-26 18:28:26 +02:00