Commit graph

333 commits

Author SHA1 Message Date
Aaron Schulz
be8eabd684 externalstore: cleanup ExternalStoreDB::getTable() and fix callers
Remove unused $db parameter and make the $cluster required. Update
the checkStorage.php callers to provide the cluster.

Use the getTable() method in trackBlobs.php to avoid duplication.

Mark this method as @internal to the class and /storage scripts.

Change-Id: I888040a536a60e22e780900a59e4c34b6c468cdf
2024-09-18 15:21:35 -07:00
Umherirrender
c8ec25a961 maintenance: Add missing documentation to class properties
Add doc-typehints to class properties found by the PropertyDocumentation
sniff to improve the documentation.

Once the sniff is enabled it avoids that new code is missing type
declarations. This is focused on documentation and does not change code.

Change-Id: I7dec01892a987a87b1b79374a1c28f97d055e8fa
2024-09-13 19:29:24 +02:00
Dreamy Jazz
e7393b3cc7 Exclude boilerplate maintenance code from code coverage reports
Why:
* Maintenance scripts in core have bolierplate code that is
  added before and after the class to allow directly running
  the maintenance script.
* Running the maintenance script directly has been deprecated
  since 1.40, so this boilerplate code is only to support a now
  deprecated method of running maintenance scripts.
* This code cannot also be marked as covered, due to PHPUnit
  not recognising code coverage for files.
* Therefore, it is best to ignore this boilerplate code in code
  coverage reports as it cannot be marked as covered and also
  is for deprecated code.

What:
* Wrap the boilerplate code (requiring Maintenance.php and then
  later defining the maintenance script class and running if the
  maintenance script was called directly) with @codeCoverageIgnore
  comments.
* Some files use a different boilerplate code, however, these
  should also be marked as ignored for coverage for the same
  reason that coverage is not properly reported for files.

Bug: T371167
Change-Id: I32f5c6362dfb354149a48ce9c28da9a7fc494f7c
2024-08-27 13:22:29 +01:00
thiemowmde
dca4931b42 Make use of the ??= and ?? operators where it makes sense
This touches various production classes and maintenance scripts.
The code should do the exact same as before. The main benefit is that
the syntax avoids any repetition.

Change-Id: I5c552125469f4d7fb5b0fe494d198951b05eb35f
2024-08-26 09:26:36 +02:00
Aaron Schulz
70fd84d8cb maintenance: remove Database::clearFlag() call in recompressTracked.php
Maintenance scripts do not use DBO_TRX mode so this was redundant.

Bug: T311090
Change-Id: I6dc6d2a8e7daf8a2a1dac7c3f8968b4a575b3dd6
2024-08-08 02:29:45 +00:00
Umherirrender
81c6df6a46 maintenance: Use expression builder instead of raw sql
Bug: T361023
Change-Id: Ieb229d8088cb1ff3f03e44f7ac99eb612f48bc7b
2024-07-22 22:29:20 +02:00
Umherirrender
fc9e42823b rdbms: Create IReadableDatabase::andExpr() / ::orExpr()
Avoid the call to internal constructor of AndExpressionGroup and
OrExpressionGroup by creating a factory function similiar as the
IReadableDatabase::expr function for Expression objects.

This is also a replacement for calls to ISQLPlatform::makeList with
LIST_AND or LIST_OR argument to reduce passing sql as string to the
query builders.

Created two functions to allow the return type to be set for both
expression group to allow further calls of ->and() or ->or() on the
returned object.
Depending on the length of the array argument to makeList() it is
sometimes hard to see if the list gets converted to AND or OR, having
the operator in the function name makes it easier to read, so two
functions are helpful in this case as well.

Bug: T358961
Change-Id: Ica29689cbd0b111b099bb09b20845f85ae4c3376
2024-07-11 15:29:20 +00:00
Umherirrender
9879723ef3 Use namespaced classes (1)
Changes to the use statements done automatically via script
Addition of missing use statement done manually

Change-Id: Ic4d4dd61de5ab896fb6173eb579c81f164a1e4a3
2024-06-16 20:18:23 +02:00
Amir Sarabadani
7f0458b472 rdbms: Remove IReadableDatabase::getReplicaPos()
Completely unused.

Bug: T363839
Change-Id: I041ab5ce57ef116076dcc07b2035b5336ceff032
2024-04-30 18:30:56 +02:00
Amir Sarabadani
8e183495e1 Stop using LoadBalancer::getConnectionRef() so it can be hard-deprecated
Bug: T326274
Change-Id: I90493d7cd4c21fdc022bcc19765fc04d986a9c8f
2024-04-30 13:31:08 +01:00
jenkins-bot
2d116a3355 Merge "Use expression builder to avoid raw sql via BETWEEN operator" 2024-04-23 07:22:55 +00:00
Taavi Väänänen
d66bd146e8
maintenance: storage: Fix multiple property declaration phpcs errors
Change-Id: I74302c9b2818e02d7d5e67a728558ced5e8f0181
2024-04-21 23:06:00 +03:00
Umherirrender
fea5c2f687 Use expression builder to avoid raw sql via BETWEEN operator
Replace BETWEEN with >= and <= operator

Change-Id: Ic21b6f4cc11c773c967d9d4c5f20e762c2ff9629
2024-04-21 14:24:21 +02:00
Umherirrender
8018e157e8 maintenance: Migrate to IDatabase::newUpdateQueryBuilder
Bug: T353219
Change-Id: Ic278c8534dad40a3f34674db2d5fbfbca5984da8
2024-04-14 18:47:55 +00:00
James D. Forrester
060a1b1668 Replace last remaining wfGetDB() calls in core, except ResourceLoader
Bug: T330641
Change-Id: I6d30af6ff9f667e367d39befb80c2bb0bf5fb29e
2024-02-14 11:02:01 -05:00
Bartosz Dziewoński
166748e3ac maintenance: Replace unnecessary uses of LBFactory and LoadBalancer
* Change `$services->getDBLoadBalancerFactory()->waitForReplication()`
  to `$this->waitForReplication()`
* Change various complicated expressions to `$this->getReplicaDB()`
  and `$this->getPrimaryDB()`
* Remove unused variables

Change-Id: Ia857be54938a32bb6288dcdf695a35cd38761c3c
2024-01-23 16:48:36 +00:00
Amir Sarabadani
d9370003fb maintenance: Introduce getReplicaDB() and getPrimaryDB()
And start using them instead of wfGetDB(), LB/LBF connection methods or
worse, $this->getDB().

$this->getDB() reuses the database object regardless of whether you're
calling a replica or primary, leading to returning a replica on a
primary and other way around.

Bug: T330641
Change-Id: I9e2cf85ca277022284fc26b9f37db57bd12aaa81
2024-01-18 15:12:04 +01:00
Amir Sarabadani
69cabb628c maintenance: Migrate to expression builders
This was somehow left out

Bug: T210206
Change-Id: I70851b5b99fa865dbfd629caf2c1866c85418350
2024-01-17 20:27:08 +01:00
Amir Sarabadani
72a7b74ea9 Migrate remaining Database::insert calls to InsertQueryBuilder
Tests are not checked.

There is nothing left as far I can check.

Bug: T353219
Change-Id: I1d58397118c7ab1110b9d7cf400c59c4bff7378c
2023-12-22 14:53:17 +01:00
Tim Starling
9c02258a04 Use thousands separators in selected integer literals
For readability. Allowed since PHP 7.4.

I searched for integer literals of 6 or more digits, and also changed
some nearby smaller numbers for consistency.

Bug: T353205
Change-Id: I8518e04889ba8fd52e0f9476a74f8e3e1454b678
2023-12-12 09:22:45 +11:00
Bartosz Dziewoński
64001f0ecd WikiImporter: Pass Authority for permissions instead of global context
Pass Authority to WikiImporter constructor, instead of looking at the
user from RequestContext::getMain(), and skipping this check if
$wgCommandLineMode is true.

Maintenance scripts now use UltimateAuthority, to match the original
intent of skipping permission checks, see 2ed55f42 / r96311.

The Authority parameter to WikiImporterFactory::getWikiImporter() is
optional for now for backwards-compatibility. It should become
required later after deprecation.

Change-Id: Iea1d03dcdcbda2f9a9adbff1b0d319efd22c4d86
2023-12-11 19:15:11 +01:00
Amir Sarabadani
ad118dbb75 maintenance: Migrate $db->buildLike() to expression builder
Bug: T210206
Change-Id: Ie7bf3701fa9d51a43167ce7ec0c1f30bc090296b
2023-11-06 14:27:03 +01:00
James D. Forrester
fcf2dd1a98 maintenance: Return false rather than silently continue on corrupted legacy blobs
Bug: T340174
Change-Id: I5d6385b5c924985f47e199dee3ecef13905d6388
2023-10-13 19:11:37 -04:00
Reedy
b98f33cdac Convert numerous DB queries to use QueryBuilders
Bug: T344971
Change-Id: Ia727b513a6bfcaa5a0b13977a6789aa879ad2f0b
2023-10-09 19:06:53 +02:00
Amir Sarabadani
eaedb7da16 maintenance: Migrate another batch to SelectQueryBuilder
Around fifty-ish. Found becuase of fixed MigrateSelect.

Bug: T344971
Change-Id: If85428d5a033822bfd8ee1f6ab730863bfad55bd
2023-09-21 14:15:42 +02:00
Amir Sarabadani
f90d18fe80 maintenance: Migrate some Database::select() calls to SQB
Done semi-automatically via migrateselect[1]. The script only accepted
ascii chars until I found out and fixed it and now I can run it in more
places.

[1] https://gitlab.wikimedia.org/ladsgroup/migrateselect

Bug: T344971
Change-Id: I83b6c424c62a517a0ab3635b64488ea53fd88bab
2023-09-15 18:18:15 +02:00
Amir Sarabadani
049b34b41c Introduce RevisionSelectQueryBuilder
Deprecating RevisionStore::getQueryInfo() and cleaning up a lot of code

Also removing a brittle test that wasn't really testing anything.

Bug: T344971
Change-Id: Ifd690dc8f030f86e3567a717eaeb830cb6dc703b
2023-09-06 12:30:38 +02:00
Derick Alangi
74033c50cd maintenance: Begin using Maintenance::getServiceContainer()
Maintenance class provides a method for getting a fresh reference
of the MW services container instance. Let's make use of these in
maintenance scripts now that we have it.

NOTE: There are still some static methods like in refreshLinks.php
that makes use of services that we can't use this method for now.

Change-Id: Idba744057577896fc97c9ecf4724db27542bf01c
2023-09-04 10:39:58 +00:00
Func
58a42dc24e compressOld: Do not assume the latest revision has the greatest ID
Previous revisions can have bigger revision IDs than the last revision
in a few situations, including imported or history-merged revisions.

Change-Id: I4e23d10b0763de4b016460e789baec4f560f1674
2023-08-05 13:44:07 +08:00
jenkins-bot
dc3578a910 Merge "Migrate more calls of Database::select* to SelectQueryBuilder" 2023-07-26 11:21:08 +00:00
Amir Sarabadani
7432b21816 Migrate more calls of Database::select* to SelectQueryBuilder
Using a php parser written on top of ANTLR4, done semi-automatically.

I checked everything and made adjustments.

Bug: T311866
Change-Id: I6150c6909bce8f3dbd745a26380cc0af9d9c547f
2023-07-26 13:01:28 +02:00
Umherirrender
6e0065ad20 Simplify WHERE conditions with field IS NULL
Reduce raw sql fragments on simple compares

Change-Id: I3f2340dfdbf5197cc22546911e6c5653dc5a6269
2023-07-24 19:22:36 +02:00
Amir Sarabadani
310333906a maintenance: Switch simple calls of Database::select to SQB
Done semi-automatically via a php parser written on top of ANTLR4.

Bug: T311866
Change-Id: I33f5b6703c0aa9c80c907a21c2a770e30642edd3
2023-07-19 17:42:23 +02:00
Amir Sarabadani
bad7b08883 Add maintenance/storage/fixLegacyEncoding.php
To fix legacy encoding entries in external storage which means they
can't be fixed via calling moveToExternal.php.

The script originally was copy-paste and clean up of moveToExternal.php
but it made so much duplication that I went with subclassing.

Bug: T282734
Change-Id: Ic52e843f3dbe7d14cc8df5e8f3fe7aada7681bc9
2023-06-22 09:59:01 +02:00
jenkins-bot
c1c4fe771a Merge "Make some storage scripts use Maintenance class" 2023-06-16 02:52:40 +00:00
Amir Sarabadani
5836bf2ce5 moveToExternal: First decompress gziped entries before iconv
This bug has messed up content of 0.5M revisions in English Wikipedia

Bug: T128150
Change-Id: I675287a07a58df0f19a35011c012462400e90be8
2023-06-15 23:45:42 +02:00
daniel
34d73531cb Make some storage scripts use Maintenance class
CommandLineInc is deprecated. This allows the scripts to be executed
from MaintenanceRunner.

Change-Id: I180605ea5cb47783670b28a6f01d98f0398c705d
2023-06-14 22:36:35 +02:00
Amir Sarabadani
4dd3850beb moveToExternal: Also check for utf8 encoding before trying to convert
While most rows in production use 'utf-8' to flag content being UTF-8,
we have lots of rows flagged with 'utf8':
mysql:research@s3-analytics-replica.eqiad.wmnet [dawiki]> select old_flags, count(*) from text group by old_flags limit 50;
+---------------------+----------+
| old_flags           | count(*) |
+---------------------+----------+
| error               |        2 |
| external,gzip       |       49 |
| external,object     |       36 |
| external,utf-8      |  1614469 |
| external,utf8       |   336780 |
| gzip,utf-8,external |     1094 |
| utf-8,gzip,external |  9458083 |
+---------------------+----------+
7 rows in set (26.038 sec)

This would confuse the script to try to reencode it again which possibly
could lead to all sorts of errors

Change-Id: I9b4a38538199c9954cfed51cdd2bba8b0f6cb953
2023-06-08 14:20:51 +02:00
Kevin Israel
18bcb86ac1 moveToExternal: Actually convert encoding of cur_text
HistoryBlobCurStub objects point to rows in the cur table from 1.4,
which would have used the legacy encoding. It is incorrect to add the
"utf-8" flag without also converting the encoding.

Bug: T337700
Change-Id: Ie884512c1489358cabdf52660a7cb9d0797b8e78
2023-06-03 06:24:29 -04:00
Umherirrender
a01256c5b8 build: Cleanup of .phpcs.xml
Use inline suppression for known exception from eval/passthru/query call

Change-Id: Ie85ea5698a615adf07e4e391bf06d102149effd5
2023-04-13 12:57:51 +02:00
Tim Starling
d36ea70309 Fix some PHPStorm inspections (#1)
* Triple backslash in regex should really be quadruple backslash
* Using the returned value of a void method
* Immediately overwritten array keys
* Duplicate array keys
* Foreach variable reuse
* sprintf() with too many params
* Incorrect reference usage

Change-Id: I3c649b543c9561a1614058c50f3847f663ff04df
2023-03-25 00:19:33 +00:00
James D. Forrester
ad06527fb4 Reorg: Namespace the Title class
This is moderately messy.

Process was principally:

* xargs rg --files-with-matches '^use Title;' | grep 'php$' | \
  xargs -P 1 -n 1 sed -i -z 's/use Title;/use MediaWiki\\Title\\Title;/1'
* rg --files-without-match 'MediaWiki\\Title\\Title;' . | grep 'php$' | \
  xargs rg --files-with-matches 'Title\b' | \
  xargs -P 1 -n 1 sed -i -z 's/\nuse /\nuse MediaWiki\\Title\\Title;\nuse /1'
* composer fix

Then manual fix-ups for a few files that don't have any use statements.

Bug: T166010
Follows-Up: Ia5d8cb759dc3bc9e9bbe217d0fb109e2f8c4101a
Change-Id: If8fc9d0d95fc1a114021e282a706fc3e7da3524b
2023-03-02 08:46:53 -05:00
Amir Sarabadani
4bb2886562 Reorg: Migrate WikiMap to WikiMap/ out of includes
And WikiReference

Bug: T321882
Change-Id: I60cf4b9ef02b9d58118caa39172677ddfe03d787
2023-02-27 05:19:46 +01:00
Umherirrender
ee73e6ac1b Remove unused local variable assignment
Dead code found by phan

Change-Id: I9fc404d546a4fb1c61394cb6359eb774fd94383a
2023-02-04 22:16:31 +01:00
jenkins-bot
34f4280b2d Merge "Update moveToExternal and resolveStubs" 2022-12-20 04:00:34 +00:00
Tim Starling
096ea23208 Update moveToExternal and resolveStubs
Convert these two old scripts to Maintenance subclasses.

* Uncomment the resolveStub() call in moveToExternal and fix one obvious
  bug with it, i.e. the fact that stubs need to be resolved after CGZ
  blobs are moved.
* Replace get_class() with instanceof.
* Make the "tiny text" threshold configurable. Normally this is not
  wanted in WMF production since new revisions are written to ES
  unconditionally.
* Add a dry run mode.
* Add an undo log.
* Add --skip-resolve option.
* Make resolveStub() be much more defensive about what it resolves.
* In moveToExternal, make compression optional and do it also for plain
  text.
* Optionally convert the legacy encoding to UTF-8.

Bug: T299387
Change-Id: I52d54e3b6b785ac072796031be06499221340f51
2022-12-20 13:43:44 +11:00
thiemowmde
70aa9c8e35 Make use of ?:, ?? and ??= operators in mostly trivial cases
The motivation is to make the code less confusing. I hope this is the
case.

?? is an older PHP 7.0 feature.
??= was added in PHP 7.4, which we can finally use.

Change-Id: Id807affa52bd1151a74c064623b41d950a389560
2022-12-05 21:37:13 +01:00
Amir Sarabadani
b525884e11 maintenance: Use $this->waitForReplication()
This adds reconfiguring db pools in case a replica gets depooled

Bug: T298485
Change-Id: Id052ce8ed45c51e51b071778858d27b48605bf93
2022-10-24 21:11:53 +02:00
Tim Starling
0077c5da15 Use short array destructuring instead of list()
Introduced in PHP 7.1. Because it's shorter and looks nice.

I used regex replacement.

Change-Id: I0555e199d126cd44501f859cb4589f8bd49694da
2022-10-21 15:33:37 +11:00
Umherirrender
b15e689d49 Remove unused local variables
Various variables are left from ealier refactor are now unused
and can be removed to make the code easier to read

Change-Id: Id51770af1f08e85c7e7a02234a2cd2ab5b47ee7a
2022-09-19 23:07:07 +02:00