Commit graph

49 commits

Author SHA1 Message Date
Dreamy Jazz
e7393b3cc7 Exclude boilerplate maintenance code from code coverage reports
Why:
* Maintenance scripts in core have bolierplate code that is
  added before and after the class to allow directly running
  the maintenance script.
* Running the maintenance script directly has been deprecated
  since 1.40, so this boilerplate code is only to support a now
  deprecated method of running maintenance scripts.
* This code cannot also be marked as covered, due to PHPUnit
  not recognising code coverage for files.
* Therefore, it is best to ignore this boilerplate code in code
  coverage reports as it cannot be marked as covered and also
  is for deprecated code.

What:
* Wrap the boilerplate code (requiring Maintenance.php and then
  later defining the maintenance script class and running if the
  maintenance script was called directly) with @codeCoverageIgnore
  comments.
* Some files use a different boilerplate code, however, these
  should also be marked as ignored for coverage for the same
  reason that coverage is not properly reported for files.

Bug: T371167
Change-Id: I32f5c6362dfb354149a48ce9c28da9a7fc494f7c
2024-08-27 13:22:29 +01:00
Umherirrender
be157850ad Replace IDatabase::buildLike with expression builder
Bug: T361023
Change-Id: I0fcb61ff1b15931477bce3c8f850a8ce97405a36
2024-05-18 12:29:17 +00:00
Amir Sarabadani
d9370003fb maintenance: Introduce getReplicaDB() and getPrimaryDB()
And start using them instead of wfGetDB(), LB/LBF connection methods or
worse, $this->getDB().

$this->getDB() reuses the database object regardless of whether you're
calling a replica or primary, leading to returning a replica on a
primary and other way around.

Bug: T330641
Change-Id: I9e2cf85ca277022284fc26b9f37db57bd12aaa81
2024-01-18 15:12:04 +01:00
Amir Sarabadani
69cabb628c maintenance: Migrate to expression builders
This was somehow left out

Bug: T210206
Change-Id: I70851b5b99fa865dbfd629caf2c1866c85418350
2024-01-17 20:27:08 +01:00
Derick Alangi
74033c50cd maintenance: Begin using Maintenance::getServiceContainer()
Maintenance class provides a method for getting a fresh reference
of the MW services container instance. Let's make use of these in
maintenance scripts now that we have it.

NOTE: There are still some static methods like in refreshLinks.php
that makes use of services that we can't use this method for now.

Change-Id: Idba744057577896fc97c9ecf4724db27542bf01c
2023-09-04 10:39:58 +00:00
Amir Sarabadani
e569aedde5 Introduce FileSelectQueryBuilder
So much can be cleaned up with this

Bug: T311866
Change-Id: Ia4d46679c540c731b2ae8da2f8022fd6f5b931a4
2023-08-07 19:05:34 +02:00
Amir Sarabadani
b525884e11 maintenance: Use $this->waitForReplication()
This adds reconfiguring db pools in case a replica gets depooled

Bug: T298485
Change-Id: Id052ce8ed45c51e51b071778858d27b48605bf93
2022-10-24 21:11:53 +02:00
Tim Starling
0077c5da15 Use short array destructuring instead of list()
Introduced in PHP 7.1. Because it's shorter and looks nice.

I used regex replacement.

Change-Id: I0555e199d126cd44501f859cb4589f8bd49694da
2022-10-21 15:33:37 +11:00
Umherirrender
6caf78c2c8 phan: Remove PhanPossiblyUndeclaredVariable suppression
Make phan stricter about conditional variable declaration
Remaining false positive issues are suppressed.
The suppression and the setting change can only be done together

Bug: T259172
Change-Id: I1f200ac37df7448453688bf464a8250c97313e5d
2022-03-30 19:47:15 +00:00
Amir Sarabadani
28ad6f27fb maintenance: Add support for oldimage table metadata refresh
This doesn't handle deleted files yet but that can be done in later
patches. I have been struggling to find a solution for it.

Bug: T298417
Change-Id: I4f33afa92be8b55d6b8f9797cb4d3aa03a37ad79
2022-01-03 12:14:09 +01:00
Amir Sarabadani
afe4bbdbe3 media: Drop XML metadata support from DjvuHandler
All of production has been migrated to json and this patch also adds an
update entry for third parties.

Bug: T275268
Change-Id: I916127896bdce95472823ae7be12fc5e6e16691a
2021-11-21 15:47:56 +01:00
Amir Sarabadani
128e55a5db Add --sleep option to refreshImageMetadata.php
For running long-term maintenance scripts, having waitForReplication()
is not enough as that method doesn't count for cross DC-replication and
replication the cloud making issues if kept running more than a couple
of hours. This mitigate the aforementioned problem.

Change-Id: Ifa0b31dde4a57a4d3f2b00c29410be568cfa6b91
2021-06-29 21:10:41 +02:00
Tim Starling
9251f3c9e3 Manual and automatic image metadata reserialization
* Add automatic splitting of large metadata on upload or refresh. If
  the reserializeMetadata option is enabled, metadata stored with PHP
  serialization will be automatically reserialized to JSON.
* Inject configuration variable $wgUpdateCompatibleMetadata via
  LocalRepo instead of accessing it directly.
* In refreshImageMetadata.php and rebuildImages.php, construct a new
  LocalRepo with config overrides, instead of overwriting config
  globals. Add a helper to RepoGroup to help with this.
* In refreshImageMetadata.php, add new options --convert-to-json and
  --split which reserialize metadata and optionally split out large
  items to blob storage.

Also, refreshImageMetadata.php was totally broken in the non-force mode
since metadata refresh on page view was disabled in b814245d9f. The
maintenance script was relying on newFileFromRow() magically upgrading
the row, which doesn't happen anymore. So, call maybeUpgradeRow()
directly.

Bug: T275268
Change-Id: I7bf7d9cef71641e287ca4346b568b381f4ada50e
2021-06-26 22:50:17 +00:00
Tim Starling
b4849e03b7 Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.

Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.

So, in filerepo:

* Add File::getMetadataItem(), which promises to allow partial
  loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
  Some file handlers were returning non-serializable strings from
  getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
  called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
  File::loadMetadataFromDb() in preparation for T275268.

In MediaHandler:

* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
  into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
  only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
  media. It was not called in that case, but was implemented anyway.

In specific handlers:

* Rename DjVuHandler::getUnserializedMetadata() and
  extractTreesFromMetadata() for clarity. "Metadata" in these function
  names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
  style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
  Orientation tag, rather than a serialized string. Also renamed for
  clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
  instead of throwing them away. There was some conflation in
  decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
  bits per channel, width, height and components (channel count). This
  is essentially a port of PHP's getimagesize(), so should be bugwards
  compatible.
* In PNGMetadataExtractor, return the width and height, which were
  previously assigned to unused local variables. I verified the
  implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
  but rename the function since it now takes an array as input.

In tests:

* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
  getMetadata() returns null, since it doesn't make sense when ported to
  getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
  useful to confirm that existing database rows will continue to be read
  correctly. I removed serialization of expected values, replacing them
  with plain data.
* In tests, I replaced access to private class constants like
  BROKEN_FILE with string literals, since stability is essential. If
  the class constant changes, the test should fail.

Elsewhere:

* In maintenance/refreshImageMetadata.php, I removed the check for
  shrinking image metadata, since it's not easy to implement and is
  not future compatible. Image metadata is expected to shrink in
  future.

Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-06-08 17:04:01 +10:00
James D. Forrester
df5eb22f83 Replace uses of DB_MASTER with DB_PRIMARY
Just an auto-replace from codesniffer for now.

Change-Id: I5240dc9ac5929d291b0ef1c743ea2bfd3f428266
2021-04-29 09:24:31 -07:00
Reedy
8ba1c75559 Replace wfWaitForSlaves() with LBFactory::waitForReplication()
Change-Id: I337147d61e2ec686a8672d0340dff4b6783f78cd
2020-05-02 02:00:01 +00:00
Derick A
e7493e35bd maintenance: Avoid deprecated usage of RepoGroup::singleton()
Change-Id: I694cf124a5f90f368309b04f8ae0b843a98816eb
2020-02-18 21:28:17 +00:00
Umherirrender
b4fe9c4bcc Set method visibility on maintenance scripts
Change-Id: I44c82fbe65e1d002803ce065df6563f06dd39cd4
2019-11-16 22:54:17 +00:00
Umherirrender
7363e38ddf Set public for override of Maintenance functions
The following function are set to public in the parent class and cannot
have another visibility in subclasses
Maintenance::__construct
Maintenance::execute
Maintenance::getDbType
Maintenance::validateParamsAndArgs
Maintenance::setDB

Change-Id: I0cd6514642d479aca20f1221bf673b0713c21631
2019-10-09 20:41:33 +02:00
Umherirrender
ad776c7d5f Use ::class to resolve class names in maintenance scripts
This helps to find renamed or misspelled classes earlier.
Phan will check the class names

Change-Id: I1d4567f47f93eb1436cb98558388e48d35258666
2018-01-23 17:40:16 +00:00
Bryan Davis
9e34eeff23 Maintenance: add fatalError() method
Deprecate the second argument to Maintenance::error() in favor of a new
Maintenance::fatalError() method. This is intended to make it easier to
review flow control in maintenance scripts.

Change-Id: I75699008638f7e99b11210c7bb9e2e131fca7c9e
2017-11-21 21:34:16 -07:00
Max Semenik
55a9b2f8b6 Finish migration to Maintenance::getBatchSize()
Change-Id: I02d89f71d820e4d00a39e86a30397b614bbdb432
2017-11-07 19:35:11 -08:00
Brad Jorsch
3488f49532 Replace selectFields() methods with getQueryInfo()
Several classes have a "selectFields()" static method to tell callers
which fields to select from the database. With the recent comment table
change and the upcoming actor table change, this pattern has become too
simplistic as a SELECT will need to join several tables to be able to
retrieve all the needed fields.

Thus, we deprecate the selectFields() methods in favor of getQueryInfo()
methods that return tables and join conditions in addition to the
fields.

Change-Id: Idcfd15568489d9f03a7ba4460e96610d33bc4089
2017-10-30 22:57:33 +00:00
Brad Jorsch
fa4a909def Replace more uses of "SELECT *"
With the introduction of CommentStore, selects from various table
require certain joins or column aliases for proper operation. The
upcoming actor table change, and the suggested title table change, will
add more such requirements.

Change-Id: Ic8213bff74b8350b15cd271d0ef252e63e7e79bd
2017-10-13 19:02:56 +00:00
Gilles Dubuc
dfb6bb0ab5 Improve output of refreshImageMetadata and refreshFileHeaders
Bug: T150741
Change-Id: Ie5f787fd77ecd31b8852d0f66de912baced4ca46
2017-05-16 09:13:36 +02:00
Aaron Schulz
21e71e0235 Use IDatabase type hints in /maintenance
Relatedly, move lockTables()/unlockTables() to IMaintainableDatabase

Change-Id: Ib53e9fa948deb2f9a70f0ce16c002613d0060bf9
2017-04-07 23:37:41 +00:00
Reedy
0200a1c113 Make refreshImageMetadata not fail completely if it doesn't like a single file
Change-Id: I1170cf91ba6dea08d35457c9859b8c18482d4aa6
2016-12-09 21:01:15 +00:00
Aaron Schulz
30f4b3c103 Replace DatabaseBase => Database in more places
Change-Id: If37a7909056bf2c31a8228cbc84f0fbbf5f1c517
2016-09-28 15:53:02 -07:00
Aaron Schulz
c95583a560 Defer maybeUpgradeRow() post-send since they can trigger on non-POST
Change-Id: I791e7133d49ed9cd6dd40bb3fa35ea38d1ceba10
2016-08-11 09:19:09 -07:00
Bartosz Dziewoński
ed3b46d259 refreshImageMetadata: Allow filtering by 'img_media_type' too
Unlike 'img_major_mime' and 'img_minor_mime', this shouldn't be
"inefficient", since there's an index on it.

Bug: T131157
Change-Id: I4985cade41c23ef68f5caf276d4934cf24de2bb6
2016-03-29 18:11:42 +02:00
Kunal Mehta
6e9b4f0e9c Convert all array() syntax to []
Per wikitech-l consensus:
 https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html

Notes:
* Disabled CallTimePassByReference due to false positives (T127163)

Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
2016-02-17 01:33:00 -08:00
Max Semenik
59db24e90b Use addDescription() instead of accessing mDescription directly
Change-Id: I0e2aa83024b8abf5298cfea4b21bf45722ad3103
2016-01-30 01:28:32 -08:00
Reedy
44cebea941 Update wfGetDB calls in Maintenance scripts to use getDB()
Change-Id: I9ad6745d84506b736dae94747256caac89715899
2016-01-02 16:58:23 +00:00
rillke
c31fbf073e Unify the spelling of MIME in documentation
Writing MIME as written in Wikipedia and some documentation clean up.

Change-Id: I9dfc36d2bf55d72d9374c4075bd6d45eef0415a4
2014-08-07 23:38:45 +02:00
Siebrand Mazeland
606c680b21 Update formatting in maintenance/ (4/4)
Change-Id: I6b58d014a4bfd6600e4e6f80188fdcfce18482ca
2014-04-23 20:09:26 +02:00
Siebrand Mazeland
89d8c583d7 Pass phpcs-strict on maintenance/ (2/8)
Change-Id: I69e2bca3c98fe9d3713c852699f49b7b4c868338
2014-04-22 21:25:47 +00:00
umherirrender
e78776373e Fixed some @params documentation (maintenance)
Swapped some "$var type" to "type $var" or added missing types
before the $var. Changed some other types to match the more common
spelling. Makes beginning of some text in captial.
Also added some missing @param.

Change-Id: I727deec35a712de0f0c676cc87dfa661f1ee965b
2014-04-17 22:48:32 +02:00
Ladsgroup
16a5102765 Change URLs to mediawiki.org in comments to HTTPS
These are only documentation fixes
http://www.mediawiki.org --> https://www.mediawiki.org

Change-Id: I62ad42be1a3aac410cc53e98ce79389ceddd8988
2014-03-20 16:59:46 +00:00
umherirrender
5dbfd5bf80 Fixed spacing
- Removed trailing spaces in comments
- Removed multiple empty lines
- Removed space after object operator

Change-Id: I9fd3256ab490c7cd2034de3fd94e6be6e6d6d8f2
2013-11-21 18:52:25 +00:00
Timo Tijhof
beb1c4a0ec phpcs: More require/include is not a function
Follows-up I1343872de7, Ia533aedf63 and I2df2f80b81.

Also updated usage in text in documentation and the
installer LocalSettingsGenerator.

Most of them were handled by this regex:
- find: (require|include|require_once|include_once)\s*\(\s*(.+?)\s*\)\s*;$
- replace: $1 $2;

Change-Id: I6b38aad9a5149c9c43ce18bd8edbab14b8ce43fa
2013-05-21 23:26:28 +02:00
Timo Tijhof
50e7985d4d phpcs: Fix WhiteSpace.LanguageConstructSpacing warnings
Squiz.WhiteSpace.LanguageConstructSpacing:
   Language constructs must be followed by a single space;
   expected "require_once expression" but found
   "require_once(expression)"

It is a keyword (e.g. like `new`, `return` and `print`). As
such the parentheses don't make sense.

Per our code conventions, we use a space after keywords like
these. We appeared to have an unwritten exception for `require`
that doesn't make sense. About 60% of require/include usage
was missing the space and/or had superfluous parentheses.

It is as silly as print("foo") or return("foo"), it works
because keywords have no significance for whitespace between
it and the expression that follows, and since experessions can
be wrapped in parentheses for clarity (e.g. when doing string
concatenation or mathematical operations) the parenthesis
before and after basiclaly just ignored.

Change-Id: I2df2f80b8123714bea7e0771bf94b51ad5bb4b87
2013-05-09 05:56:26 +02:00
umherirrender
bfb75bc8e2 Fixed spacing around parenthesis in languages/tests/maintenance
Change-Id: Idd4299d17f1fcf98ab1d635484cb4e880f35ee24
2013-04-28 15:57:34 +00:00
umherirrender
b114f5e1c1 Fixed some spacing in maintenance folder
Added spaces before if, foreach
Added some braces for one line statements

Change-Id: I9657f72996358f8c1c154cea1ea97970d973723c
2013-04-18 20:48:44 +02:00
Antoine Musso
7006e1df93 style: fix up commas in function arguments
Fix up spaces in our function calls, we do not want spaces before a
comma and try to avoid multiple commas whenever possible.

Errors:

* No space found after comma in function call
* Space found before comma in function call

Change-Id: I51aec02016f742422fa60b92ad35ba3f0ef59ba3
2013-02-06 19:30:39 +01:00
jeroendedauw
38c7f444e1 Use __DIR__ instead of dirname( __FILE__ )
We can now do this since we finally switched to PHP 5.3 for MW 1.20 and get rid of the silly dirname(__FILE__) stuff :)

Change-Id: Id9b2c9cd2e678197aa81c78adced5d1d31ff57b1
2012-08-27 21:45:00 +02:00
Alexandre Emsenhuber
6015071beb Improve documentation of maintenance scripts.
Change-Id: I21b4fb873e88026108754eb7206e62c82648df0e
2012-08-09 19:28:14 +02:00
Antoine Musso
46546f0519 Fix doc for maintenance/ 2012-02-08 16:55:54 +00:00
Sam Reed
579b0c5dbc Followup r95458
Add a couple of bits of documentation

Removed an unused global

No need to 1.18 this
2011-09-06 15:45:43 +00:00
Brian Wolff
ddc1d23149 New maintenance script for refreshing image metadata (refreshImageMetadata.php)
This is very similar to rebuildImages.php, except more specific to img_metadata field,
and does the images in batches instead of all at once.

Also, while I'm here, I added a line to Maintenance.php to make sure it casted
$this->mBatchSize to an integer when gotten from command line (thought it was weird
that it didn't do that)

(I'm going to tag this revision 1.18 because I think it'd be nice to have this script
in 1.18 given new image metadata stuff added in 1.18, but not super-important
because rebuildImages.php does already work to refresh image metadata)
2011-08-25 05:33:32 +00:00