Why:
* Maintenance scripts in core have bolierplate code that is
added before and after the class to allow directly running
the maintenance script.
* Running the maintenance script directly has been deprecated
since 1.40, so this boilerplate code is only to support a now
deprecated method of running maintenance scripts.
* This code cannot also be marked as covered, due to PHPUnit
not recognising code coverage for files.
* Therefore, it is best to ignore this boilerplate code in code
coverage reports as it cannot be marked as covered and also
is for deprecated code.
What:
* Wrap the boilerplate code (requiring Maintenance.php and then
later defining the maintenance script class and running if the
maintenance script was called directly) with @codeCoverageIgnore
comments.
* Some files use a different boilerplate code, however, these
should also be marked as ignored for coverage for the same
reason that coverage is not properly reported for files.
Bug: T371167
Change-Id: I32f5c6362dfb354149a48ce9c28da9a7fc494f7c
And start using them instead of wfGetDB(), LB/LBF connection methods or
worse, $this->getDB().
$this->getDB() reuses the database object regardless of whether you're
calling a replica or primary, leading to returning a replica on a
primary and other way around.
Bug: T330641
Change-Id: I9e2cf85ca277022284fc26b9f37db57bd12aaa81
Maintenance class provides a method for getting a fresh reference
of the MW services container instance. Let's make use of these in
maintenance scripts now that we have it.
NOTE: There are still some static methods like in refreshLinks.php
that makes use of services that we can't use this method for now.
Change-Id: Idba744057577896fc97c9ecf4724db27542bf01c
Make phan stricter about conditional variable declaration
Remaining false positive issues are suppressed.
The suppression and the setting change can only be done together
Bug: T259172
Change-Id: I1f200ac37df7448453688bf464a8250c97313e5d
This doesn't handle deleted files yet but that can be done in later
patches. I have been struggling to find a solution for it.
Bug: T298417
Change-Id: I4f33afa92be8b55d6b8f9797cb4d3aa03a37ad79
All of production has been migrated to json and this patch also adds an
update entry for third parties.
Bug: T275268
Change-Id: I916127896bdce95472823ae7be12fc5e6e16691a
For running long-term maintenance scripts, having waitForReplication()
is not enough as that method doesn't count for cross DC-replication and
replication the cloud making issues if kept running more than a couple
of hours. This mitigate the aforementioned problem.
Change-Id: Ifa0b31dde4a57a4d3f2b00c29410be568cfa6b91
* Add automatic splitting of large metadata on upload or refresh. If
the reserializeMetadata option is enabled, metadata stored with PHP
serialization will be automatically reserialized to JSON.
* Inject configuration variable $wgUpdateCompatibleMetadata via
LocalRepo instead of accessing it directly.
* In refreshImageMetadata.php and rebuildImages.php, construct a new
LocalRepo with config overrides, instead of overwriting config
globals. Add a helper to RepoGroup to help with this.
* In refreshImageMetadata.php, add new options --convert-to-json and
--split which reserialize metadata and optionally split out large
items to blob storage.
Also, refreshImageMetadata.php was totally broken in the non-force mode
since metadata refresh on page view was disabled in b814245d9f. The
maintenance script was relying on newFileFromRow() magically upgrading
the row, which doesn't happen anymore. So, call maybeUpgradeRow()
directly.
Bug: T275268
Change-Id: I7bf7d9cef71641e287ca4346b568b381f4ada50e
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
The following function are set to public in the parent class and cannot
have another visibility in subclasses
Maintenance::__construct
Maintenance::execute
Maintenance::getDbType
Maintenance::validateParamsAndArgs
Maintenance::setDB
Change-Id: I0cd6514642d479aca20f1221bf673b0713c21631
Deprecate the second argument to Maintenance::error() in favor of a new
Maintenance::fatalError() method. This is intended to make it easier to
review flow control in maintenance scripts.
Change-Id: I75699008638f7e99b11210c7bb9e2e131fca7c9e
Several classes have a "selectFields()" static method to tell callers
which fields to select from the database. With the recent comment table
change and the upcoming actor table change, this pattern has become too
simplistic as a SELECT will need to join several tables to be able to
retrieve all the needed fields.
Thus, we deprecate the selectFields() methods in favor of getQueryInfo()
methods that return tables and join conditions in addition to the
fields.
Change-Id: Idcfd15568489d9f03a7ba4460e96610d33bc4089
With the introduction of CommentStore, selects from various table
require certain joins or column aliases for proper operation. The
upcoming actor table change, and the suggested title table change, will
add more such requirements.
Change-Id: Ic8213bff74b8350b15cd271d0ef252e63e7e79bd
Unlike 'img_major_mime' and 'img_minor_mime', this shouldn't be
"inefficient", since there's an index on it.
Bug: T131157
Change-Id: I4985cade41c23ef68f5caf276d4934cf24de2bb6
Swapped some "$var type" to "type $var" or added missing types
before the $var. Changed some other types to match the more common
spelling. Makes beginning of some text in captial.
Also added some missing @param.
Change-Id: I727deec35a712de0f0c676cc87dfa661f1ee965b
Follows-up I1343872de7, Ia533aedf63 and I2df2f80b81.
Also updated usage in text in documentation and the
installer LocalSettingsGenerator.
Most of them were handled by this regex:
- find: (require|include|require_once|include_once)\s*\(\s*(.+?)\s*\)\s*;$
- replace: $1 $2;
Change-Id: I6b38aad9a5149c9c43ce18bd8edbab14b8ce43fa
Squiz.WhiteSpace.LanguageConstructSpacing:
Language constructs must be followed by a single space;
expected "require_once expression" but found
"require_once(expression)"
It is a keyword (e.g. like `new`, `return` and `print`). As
such the parentheses don't make sense.
Per our code conventions, we use a space after keywords like
these. We appeared to have an unwritten exception for `require`
that doesn't make sense. About 60% of require/include usage
was missing the space and/or had superfluous parentheses.
It is as silly as print("foo") or return("foo"), it works
because keywords have no significance for whitespace between
it and the expression that follows, and since experessions can
be wrapped in parentheses for clarity (e.g. when doing string
concatenation or mathematical operations) the parenthesis
before and after basiclaly just ignored.
Change-Id: I2df2f80b8123714bea7e0771bf94b51ad5bb4b87
Fix up spaces in our function calls, we do not want spaces before a
comma and try to avoid multiple commas whenever possible.
Errors:
* No space found after comma in function call
* Space found before comma in function call
Change-Id: I51aec02016f742422fa60b92ad35ba3f0ef59ba3
We can now do this since we finally switched to PHP 5.3 for MW 1.20 and get rid of the silly dirname(__FILE__) stuff :)
Change-Id: Id9b2c9cd2e678197aa81c78adced5d1d31ff57b1
This is very similar to rebuildImages.php, except more specific to img_metadata field,
and does the images in batches instead of all at once.
Also, while I'm here, I added a line to Maintenance.php to make sure it casted
$this->mBatchSize to an integer when gotten from command line (thought it was weird
that it didn't do that)
(I'm going to tag this revision 1.18 because I think it'd be nice to have this script
in 1.18 given new image metadata stuff added in 1.18, but not super-important
because rebuildImages.php does already work to refresh image metadata)