* Add automatic splitting of large metadata on upload or refresh. If
the reserializeMetadata option is enabled, metadata stored with PHP
serialization will be automatically reserialized to JSON.
* Inject configuration variable $wgUpdateCompatibleMetadata via
LocalRepo instead of accessing it directly.
* In refreshImageMetadata.php and rebuildImages.php, construct a new
LocalRepo with config overrides, instead of overwriting config
globals. Add a helper to RepoGroup to help with this.
* In refreshImageMetadata.php, add new options --convert-to-json and
--split which reserialize metadata and optionally split out large
items to blob storage.
Also, refreshImageMetadata.php was totally broken in the non-force mode
since metadata refresh on page view was disabled in b814245d9f. The
maintenance script was relying on newFileFromRow() magically upgrading
the row, which doesn't happen anymore. So, call maybeUpgradeRow()
directly.
Bug: T275268
Change-Id: I7bf7d9cef71641e287ca4346b568b381f4ada50e
* Optionally store metadata in the database in JSON format instead of
PHP serialization. The new JSON format has a top-level "envelope"
array which gives us a place to store things that are not part of the
handler metadata.
* Optionally split metadata items, putting items above a threshold into
the text table. The FileRepo and MediaHandler must both opt in.
* For staged deployment, the read side of these changes is always
active. Only the write side is configurable.
Bug: T275268
Change-Id: I876ea5c9d3a1881e278f689d2f8a3ae20240c703
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
The global function wfLocalFile() is deprecated since 1.34 and unused.
This patch now makes it emit deprecation warnings.
Change-Id: Ib9f94b4f49e7720bd4455d019995037eaa4e3980
The name change happened some time ago, and I think its
about time to start using the name name!
(Done with a find and replace)
My personal motivation for doing this is that I have started
trying out vscode as an IDE for mediawiki development, and
right now it doesn't appear to handle php aliases very well
or at all.
Change-Id: I412235d91ae26e4c1c6a62e0dbb7e7cf3c5ed4a6
The only use-case for this was if the local wiki's File namespace didn't
allow initial-lowercase names but a remote repo did. This case is
unlikely to be useful and was broken anyway, so it's now prohibited, and
getUserCaseDBKey is no longer needed. It's now an alias for getDBkey
until we remove it.
This includes a breaking change to
MediaWikiTitleCodec::splitTitleString, but that was always intended as
an internal method and I have now marked it officially as such. There is
one caller in code search outside core (JsonConfig), but it looks like
it will be unaffected.
Bug: T202094
Depends-On: I4b8ceb8a7f4624d6a3763aca6df41bf1a0d7293f
Depends-On: I724be15e93421f874fb202f0ec2ca6940e56a20a
Change-Id: I4fd64d4b0036b6dabdcfeee18766df18bf538542
assertSame() is guaranteed to never do any magic type conversion.
This can be critical when accidentially comparing empty strings (a
value PHP considers to be "falsy") to false, 0, 0.0, null, and such.
Change-Id: I2e2685c5992cae252f629a68ffe1a049f2e5ed1b
Some methods on LocalFile will fatal if called on a non-existing file.
ApiQueryImageInfo did not take that into account.
This patch changes LocalFile to avoid fatal errors, and ApiQueryImageInfo
to not try and report information on non-existing files.
NOTE: the modified code has NO test coverage! This should be fixed
before this patch is applied, or the patch needs to be thoroughly tested
manually.
Bug: T221812
Change-Id: I9b74545a393d1b7a25c8262d4fe37a6492bbc11e