Commit graph

19 commits

Author SHA1 Message Date
Tim Starling
9251f3c9e3 Manual and automatic image metadata reserialization
* Add automatic splitting of large metadata on upload or refresh. If
  the reserializeMetadata option is enabled, metadata stored with PHP
  serialization will be automatically reserialized to JSON.
* Inject configuration variable $wgUpdateCompatibleMetadata via
  LocalRepo instead of accessing it directly.
* In refreshImageMetadata.php and rebuildImages.php, construct a new
  LocalRepo with config overrides, instead of overwriting config
  globals. Add a helper to RepoGroup to help with this.
* In refreshImageMetadata.php, add new options --convert-to-json and
  --split which reserialize metadata and optionally split out large
  items to blob storage.

Also, refreshImageMetadata.php was totally broken in the non-force mode
since metadata refresh on page view was disabled in b814245d9f. The
maintenance script was relying on newFileFromRow() magically upgrading
the row, which doesn't happen anymore. So, call maybeUpgradeRow()
directly.

Bug: T275268
Change-Id: I7bf7d9cef71641e287ca4346b568b381f4ada50e
2021-06-26 22:50:17 +00:00
Tim Starling
68ebdfc77f Optionally split out parts of file metadata to BlobStore
* Optionally store metadata in the database in JSON format instead of
  PHP serialization. The new JSON format has a top-level "envelope"
  array which gives us a place to store things that are not part of the
  handler metadata.
* Optionally split metadata items, putting items above a threshold into
  the text table. The FileRepo and MediaHandler must both opt in.
* For staged deployment, the read side of these changes is always
  active. Only the write side is configurable.

Bug: T275268
Change-Id: I876ea5c9d3a1881e278f689d2f8a3ae20240c703
2021-06-11 08:01:26 +10:00
Tim Starling
b4849e03b7 Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.

Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.

So, in filerepo:

* Add File::getMetadataItem(), which promises to allow partial
  loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
  Some file handlers were returning non-serializable strings from
  getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
  called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
  File::loadMetadataFromDb() in preparation for T275268.

In MediaHandler:

* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
  into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
  only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
  media. It was not called in that case, but was implemented anyway.

In specific handlers:

* Rename DjVuHandler::getUnserializedMetadata() and
  extractTreesFromMetadata() for clarity. "Metadata" in these function
  names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
  style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
  Orientation tag, rather than a serialized string. Also renamed for
  clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
  instead of throwing them away. There was some conflation in
  decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
  bits per channel, width, height and components (channel count). This
  is essentially a port of PHP's getimagesize(), so should be bugwards
  compatible.
* In PNGMetadataExtractor, return the width and height, which were
  previously assigned to unused local variables. I verified the
  implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
  but rename the function since it now takes an array as input.

In tests:

* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
  getMetadata() returns null, since it doesn't make sense when ported to
  getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
  useful to confirm that existing database rows will continue to be read
  correctly. I removed serialization of expected values, replacing them
  with plain data.
* In tests, I replaced access to private class constants like
  BROKEN_FILE with string literals, since stability is essential. If
  the class constant changes, the test should fail.

Elsewhere:

* In maintenance/refreshImageMetadata.php, I removed the check for
  shrinking image metadata, since it's not easy to implement and is
  not future compatible. Image metadata is expected to shrink in
  future.

Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-06-08 17:04:01 +10:00
Petr Pchelko
3e2259be93 ArchivedFile: replace ::getUser methods with ::getUploader
Change-Id: I43ea7be80e8546e217d6aa371bb48f0699129cc4
2021-06-04 12:07:07 -07:00
Petr Pchelko
eed864e311 Hard-deprecate File::getUser
Depends-On: Ia944ba04880b20d6da71ebccfc0cb7ecd9b37904
Depends-On: Ie9978d2ee692952b73832a859ea30e3f5457a743
Change-Id: I6885be24237cf222a2bd87156b64c784590b6efb
2021-06-02 13:49:00 -07:00
Petr Pchelko
5455e58967 Deprecate File::getUser in favor of File::getUploader
Change-Id: I8a45a8fdfa827f203e6bc123cb685d02c3612bb0
2021-06-02 09:06:09 -07:00
Tim Starling
f5afdc1e52 Add an integration test for some LocalFile methods
Change-Id: I9b9aae442d966a9adc850b320645571a57dcaf8c
2021-05-26 15:26:47 +00:00
Alexander Vorwerk
65015ad2d8 Hard deprecate global function wfLocalFile()
The global function wfLocalFile() is deprecated since 1.34 and unused.
This patch now makes it emit deprecation warnings.

Change-Id: Ib9f94b4f49e7720bd4455d019995037eaa4e3980
2021-05-18 14:59:25 +00:00
DannyS712
0483225872 LocalFileTest: use assertInstanceOf
Instead of reinventing the wheel

Change-Id: I5c596c0355c6a0f19abeb1927587bceec749f91c
2021-03-12 16:55:59 +00:00
addshore
959bc315f2 MediaWikiTestCase to MediaWikiIntegrationTestCase
The name change happened some time ago, and I think its
about time to start using the name name!
(Done with a find and replace)

My personal motivation for doing this is that I have started
trying out vscode as an IDE for mediawiki development, and
right now it doesn't appear to handle php aliases very well
or at all.

Change-Id: I412235d91ae26e4c1c6a62e0dbb7e7cf3c5ed4a6
2020-06-30 17:02:22 +01:00
Aryeh Gregor
b4a7620b9e Hard-deprecate Title::getUserCaseDBKey
The only use-case for this was if the local wiki's File namespace didn't
allow initial-lowercase names but a remote repo did. This case is
unlikely to be useful and was broken anyway, so it's now prohibited, and
getUserCaseDBKey is no longer needed. It's now an alias for getDBkey
until we remove it.

This includes a breaking change to
MediaWikiTitleCodec::splitTitleString, but that was always intended as
an internal method and I have now marked it officially as such. There is
one caller in code search outside core (JsonConfig), but it looks like
it will be unaffected.

Bug: T202094
Depends-On: I4b8ceb8a7f4624d6a3763aca6df41bf1a0d7293f
Depends-On: I724be15e93421f874fb202f0ec2ca6940e56a20a
Change-Id: I4fd64d4b0036b6dabdcfeee18766df18bf538542
2019-10-24 19:26:37 +03:00
Thiemo Kreuz
e06ce9f467 tests: Prefer PHPUnit's assertSame() when comparing empty strings
assertSame() is guaranteed to never do any magic type conversion.
This can be critical when accidentially comparing empty strings (a
value PHP considers to be "falsy") to false, 0, 0.0, null, and such.

Change-Id: I2e2685c5992cae252f629a68ffe1a049f2e5ed1b
2019-09-20 15:27:58 +00:00
daniel
bdc6b4e378 LocalFile: avoid hard failures on non-existing files.
Some methods on LocalFile will fatal if called on a non-existing file.
ApiQueryImageInfo did not take that into account.

This patch changes LocalFile to avoid fatal errors, and ApiQueryImageInfo
to not try and report information on non-existing files.

NOTE: the modified code has NO test coverage! This should be fixed
before this patch is applied, or the patch needs to be thoroughly tested
manually.

Bug: T221812
Change-Id: I9b74545a393d1b7a25c8262d4fe37a6492bbc11e
2019-09-18 09:18:44 +00:00
Thiemo Kreuz
c0e27fba86 Add missing property declarations to LocalFileTest
Change-Id: I518a7a939042110f7ca67b0697e35e2e35105a08
2019-01-02 15:25:44 +01:00
Umherirrender
45da581551 Use ::class to resolve class names in tests
This helps to find renamed or misspelled classes earlier.
Phan will check the class names

Change-Id: Ie541a7baae10ab6f5c13f95ac2ff6598b8f8950c
2018-01-26 22:49:13 +01:00
Reedy
1834ee3d8e Fix numerous class/function casing
Change-Id: I23982bfa0548c9ea3bdb432be7982f1563930715
2016-03-18 23:14:49 +00:00
Kunal Mehta
6e9b4f0e9c Convert all array() syntax to []
Per wikitech-l consensus:
 https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html

Notes:
* Disabled CallTimePassByReference due to false positives (T127163)

Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
2016-02-17 01:33:00 -08:00
Kunal Mehta
790eb5810f LocalFileTest does not require database access
Change-Id: Iea4a2758a1169b212c15ceb4445b2e2a0458df1a
2015-03-26 17:36:14 -07:00
umherirrender
1d8b52fbfd Move Test files under same folder structure where class is (/includes/)
Change-Id: I95f1aa6f0ed2cc3306aa6e588a11f359854315c1
2014-12-16 21:56:44 +01:00
Renamed from tests/phpunit/includes/LocalFileTest.php (Browse further)