2011-07-11 16:42:15 +00:00
< ? php
2011-07-20 15:24:21 +00:00
2022-07-28 20:36:55 +00:00
use MediaWiki\MainConfigNames ;
2014-07-15 17:31:07 +00:00
/**
* @ group Media
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
* @ covers ExifBitmapHandler
2022-10-23 08:10:59 +00:00
* @ requires extension exif
2014-07-15 17:31:07 +00:00
*/
2015-06-15 16:23:46 +00:00
class ExifBitmapTest extends MediaWikiMediaTestCase {
2011-07-11 16:42:15 +00:00
2013-10-23 22:12:39 +00:00
/**
* @ var ExifBitmapHandler
*/
protected $handler ;
2021-07-22 03:11:47 +00:00
protected function setUp () : void {
2012-10-08 10:56:20 +00:00
parent :: setUp ();
2022-07-28 20:36:55 +00:00
$this -> overrideConfigValue ( MainConfigNames :: ShowEXIF , true );
2012-10-08 10:56:20 +00:00
2011-08-21 18:05:34 +00:00
$this -> handler = new ExifBitmapHandler ;
2011-07-20 15:33:46 +00:00
}
2023-03-23 11:36:19 +00:00
public static function provideIsFileMetadataValid () {
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
return [
'old broken' => [
ExifBitmapHandler :: OLD_BROKEN_FILE ,
ExifBitmapHandler :: METADATA_COMPATIBLE
],
'broken' => [
ExifBitmapHandler :: BROKEN_FILE ,
ExifBitmapHandler :: METADATA_GOOD
],
'invalid' => [
'Something Invalid Here.' ,
ExifBitmapHandler :: METADATA_BAD
],
'good' => [
'a:16:{s:10:"ImageWidth";i:20;s:11:"ImageLength";i:20;s:13:"BitsPerSample";a:3:{i:0;i:8;i:1;i:8;i:2;i:8;}s:11:"Compression";i:5;s:25:"PhotometricInterpretation";i:2;s:16:"ImageDescription";s:17:"Created with GIMP";s:12:"StripOffsets";i:8;s:11:"Orientation";i:1;s:15:"SamplesPerPixel";i:3;s:12:"RowsPerStrip";i:64;s:15:"StripByteCounts";i:238;s:11:"XResolution";s:19:"1207959552/16777216";s:11:"YResolution";s:19:"1207959552/16777216";s:19:"PlanarConfiguration";i:1;s:14:"ResolutionUnit";i:2;s:22:"MEDIAWIKI_EXIF_VERSION";i:2;}' ,
ExifBitmapHandler :: METADATA_GOOD
],
'old good' => [
'a:16:{s:10:"ImageWidth";i:20;s:11:"ImageLength";i:20;s:13:"BitsPerSample";a:3:{i:0;i:8;i:1;i:8;i:2;i:8;}s:11:"Compression";i:5;s:25:"PhotometricInterpretation";i:2;s:16:"ImageDescription";s:17:"Created with GIMP";s:12:"StripOffsets";i:8;s:11:"Orientation";i:1;s:15:"SamplesPerPixel";i:3;s:12:"RowsPerStrip";i:64;s:15:"StripByteCounts";i:238;s:11:"XResolution";s:19:"1207959552/16777216";s:11:"YResolution";s:19:"1207959552/16777216";s:19:"PlanarConfiguration";i:1;s:14:"ResolutionUnit";i:2;s:22:"MEDIAWIKI_EXIF_VERSION";i:1;}' ,
ExifBitmapHandler :: METADATA_COMPATIBLE
],
// Handle metadata from paged tiff handler (gotten via instant commons) gracefully.
'paged tiff' => [
'a:6:{s:9:"page_data";a:1:{i:1;a:5:{s:5:"width";i:643;s:6:"height";i:448;s:5:"alpha";s:4:"true";s:4:"page";i:1;s:6:"pixels";i:288064;}}s:10:"page_count";i:1;s:10:"first_page";i:1;s:9:"last_page";i:1;s:4:"exif";a:9:{s:10:"ImageWidth";i:643;s:11:"ImageLength";i:448;s:11:"Compression";i:5;s:25:"PhotometricInterpretation";i:2;s:11:"Orientation";i:1;s:15:"SamplesPerPixel";i:4;s:12:"RowsPerStrip";i:50;s:19:"PlanarConfiguration";i:1;s:22:"MEDIAWIKI_EXIF_VERSION";i:1;}s:21:"TIFF_METADATA_VERSION";s:3:"1.4";}' ,
ExifBitmapHandler :: METADATA_BAD
],
2012-10-08 10:56:20 +00:00
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
];
2011-07-11 16:42:15 +00:00
}
2012-10-08 10:56:20 +00:00
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
/** @dataProvider provideIsFileMetadataValid */
public function testIsFileMetadataValid ( $serializedMetadata , $expected ) {
$file = $this -> getMockFileWithMetadata ( $serializedMetadata );
$res = $this -> handler -> isFileMetadataValid ( $file );
$this -> assertEquals ( $expected , $res );
2011-07-11 16:42:15 +00:00
}
2011-08-21 18:05:34 +00:00
2013-10-23 22:12:39 +00:00
public function testConvertMetadataLatest () {
2016-02-17 09:09:32 +00:00
$metadata = [
'foo' => [ 'First' , 'Second' , '_type' => 'ol' ],
2013-02-15 10:17:52 +00:00
'MEDIAWIKI_EXIF_VERSION' => 2
2016-02-17 09:09:32 +00:00
];
2011-08-21 18:05:34 +00:00
$res = $this -> handler -> convertMetadataVersion ( $metadata , 2 );
$this -> assertEquals ( $metadata , $res );
}
2012-10-08 10:56:20 +00:00
2013-10-23 22:12:39 +00:00
public function testConvertMetadataToOld () {
2016-02-17 09:09:32 +00:00
$metadata = [
'foo' => [ 'First' , 'Second' , '_type' => 'ol' ],
'bar' => [ 'First' , 'Second' , '_type' => 'ul' ],
'baz' => [ 'First' , 'Second' ],
2011-08-21 18:05:34 +00:00
'fred' => 'Single' ,
'MEDIAWIKI_EXIF_VERSION' => 2 ,
2016-02-17 09:09:32 +00:00
];
$expected = [
2011-08-21 18:05:34 +00:00
'foo' => " \n #First \n #Second " ,
'bar' => " \n *First \n *Second " ,
'baz' => " \n *First \n *Second " ,
'fred' => 'Single' ,
'MEDIAWIKI_EXIF_VERSION' => 1 ,
2016-02-17 09:09:32 +00:00
];
2011-08-21 18:05:34 +00:00
$res = $this -> handler -> convertMetadataVersion ( $metadata , 1 );
$this -> assertEquals ( $expected , $res );
}
2012-10-08 10:56:20 +00:00
2013-10-23 22:12:39 +00:00
public function testConvertMetadataSoftware () {
2016-02-17 09:09:32 +00:00
$metadata = [
'Software' => [ [ 'GIMP' , '1.1' ] ],
2011-08-21 18:05:34 +00:00
'MEDIAWIKI_EXIF_VERSION' => 2 ,
2016-02-17 09:09:32 +00:00
];
$expected = [
2011-08-21 18:05:34 +00:00
'Software' => 'GIMP (Version 1.1)' ,
'MEDIAWIKI_EXIF_VERSION' => 1 ,
2016-02-17 09:09:32 +00:00
];
2011-08-21 18:05:34 +00:00
$res = $this -> handler -> convertMetadataVersion ( $metadata , 1 );
$this -> assertEquals ( $expected , $res );
}
2012-10-08 10:56:20 +00:00
2013-10-23 22:12:39 +00:00
public function testConvertMetadataSoftwareNormal () {
2016-02-17 09:09:32 +00:00
$metadata = [
'Software' => [ " GIMP 1.2 " , " vim " ],
2011-08-21 18:05:34 +00:00
'MEDIAWIKI_EXIF_VERSION' => 2 ,
2016-02-17 09:09:32 +00:00
];
$expected = [
2011-08-21 18:05:34 +00:00
'Software' => " \n *GIMP 1.2 \n *vim " ,
'MEDIAWIKI_EXIF_VERSION' => 1 ,
2016-02-17 09:09:32 +00:00
];
2011-08-21 18:05:34 +00:00
$res = $this -> handler -> convertMetadataVersion ( $metadata , 1 );
$this -> assertEquals ( $expected , $res );
}
2011-07-11 16:42:15 +00:00
}