2007-05-30 21:02:32 +00:00
|
|
|
<?php
|
|
|
|
|
/**
|
2012-05-07 07:11:33 +00:00
|
|
|
* Local file in the wiki's own database.
|
|
|
|
|
*
|
|
|
|
|
* This program is free software; you can redistribute it and/or modify
|
|
|
|
|
* it under the terms of the GNU General Public License as published by
|
|
|
|
|
* the Free Software Foundation; either version 2 of the License, or
|
|
|
|
|
* (at your option) any later version.
|
|
|
|
|
*
|
|
|
|
|
* This program is distributed in the hope that it will be useful,
|
|
|
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
|
* GNU General Public License for more details.
|
|
|
|
|
*
|
|
|
|
|
* You should have received a copy of the GNU General Public License along
|
|
|
|
|
* with this program; if not, write to the Free Software Foundation, Inc.,
|
|
|
|
|
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
|
|
|
|
* http://www.gnu.org/copyleft/gpl.html
|
2010-09-04 18:13:18 +00:00
|
|
|
*
|
|
|
|
|
* @file
|
2012-02-08 15:51:16 +00:00
|
|
|
* @ingroup FileAbstraction
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
|
|
|
|
|
2017-02-07 04:49:57 +00:00
|
|
|
use MediaWiki\Logger\LoggerFactory;
|
2020-01-10 00:00:51 +00:00
|
|
|
use MediaWiki\MediaWikiServices;
|
2021-04-05 19:43:12 +00:00
|
|
|
use MediaWiki\Permissions\Authority;
|
2020-06-09 23:20:52 +00:00
|
|
|
use MediaWiki\Revision\RevisionRecord;
|
2020-07-03 00:20:38 +00:00
|
|
|
use MediaWiki\Revision\RevisionStore;
|
2021-06-02 04:34:38 +00:00
|
|
|
use MediaWiki\Storage\BlobStore;
|
2021-04-05 19:43:12 +00:00
|
|
|
use MediaWiki\User\UserIdentity;
|
2021-05-27 19:56:40 +00:00
|
|
|
use MediaWiki\User\UserIdentityValue;
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
use Wikimedia\Rdbms\Blob;
|
2017-02-07 04:49:57 +00:00
|
|
|
use Wikimedia\Rdbms\Database;
|
2017-02-10 18:09:05 +00:00
|
|
|
use Wikimedia\Rdbms\IDatabase;
|
2019-07-01 20:30:14 +00:00
|
|
|
use Wikimedia\Rdbms\IResultWrapper;
|
2016-06-30 00:22:03 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
/**
|
|
|
|
|
* Class to represent a local file in the wiki's own database
|
|
|
|
|
*
|
|
|
|
|
* Provides methods to retrieve paths (physical, logical, URL),
|
|
|
|
|
* to generate image thumbnails or for uploading.
|
|
|
|
|
*
|
2007-05-31 01:43:41 +00:00
|
|
|
* Note that only the repo object knows what its file class is called. You should
|
2008-04-14 07:45:50 +00:00
|
|
|
* never name a file class explictly outside of the repo class. Instead use the
|
2007-05-31 01:43:41 +00:00
|
|
|
* repo's factory functions to generate file objects, for example:
|
|
|
|
|
*
|
2013-02-03 19:42:08 +00:00
|
|
|
* RepoGroup::singleton()->getLocalRepo()->newFile( $title );
|
2007-05-31 01:43:41 +00:00
|
|
|
*
|
2019-05-14 17:00:34 +00:00
|
|
|
* Consider the services container below;
|
|
|
|
|
*
|
|
|
|
|
* $services = MediaWikiServices::getInstance();
|
|
|
|
|
*
|
|
|
|
|
* The convenience services $services->getRepoGroup()->getLocalRepo()->newFile()
|
|
|
|
|
* and $services->getRepoGroup()->findFile() should be sufficient in most cases.
|
|
|
|
|
*
|
|
|
|
|
* @TODO: DI - Instead of using MediaWikiServices::getInstance(), a service should
|
|
|
|
|
* ideally accept a RepoGroup in its constructor and then, use $this->repoGroup->findFile()
|
|
|
|
|
* and $this->repoGroup->getLocalRepo()->newFile().
|
2007-05-31 01:43:41 +00:00
|
|
|
*
|
2020-07-13 09:00:30 +00:00
|
|
|
* @stable to extend
|
2012-02-08 15:51:16 +00:00
|
|
|
* @ingroup FileAbstraction
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2010-02-07 19:31:24 +00:00
|
|
|
class LocalFile extends File {
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
private const VERSION = 13; // cache version
|
2016-08-30 06:22:22 +00:00
|
|
|
|
2020-05-15 22:16:46 +00:00
|
|
|
private const CACHE_FIELD_MAX_LEN = 1000;
|
2012-04-06 17:38:38 +00:00
|
|
|
|
2021-06-04 06:46:47 +00:00
|
|
|
/** @var string Metadata serialization: empty string. This is a compact non-legacy format. */
|
|
|
|
|
private const MDS_EMPTY = 'empty';
|
|
|
|
|
|
|
|
|
|
/** @var string Metadata serialization: some other string */
|
|
|
|
|
private const MDS_LEGACY = 'legacy';
|
|
|
|
|
|
|
|
|
|
/** @var string Metadata serialization: PHP serialize() */
|
|
|
|
|
private const MDS_PHP = 'php';
|
|
|
|
|
|
|
|
|
|
/** @var string Metadata serialization: JSON */
|
|
|
|
|
private const MDS_JSON = 'json';
|
|
|
|
|
|
2013-11-23 22:21:10 +00:00
|
|
|
/** @var bool Does the file exist on disk? (loadFromXxx) */
|
|
|
|
|
protected $fileExists;
|
2012-04-07 19:10:02 +00:00
|
|
|
|
2014-07-24 17:43:03 +00:00
|
|
|
/** @var int Image width */
|
2013-11-23 22:21:10 +00:00
|
|
|
protected $width;
|
|
|
|
|
|
2014-07-24 17:43:03 +00:00
|
|
|
/** @var int Image height */
|
2013-11-23 22:21:10 +00:00
|
|
|
protected $height;
|
|
|
|
|
|
|
|
|
|
/** @var int Returned by getimagesize (loadFromXxx) */
|
|
|
|
|
protected $bits;
|
|
|
|
|
|
|
|
|
|
/** @var string MEDIATYPE_xxx (bitmap, drawing, audio...) */
|
|
|
|
|
protected $media_type;
|
|
|
|
|
|
2018-09-21 01:40:59 +00:00
|
|
|
/** @var string MIME type, determined by MimeAnalyzer::guessMimeType */
|
2013-11-23 22:21:10 +00:00
|
|
|
protected $mime;
|
|
|
|
|
|
|
|
|
|
/** @var int Size in bytes (loadFromXxx) */
|
|
|
|
|
protected $size;
|
|
|
|
|
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
/** @var array Unserialized metadata */
|
|
|
|
|
protected $metadataArray = [];
|
2013-11-23 22:21:10 +00:00
|
|
|
|
2021-06-04 06:46:47 +00:00
|
|
|
/**
|
|
|
|
|
* One of the MDS_* constants, giving the format of the metadata as stored
|
|
|
|
|
* in the DB, or null if the data was not loaded from the DB.
|
|
|
|
|
*
|
|
|
|
|
* @var string|null
|
|
|
|
|
*/
|
|
|
|
|
protected $metadataSerializationFormat;
|
|
|
|
|
|
2021-06-02 04:34:38 +00:00
|
|
|
/** @var string[] Map of metadata item name to blob address */
|
|
|
|
|
protected $metadataBlobs = [];
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Map of metadata item name to blob address for items that exist but
|
|
|
|
|
* have not yet been loaded into $this->metadataArray
|
|
|
|
|
*
|
|
|
|
|
* @var string[]
|
|
|
|
|
*/
|
|
|
|
|
protected $unloadedMetadataBlobs = [];
|
|
|
|
|
|
2013-11-23 22:21:10 +00:00
|
|
|
/** @var string SHA-1 base 36 content hash */
|
|
|
|
|
protected $sha1;
|
|
|
|
|
|
|
|
|
|
/** @var bool Whether or not core data has been loaded from the database (loadFromXxx) */
|
|
|
|
|
protected $dataLoaded;
|
|
|
|
|
|
|
|
|
|
/** @var bool Whether or not lazy-loaded data has been loaded from the database */
|
|
|
|
|
protected $extraDataLoaded;
|
|
|
|
|
|
|
|
|
|
/** @var int Bitfield akin to rev_deleted */
|
|
|
|
|
protected $deleted;
|
|
|
|
|
|
|
|
|
|
/** @var string */
|
2018-01-13 00:02:09 +00:00
|
|
|
protected $repoClass = LocalRepo::class;
|
2011-11-07 21:54:19 +00:00
|
|
|
|
2013-11-23 22:21:10 +00:00
|
|
|
/** @var int Number of line to return by nextHistoryLine() (constructor) */
|
|
|
|
|
private $historyLine;
|
|
|
|
|
|
2019-07-01 20:30:14 +00:00
|
|
|
/** @var IResultWrapper|null Result of the query for the file's history (nextHistoryLine) */
|
2013-11-23 22:21:10 +00:00
|
|
|
private $historyRes;
|
|
|
|
|
|
2014-07-24 14:04:48 +00:00
|
|
|
/** @var string Major MIME type */
|
2013-11-23 22:21:10 +00:00
|
|
|
private $major_mime;
|
|
|
|
|
|
2014-07-24 14:04:48 +00:00
|
|
|
/** @var string Minor MIME type */
|
2013-11-23 22:21:10 +00:00
|
|
|
private $minor_mime;
|
|
|
|
|
|
|
|
|
|
/** @var string Upload timestamp */
|
|
|
|
|
private $timestamp;
|
|
|
|
|
|
2021-05-27 19:56:40 +00:00
|
|
|
/** @var UserIdentity|null Uploader */
|
2013-11-23 22:21:10 +00:00
|
|
|
private $user;
|
|
|
|
|
|
|
|
|
|
/** @var string Description of current revision of the file */
|
|
|
|
|
private $description;
|
|
|
|
|
|
2015-02-05 03:43:22 +00:00
|
|
|
/** @var string TS_MW timestamp of the last change of the file description */
|
|
|
|
|
private $descriptionTouched;
|
|
|
|
|
|
2013-11-23 22:21:10 +00:00
|
|
|
/** @var bool Whether the row was upgraded on load */
|
|
|
|
|
private $upgraded;
|
|
|
|
|
|
2016-08-11 16:19:09 +00:00
|
|
|
/** @var bool Whether the row was scheduled to upgrade on load */
|
|
|
|
|
private $upgrading;
|
|
|
|
|
|
2021-09-20 21:26:42 +00:00
|
|
|
/** @var int If >= 1 the image row is locked */
|
2013-11-23 22:21:10 +00:00
|
|
|
private $locked;
|
|
|
|
|
|
|
|
|
|
/** @var bool True if the image row is locked with a lock initiated transaction */
|
|
|
|
|
private $lockedOwnTrx;
|
|
|
|
|
|
|
|
|
|
/** @var bool True if file is not present in file system. Not to be cached in memcached */
|
|
|
|
|
private $missing;
|
|
|
|
|
|
2015-03-05 01:02:05 +00:00
|
|
|
// @note: higher than IDBAccessObject constants
|
2020-05-15 22:16:46 +00:00
|
|
|
private const LOAD_ALL = 16; // integer; load all the lazy fields too (like metadata)
|
2014-04-23 00:04:10 +00:00
|
|
|
|
2020-05-15 22:16:46 +00:00
|
|
|
private const ATOMIC_SECTION_LOCK = 'LocalFile::lockingTransaction';
|
2016-07-14 21:51:25 +00:00
|
|
|
|
2007-05-31 01:43:41 +00:00
|
|
|
/**
|
|
|
|
|
* Create a LocalFile from a title
|
|
|
|
|
* Do not call this except from inside a repo class.
|
2008-04-05 12:26:10 +00:00
|
|
|
*
|
|
|
|
|
* Note: $unused param is only here to avoid an E_STRICT
|
2011-05-28 18:59:42 +00:00
|
|
|
*
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2020-07-09 09:03:42 +00:00
|
|
|
*
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param Title $title
|
|
|
|
|
* @param FileRepo $repo
|
2014-04-19 15:19:17 +00:00
|
|
|
* @param null $unused
|
2011-05-28 18:59:42 +00:00
|
|
|
*
|
2019-05-22 19:26:35 +00:00
|
|
|
* @return static
|
2007-05-31 01:43:41 +00:00
|
|
|
*/
|
2020-05-18 01:39:53 +00:00
|
|
|
public static function newFromTitle( $title, $repo, $unused = null ) {
|
2019-05-22 19:26:35 +00:00
|
|
|
return new static( $title, $repo );
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2007-05-31 01:43:41 +00:00
|
|
|
/**
|
|
|
|
|
* Create a LocalFile from a title
|
|
|
|
|
* Do not call this except from inside a repo class.
|
2011-05-28 18:59:42 +00:00
|
|
|
*
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2020-07-09 09:03:42 +00:00
|
|
|
*
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param stdClass $row
|
|
|
|
|
* @param FileRepo $repo
|
2011-05-28 18:59:42 +00:00
|
|
|
*
|
2019-05-22 19:26:35 +00:00
|
|
|
* @return static
|
2007-05-31 01:43:41 +00:00
|
|
|
*/
|
2020-05-18 01:39:53 +00:00
|
|
|
public static function newFromRow( $row, $repo ) {
|
2008-12-01 17:14:30 +00:00
|
|
|
$title = Title::makeTitle( NS_FILE, $row->img_name );
|
2019-05-22 19:26:35 +00:00
|
|
|
$file = new static( $title, $repo );
|
2007-05-30 21:02:32 +00:00
|
|
|
$file->loadFromRow( $row );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
return $file;
|
|
|
|
|
}
|
2010-09-04 01:06:34 +00:00
|
|
|
|
2008-05-20 02:58:40 +00:00
|
|
|
/**
|
|
|
|
|
* Create a LocalFile from a SHA-1 key
|
|
|
|
|
* Do not call this except from inside a repo class.
|
2011-05-28 18:59:42 +00:00
|
|
|
*
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2020-07-09 09:03:42 +00:00
|
|
|
*
|
2014-07-24 17:43:03 +00:00
|
|
|
* @param string $sha1 Base-36 SHA-1
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param LocalRepo $repo
|
2011-10-19 04:06:16 +00:00
|
|
|
* @param string|bool $timestamp MW_timestamp (optional)
|
|
|
|
|
* @return bool|LocalFile
|
2008-05-20 02:58:40 +00:00
|
|
|
*/
|
2020-05-18 01:39:53 +00:00
|
|
|
public static function newFromKey( $sha1, $repo, $timestamp = false ) {
|
2016-11-18 15:42:39 +00:00
|
|
|
$dbr = $repo->getReplicaDB();
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2016-02-17 09:09:32 +00:00
|
|
|
$conds = [ 'img_sha1' => $sha1 ];
|
2010-09-04 13:48:16 +00:00
|
|
|
if ( $timestamp ) {
|
2011-10-19 04:06:16 +00:00
|
|
|
$conds['img_timestamp'] = $dbr->timestamp( $timestamp );
|
2008-05-20 02:58:40 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2019-05-22 19:26:35 +00:00
|
|
|
$fileQuery = static::getQueryInfo();
|
2017-10-06 17:03:55 +00:00
|
|
|
$row = $dbr->selectRow(
|
|
|
|
|
$fileQuery['tables'], $fileQuery['fields'], $conds, __METHOD__, [], $fileQuery['joins']
|
|
|
|
|
);
|
2010-09-04 13:48:16 +00:00
|
|
|
if ( $row ) {
|
2019-05-22 19:26:35 +00:00
|
|
|
return static::newFromRow( $row, $repo );
|
2008-05-20 02:58:40 +00:00
|
|
|
} else {
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
}
|
2010-09-04 01:06:34 +00:00
|
|
|
|
2017-10-06 17:03:55 +00:00
|
|
|
/**
|
|
|
|
|
* Return the tables, fields, and join conditions to be selected to create
|
|
|
|
|
* a new localfile object.
|
2021-04-19 00:55:24 +00:00
|
|
|
*
|
|
|
|
|
* Since 1.34, img_user and img_user_text have not been present in the
|
|
|
|
|
* database, but they continue to be available in query results as
|
|
|
|
|
* aliases.
|
|
|
|
|
*
|
2017-10-06 17:03:55 +00:00
|
|
|
* @since 1.31
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2020-07-09 09:03:42 +00:00
|
|
|
*
|
2017-10-06 17:03:55 +00:00
|
|
|
* @param string[] $options
|
|
|
|
|
* - omit-lazy: Omit fields that are lazily cached.
|
2018-04-19 08:30:33 +00:00
|
|
|
* @return array[] With three keys:
|
2017-10-06 17:03:55 +00:00
|
|
|
* - tables: (string[]) to include in the `$table` to `IDatabase->select()`
|
|
|
|
|
* - fields: (string[]) to include in the `$vars` to `IDatabase->select()`
|
|
|
|
|
* - joins: (array) to include in the `$join_conds` to `IDatabase->select()`
|
|
|
|
|
*/
|
|
|
|
|
public static function getQueryInfo( array $options = [] ) {
|
2018-05-04 13:39:33 +00:00
|
|
|
$commentQuery = MediaWikiServices::getInstance()->getCommentStore()->getJoin( 'img_description' );
|
2017-10-06 17:03:55 +00:00
|
|
|
$ret = [
|
2021-04-19 00:55:24 +00:00
|
|
|
'tables' => [
|
|
|
|
|
'image',
|
|
|
|
|
'image_actor' => 'actor'
|
|
|
|
|
] + $commentQuery['tables'],
|
2017-10-06 17:03:55 +00:00
|
|
|
'fields' => [
|
|
|
|
|
'img_name',
|
|
|
|
|
'img_size',
|
|
|
|
|
'img_width',
|
|
|
|
|
'img_height',
|
|
|
|
|
'img_metadata',
|
|
|
|
|
'img_bits',
|
|
|
|
|
'img_media_type',
|
|
|
|
|
'img_major_mime',
|
|
|
|
|
'img_minor_mime',
|
|
|
|
|
'img_timestamp',
|
|
|
|
|
'img_sha1',
|
2021-04-19 00:55:24 +00:00
|
|
|
'img_actor',
|
|
|
|
|
'img_user' => 'image_actor.actor_user',
|
|
|
|
|
'img_user_text' => 'image_actor.actor_name',
|
|
|
|
|
] + $commentQuery['fields'],
|
|
|
|
|
'joins' => [
|
|
|
|
|
'image_actor' => [ 'JOIN', 'actor_id=img_actor' ]
|
|
|
|
|
] + $commentQuery['joins'],
|
2017-10-06 17:03:55 +00:00
|
|
|
];
|
|
|
|
|
|
|
|
|
|
if ( in_array( 'omit-nonlazy', $options, true ) ) {
|
|
|
|
|
// Internal use only for getting only the lazy fields
|
|
|
|
|
$ret['fields'] = [];
|
|
|
|
|
}
|
|
|
|
|
if ( !in_array( 'omit-lazy', $options, true ) ) {
|
2021-06-03 01:34:31 +00:00
|
|
|
// Note: Keep this in sync with self::getLazyCacheFields() and
|
|
|
|
|
// self::loadExtraFromDB()
|
2017-10-06 17:03:55 +00:00
|
|
|
$ret['fields'][] = 'img_metadata';
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return $ret;
|
|
|
|
|
}
|
|
|
|
|
|
2007-05-31 01:43:41 +00:00
|
|
|
/**
|
|
|
|
|
* Do not call this except from inside a repo class.
|
2020-07-13 08:53:06 +00:00
|
|
|
* @stable to call
|
2020-07-09 09:03:42 +00:00
|
|
|
*
|
2014-04-19 15:19:17 +00:00
|
|
|
* @param Title $title
|
|
|
|
|
* @param FileRepo $repo
|
2007-05-31 01:43:41 +00:00
|
|
|
*/
|
2020-05-18 01:39:53 +00:00
|
|
|
public function __construct( $title, $repo ) {
|
2007-05-30 21:02:32 +00:00
|
|
|
parent::__construct( $title, $repo );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->historyLine = 0;
|
2007-07-07 03:04:20 +00:00
|
|
|
$this->historyRes = null;
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->dataLoaded = false;
|
2013-01-24 03:20:57 +00:00
|
|
|
$this->extraDataLoaded = false;
|
2011-11-07 21:54:19 +00:00
|
|
|
|
|
|
|
|
$this->assertRepoDefined();
|
|
|
|
|
$this->assertTitleDefined();
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2020-03-18 03:38:50 +00:00
|
|
|
/**
|
|
|
|
|
* @return LocalRepo|bool
|
|
|
|
|
*/
|
|
|
|
|
public function getRepo() {
|
|
|
|
|
return $this->repo;
|
|
|
|
|
}
|
|
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
/**
|
2010-09-04 01:06:34 +00:00
|
|
|
* Get the memcached key for the main data for this file, or false if
|
2009-06-17 07:31:00 +00:00
|
|
|
* there is no access to the shared cache.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2014-04-23 00:04:10 +00:00
|
|
|
* @return string|bool
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-18 01:39:53 +00:00
|
|
|
protected function getCacheKey() {
|
2016-08-30 06:22:22 +00:00
|
|
|
return $this->repo->getSharedCacheKey( 'file', sha1( $this->getName() ) );
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2016-09-03 04:43:16 +00:00
|
|
|
/**
|
|
|
|
|
* @param WANObjectCache $cache
|
|
|
|
|
* @return string[]
|
|
|
|
|
* @since 1.28
|
|
|
|
|
*/
|
|
|
|
|
public function getMutableCacheKeys( WANObjectCache $cache ) {
|
|
|
|
|
return [ $this->getCacheKey() ];
|
|
|
|
|
}
|
|
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
/**
|
2016-08-30 06:22:22 +00:00
|
|
|
* Try to load file metadata from memcached, falling back to the database
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2016-03-24 19:09:24 +00:00
|
|
|
private function loadFromCache() {
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->dataLoaded = false;
|
2013-01-24 03:20:57 +00:00
|
|
|
$this->extraDataLoaded = false;
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2016-08-30 06:22:22 +00:00
|
|
|
$key = $this->getCacheKey();
|
2007-05-30 21:02:32 +00:00
|
|
|
if ( !$key ) {
|
2016-08-30 06:22:22 +00:00
|
|
|
$this->loadFromDB( self::READ_NORMAL );
|
2007-05-30 21:02:32 +00:00
|
|
|
|
2016-08-30 06:22:22 +00:00
|
|
|
return;
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2018-05-04 13:39:33 +00:00
|
|
|
$cache = MediaWikiServices::getInstance()->getMainWANObjectCache();
|
2016-08-30 06:22:22 +00:00
|
|
|
$cachedValues = $cache->getWithSetCallback(
|
|
|
|
|
$key,
|
|
|
|
|
$cache::TTL_WEEK,
|
|
|
|
|
function ( $oldValue, &$ttl, array &$setOpts ) use ( $cache ) {
|
2016-11-18 15:42:39 +00:00
|
|
|
$setOpts += Database::getCacheSetOptions( $this->repo->getReplicaDB() );
|
2016-08-30 06:22:22 +00:00
|
|
|
|
|
|
|
|
$this->loadFromDB( self::READ_NORMAL );
|
|
|
|
|
|
|
|
|
|
$fields = $this->getCacheFields( '' );
|
2019-08-30 17:56:27 +00:00
|
|
|
$cacheVal = [];
|
2016-08-30 06:22:22 +00:00
|
|
|
$cacheVal['fileExists'] = $this->fileExists;
|
|
|
|
|
if ( $this->fileExists ) {
|
|
|
|
|
foreach ( $fields as $field ) {
|
|
|
|
|
$cacheVal[$field] = $this->$field;
|
|
|
|
|
}
|
|
|
|
|
}
|
2021-05-27 19:56:40 +00:00
|
|
|
if ( $this->user ) {
|
|
|
|
|
$cacheVal['user'] = $this->user->getId();
|
|
|
|
|
$cacheVal['user_text'] = $this->user->getName();
|
|
|
|
|
}
|
2021-06-02 04:34:38 +00:00
|
|
|
|
|
|
|
|
// Don't cache metadata items stored as blobs, since they tend to be large
|
|
|
|
|
if ( $this->metadataBlobs ) {
|
|
|
|
|
$cacheVal['metadata'] = array_diff_key(
|
|
|
|
|
$this->metadataArray, $this->metadataBlobs );
|
|
|
|
|
// Save the blob addresses
|
|
|
|
|
$cacheVal['metadataBlobs'] = $this->metadataBlobs;
|
|
|
|
|
} else {
|
|
|
|
|
$cacheVal['metadata'] = $this->metadataArray;
|
|
|
|
|
}
|
2017-09-12 17:12:29 +00:00
|
|
|
|
2016-08-30 06:22:22 +00:00
|
|
|
// Strip off excessive entries from the subset of fields that can become large.
|
2021-06-09 19:15:43 +00:00
|
|
|
// If the cache value gets too large and might not fit in the cache,
|
|
|
|
|
// causing repeat database queries for each access to the file.
|
2016-08-30 06:22:22 +00:00
|
|
|
foreach ( $this->getLazyCacheFields( '' ) as $field ) {
|
|
|
|
|
if ( isset( $cacheVal[$field] )
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
&& strlen( serialize( $cacheVal[$field] ) ) > 100 * 1024
|
2016-08-30 06:22:22 +00:00
|
|
|
) {
|
|
|
|
|
unset( $cacheVal[$field] ); // don't let the value get too big
|
2021-06-02 04:34:38 +00:00
|
|
|
if ( $field === 'metadata' ) {
|
|
|
|
|
unset( $cacheVal['metadataBlobs'] );
|
|
|
|
|
}
|
2016-08-30 06:22:22 +00:00
|
|
|
}
|
|
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2016-08-30 06:22:22 +00:00
|
|
|
if ( $this->fileExists ) {
|
|
|
|
|
$ttl = $cache->adaptiveTTL( wfTimestamp( TS_UNIX, $this->timestamp ), $ttl );
|
|
|
|
|
} else {
|
|
|
|
|
$ttl = $cache::TTL_DAY;
|
|
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2016-08-30 06:22:22 +00:00
|
|
|
return $cacheVal;
|
|
|
|
|
},
|
|
|
|
|
[ 'version' => self::VERSION ]
|
|
|
|
|
);
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2016-08-30 06:22:22 +00:00
|
|
|
$this->fileExists = $cachedValues['fileExists'];
|
2007-05-30 21:02:32 +00:00
|
|
|
if ( $this->fileExists ) {
|
2016-08-30 06:22:22 +00:00
|
|
|
$this->setProps( $cachedValues );
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2016-08-30 06:22:22 +00:00
|
|
|
$this->dataLoaded = true;
|
|
|
|
|
$this->extraDataLoaded = true;
|
2013-01-24 03:20:57 +00:00
|
|
|
foreach ( $this->getLazyCacheFields( '' ) as $field ) {
|
2016-08-30 06:22:22 +00:00
|
|
|
$this->extraDataLoaded = $this->extraDataLoaded && isset( $cachedValues[$field] );
|
2013-01-24 03:20:57 +00:00
|
|
|
}
|
2015-04-28 00:26:58 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Purge the file object/metadata cache
|
|
|
|
|
*/
|
2015-11-01 22:29:05 +00:00
|
|
|
public function invalidateCache() {
|
2015-04-28 00:26:58 +00:00
|
|
|
$key = $this->getCacheKey();
|
|
|
|
|
if ( !$key ) {
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
|
2021-04-19 01:32:42 +00:00
|
|
|
$this->repo->getPrimaryDB()->onTransactionPreCommitOrIdle(
|
2021-02-10 22:31:02 +00:00
|
|
|
static function () use ( $key ) {
|
2018-05-04 13:39:33 +00:00
|
|
|
MediaWikiServices::getInstance()->getMainWANObjectCache()->delete( $key );
|
2016-09-15 21:40:00 +00:00
|
|
|
},
|
|
|
|
|
__METHOD__
|
|
|
|
|
);
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Load metadata from the file itself
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
*
|
|
|
|
|
* @internal
|
|
|
|
|
* @param string|null $path The path or virtual URL to load from, or null
|
|
|
|
|
* to use the previously stored file.
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
public function loadFromFile( $path = null ) {
|
|
|
|
|
$props = $this->repo->getFileProps( $path ?? $this->getVirtualUrl() );
|
2011-12-20 03:52:06 +00:00
|
|
|
$this->setProps( $props );
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2012-05-10 07:55:33 +00:00
|
|
|
/**
|
2017-10-06 17:03:55 +00:00
|
|
|
* Returns the list of object properties that are included as-is in the cache.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2017-10-06 17:03:55 +00:00
|
|
|
* @param string $prefix Must be the empty string
|
2018-04-19 08:30:33 +00:00
|
|
|
* @return string[]
|
2017-10-06 17:03:55 +00:00
|
|
|
* @since 1.31 No longer accepts a non-empty $prefix
|
2012-05-10 07:55:33 +00:00
|
|
|
*/
|
2017-10-06 17:03:55 +00:00
|
|
|
protected function getCacheFields( $prefix = 'img_' ) {
|
|
|
|
|
if ( $prefix !== '' ) {
|
|
|
|
|
throw new InvalidArgumentException(
|
|
|
|
|
__METHOD__ . ' with a non-empty prefix is no longer supported.'
|
|
|
|
|
);
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2017-10-06 17:03:55 +00:00
|
|
|
// See self::getQueryInfo() for the fetching of the data from the DB,
|
|
|
|
|
// self::loadFromRow() for the loading of the object from the DB row,
|
|
|
|
|
// and self::loadFromCache() for the caching, and self::setProps() for
|
|
|
|
|
// populating the object from an array of data.
|
|
|
|
|
return [ 'size', 'width', 'height', 'bits', 'media_type',
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
'major_mime', 'minor_mime', 'timestamp', 'sha1', 'description' ];
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2013-01-24 03:20:57 +00:00
|
|
|
/**
|
2017-10-06 17:03:55 +00:00
|
|
|
* Returns the list of object properties that are included as-is in the
|
|
|
|
|
* cache, only when they're not too big, and are lazily loaded by self::loadExtraFromDB().
|
|
|
|
|
* @param string $prefix Must be the empty string
|
2018-04-19 08:30:33 +00:00
|
|
|
* @return string[]
|
2017-10-06 17:03:55 +00:00
|
|
|
* @since 1.31 No longer accepts a non-empty $prefix
|
2013-01-24 03:20:57 +00:00
|
|
|
*/
|
2017-10-06 17:03:55 +00:00
|
|
|
protected function getLazyCacheFields( $prefix = 'img_' ) {
|
|
|
|
|
if ( $prefix !== '' ) {
|
|
|
|
|
throw new InvalidArgumentException(
|
|
|
|
|
__METHOD__ . ' with a non-empty prefix is no longer supported.'
|
|
|
|
|
);
|
2013-01-24 03:20:57 +00:00
|
|
|
}
|
|
|
|
|
|
2017-10-06 17:03:55 +00:00
|
|
|
// Keep this in sync with the omit-lazy option in self::getQueryInfo().
|
|
|
|
|
return [ 'metadata' ];
|
2013-01-24 03:20:57 +00:00
|
|
|
}
|
|
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
/**
|
|
|
|
|
* Load file metadata from the DB
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2014-08-14 18:22:52 +00:00
|
|
|
* @param int $flags
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-18 01:39:53 +00:00
|
|
|
protected function loadFromDB( $flags = 0 ) {
|
2017-03-07 02:14:14 +00:00
|
|
|
$fname = static::class . '::' . __FUNCTION__;
|
2007-05-30 21:02:32 +00:00
|
|
|
|
|
|
|
|
# Unconditionally set loaded=true, we don't want the accessors constantly rechecking
|
|
|
|
|
$this->dataLoaded = true;
|
2013-01-24 03:20:57 +00:00
|
|
|
$this->extraDataLoaded = true;
|
2007-05-30 21:02:32 +00:00
|
|
|
|
2015-03-05 01:02:05 +00:00
|
|
|
$dbr = ( $flags & self::READ_LATEST )
|
2021-04-19 01:32:42 +00:00
|
|
|
? $this->repo->getPrimaryDB()
|
2016-11-18 15:42:39 +00:00
|
|
|
: $this->repo->getReplicaDB();
|
2014-04-23 00:04:10 +00:00
|
|
|
|
2017-10-06 17:03:55 +00:00
|
|
|
$fileQuery = static::getQueryInfo();
|
|
|
|
|
$row = $dbr->selectRow(
|
|
|
|
|
$fileQuery['tables'],
|
|
|
|
|
$fileQuery['fields'],
|
|
|
|
|
[ 'img_name' => $this->getName() ],
|
|
|
|
|
$fname,
|
|
|
|
|
[],
|
|
|
|
|
$fileQuery['joins']
|
|
|
|
|
);
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
if ( $row ) {
|
|
|
|
|
$this->loadFromRow( $row );
|
|
|
|
|
} else {
|
|
|
|
|
$this->fileExists = false;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2013-01-24 03:20:57 +00:00
|
|
|
* Load lazy file metadata from the DB.
|
|
|
|
|
* This covers fields that are sometimes not cached.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2013-01-24 03:20:57 +00:00
|
|
|
*/
|
|
|
|
|
protected function loadExtraFromDB() {
|
2019-08-30 08:55:01 +00:00
|
|
|
if ( !$this->title ) {
|
|
|
|
|
return; // Avoid hard failure when the file does not exist. T221812
|
|
|
|
|
}
|
|
|
|
|
|
2017-03-07 02:14:14 +00:00
|
|
|
$fname = static::class . '::' . __FUNCTION__;
|
2013-01-24 03:20:57 +00:00
|
|
|
|
|
|
|
|
# Unconditionally set loaded=true, we don't want the accessors constantly rechecking
|
|
|
|
|
$this->extraDataLoaded = true;
|
|
|
|
|
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
$db = $this->repo->getReplicaDB();
|
|
|
|
|
$fieldMap = $this->loadExtraFieldsWithTimestamp( $db, $fname );
|
2014-04-22 16:38:51 +00:00
|
|
|
if ( !$fieldMap ) {
|
2021-04-19 01:32:42 +00:00
|
|
|
$db = $this->repo->getPrimaryDB();
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
$fieldMap = $this->loadExtraFieldsWithTimestamp( $db, $fname );
|
2013-01-24 03:20:57 +00:00
|
|
|
}
|
|
|
|
|
|
2014-04-22 16:38:51 +00:00
|
|
|
if ( $fieldMap ) {
|
2021-06-03 01:34:31 +00:00
|
|
|
if ( isset( $fieldMap['metadata'] ) ) {
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
$this->loadMetadataFromDbFieldValue( $db, $fieldMap['metadata'] );
|
2013-01-24 03:20:57 +00:00
|
|
|
}
|
|
|
|
|
} else {
|
|
|
|
|
throw new MWException( "Could not find data for image '{$this->getName()}'." );
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2014-04-22 16:38:51 +00:00
|
|
|
/**
|
2015-10-04 09:07:25 +00:00
|
|
|
* @param IDatabase $dbr
|
2014-04-22 16:38:51 +00:00
|
|
|
* @param string $fname
|
2018-04-19 08:30:33 +00:00
|
|
|
* @return string[]|bool
|
2014-04-22 16:38:51 +00:00
|
|
|
*/
|
2017-10-06 17:03:55 +00:00
|
|
|
private function loadExtraFieldsWithTimestamp( $dbr, $fname ) {
|
2014-04-22 16:38:51 +00:00
|
|
|
$fieldMap = false;
|
|
|
|
|
|
2017-10-06 17:03:55 +00:00
|
|
|
$fileQuery = self::getQueryInfo( [ 'omit-nonlazy' ] );
|
|
|
|
|
$row = $dbr->selectRow(
|
|
|
|
|
$fileQuery['tables'],
|
|
|
|
|
$fileQuery['fields'],
|
|
|
|
|
[
|
2016-09-10 20:36:50 +00:00
|
|
|
'img_name' => $this->getName(),
|
2017-10-06 17:03:55 +00:00
|
|
|
'img_timestamp' => $dbr->timestamp( $this->getTimestamp() ),
|
|
|
|
|
],
|
|
|
|
|
$fname,
|
|
|
|
|
[],
|
|
|
|
|
$fileQuery['joins']
|
|
|
|
|
);
|
2014-04-22 16:38:51 +00:00
|
|
|
if ( $row ) {
|
|
|
|
|
$fieldMap = $this->unprefixRow( $row, 'img_' );
|
|
|
|
|
} else {
|
|
|
|
|
# File may have been uploaded over in the meantime; check the old versions
|
2017-10-06 17:03:55 +00:00
|
|
|
$fileQuery = OldLocalFile::getQueryInfo( [ 'omit-nonlazy' ] );
|
|
|
|
|
$row = $dbr->selectRow(
|
|
|
|
|
$fileQuery['tables'],
|
|
|
|
|
$fileQuery['fields'],
|
|
|
|
|
[
|
2016-09-10 20:36:50 +00:00
|
|
|
'oi_name' => $this->getName(),
|
2017-10-06 17:03:55 +00:00
|
|
|
'oi_timestamp' => $dbr->timestamp( $this->getTimestamp() ),
|
|
|
|
|
],
|
|
|
|
|
$fname,
|
|
|
|
|
[],
|
|
|
|
|
$fileQuery['joins']
|
|
|
|
|
);
|
2014-04-22 16:38:51 +00:00
|
|
|
if ( $row ) {
|
|
|
|
|
$fieldMap = $this->unprefixRow( $row, 'oi_' );
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return $fieldMap;
|
|
|
|
|
}
|
|
|
|
|
|
2013-01-24 03:20:57 +00:00
|
|
|
/**
|
2020-11-12 22:22:22 +00:00
|
|
|
* @param array|stdClass $row
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param string $prefix
|
|
|
|
|
* @throws MWException
|
|
|
|
|
* @return array
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2013-01-24 03:20:57 +00:00
|
|
|
protected function unprefixRow( $row, $prefix = 'img_' ) {
|
2007-05-30 21:02:32 +00:00
|
|
|
$array = (array)$row;
|
|
|
|
|
$prefixLength = strlen( $prefix );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
// Sanity check prefix once
|
|
|
|
|
if ( substr( key( $array ), 0, $prefixLength ) !== $prefix ) {
|
2013-02-03 19:42:08 +00:00
|
|
|
throw new MWException( __METHOD__ . ': incorrect $prefix parameter' );
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2016-02-17 09:09:32 +00:00
|
|
|
$decoded = [];
|
2007-05-30 21:02:32 +00:00
|
|
|
foreach ( $array as $name => $value ) {
|
|
|
|
|
$decoded[substr( $name, $prefixLength )] = $value;
|
|
|
|
|
}
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2013-01-24 03:20:57 +00:00
|
|
|
return $decoded;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2021-06-03 01:34:31 +00:00
|
|
|
* Load file metadata from a DB result row
|
|
|
|
|
* @stable to override
|
|
|
|
|
*
|
|
|
|
|
* Passing arbitrary fields in the row and expecting them to be translated
|
|
|
|
|
* to property names on $this is deprecated since 1.37. Instead, override
|
|
|
|
|
* loadFromRow(), and clone and unset the extra fields before passing them
|
|
|
|
|
* to the parent.
|
|
|
|
|
*
|
|
|
|
|
* After the deprecation period has passed, extra fields will be ignored,
|
|
|
|
|
* and the deprecation warning will be removed.
|
|
|
|
|
*
|
2020-11-12 22:22:22 +00:00
|
|
|
* @param stdClass $row
|
2014-04-19 15:19:17 +00:00
|
|
|
* @param string $prefix
|
2013-01-24 03:20:57 +00:00
|
|
|
*/
|
2021-06-03 01:34:31 +00:00
|
|
|
public function loadFromRow( $row, $prefix = 'img_' ) {
|
|
|
|
|
$this->dataLoaded = true;
|
|
|
|
|
|
|
|
|
|
$unprefixed = $this->unprefixRow( $row, $prefix );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2021-06-03 01:34:31 +00:00
|
|
|
$this->name = $unprefixed['name'];
|
|
|
|
|
$this->media_type = $unprefixed['media_type'];
|
2017-10-06 17:03:55 +00:00
|
|
|
|
2021-06-03 01:34:31 +00:00
|
|
|
$this->description = MediaWikiServices::getInstance()->getCommentStore()
|
|
|
|
|
->getComment( "{$prefix}description", $row )->text;
|
|
|
|
|
|
|
|
|
|
$this->user = User::newFromAnyId(
|
|
|
|
|
$unprefixed['user'] ?? null,
|
|
|
|
|
$unprefixed['user_text'] ?? null,
|
|
|
|
|
$unprefixed['actor'] ?? null
|
2017-09-12 17:12:29 +00:00
|
|
|
);
|
|
|
|
|
|
2021-06-03 01:34:31 +00:00
|
|
|
$this->timestamp = wfTimestamp( TS_MW, $unprefixed['timestamp'] );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
$this->loadMetadataFromDbFieldValue(
|
|
|
|
|
$this->repo->getReplicaDB(), $unprefixed['metadata'] );
|
2014-04-23 06:56:23 +00:00
|
|
|
|
2021-06-03 01:34:31 +00:00
|
|
|
if ( empty( $unprefixed['major_mime'] ) ) {
|
|
|
|
|
$this->major_mime = 'unknown';
|
|
|
|
|
$this->minor_mime = 'unknown';
|
|
|
|
|
$this->mime = 'unknown/unknown';
|
2007-05-30 21:02:32 +00:00
|
|
|
} else {
|
2021-06-03 01:34:31 +00:00
|
|
|
if ( !$unprefixed['minor_mime'] ) {
|
|
|
|
|
$unprefixed['minor_mime'] = 'unknown';
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2021-06-03 01:34:31 +00:00
|
|
|
$this->major_mime = $unprefixed['major_mime'];
|
|
|
|
|
$this->minor_mime = $unprefixed['minor_mime'];
|
|
|
|
|
$this->mime = $unprefixed['major_mime'] . '/' . $unprefixed['minor_mime'];
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2015-09-02 06:07:38 +00:00
|
|
|
// Trim zero padding from char/binary field
|
2021-06-03 01:34:31 +00:00
|
|
|
$this->sha1 = rtrim( $unprefixed['sha1'], "\0" );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2015-09-02 06:07:38 +00:00
|
|
|
// Normalize some fields to integer type, per their database definition.
|
|
|
|
|
// Use unary + so that overflows will be upgraded to double instead of
|
2019-09-09 08:49:23 +00:00
|
|
|
// being trucated as with intval(). This is important to allow > 2 GiB
|
2015-09-02 06:07:38 +00:00
|
|
|
// files on 32-bit systems.
|
2021-06-03 01:34:31 +00:00
|
|
|
$this->size = +$unprefixed['size'];
|
|
|
|
|
$this->width = +$unprefixed['width'];
|
|
|
|
|
$this->height = +$unprefixed['height'];
|
|
|
|
|
$this->bits = +$unprefixed['bits'];
|
|
|
|
|
|
|
|
|
|
// Check for extra fields (deprecated since MW 1.37)
|
|
|
|
|
$extraFields = array_diff(
|
|
|
|
|
array_keys( $unprefixed ),
|
|
|
|
|
[
|
|
|
|
|
'name', 'media_type', 'description_text', 'description_data',
|
|
|
|
|
'description_cid', 'user', 'user_text', 'actor', 'timestamp',
|
|
|
|
|
'metadata', 'major_mime', 'minor_mime', 'sha1', 'size', 'width',
|
|
|
|
|
'height', 'bits'
|
|
|
|
|
]
|
|
|
|
|
);
|
|
|
|
|
if ( $extraFields ) {
|
|
|
|
|
wfDeprecatedMsg(
|
|
|
|
|
'Passing extra fields (' .
|
|
|
|
|
implode( ', ', $extraFields )
|
|
|
|
|
. ') to ' . __METHOD__ . ' was deprecated in MediaWiki 1.37. ' .
|
|
|
|
|
'Property assignment will be removed in a later version.',
|
|
|
|
|
'1.37' );
|
|
|
|
|
foreach ( $extraFields as $field ) {
|
|
|
|
|
$this->$field = $unprefixed[$field];
|
|
|
|
|
}
|
2021-06-02 14:35:19 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->fileExists = true;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Load file metadata from cache or DB, unless already loaded
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2014-04-19 15:19:17 +00:00
|
|
|
* @param int $flags
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function load( $flags = 0 ) {
|
2007-05-30 21:02:32 +00:00
|
|
|
if ( !$this->dataLoaded ) {
|
2016-08-30 06:22:22 +00:00
|
|
|
if ( $flags & self::READ_LATEST ) {
|
2015-03-05 01:02:05 +00:00
|
|
|
$this->loadFromDB( $flags );
|
2016-08-30 06:22:22 +00:00
|
|
|
} else {
|
|
|
|
|
$this->loadFromCache();
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
}
|
2016-08-30 06:22:22 +00:00
|
|
|
|
2013-01-24 03:20:57 +00:00
|
|
|
if ( ( $flags & self::LOAD_ALL ) && !$this->extraDataLoaded ) {
|
2015-03-05 01:02:05 +00:00
|
|
|
// @note: loads on name/timestamp to reduce race condition problems
|
2013-01-24 03:20:57 +00:00
|
|
|
$this->loadExtraFromDB();
|
|
|
|
|
}
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Upgrade a row if it needs it
|
2021-06-04 06:46:47 +00:00
|
|
|
* @internal
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2021-06-04 06:46:47 +00:00
|
|
|
public function maybeUpgradeRow() {
|
2016-08-11 16:19:09 +00:00
|
|
|
if ( wfReadOnly() || $this->upgrading ) {
|
2007-05-30 21:02:32 +00:00
|
|
|
return;
|
|
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2016-04-19 15:58:49 +00:00
|
|
|
$upgrade = false;
|
2021-06-04 06:46:47 +00:00
|
|
|
$reserialize = false;
|
2020-01-09 23:48:34 +00:00
|
|
|
if ( $this->media_type === null || $this->mime == 'image/svg' ) {
|
2016-04-19 15:58:49 +00:00
|
|
|
$upgrade = true;
|
2007-05-30 21:02:32 +00:00
|
|
|
} else {
|
|
|
|
|
$handler = $this->getHandler();
|
2011-04-16 01:23:15 +00:00
|
|
|
if ( $handler ) {
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
$validity = $handler->isFileMetadataValid( $this );
|
2016-08-11 16:19:09 +00:00
|
|
|
if ( $validity === MediaHandler::METADATA_BAD ) {
|
2016-04-19 15:58:49 +00:00
|
|
|
$upgrade = true;
|
2021-06-04 06:46:47 +00:00
|
|
|
} elseif ( $validity === MediaHandler::METADATA_COMPATIBLE
|
|
|
|
|
&& $this->repo->isMetadataUpdateEnabled()
|
|
|
|
|
) {
|
|
|
|
|
$upgrade = true;
|
|
|
|
|
} elseif ( $this->repo->isJsonMetadataEnabled()
|
|
|
|
|
&& $this->repo->isMetadataReserializeEnabled()
|
|
|
|
|
) {
|
|
|
|
|
if ( $this->repo->isSplitMetadataEnabled() && $this->isMetadataOversize() ) {
|
|
|
|
|
$reserialize = true;
|
|
|
|
|
} elseif ( $this->metadataSerializationFormat !== self::MDS_EMPTY &&
|
|
|
|
|
$this->metadataSerializationFormat !== self::MDS_JSON ) {
|
|
|
|
|
$reserialize = true;
|
|
|
|
|
}
|
2011-04-16 01:23:15 +00:00
|
|
|
}
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
}
|
2016-04-19 15:58:49 +00:00
|
|
|
|
2021-06-04 06:46:47 +00:00
|
|
|
if ( $upgrade || $reserialize ) {
|
2016-08-11 16:19:09 +00:00
|
|
|
$this->upgrading = true;
|
|
|
|
|
// Defer updates unless in auto-commit CLI mode
|
2021-06-04 06:46:47 +00:00
|
|
|
DeferredUpdates::addCallableUpdate( function () use ( $upgrade ) {
|
2016-08-11 16:19:09 +00:00
|
|
|
$this->upgrading = false; // avoid duplicate updates
|
|
|
|
|
try {
|
2021-06-04 06:46:47 +00:00
|
|
|
if ( $upgrade ) {
|
|
|
|
|
$this->upgradeRow();
|
|
|
|
|
} else {
|
|
|
|
|
$this->reserializeMetadata();
|
|
|
|
|
}
|
2016-08-11 16:19:09 +00:00
|
|
|
} catch ( LocalFileLockError $e ) {
|
|
|
|
|
// let the other process handle it (or do it next time)
|
|
|
|
|
}
|
|
|
|
|
} );
|
2016-04-19 15:58:49 +00:00
|
|
|
}
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2016-08-11 16:19:09 +00:00
|
|
|
/**
|
|
|
|
|
* @return bool Whether upgradeRow() ran for this object
|
|
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function getUpgraded() {
|
2007-05-30 21:02:32 +00:00
|
|
|
return $this->upgraded;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Fix assorted version-related problems with the image row by reloading it from the file
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-18 01:39:53 +00:00
|
|
|
public function upgradeRow() {
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->lock();
|
2011-12-26 23:35:40 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->loadFromFile();
|
|
|
|
|
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
# Don't destroy file info of missing files
|
|
|
|
|
if ( !$this->fileExists ) {
|
2014-07-17 00:14:19 +00:00
|
|
|
$this->unlock();
|
2020-06-01 05:00:39 +00:00
|
|
|
wfDebug( __METHOD__ . ": file does not exist, aborting" );
|
2013-11-23 20:00:11 +00:00
|
|
|
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
return;
|
|
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2021-04-19 01:32:42 +00:00
|
|
|
$dbw = $this->repo->getPrimaryDB();
|
2007-05-30 21:02:32 +00:00
|
|
|
list( $major, $minor ) = self::splitMime( $this->mime );
|
|
|
|
|
|
2008-01-29 01:14:50 +00:00
|
|
|
if ( wfReadOnly() ) {
|
2014-07-17 00:14:19 +00:00
|
|
|
$this->unlock();
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2008-01-29 01:14:50 +00:00
|
|
|
return;
|
|
|
|
|
}
|
2020-06-01 05:00:39 +00:00
|
|
|
wfDebug( __METHOD__ . ': upgrading ' . $this->getName() . " to the current schema" );
|
2007-05-30 21:02:32 +00:00
|
|
|
|
|
|
|
|
$dbw->update( 'image',
|
2016-02-17 09:09:32 +00:00
|
|
|
[
|
2013-04-20 17:18:13 +00:00
|
|
|
'img_size' => $this->size, // sanity
|
|
|
|
|
'img_width' => $this->width,
|
|
|
|
|
'img_height' => $this->height,
|
|
|
|
|
'img_bits' => $this->bits,
|
2007-05-30 21:02:32 +00:00
|
|
|
'img_media_type' => $this->media_type,
|
|
|
|
|
'img_major_mime' => $major,
|
|
|
|
|
'img_minor_mime' => $minor,
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
'img_metadata' => $this->getMetadataForDb( $dbw ),
|
2013-04-20 17:18:13 +00:00
|
|
|
'img_sha1' => $this->sha1,
|
2016-02-17 09:09:32 +00:00
|
|
|
],
|
|
|
|
|
[ 'img_name' => $this->getName() ],
|
2007-05-30 21:02:32 +00:00
|
|
|
__METHOD__
|
|
|
|
|
);
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2015-04-28 00:26:58 +00:00
|
|
|
$this->invalidateCache();
|
2011-12-26 23:35:40 +00:00
|
|
|
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->unlock();
|
2016-08-11 16:19:09 +00:00
|
|
|
$this->upgraded = true; // avoid rework/retries
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2021-06-04 06:46:47 +00:00
|
|
|
/**
|
|
|
|
|
* Write the metadata back to the database with the current serialization
|
|
|
|
|
* format.
|
|
|
|
|
*/
|
|
|
|
|
protected function reserializeMetadata() {
|
|
|
|
|
if ( wfReadOnly() ) {
|
|
|
|
|
return;
|
|
|
|
|
}
|
2021-05-15 00:01:51 +00:00
|
|
|
$dbw = $this->repo->getPrimaryDB();
|
2021-06-04 06:46:47 +00:00
|
|
|
$dbw->update(
|
|
|
|
|
'image',
|
|
|
|
|
[ 'img_metadata' => $this->getMetadataForDb( $dbw ) ],
|
|
|
|
|
[
|
|
|
|
|
'img_name' => $this->name,
|
|
|
|
|
'img_timestamp' => $dbw->timestamp( $this->timestamp ),
|
|
|
|
|
],
|
|
|
|
|
__METHOD__
|
|
|
|
|
);
|
|
|
|
|
$this->upgraded = true;
|
|
|
|
|
}
|
|
|
|
|
|
2007-11-19 10:04:18 +00:00
|
|
|
/**
|
2008-04-14 07:45:50 +00:00
|
|
|
* Set properties in this object to be equal to those given in the
|
2007-11-19 10:04:18 +00:00
|
|
|
* associative array $info. Only cacheable fields can be set.
|
2013-01-24 03:20:57 +00:00
|
|
|
* All fields *must* be set in $info except for getLazyCacheFields().
|
2008-04-14 07:45:50 +00:00
|
|
|
*
|
|
|
|
|
* If 'mime' is given, it will be split into major_mime/minor_mime.
|
2007-11-19 10:04:18 +00:00
|
|
|
* If major_mime/minor_mime are given, $this->mime will also be set.
|
2014-04-19 15:19:17 +00:00
|
|
|
*
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2014-04-19 15:19:17 +00:00
|
|
|
* @param array $info
|
2007-11-19 10:04:18 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
protected function setProps( $info ) {
|
2007-06-16 02:55:25 +00:00
|
|
|
$this->dataLoaded = true;
|
|
|
|
|
$fields = $this->getCacheFields( '' );
|
|
|
|
|
$fields[] = 'fileExists';
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2007-06-16 02:55:25 +00:00
|
|
|
foreach ( $fields as $field ) {
|
|
|
|
|
if ( isset( $info[$field] ) ) {
|
|
|
|
|
$this->$field = $info[$field];
|
|
|
|
|
}
|
|
|
|
|
}
|
2010-08-11 18:56:38 +00:00
|
|
|
|
2021-05-27 19:56:40 +00:00
|
|
|
// Only our own cache sets these properties, so they both should be present.
|
|
|
|
|
if ( isset( $info['user'] ) &&
|
|
|
|
|
isset( $info['user_text'] ) &&
|
|
|
|
|
$info['user_text'] !== ''
|
|
|
|
|
) {
|
|
|
|
|
$this->user = new UserIdentityValue( $info['user'], $info['user_text'] );
|
2017-09-12 17:12:29 +00:00
|
|
|
}
|
|
|
|
|
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
// Fix up mime fields
|
|
|
|
|
if ( isset( $info['major_mime'] ) ) {
|
|
|
|
|
$this->mime = "{$info['major_mime']}/{$info['minor_mime']}";
|
|
|
|
|
} elseif ( isset( $info['mime'] ) ) {
|
2010-08-11 18:56:38 +00:00
|
|
|
$this->mime = $info['mime'];
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
list( $this->major_mime, $this->minor_mime ) = self::splitMime( $this->mime );
|
|
|
|
|
}
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
|
|
|
|
|
if ( isset( $info['metadata'] ) ) {
|
|
|
|
|
if ( is_string( $info['metadata'] ) ) {
|
|
|
|
|
$this->loadMetadataFromString( $info['metadata'] );
|
|
|
|
|
} elseif ( is_array( $info['metadata'] ) ) {
|
|
|
|
|
$this->metadataArray = $info['metadata'];
|
2021-06-02 04:34:38 +00:00
|
|
|
if ( isset( $info['metadataBlobs'] ) ) {
|
|
|
|
|
$this->metadataBlobs = $info['metadataBlobs'];
|
|
|
|
|
$this->unloadedMetadataBlobs = array_diff_key(
|
|
|
|
|
$this->metadataBlobs,
|
|
|
|
|
$this->metadataArray
|
|
|
|
|
);
|
|
|
|
|
} else {
|
|
|
|
|
$this->metadataBlobs = [];
|
|
|
|
|
$this->unloadedMetadataBlobs = [];
|
|
|
|
|
}
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
} else {
|
|
|
|
|
$logger = LoggerFactory::getInstance( 'LocalFile' );
|
|
|
|
|
$logger->warning( __METHOD__ . ' given invalid metadata of type ' .
|
|
|
|
|
gettype( $info['metadata'] ) );
|
|
|
|
|
$this->metadataArray = [];
|
|
|
|
|
}
|
|
|
|
|
$this->extraDataLoaded = true;
|
|
|
|
|
}
|
2007-06-16 02:55:25 +00:00
|
|
|
}
|
|
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
/** splitMime inherited */
|
|
|
|
|
/** getName inherited */
|
|
|
|
|
/** getTitle inherited */
|
|
|
|
|
/** getURL inherited */
|
|
|
|
|
/** getViewURL inherited */
|
|
|
|
|
/** getPath inherited */
|
2014-12-12 08:41:27 +00:00
|
|
|
/** isVisible inherited */
|
2007-05-30 21:02:32 +00:00
|
|
|
|
2012-05-10 07:55:33 +00:00
|
|
|
/**
|
2019-04-05 21:47:38 +00:00
|
|
|
* Checks if this file exists in its parent repo, as referenced by its
|
|
|
|
|
* virtual URL.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2019-04-05 21:47:38 +00:00
|
|
|
*
|
2012-05-10 07:55:33 +00:00
|
|
|
* @return bool
|
|
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function isMissing() {
|
2010-09-04 13:48:16 +00:00
|
|
|
if ( $this->missing === null ) {
|
2019-04-05 21:47:38 +00:00
|
|
|
$fileExists = $this->repo->fileExists( $this->getVirtualUrl() );
|
2009-04-26 11:12:39 +00:00
|
|
|
$this->missing = !$fileExists;
|
|
|
|
|
}
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2009-04-26 11:12:39 +00:00
|
|
|
return $this->missing;
|
|
|
|
|
}
|
|
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
/**
|
|
|
|
|
* Return the width of the image
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2007-05-30 21:02:32 +00:00
|
|
|
*
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param int $page
|
2013-04-16 21:34:43 +00:00
|
|
|
* @return int
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2010-02-07 19:31:24 +00:00
|
|
|
public function getWidth( $page = 1 ) {
|
2017-04-28 12:57:04 +00:00
|
|
|
$page = (int)$page;
|
|
|
|
|
if ( $page < 1 ) {
|
|
|
|
|
$page = 1;
|
|
|
|
|
}
|
|
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->load();
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
if ( $this->isMultipage() ) {
|
2013-09-05 20:46:38 +00:00
|
|
|
$handler = $this->getHandler();
|
|
|
|
|
if ( !$handler ) {
|
|
|
|
|
return 0;
|
|
|
|
|
}
|
|
|
|
|
$dim = $handler->getPageDimensions( $this, $page );
|
2007-05-30 21:02:32 +00:00
|
|
|
if ( $dim ) {
|
|
|
|
|
return $dim['width'];
|
|
|
|
|
} else {
|
2013-04-16 21:34:43 +00:00
|
|
|
// For non-paged media, the false goes through an
|
|
|
|
|
// intval, turning failure into 0, so do same here.
|
|
|
|
|
return 0;
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
} else {
|
|
|
|
|
return $this->width;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Return the height of the image
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2007-05-30 21:02:32 +00:00
|
|
|
*
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param int $page
|
2013-04-16 21:34:43 +00:00
|
|
|
* @return int
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2010-02-07 19:31:24 +00:00
|
|
|
public function getHeight( $page = 1 ) {
|
2017-04-28 12:57:04 +00:00
|
|
|
$page = (int)$page;
|
|
|
|
|
if ( $page < 1 ) {
|
|
|
|
|
$page = 1;
|
|
|
|
|
}
|
|
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->load();
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
if ( $this->isMultipage() ) {
|
2013-09-05 20:46:38 +00:00
|
|
|
$handler = $this->getHandler();
|
|
|
|
|
if ( !$handler ) {
|
|
|
|
|
return 0;
|
|
|
|
|
}
|
|
|
|
|
$dim = $handler->getPageDimensions( $this, $page );
|
2007-05-30 21:02:32 +00:00
|
|
|
if ( $dim ) {
|
|
|
|
|
return $dim['height'];
|
|
|
|
|
} else {
|
2013-04-16 21:34:43 +00:00
|
|
|
// For non-paged media, the false goes through an
|
|
|
|
|
// intval, turning failure into 0, so do same here.
|
|
|
|
|
return 0;
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
} else {
|
|
|
|
|
return $this->height;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2016-01-04 23:40:01 +00:00
|
|
|
/**
|
|
|
|
|
* Get short description URL for a file based on the page ID.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2016-01-04 23:40:01 +00:00
|
|
|
*
|
|
|
|
|
* @return string|null
|
|
|
|
|
* @since 1.27
|
|
|
|
|
*/
|
|
|
|
|
public function getDescriptionShortUrl() {
|
2019-08-30 08:55:01 +00:00
|
|
|
if ( !$this->title ) {
|
|
|
|
|
return null; // Avoid hard failure when the file does not exist. T221812
|
|
|
|
|
}
|
|
|
|
|
|
2016-01-04 23:40:01 +00:00
|
|
|
$pageId = $this->title->getArticleID();
|
|
|
|
|
|
2019-08-30 08:55:01 +00:00
|
|
|
if ( $pageId ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$url = $this->repo->makeUrl( [ 'curid' => $pageId ] );
|
2016-01-04 23:40:01 +00:00
|
|
|
if ( $url !== false ) {
|
|
|
|
|
return $url;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return null;
|
|
|
|
|
}
|
|
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
/**
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
* Get handler-specific metadata as a serialized string
|
|
|
|
|
*
|
|
|
|
|
* @deprecated since 1.37 use getMetadataArray() or getMetadataItem()
|
2012-02-09 21:33:27 +00:00
|
|
|
* @return string
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function getMetadata() {
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
$data = $this->getMetadataArray();
|
|
|
|
|
if ( !$data ) {
|
|
|
|
|
return '';
|
|
|
|
|
} elseif ( array_keys( $data ) === [ '_error' ] ) {
|
|
|
|
|
// Legacy error encoding
|
|
|
|
|
return $data['_error'];
|
|
|
|
|
} else {
|
|
|
|
|
return serialize( $this->getMetadataArray() );
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Get unserialized handler-specific metadata
|
|
|
|
|
*
|
|
|
|
|
* @since 1.37
|
|
|
|
|
* @return array
|
|
|
|
|
*/
|
|
|
|
|
public function getMetadataArray(): array {
|
|
|
|
|
$this->load( self::LOAD_ALL );
|
2021-06-02 04:34:38 +00:00
|
|
|
if ( $this->unloadedMetadataBlobs ) {
|
|
|
|
|
return $this->getMetadataItems(
|
|
|
|
|
array_unique( array_merge(
|
|
|
|
|
array_keys( $this->metadataArray ),
|
|
|
|
|
array_keys( $this->unloadedMetadataBlobs )
|
|
|
|
|
) )
|
|
|
|
|
);
|
|
|
|
|
}
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
return $this->metadataArray;
|
|
|
|
|
}
|
|
|
|
|
|
2021-06-02 04:34:38 +00:00
|
|
|
public function getMetadataItems( array $itemNames ): array {
|
|
|
|
|
$this->load( self::LOAD_ALL );
|
|
|
|
|
$result = [];
|
|
|
|
|
$addresses = [];
|
|
|
|
|
foreach ( $itemNames as $itemName ) {
|
|
|
|
|
if ( array_key_exists( $itemName, $this->metadataArray ) ) {
|
|
|
|
|
$result[$itemName] = $this->metadataArray[$itemName];
|
|
|
|
|
} elseif ( isset( $this->unloadedMetadataBlobs[$itemName] ) ) {
|
|
|
|
|
$addresses[$itemName] = $this->unloadedMetadataBlobs[$itemName];
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
if ( $addresses ) {
|
|
|
|
|
$blobStore = $this->repo->getBlobStore();
|
|
|
|
|
if ( !$blobStore ) {
|
|
|
|
|
LoggerFactory::getInstance( 'LocalFile' )->warning(
|
|
|
|
|
"Unable to load metadata: repo has no blob store" );
|
|
|
|
|
return $result;
|
|
|
|
|
}
|
|
|
|
|
$status = $blobStore->getBlobBatch( $addresses );
|
|
|
|
|
if ( !$status->isGood() ) {
|
|
|
|
|
$msg = Status::wrap( $status )->getWikiText(
|
|
|
|
|
false, false, 'en' );
|
|
|
|
|
LoggerFactory::getInstance( 'LocalFile' )->warning(
|
|
|
|
|
"Error loading metadata from BlobStore: $msg" );
|
|
|
|
|
}
|
|
|
|
|
foreach ( $addresses as $itemName => $address ) {
|
|
|
|
|
unset( $this->unloadedMetadataBlobs[$itemName] );
|
|
|
|
|
$json = $status->getValue()[$address] ?? null;
|
|
|
|
|
if ( $json !== null ) {
|
|
|
|
|
$value = $this->jsonDecode( $json );
|
|
|
|
|
$result[$itemName] = $value;
|
|
|
|
|
$this->metadataArray[$itemName] = $value;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return $result;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Do JSON encoding with local flags. Throw an exception if the data cannot be
|
|
|
|
|
* serialized.
|
|
|
|
|
*
|
|
|
|
|
* @throws MWException
|
|
|
|
|
* @param mixed $data
|
|
|
|
|
* @return string
|
|
|
|
|
*/
|
|
|
|
|
private function jsonEncode( $data ): string {
|
|
|
|
|
$s = json_encode( $data,
|
|
|
|
|
JSON_INVALID_UTF8_IGNORE |
|
|
|
|
|
JSON_UNESCAPED_SLASHES |
|
|
|
|
|
JSON_UNESCAPED_UNICODE );
|
|
|
|
|
if ( $s === false ) {
|
|
|
|
|
throw new MWException( __METHOD__ . ': metadata is not JSON-serializable ' .
|
|
|
|
|
'(type = ' . $this->getMimeType() . ')' );
|
|
|
|
|
}
|
|
|
|
|
return $s;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Do JSON decoding with local flags.
|
|
|
|
|
*
|
|
|
|
|
* This doesn't use JsonCodec because JsonCodec can construct objects,
|
|
|
|
|
* which we don't want.
|
|
|
|
|
*
|
|
|
|
|
* Does not throw. Returns false on failure.
|
|
|
|
|
*
|
|
|
|
|
* @param string $s
|
|
|
|
|
* @return mixed The decoded value, or false on failure
|
|
|
|
|
*/
|
|
|
|
|
private function jsonDecode( string $s ) {
|
|
|
|
|
// phpcs:ignore Generic.PHP.NoSilencedErrors.Discouraged
|
|
|
|
|
return @json_decode( $s, true, 512, JSON_INVALID_UTF8_IGNORE );
|
|
|
|
|
}
|
|
|
|
|
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
/**
|
|
|
|
|
* Serialize the metadata array for insertion into img_metadata, oi_metadata
|
2021-06-02 04:34:38 +00:00
|
|
|
* or fa_metadata.
|
|
|
|
|
*
|
|
|
|
|
* If metadata splitting is enabled, this may write blobs to the database,
|
|
|
|
|
* returning their addresses.
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
*
|
|
|
|
|
* @internal
|
|
|
|
|
* @param IDatabase $db
|
|
|
|
|
* @return string|Blob
|
|
|
|
|
*/
|
|
|
|
|
public function getMetadataForDb( IDatabase $db ) {
|
|
|
|
|
$this->load( self::LOAD_ALL );
|
2021-06-02 04:34:38 +00:00
|
|
|
if ( !$this->metadataArray && !$this->metadataBlobs ) {
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
$s = '';
|
2021-06-02 04:34:38 +00:00
|
|
|
} elseif ( $this->repo->isJsonMetadataEnabled() ) {
|
|
|
|
|
$s = $this->getJsonMetadata();
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
} else {
|
2021-06-02 04:34:38 +00:00
|
|
|
$s = serialize( $this->getMetadataArray() );
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
}
|
|
|
|
|
if ( !is_string( $s ) ) {
|
|
|
|
|
throw new MWException( 'Could not serialize image metadata value for DB' );
|
|
|
|
|
}
|
|
|
|
|
return $db->encodeBlob( $s );
|
|
|
|
|
}
|
|
|
|
|
|
2021-06-02 04:34:38 +00:00
|
|
|
/**
|
|
|
|
|
* Get metadata in JSON format ready for DB insertion, optionally splitting
|
|
|
|
|
* items out to BlobStore.
|
|
|
|
|
*
|
|
|
|
|
* @return string
|
|
|
|
|
*/
|
|
|
|
|
private function getJsonMetadata() {
|
|
|
|
|
// Directly store data that is not already in BlobStore
|
|
|
|
|
$envelope = [
|
|
|
|
|
'data' => array_diff_key( $this->metadataArray, $this->metadataBlobs )
|
|
|
|
|
];
|
|
|
|
|
|
|
|
|
|
// Also store the blob addresses
|
|
|
|
|
if ( $this->metadataBlobs ) {
|
|
|
|
|
$envelope['blobs'] = $this->metadataBlobs;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Try encoding
|
|
|
|
|
$s = $this->jsonEncode( $envelope );
|
|
|
|
|
|
|
|
|
|
// Decide whether to try splitting the metadata.
|
|
|
|
|
// Return early if it's not going to happen.
|
|
|
|
|
if ( !$this->repo->isSplitMetadataEnabled()
|
|
|
|
|
|| !$this->getHandler()
|
|
|
|
|
|| !$this->getHandler()->useSplitMetadata()
|
|
|
|
|
) {
|
|
|
|
|
return $s;
|
|
|
|
|
}
|
|
|
|
|
$threshold = $this->repo->getSplitMetadataThreshold();
|
|
|
|
|
if ( !$threshold || strlen( $s ) <= $threshold ) {
|
|
|
|
|
return $s;
|
|
|
|
|
}
|
|
|
|
|
$blobStore = $this->repo->getBlobStore();
|
|
|
|
|
if ( !$blobStore ) {
|
|
|
|
|
return $s;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// The data as a whole is above the item threshold. Look for
|
|
|
|
|
// large items that can be split out.
|
|
|
|
|
$blobAddresses = [];
|
|
|
|
|
foreach ( $envelope['data'] as $name => $value ) {
|
|
|
|
|
$encoded = $this->jsonEncode( $value );
|
|
|
|
|
if ( strlen( $encoded ) > $threshold ) {
|
|
|
|
|
$blobAddresses[$name] = $blobStore->storeBlob(
|
|
|
|
|
$encoded,
|
|
|
|
|
[ BlobStore::IMAGE_HINT => $this->getName() ]
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
// Remove any items that were split out
|
|
|
|
|
$envelope['data'] = array_diff_key( $envelope['data'], $blobAddresses );
|
|
|
|
|
$envelope['blobs'] = $blobAddresses;
|
|
|
|
|
$s = $this->jsonEncode( $envelope );
|
|
|
|
|
|
|
|
|
|
// Repeated calls to this function should not keep inserting more blobs
|
|
|
|
|
$this->metadataBlobs += $blobAddresses;
|
|
|
|
|
|
|
|
|
|
return $s;
|
|
|
|
|
}
|
|
|
|
|
|
2021-06-04 06:46:47 +00:00
|
|
|
/**
|
|
|
|
|
* Determine whether the loaded metadata may be a candidate for splitting, by measuring its
|
|
|
|
|
* serialized size. Helper for maybeUpgradeRow().
|
|
|
|
|
*
|
|
|
|
|
* @return bool
|
|
|
|
|
*/
|
|
|
|
|
private function isMetadataOversize() {
|
|
|
|
|
if ( !$this->repo->isSplitMetadataEnabled() ) {
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
$threshold = $this->repo->getSplitMetadataThreshold();
|
|
|
|
|
$directItems = array_diff_key( $this->metadataArray, $this->metadataBlobs );
|
|
|
|
|
foreach ( $directItems as $value ) {
|
|
|
|
|
if ( strlen( $this->jsonEncode( $value ) ) > $threshold ) {
|
|
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
/**
|
|
|
|
|
* Unserialize a metadata blob which came from the database and store it
|
|
|
|
|
* in $this.
|
|
|
|
|
*
|
|
|
|
|
* @since 1.37
|
|
|
|
|
* @param IDatabase $db
|
|
|
|
|
* @param string|Blob $metadataBlob
|
|
|
|
|
*/
|
|
|
|
|
protected function loadMetadataFromDbFieldValue( IDatabase $db, $metadataBlob ) {
|
|
|
|
|
$this->loadMetadataFromString( $db->decodeBlob( $metadataBlob ) );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Unserialize a metadata string which came from some non-DB source, or is
|
|
|
|
|
* the return value of IDatabase::decodeBlob().
|
|
|
|
|
*
|
|
|
|
|
* @since 1.37
|
|
|
|
|
* @param string $metadataString
|
|
|
|
|
*/
|
|
|
|
|
protected function loadMetadataFromString( $metadataString ) {
|
|
|
|
|
$this->extraDataLoaded = true;
|
2021-06-02 04:34:38 +00:00
|
|
|
$this->metadataArray = [];
|
|
|
|
|
$this->metadataBlobs = [];
|
|
|
|
|
$this->unloadedMetadataBlobs = [];
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
$metadataString = (string)$metadataString;
|
|
|
|
|
if ( $metadataString === '' ) {
|
2021-06-04 06:46:47 +00:00
|
|
|
$this->metadataSerializationFormat = self::MDS_EMPTY;
|
2021-06-02 04:34:38 +00:00
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
if ( $metadataString[0] === '{' ) {
|
|
|
|
|
$envelope = $this->jsonDecode( $metadataString );
|
|
|
|
|
if ( !$envelope ) {
|
|
|
|
|
// Legacy error encoding
|
|
|
|
|
$this->metadataArray = [ '_error' => $metadataString ];
|
2021-06-04 06:46:47 +00:00
|
|
|
$this->metadataSerializationFormat = self::MDS_LEGACY;
|
2021-06-02 04:34:38 +00:00
|
|
|
} else {
|
2021-06-04 06:46:47 +00:00
|
|
|
$this->metadataSerializationFormat = self::MDS_JSON;
|
2021-06-02 04:34:38 +00:00
|
|
|
if ( isset( $envelope['data'] ) ) {
|
|
|
|
|
$this->metadataArray = $envelope['data'];
|
|
|
|
|
}
|
|
|
|
|
if ( isset( $envelope['blobs'] ) ) {
|
|
|
|
|
$this->metadataBlobs = $this->unloadedMetadataBlobs = $envelope['blobs'];
|
|
|
|
|
}
|
|
|
|
|
}
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
} else {
|
|
|
|
|
// phpcs:ignore Generic.PHP.NoSilencedErrors.Discouraged
|
|
|
|
|
$data = @unserialize( $metadataString );
|
|
|
|
|
if ( !is_array( $data ) ) {
|
|
|
|
|
// Legacy error encoding
|
2021-06-02 04:34:38 +00:00
|
|
|
$data = [ '_error' => $metadataString ];
|
2021-06-04 06:46:47 +00:00
|
|
|
$this->metadataSerializationFormat = self::MDS_LEGACY;
|
|
|
|
|
} else {
|
|
|
|
|
$this->metadataSerializationFormat = self::MDS_PHP;
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
}
|
2021-06-02 04:34:38 +00:00
|
|
|
$this->metadataArray = $data;
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
}
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2012-05-10 07:55:33 +00:00
|
|
|
/**
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2012-05-10 07:55:33 +00:00
|
|
|
* @return int
|
|
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function getBitDepth() {
|
2008-07-31 20:10:36 +00:00
|
|
|
$this->load();
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2015-05-06 15:33:08 +00:00
|
|
|
return (int)$this->bits;
|
2008-07-31 20:10:36 +00:00
|
|
|
}
|
|
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
/**
|
2013-12-04 16:18:05 +00:00
|
|
|
* Returns the size of the image file, in bytes
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2012-05-10 07:55:33 +00:00
|
|
|
* @return int
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2010-02-07 19:31:24 +00:00
|
|
|
public function getSize() {
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->load();
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
return $this->size;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2014-07-24 14:04:48 +00:00
|
|
|
* Returns the MIME type of the file.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2012-05-10 07:55:33 +00:00
|
|
|
* @return string
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function getMimeType() {
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->load();
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
return $this->mime;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2013-12-04 16:18:05 +00:00
|
|
|
* Returns the type of the media in the file.
|
2007-05-30 21:02:32 +00:00
|
|
|
* Use the value returned by this function with the MEDIATYPE_xxx constants.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2012-05-10 07:55:33 +00:00
|
|
|
* @return string
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function getMediaType() {
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->load();
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
return $this->media_type;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** canRender inherited */
|
|
|
|
|
/** mustRender inherited */
|
|
|
|
|
/** allowInlineDisplay inherited */
|
|
|
|
|
/** isSafeFile inherited */
|
|
|
|
|
/** isTrustedFile inherited */
|
|
|
|
|
|
|
|
|
|
/**
|
2011-05-02 18:48:35 +00:00
|
|
|
* Returns true if the file exists on disk.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2013-12-04 16:18:05 +00:00
|
|
|
* @return bool Whether file exist on disk.
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2010-02-07 19:31:24 +00:00
|
|
|
public function exists() {
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->load();
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
return $this->fileExists;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** getTransformScript inherited */
|
|
|
|
|
/** getUnscaledThumb inherited */
|
|
|
|
|
/** thumbName inherited */
|
|
|
|
|
/** createThumb inherited */
|
|
|
|
|
/** transform inherited */
|
2008-04-14 07:45:50 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
/** getHandler inherited */
|
|
|
|
|
/** iconThumb inherited */
|
|
|
|
|
/** getLastError inherited */
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Get all thumbnail names previously generated for this file
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2013-03-11 17:15:01 +00:00
|
|
|
* @param string|bool $archiveName Name of an archive file, default false
|
2014-07-24 17:43:03 +00:00
|
|
|
* @return array First element is the base dir, then files in that base dir.
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
protected function getThumbnails( $archiveName = false ) {
|
2011-09-06 21:01:42 +00:00
|
|
|
if ( $archiveName ) {
|
|
|
|
|
$dir = $this->getArchiveThumbPath( $archiveName );
|
|
|
|
|
} else {
|
|
|
|
|
$dir = $this->getThumbPath();
|
|
|
|
|
}
|
2010-12-23 21:49:01 +00:00
|
|
|
|
2011-12-20 03:52:06 +00:00
|
|
|
$backend = $this->repo->getBackend();
|
2016-02-17 09:09:32 +00:00
|
|
|
$files = [ $dir ];
|
2013-09-27 18:36:59 +00:00
|
|
|
try {
|
2016-02-17 09:09:32 +00:00
|
|
|
$iterator = $backend->getFileList( [ 'dir' => $dir ] );
|
2019-11-28 19:22:25 +00:00
|
|
|
if ( $iterator !== null ) {
|
|
|
|
|
foreach ( $iterator as $file ) {
|
|
|
|
|
$files[] = $file;
|
|
|
|
|
}
|
2013-09-27 18:36:59 +00:00
|
|
|
}
|
2013-11-23 20:00:11 +00:00
|
|
|
} catch ( FileBackendError $e ) {
|
2017-02-20 22:44:19 +00:00
|
|
|
} // suppress (T56674)
|
2010-12-23 21:49:01 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
return $files;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2015-12-10 01:30:47 +00:00
|
|
|
* Refresh metadata in memcached, but don't touch thumbnails or CDN
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
private function purgeMetadataCache() {
|
2015-04-28 00:26:58 +00:00
|
|
|
$this->invalidateCache();
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2015-12-10 01:30:47 +00:00
|
|
|
* Delete all previously generated thumbnails, refresh metadata in memcached and purge the CDN.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2013-07-09 16:32:08 +00:00
|
|
|
*
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param array $options An array potentially with the key forThumbRefresh.
|
2013-07-09 16:32:08 +00:00
|
|
|
*
|
|
|
|
|
* @note This used to purge old thumbnails by default as well, but doesn't anymore.
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function purgeCache( $options = [] ) {
|
2007-05-30 21:02:32 +00:00
|
|
|
// Refresh metadata cache
|
2019-03-29 08:22:22 +00:00
|
|
|
$this->maybeUpgradeRow();
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->purgeMetadataCache();
|
|
|
|
|
|
2007-06-16 02:55:25 +00:00
|
|
|
// Delete thumbnails
|
2011-11-28 08:53:03 +00:00
|
|
|
$this->purgeThumbnails( $options );
|
2007-06-16 02:55:25 +00:00
|
|
|
|
2015-12-10 01:30:47 +00:00
|
|
|
// Purge CDN cache for this file
|
2019-03-15 00:23:26 +00:00
|
|
|
$hcu = MediaWikiServices::getInstance()->getHtmlCacheUpdater();
|
|
|
|
|
$hcu->purgeUrls(
|
|
|
|
|
$this->getUrl(),
|
|
|
|
|
!empty( $options['forThumbRefresh'] )
|
|
|
|
|
? $hcu::PURGE_PRESEND // just a manual purge
|
|
|
|
|
: $hcu::PURGE_INTENT_TXROUND_REFLECTED
|
2019-08-07 13:56:30 +00:00
|
|
|
);
|
2007-06-16 02:55:25 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2011-11-15 02:20:08 +00:00
|
|
|
* Delete cached transformed files for an archived version only.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param string $archiveName Name of the archived file
|
2007-06-16 02:55:25 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function purgeOldThumbnails( $archiveName ) {
|
2020-12-05 21:03:14 +00:00
|
|
|
// Get a list of old thumbnails
|
|
|
|
|
$thumbs = $this->getThumbnails( $archiveName );
|
2011-09-06 21:01:42 +00:00
|
|
|
|
2020-12-05 21:03:14 +00:00
|
|
|
// Delete thumbnails from storage, and prevent the directory itself from being purged
|
|
|
|
|
$dir = array_shift( $thumbs );
|
|
|
|
|
$this->purgeThumbList( $dir, $thumbs );
|
2014-03-13 02:05:51 +00:00
|
|
|
|
2016-02-17 09:09:32 +00:00
|
|
|
$urls = [];
|
2020-12-05 21:03:14 +00:00
|
|
|
foreach ( $thumbs as $thumb ) {
|
|
|
|
|
$urls[] = $this->getArchiveThumbUrl( $archiveName, $thumb );
|
2011-09-06 21:01:42 +00:00
|
|
|
}
|
2019-03-15 00:23:26 +00:00
|
|
|
|
2020-12-05 21:03:14 +00:00
|
|
|
// Purge any custom thumbnail caches
|
2021-10-16 21:10:35 +00:00
|
|
|
$this->getHookRunner()->onLocalFilePurgeThumbnails( $this, $archiveName, $urls );
|
2020-12-05 21:03:14 +00:00
|
|
|
|
|
|
|
|
// Purge the CDN
|
2019-03-15 00:23:26 +00:00
|
|
|
$hcu = MediaWikiServices::getInstance()->getHtmlCacheUpdater();
|
|
|
|
|
$hcu->purgeUrls( $urls, $hcu::PURGE_PRESEND );
|
2011-09-06 21:01:42 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Delete cached transformed files for the current version only.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2014-08-14 18:22:52 +00:00
|
|
|
* @param array $options
|
2019-08-30 18:17:32 +00:00
|
|
|
* @phan-param array{forThumbRefresh?:bool} $options
|
2011-09-06 21:01:42 +00:00
|
|
|
*/
|
2016-02-17 09:09:32 +00:00
|
|
|
public function purgeThumbnails( $options = [] ) {
|
2020-12-05 21:03:14 +00:00
|
|
|
$thumbs = $this->getThumbnails();
|
|
|
|
|
|
|
|
|
|
// Delete thumbnails from storage, and prevent the directory itself from being purged
|
|
|
|
|
$dir = array_shift( $thumbs );
|
|
|
|
|
$this->purgeThumbList( $dir, $thumbs );
|
|
|
|
|
|
2015-12-10 01:30:47 +00:00
|
|
|
// Always purge all files from CDN regardless of handler filters
|
2016-02-17 09:09:32 +00:00
|
|
|
$urls = [];
|
2020-12-05 21:03:14 +00:00
|
|
|
foreach ( $thumbs as $thumb ) {
|
|
|
|
|
$urls[] = $this->getThumbUrl( $thumb );
|
2012-11-27 05:41:56 +00:00
|
|
|
}
|
2011-12-20 03:52:06 +00:00
|
|
|
|
2020-12-05 21:03:14 +00:00
|
|
|
// Give the media handler a chance to filter the file purge list
|
2011-12-20 06:48:05 +00:00
|
|
|
if ( !empty( $options['forThumbRefresh'] ) ) {
|
2011-11-28 08:57:46 +00:00
|
|
|
$handler = $this->getHandler();
|
|
|
|
|
if ( $handler ) {
|
2020-12-05 21:03:14 +00:00
|
|
|
$handler->filterThumbnailPurgeList( $thumbs, $options );
|
2011-11-28 08:57:46 +00:00
|
|
|
}
|
2011-11-28 08:53:03 +00:00
|
|
|
}
|
2011-12-20 03:52:06 +00:00
|
|
|
|
2011-10-28 22:21:03 +00:00
|
|
|
// Purge any custom thumbnail caches
|
2020-12-05 21:03:14 +00:00
|
|
|
$this->getHookRunner()->onLocalFilePurgeThumbnails( $this, false, $urls );
|
2014-03-13 02:05:51 +00:00
|
|
|
|
2015-12-10 01:30:47 +00:00
|
|
|
// Purge the CDN
|
2019-03-15 00:23:26 +00:00
|
|
|
$hcu = MediaWikiServices::getInstance()->getHtmlCacheUpdater();
|
|
|
|
|
$hcu->purgeUrls(
|
|
|
|
|
$urls,
|
|
|
|
|
!empty( $options['forThumbRefresh'] )
|
|
|
|
|
? $hcu::PURGE_PRESEND // just a manual purge
|
|
|
|
|
: $hcu::PURGE_INTENT_TXROUND_REFLECTED
|
|
|
|
|
);
|
2011-09-06 21:01:42 +00:00
|
|
|
}
|
|
|
|
|
|
2016-08-16 11:44:08 +00:00
|
|
|
/**
|
|
|
|
|
* Prerenders a configurable set of thumbnails
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2016-08-16 11:44:08 +00:00
|
|
|
*
|
|
|
|
|
* @since 1.28
|
|
|
|
|
*/
|
|
|
|
|
public function prerenderThumbnails() {
|
|
|
|
|
global $wgUploadThumbnailRenderMap;
|
|
|
|
|
|
|
|
|
|
$jobs = [];
|
|
|
|
|
|
|
|
|
|
$sizes = $wgUploadThumbnailRenderMap;
|
|
|
|
|
rsort( $sizes );
|
|
|
|
|
|
|
|
|
|
foreach ( $sizes as $size ) {
|
2021-06-07 04:45:34 +00:00
|
|
|
if ( $this->isMultipage() ) {
|
|
|
|
|
for ( $page = 1; $page <= $this->pageCount(); $page++ ) {
|
|
|
|
|
$jobs[] = new ThumbnailRenderJob(
|
|
|
|
|
$this->getTitle(),
|
|
|
|
|
[ 'transformParams' => [
|
|
|
|
|
'width' => $size,
|
|
|
|
|
'page' => $page,
|
|
|
|
|
] ]
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
} elseif ( $this->isVectorized() || $this->getWidth() > $size ) {
|
2016-08-16 11:44:08 +00:00
|
|
|
$jobs[] = new ThumbnailRenderJob(
|
|
|
|
|
$this->getTitle(),
|
|
|
|
|
[ 'transformParams' => [ 'width' => $size ] ]
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if ( $jobs ) {
|
|
|
|
|
JobQueueGroup::singleton()->lazyPush( $jobs );
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2011-09-06 21:01:42 +00:00
|
|
|
/**
|
|
|
|
|
* Delete a list of thumbnails visible at urls
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param string $dir Base dir of the files.
|
|
|
|
|
* @param array $files Array of strings: relative filenames (to $dir)
|
2011-09-06 21:01:42 +00:00
|
|
|
*/
|
2011-12-20 03:52:06 +00:00
|
|
|
protected function purgeThumbList( $dir, $files ) {
|
2011-12-08 12:51:53 +00:00
|
|
|
$fileListDebug = strtr(
|
|
|
|
|
var_export( $files, true ),
|
2016-02-17 09:09:32 +00:00
|
|
|
[ "\n" => '' ]
|
2011-12-08 12:51:53 +00:00
|
|
|
);
|
2020-06-01 05:00:39 +00:00
|
|
|
wfDebug( __METHOD__ . ": $fileListDebug" );
|
2011-12-08 12:51:53 +00:00
|
|
|
|
2020-12-05 22:03:04 +00:00
|
|
|
if ( $this->repo->supportsSha1URLs() ) {
|
|
|
|
|
$reference = $this->getSha1();
|
|
|
|
|
} else {
|
|
|
|
|
$reference = $this->getName();
|
|
|
|
|
}
|
|
|
|
|
|
2016-02-17 09:09:32 +00:00
|
|
|
$purgeList = [];
|
2007-05-30 21:02:32 +00:00
|
|
|
foreach ( $files as $file ) {
|
2017-06-29 13:21:38 +00:00
|
|
|
# Check that the reference (filename or sha1) is part of the thumb name
|
2007-05-30 21:02:32 +00:00
|
|
|
# This is a basic sanity check to avoid erasing unrelated directories
|
2017-06-29 13:21:38 +00:00
|
|
|
if ( strpos( $file, $reference ) !== false
|
2012-10-29 18:09:43 +00:00
|
|
|
|| strpos( $file, "-thumbnail" ) !== false // "short" thumb name
|
|
|
|
|
) {
|
2011-12-26 23:35:40 +00:00
|
|
|
$purgeList[] = "{$dir}/{$file}";
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
}
|
2011-12-26 23:35:40 +00:00
|
|
|
|
|
|
|
|
# Delete the thumbnails
|
2012-04-16 23:51:55 +00:00
|
|
|
$this->repo->quickPurgeBatch( $purgeList );
|
2011-12-26 23:35:40 +00:00
|
|
|
# Clear out the thumbnail directory if empty
|
2012-04-25 17:47:59 +00:00
|
|
|
$this->repo->quickCleanDir( $dir );
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** purgeDescription inherited */
|
|
|
|
|
/** purgeEverything inherited */
|
|
|
|
|
|
2012-05-10 07:55:33 +00:00
|
|
|
/**
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2018-04-19 08:30:33 +00:00
|
|
|
* @param int|null $limit Optional: Limit to number of results
|
|
|
|
|
* @param string|int|null $start Optional: Timestamp, start from
|
|
|
|
|
* @param string|int|null $end Optional: Timestamp, end at
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param bool $inc
|
2015-10-05 04:45:25 +00:00
|
|
|
* @return OldLocalFile[]
|
2012-05-10 07:55:33 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function getHistory( $limit = null, $start = null, $end = null, $inc = true ) {
|
2019-08-30 08:55:01 +00:00
|
|
|
if ( !$this->exists() ) {
|
|
|
|
|
return []; // Avoid hard failure when the file does not exist. T221812
|
|
|
|
|
}
|
|
|
|
|
|
2016-11-18 15:42:39 +00:00
|
|
|
$dbr = $this->repo->getReplicaDB();
|
2017-10-06 17:03:55 +00:00
|
|
|
$oldFileQuery = OldLocalFile::getQueryInfo();
|
|
|
|
|
|
|
|
|
|
$tables = $oldFileQuery['tables'];
|
|
|
|
|
$fields = $oldFileQuery['fields'];
|
|
|
|
|
$join_conds = $oldFileQuery['joins'];
|
|
|
|
|
$conds = $opts = [];
|
2010-02-07 19:31:24 +00:00
|
|
|
$eq = $inc ? '=' : '';
|
2009-05-24 08:29:10 +00:00
|
|
|
$conds[] = "oi_name = " . $dbr->addQuotes( $this->title->getDBkey() );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
|
|
|
|
if ( $start ) {
|
2008-12-14 03:53:05 +00:00
|
|
|
$conds[] = "oi_timestamp <$eq " . $dbr->addQuotes( $dbr->timestamp( $start ) );
|
2008-01-20 06:48:57 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
|
|
|
|
if ( $end ) {
|
2008-12-14 03:53:05 +00:00
|
|
|
$conds[] = "oi_timestamp >$eq " . $dbr->addQuotes( $dbr->timestamp( $end ) );
|
2008-01-20 06:48:57 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
|
|
|
|
if ( $limit ) {
|
2008-01-20 06:48:57 +00:00
|
|
|
$opts['LIMIT'] = $limit;
|
|
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2008-12-13 19:38:34 +00:00
|
|
|
// Search backwards for time > x queries
|
2010-02-07 19:31:24 +00:00
|
|
|
$order = ( !$start && $end !== null ) ? 'ASC' : 'DESC';
|
2008-12-13 19:38:34 +00:00
|
|
|
$opts['ORDER BY'] = "oi_timestamp $order";
|
2016-02-17 09:09:32 +00:00
|
|
|
$opts['USE INDEX'] = [ 'oldimage' => 'oi_name_timestamp' ];
|
2010-02-07 19:31:24 +00:00
|
|
|
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
$this->getHookRunner()->onLocalFile__getHistory( $this, $tables, $fields,
|
|
|
|
|
$conds, $opts, $join_conds );
|
2010-02-07 19:31:24 +00:00
|
|
|
|
2008-06-08 17:39:24 +00:00
|
|
|
$res = $dbr->select( $tables, $fields, $conds, __METHOD__, $opts, $join_conds );
|
2016-02-17 09:09:32 +00:00
|
|
|
$r = [];
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2010-10-13 23:11:40 +00:00
|
|
|
foreach ( $res as $row ) {
|
2013-11-29 16:47:43 +00:00
|
|
|
$r[] = $this->repo->newFileFromRow( $row );
|
2008-01-20 06:48:57 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
|
|
|
|
if ( $order == 'ASC' ) {
|
2008-12-13 19:38:34 +00:00
|
|
|
$r = array_reverse( $r ); // make sure it ends up descending
|
|
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2008-01-20 06:48:57 +00:00
|
|
|
return $r;
|
|
|
|
|
}
|
2008-04-14 07:45:50 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
/**
|
2013-12-04 16:18:05 +00:00
|
|
|
* Returns the history of this file, line by line.
|
2007-05-30 21:02:32 +00:00
|
|
|
* starts with current version, then old versions.
|
|
|
|
|
* uses $this->historyLine to check which line to return:
|
|
|
|
|
* 0 return line for current version
|
|
|
|
|
* 1 query for old versions, return first one
|
|
|
|
|
* 2, ... return next old version from above query
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2019-08-30 08:55:01 +00:00
|
|
|
* @return stdClass|bool
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2010-02-07 19:31:24 +00:00
|
|
|
public function nextHistoryLine() {
|
2019-08-30 08:55:01 +00:00
|
|
|
if ( !$this->exists() ) {
|
|
|
|
|
return false; // Avoid hard failure when the file does not exist. T221812
|
|
|
|
|
}
|
|
|
|
|
|
2008-01-29 01:14:50 +00:00
|
|
|
# Polymorphic function name to distinguish foreign and local fetches
|
2017-03-07 02:14:14 +00:00
|
|
|
$fname = static::class . '::' . __FUNCTION__;
|
2008-01-29 01:14:50 +00:00
|
|
|
|
2016-11-18 15:42:39 +00:00
|
|
|
$dbr = $this->repo->getReplicaDB();
|
2007-05-30 21:02:32 +00:00
|
|
|
|
2013-11-23 20:00:11 +00:00
|
|
|
if ( $this->historyLine == 0 ) { // called for the first time, return line from cur
|
2017-10-06 17:03:55 +00:00
|
|
|
$fileQuery = self::getQueryInfo();
|
|
|
|
|
$this->historyRes = $dbr->select( $fileQuery['tables'],
|
|
|
|
|
$fileQuery['fields'] + [
|
2017-09-13 12:13:02 +00:00
|
|
|
'oi_archive_name' => $dbr->addQuotes( '' ),
|
|
|
|
|
'oi_deleted' => 0,
|
2016-02-17 09:09:32 +00:00
|
|
|
],
|
|
|
|
|
[ 'img_name' => $this->title->getDBkey() ],
|
2017-10-06 17:03:55 +00:00
|
|
|
$fname,
|
|
|
|
|
[],
|
|
|
|
|
$fileQuery['joins']
|
2007-05-30 21:02:32 +00:00
|
|
|
);
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2018-06-30 09:43:00 +00:00
|
|
|
if ( $dbr->numRows( $this->historyRes ) == 0 ) {
|
2007-07-07 03:04:20 +00:00
|
|
|
$this->historyRes = null;
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2010-02-07 19:31:24 +00:00
|
|
|
return false;
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2010-02-07 19:31:24 +00:00
|
|
|
} elseif ( $this->historyLine == 1 ) {
|
2017-10-06 17:03:55 +00:00
|
|
|
$fileQuery = OldLocalFile::getQueryInfo();
|
2017-09-13 12:13:02 +00:00
|
|
|
$this->historyRes = $dbr->select(
|
2017-10-06 17:03:55 +00:00
|
|
|
$fileQuery['tables'],
|
|
|
|
|
$fileQuery['fields'],
|
2016-02-17 09:09:32 +00:00
|
|
|
[ 'oi_name' => $this->title->getDBkey() ],
|
2008-01-29 01:14:50 +00:00
|
|
|
$fname,
|
2017-10-06 17:03:55 +00:00
|
|
|
[ 'ORDER BY' => 'oi_timestamp DESC' ],
|
|
|
|
|
$fileQuery['joins']
|
2007-05-30 21:02:32 +00:00
|
|
|
);
|
|
|
|
|
}
|
2013-11-23 20:00:11 +00:00
|
|
|
$this->historyLine++;
|
2007-05-30 21:02:32 +00:00
|
|
|
|
|
|
|
|
return $dbr->fetchObject( $this->historyRes );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Reset the history pointer to the first element of the history
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2010-02-07 19:31:24 +00:00
|
|
|
public function resetHistory() {
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->historyLine = 0;
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2020-01-09 23:48:34 +00:00
|
|
|
if ( $this->historyRes !== null ) {
|
2007-07-07 03:04:20 +00:00
|
|
|
$this->historyRes = null;
|
|
|
|
|
}
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** getHashPath inherited */
|
|
|
|
|
/** getRel inherited */
|
|
|
|
|
/** getUrlRel inherited */
|
2007-07-22 14:45:12 +00:00
|
|
|
/** getArchiveRel inherited */
|
2007-05-30 21:02:32 +00:00
|
|
|
/** getArchivePath inherited */
|
|
|
|
|
/** getThumbPath inherited */
|
|
|
|
|
/** getArchiveUrl inherited */
|
|
|
|
|
/** getThumbUrl inherited */
|
|
|
|
|
/** getArchiveVirtualUrl inherited */
|
|
|
|
|
/** getThumbVirtualUrl inherited */
|
|
|
|
|
/** isHashed inherited */
|
|
|
|
|
|
2007-06-16 02:55:25 +00:00
|
|
|
/**
|
|
|
|
|
* Upload a file and record it in the DB
|
2016-03-02 23:42:36 +00:00
|
|
|
* @param string|FSFile $src Source storage path, virtual URL, or filesystem path
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param string $comment Upload description
|
|
|
|
|
* @param string $pageText Text to use for the new description page,
|
|
|
|
|
* if a new description page is created
|
|
|
|
|
* @param int|bool $flags Flags for publish()
|
|
|
|
|
* @param array|bool $props File properties, if known. This can be used to
|
|
|
|
|
* reduce the upload time when uploading virtual URLs for which the file
|
|
|
|
|
* info is already known
|
|
|
|
|
* @param string|bool $timestamp Timestamp for img_timestamp, or false to use the
|
|
|
|
|
* current time
|
2021-05-27 19:56:40 +00:00
|
|
|
* @param Authority|null $uploader object or null to use the context authority
|
2016-01-21 17:14:40 +00:00
|
|
|
* @param string[] $tags Change tags to add to the log entry and page revision.
|
2021-05-27 19:56:40 +00:00
|
|
|
* (This doesn't check $uploader's permissions.)
|
2018-05-07 13:41:53 +00:00
|
|
|
* @param bool $createNullRevision Set to false to avoid creation of a null revision on file
|
|
|
|
|
* upload, see T193621
|
2018-10-01 18:05:44 +00:00
|
|
|
* @param bool $revert If this file upload is a revert
|
2016-10-05 22:50:33 +00:00
|
|
|
* @return Status On success, the value member contains the
|
2008-04-14 07:45:50 +00:00
|
|
|
* archive name, or an empty string if it was a new file.
|
2007-06-16 02:55:25 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function upload( $src, $comment, $pageText, $flags = 0, $props = false,
|
2021-05-27 19:56:40 +00:00
|
|
|
$timestamp = false, Authority $uploader = null, $tags = [],
|
2018-10-01 18:05:44 +00:00
|
|
|
$createNullRevision = true, $revert = false
|
2013-11-23 20:54:48 +00:00
|
|
|
) {
|
2012-03-14 21:30:26 +00:00
|
|
|
if ( $this->getRepo()->getReadOnlyReason() !== false ) {
|
|
|
|
|
return $this->readOnlyFatalStatus();
|
2018-02-22 08:48:42 +00:00
|
|
|
} elseif ( MediaWikiServices::getInstance()->getRevisionStore()->isReadOnly() ) {
|
|
|
|
|
// Check this in advance to avoid writing to FileBackend and the file tables,
|
|
|
|
|
// only to fail on insert the revision due to the text store being unavailable.
|
|
|
|
|
return $this->readOnlyFatalStatus();
|
2012-03-14 21:30:26 +00:00
|
|
|
}
|
|
|
|
|
|
2016-03-02 23:42:36 +00:00
|
|
|
$srcPath = ( $src instanceof FSFile ) ? $src->getPath() : $src;
|
2012-11-20 01:38:17 +00:00
|
|
|
if ( !$props ) {
|
2019-01-18 21:03:45 +00:00
|
|
|
if ( FileRepo::isVirtualUrl( $srcPath )
|
2013-11-23 20:00:11 +00:00
|
|
|
|| FileBackend::isStoragePath( $srcPath )
|
|
|
|
|
) {
|
2012-11-26 11:42:09 +00:00
|
|
|
$props = $this->repo->getFileProps( $srcPath );
|
|
|
|
|
} else {
|
2018-05-04 13:39:33 +00:00
|
|
|
$mwProps = new MWFileProps( MediaWikiServices::getInstance()->getMimeAnalyzer() );
|
2016-09-19 01:39:59 +00:00
|
|
|
$props = $mwProps->getPropsFromPath( $srcPath, true );
|
2012-11-26 11:42:09 +00:00
|
|
|
}
|
2012-11-20 01:38:17 +00:00
|
|
|
}
|
|
|
|
|
|
2016-02-17 09:09:32 +00:00
|
|
|
$options = [];
|
2012-11-20 01:38:17 +00:00
|
|
|
$handler = MediaHandler::getHandler( $props['mime'] );
|
|
|
|
|
if ( $handler ) {
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
if ( is_string( $props['metadata'] ) ) {
|
|
|
|
|
// This supports callers directly fabricating a metadata
|
|
|
|
|
// property using serialize(). Normally the metadata property
|
|
|
|
|
// comes from MWFileProps, in which case it won't be a string.
|
|
|
|
|
// phpcs:ignore Generic.PHP.NoSilencedErrors.Discouraged
|
|
|
|
|
$metadata = @unserialize( $props['metadata'] );
|
|
|
|
|
} else {
|
|
|
|
|
$metadata = $props['metadata'];
|
2017-05-30 11:19:49 +00:00
|
|
|
}
|
|
|
|
|
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
if ( is_array( $metadata ) ) {
|
|
|
|
|
$options['headers'] = $handler->getContentHeaders( $metadata );
|
|
|
|
|
}
|
2012-11-20 01:38:17 +00:00
|
|
|
} else {
|
2016-02-17 09:09:32 +00:00
|
|
|
$options['headers'] = [];
|
2012-11-20 01:38:17 +00:00
|
|
|
}
|
|
|
|
|
|
2012-12-01 19:11:21 +00:00
|
|
|
// Trim spaces on user supplied text
|
|
|
|
|
$comment = trim( $comment );
|
|
|
|
|
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->lock();
|
2016-03-02 23:42:36 +00:00
|
|
|
$status = $this->publish( $src, $flags, $options );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2014-04-18 19:34:52 +00:00
|
|
|
if ( $status->successCount >= 2 ) {
|
|
|
|
|
// There will be a copy+(one of move,copy,store).
|
|
|
|
|
// The first succeeding does not commit us to updating the DB
|
|
|
|
|
// since it simply copied the current version to a timestamped file name.
|
|
|
|
|
// It is only *preferable* to avoid leaving such files orphaned.
|
|
|
|
|
// Once the second operation goes through, then the current version was
|
|
|
|
|
// updated and we must therefore update the DB too.
|
2016-01-21 17:14:40 +00:00
|
|
|
$oldver = $status->value;
|
2020-09-24 23:36:30 +00:00
|
|
|
|
2021-05-27 19:56:40 +00:00
|
|
|
if ( $uploader === null ) {
|
|
|
|
|
// Uploader argument is optional, fall back to the context authority
|
|
|
|
|
$uploader = RequestContext::getMain()->getAuthority();
|
2020-09-24 23:36:30 +00:00
|
|
|
}
|
|
|
|
|
|
2020-06-16 01:45:46 +00:00
|
|
|
$uploadStatus = $this->recordUpload3(
|
Block same-file reuploads
When uploading a file, there are a few ways of checking for and blocking
(or at least warning about) duplicate uploads already.
However, it occasionally seems to happen that files get uploaded twice.
The exact same file, usually - submitted at (almost) the exact same time
(possibly some error in whatever submits the file upload, but still)
Given 2 uploads at (almost) the exact same time, both of them are stored,
even if they are the exact same files.
The last upload also ends up with a `logging` entry with `log_page = 0`.
I don’t believe such upload should go through: if we do find that a file
is an exact duplicate of something that already exists, I don’t see any
reason it should go through.
Note that with this patch, it will become impossible to reupload a file
with the exact same hash (which was possible before.)
If we still want to allow same-file reuploads while also blocking these
kind of race-condition same-uploads, we could make the check more strict
(e.g. also check timestamps, or check if page already exists, or …)
Bug: T158480
Change-Id: I76cbd2c64c3b893997f1f85974d6f82cbfe121e1
2017-09-18 13:35:42 +00:00
|
|
|
$oldver,
|
|
|
|
|
$comment,
|
|
|
|
|
$pageText,
|
2021-05-27 19:56:40 +00:00
|
|
|
$uploader,
|
Block same-file reuploads
When uploading a file, there are a few ways of checking for and blocking
(or at least warning about) duplicate uploads already.
However, it occasionally seems to happen that files get uploaded twice.
The exact same file, usually - submitted at (almost) the exact same time
(possibly some error in whatever submits the file upload, but still)
Given 2 uploads at (almost) the exact same time, both of them are stored,
even if they are the exact same files.
The last upload also ends up with a `logging` entry with `log_page = 0`.
I don’t believe such upload should go through: if we do find that a file
is an exact duplicate of something that already exists, I don’t see any
reason it should go through.
Note that with this patch, it will become impossible to reupload a file
with the exact same hash (which was possible before.)
If we still want to allow same-file reuploads while also blocking these
kind of race-condition same-uploads, we could make the check more strict
(e.g. also check timestamps, or check if page already exists, or …)
Bug: T158480
Change-Id: I76cbd2c64c3b893997f1f85974d6f82cbfe121e1
2017-09-18 13:35:42 +00:00
|
|
|
$props,
|
|
|
|
|
$timestamp,
|
2018-05-07 13:41:53 +00:00
|
|
|
$tags,
|
2018-10-01 18:05:44 +00:00
|
|
|
$createNullRevision,
|
|
|
|
|
$revert
|
Block same-file reuploads
When uploading a file, there are a few ways of checking for and blocking
(or at least warning about) duplicate uploads already.
However, it occasionally seems to happen that files get uploaded twice.
The exact same file, usually - submitted at (almost) the exact same time
(possibly some error in whatever submits the file upload, but still)
Given 2 uploads at (almost) the exact same time, both of them are stored,
even if they are the exact same files.
The last upload also ends up with a `logging` entry with `log_page = 0`.
I don’t believe such upload should go through: if we do find that a file
is an exact duplicate of something that already exists, I don’t see any
reason it should go through.
Note that with this patch, it will become impossible to reupload a file
with the exact same hash (which was possible before.)
If we still want to allow same-file reuploads while also blocking these
kind of race-condition same-uploads, we could make the check more strict
(e.g. also check timestamps, or check if page already exists, or …)
Bug: T158480
Change-Id: I76cbd2c64c3b893997f1f85974d6f82cbfe121e1
2017-09-18 13:35:42 +00:00
|
|
|
);
|
|
|
|
|
if ( !$uploadStatus->isOK() ) {
|
|
|
|
|
if ( $uploadStatus->hasMessage( 'filenotfound' ) ) {
|
|
|
|
|
// update filenotfound error with more specific path
|
|
|
|
|
$status->fatal( 'filenotfound', $srcPath );
|
|
|
|
|
} else {
|
|
|
|
|
$status->merge( $uploadStatus );
|
|
|
|
|
}
|
2007-07-22 14:45:12 +00:00
|
|
|
}
|
2007-06-16 02:55:25 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->unlock();
|
2007-07-22 14:45:12 +00:00
|
|
|
return $status;
|
2007-06-16 02:55:25 +00:00
|
|
|
}
|
|
|
|
|
|
2020-03-30 05:05:18 +00:00
|
|
|
/**
|
|
|
|
|
* Record a file upload in the upload log and the image table (version 3)
|
|
|
|
|
* @since 1.35
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2020-03-30 05:05:18 +00:00
|
|
|
* @param string $oldver
|
|
|
|
|
* @param string $comment
|
|
|
|
|
* @param string $pageText
|
2021-05-27 19:56:40 +00:00
|
|
|
* @param Authority $performer
|
2020-03-30 05:05:18 +00:00
|
|
|
* @param bool|array $props
|
|
|
|
|
* @param string|bool $timestamp
|
|
|
|
|
* @param string[] $tags
|
|
|
|
|
* @param bool $createNullRevision Set to false to avoid creation of a null revision on file
|
|
|
|
|
* upload, see T193621
|
|
|
|
|
* @param bool $revert If this file upload is a revert
|
|
|
|
|
* @return Status
|
|
|
|
|
*/
|
|
|
|
|
public function recordUpload3(
|
|
|
|
|
string $oldver,
|
|
|
|
|
string $comment,
|
|
|
|
|
string $pageText,
|
2021-05-27 19:56:40 +00:00
|
|
|
Authority $performer,
|
2020-03-30 05:05:18 +00:00
|
|
|
$props = false,
|
|
|
|
|
$timestamp = false,
|
|
|
|
|
$tags = [],
|
|
|
|
|
bool $createNullRevision = true,
|
|
|
|
|
bool $revert = false
|
2021-07-22 03:11:47 +00:00
|
|
|
): Status {
|
2021-04-19 01:32:42 +00:00
|
|
|
$dbw = $this->repo->getPrimaryDB();
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2014-05-20 21:40:42 +00:00
|
|
|
# Imports or such might force a certain timestamp; otherwise we generate
|
|
|
|
|
# it and can fudge it slightly to keep (name,timestamp) unique on re-upload.
|
2011-04-02 16:49:47 +00:00
|
|
|
if ( $timestamp === false ) {
|
2014-05-20 21:40:42 +00:00
|
|
|
$timestamp = $dbw->timestamp();
|
|
|
|
|
$allowTimeKludge = true;
|
|
|
|
|
} else {
|
|
|
|
|
$allowTimeKludge = false;
|
2011-04-02 16:49:47 +00:00
|
|
|
}
|
|
|
|
|
|
2015-11-01 22:29:05 +00:00
|
|
|
$props = $props ?: $this->repo->getFileProps( $this->getVirtualUrl() );
|
2008-01-20 12:48:39 +00:00
|
|
|
$props['description'] = $comment;
|
2011-04-02 16:49:47 +00:00
|
|
|
$props['timestamp'] = wfTimestamp( TS_MW, $timestamp ); // DB -> TS_MW
|
2007-06-16 02:55:25 +00:00
|
|
|
$this->setProps( $props );
|
|
|
|
|
|
2010-08-31 11:12:33 +00:00
|
|
|
# Fail now if the file isn't there
|
2007-05-30 21:02:32 +00:00
|
|
|
if ( !$this->fileExists ) {
|
2020-06-01 05:00:39 +00:00
|
|
|
wfDebug( __METHOD__ . ": File " . $this->getRel() . " went missing!" );
|
2013-11-23 20:00:11 +00:00
|
|
|
|
Block same-file reuploads
When uploading a file, there are a few ways of checking for and blocking
(or at least warning about) duplicate uploads already.
However, it occasionally seems to happen that files get uploaded twice.
The exact same file, usually - submitted at (almost) the exact same time
(possibly some error in whatever submits the file upload, but still)
Given 2 uploads at (almost) the exact same time, both of them are stored,
even if they are the exact same files.
The last upload also ends up with a `logging` entry with `log_page = 0`.
I don’t believe such upload should go through: if we do find that a file
is an exact duplicate of something that already exists, I don’t see any
reason it should go through.
Note that with this patch, it will become impossible to reupload a file
with the exact same hash (which was possible before.)
If we still want to allow same-file reuploads while also blocking these
kind of race-condition same-uploads, we could make the check more strict
(e.g. also check timestamps, or check if page already exists, or …)
Bug: T158480
Change-Id: I76cbd2c64c3b893997f1f85974d6f82cbfe121e1
2017-09-18 13:35:42 +00:00
|
|
|
return Status::newFatal( 'filenotfound', $this->getRel() );
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2021-02-15 22:26:08 +00:00
|
|
|
$actorNormalizaton = MediaWikiServices::getInstance()->getActorNormalization();
|
|
|
|
|
|
2015-11-01 22:29:05 +00:00
|
|
|
$dbw->startAtomic( __METHOD__ );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2021-05-27 19:56:40 +00:00
|
|
|
$actorId = $actorNormalizaton->acquireActorId( $performer->getUser(), $dbw );
|
|
|
|
|
$this->user = $performer->getUser();
|
2021-02-15 22:26:08 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
# Test to see if the row exists using INSERT IGNORE
|
|
|
|
|
# This avoids race conditions by locking the row until the commit, and also
|
|
|
|
|
# doesn't deadlock. SELECT FOR UPDATE causes a deadlock for every race condition.
|
2018-05-04 13:39:33 +00:00
|
|
|
$commentStore = MediaWikiServices::getInstance()->getCommentStore();
|
2018-03-07 16:40:56 +00:00
|
|
|
$commentFields = $commentStore->insert( $dbw, 'img_description', $comment );
|
2021-02-15 22:26:08 +00:00
|
|
|
$actorFields = [ 'img_actor' => $actorId ];
|
2007-05-30 21:02:32 +00:00
|
|
|
$dbw->insert( 'image',
|
2016-02-17 09:09:32 +00:00
|
|
|
[
|
2013-04-20 17:18:13 +00:00
|
|
|
'img_name' => $this->getName(),
|
|
|
|
|
'img_size' => $this->size,
|
|
|
|
|
'img_width' => intval( $this->width ),
|
|
|
|
|
'img_height' => intval( $this->height ),
|
|
|
|
|
'img_bits' => $this->bits,
|
|
|
|
|
'img_media_type' => $this->media_type,
|
|
|
|
|
'img_major_mime' => $this->major_mime,
|
|
|
|
|
'img_minor_mime' => $this->minor_mime,
|
|
|
|
|
'img_timestamp' => $timestamp,
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
'img_metadata' => $this->getMetadataForDb( $dbw ),
|
2013-04-20 17:18:13 +00:00
|
|
|
'img_sha1' => $this->sha1
|
2017-09-12 17:12:29 +00:00
|
|
|
] + $commentFields + $actorFields,
|
2007-05-30 21:02:32 +00:00
|
|
|
__METHOD__,
|
2019-06-07 17:12:35 +00:00
|
|
|
[ 'IGNORE' ]
|
2007-05-30 21:02:32 +00:00
|
|
|
);
|
2015-11-01 22:29:05 +00:00
|
|
|
$reupload = ( $dbw->affectedRows() == 0 );
|
2017-06-06 17:39:14 +00:00
|
|
|
|
2015-11-01 22:29:05 +00:00
|
|
|
if ( $reupload ) {
|
Block same-file reuploads
When uploading a file, there are a few ways of checking for and blocking
(or at least warning about) duplicate uploads already.
However, it occasionally seems to happen that files get uploaded twice.
The exact same file, usually - submitted at (almost) the exact same time
(possibly some error in whatever submits the file upload, but still)
Given 2 uploads at (almost) the exact same time, both of them are stored,
even if they are the exact same files.
The last upload also ends up with a `logging` entry with `log_page = 0`.
I don’t believe such upload should go through: if we do find that a file
is an exact duplicate of something that already exists, I don’t see any
reason it should go through.
Note that with this patch, it will become impossible to reupload a file
with the exact same hash (which was possible before.)
If we still want to allow same-file reuploads while also blocking these
kind of race-condition same-uploads, we could make the check more strict
(e.g. also check timestamps, or check if page already exists, or …)
Bug: T158480
Change-Id: I76cbd2c64c3b893997f1f85974d6f82cbfe121e1
2017-09-18 13:35:42 +00:00
|
|
|
$row = $dbw->selectRow(
|
|
|
|
|
'image',
|
|
|
|
|
[ 'img_timestamp', 'img_sha1' ],
|
|
|
|
|
[ 'img_name' => $this->getName() ],
|
|
|
|
|
__METHOD__,
|
|
|
|
|
[ 'LOCK IN SHARE MODE' ]
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
if ( $row && $row->img_sha1 === $this->sha1 ) {
|
|
|
|
|
$dbw->endAtomic( __METHOD__ );
|
2020-06-01 05:00:39 +00:00
|
|
|
wfDebug( __METHOD__ . ": File " . $this->getRel() . " already exists!" );
|
Block same-file reuploads
When uploading a file, there are a few ways of checking for and blocking
(or at least warning about) duplicate uploads already.
However, it occasionally seems to happen that files get uploaded twice.
The exact same file, usually - submitted at (almost) the exact same time
(possibly some error in whatever submits the file upload, but still)
Given 2 uploads at (almost) the exact same time, both of them are stored,
even if they are the exact same files.
The last upload also ends up with a `logging` entry with `log_page = 0`.
I don’t believe such upload should go through: if we do find that a file
is an exact duplicate of something that already exists, I don’t see any
reason it should go through.
Note that with this patch, it will become impossible to reupload a file
with the exact same hash (which was possible before.)
If we still want to allow same-file reuploads while also blocking these
kind of race-condition same-uploads, we could make the check more strict
(e.g. also check timestamps, or check if page already exists, or …)
Bug: T158480
Change-Id: I76cbd2c64c3b893997f1f85974d6f82cbfe121e1
2017-09-18 13:35:42 +00:00
|
|
|
$title = Title::newFromText( $this->getName(), NS_FILE );
|
|
|
|
|
return Status::newFatal( 'fileexists-no-change', $title->getPrefixedText() );
|
|
|
|
|
}
|
|
|
|
|
|
2014-05-20 21:40:42 +00:00
|
|
|
if ( $allowTimeKludge ) {
|
2014-10-19 02:47:22 +00:00
|
|
|
# Use LOCK IN SHARE MODE to ignore any transaction snapshotting
|
Block same-file reuploads
When uploading a file, there are a few ways of checking for and blocking
(or at least warning about) duplicate uploads already.
However, it occasionally seems to happen that files get uploaded twice.
The exact same file, usually - submitted at (almost) the exact same time
(possibly some error in whatever submits the file upload, but still)
Given 2 uploads at (almost) the exact same time, both of them are stored,
even if they are the exact same files.
The last upload also ends up with a `logging` entry with `log_page = 0`.
I don’t believe such upload should go through: if we do find that a file
is an exact duplicate of something that already exists, I don’t see any
reason it should go through.
Note that with this patch, it will become impossible to reupload a file
with the exact same hash (which was possible before.)
If we still want to allow same-file reuploads while also blocking these
kind of race-condition same-uploads, we could make the check more strict
(e.g. also check timestamps, or check if page already exists, or …)
Bug: T158480
Change-Id: I76cbd2c64c3b893997f1f85974d6f82cbfe121e1
2017-09-18 13:35:42 +00:00
|
|
|
$lUnixtime = $row ? wfTimestamp( TS_UNIX, $row->img_timestamp ) : false;
|
2014-05-20 21:40:42 +00:00
|
|
|
# Avoid a timestamp that is not newer than the last version
|
|
|
|
|
# TODO: the image/oldimage tables should be like page/revision with an ID field
|
|
|
|
|
if ( $lUnixtime && wfTimestamp( TS_UNIX, $timestamp ) <= $lUnixtime ) {
|
|
|
|
|
sleep( 1 ); // fast enough re-uploads would go far in the future otherwise
|
|
|
|
|
$timestamp = $dbw->timestamp( $lUnixtime + 1 );
|
|
|
|
|
$this->timestamp = wfTimestamp( TS_MW, $timestamp ); // DB -> TS_MW
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2017-06-06 17:39:14 +00:00
|
|
|
$tables = [ 'image' ];
|
|
|
|
|
$fields = [
|
|
|
|
|
'oi_name' => 'img_name',
|
|
|
|
|
'oi_archive_name' => $dbw->addQuotes( $oldver ),
|
|
|
|
|
'oi_size' => 'img_size',
|
|
|
|
|
'oi_width' => 'img_width',
|
|
|
|
|
'oi_height' => 'img_height',
|
|
|
|
|
'oi_bits' => 'img_bits',
|
2019-01-04 18:55:11 +00:00
|
|
|
'oi_description_id' => 'img_description_id',
|
2017-06-06 17:39:14 +00:00
|
|
|
'oi_timestamp' => 'img_timestamp',
|
|
|
|
|
'oi_metadata' => 'img_metadata',
|
|
|
|
|
'oi_media_type' => 'img_media_type',
|
|
|
|
|
'oi_major_mime' => 'img_major_mime',
|
|
|
|
|
'oi_minor_mime' => 'img_minor_mime',
|
|
|
|
|
'oi_sha1' => 'img_sha1',
|
2019-07-23 17:40:52 +00:00
|
|
|
'oi_actor' => 'img_actor',
|
2017-06-06 17:39:14 +00:00
|
|
|
];
|
|
|
|
|
$joins = [];
|
|
|
|
|
|
2017-02-20 22:44:19 +00:00
|
|
|
# (T36993) Note: $oldver can be empty here, if the previous
|
2012-04-06 17:38:38 +00:00
|
|
|
# version of the file was broken. Allow registration of the new
|
|
|
|
|
# version to continue anyway, because that's better than having
|
2012-03-07 23:00:42 +00:00
|
|
|
# an image that's not fixable by user operations.
|
2007-05-30 21:02:32 +00:00
|
|
|
# Collision, this is an update of a file
|
|
|
|
|
# Insert previous contents into oldimage
|
2017-06-06 17:39:14 +00:00
|
|
|
$dbw->insertSelect( 'oldimage', $tables, $fields,
|
|
|
|
|
[ 'img_name' => $this->getName() ], __METHOD__, [], [], $joins );
|
2007-05-30 21:02:32 +00:00
|
|
|
|
|
|
|
|
# Update the current image row
|
|
|
|
|
$dbw->update( 'image',
|
2016-02-17 09:09:32 +00:00
|
|
|
[
|
2013-11-23 20:00:11 +00:00
|
|
|
'img_size' => $this->size,
|
|
|
|
|
'img_width' => intval( $this->width ),
|
|
|
|
|
'img_height' => intval( $this->height ),
|
|
|
|
|
'img_bits' => $this->bits,
|
|
|
|
|
'img_media_type' => $this->media_type,
|
|
|
|
|
'img_major_mime' => $this->major_mime,
|
|
|
|
|
'img_minor_mime' => $this->minor_mime,
|
|
|
|
|
'img_timestamp' => $timestamp,
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
'img_metadata' => $this->getMetadataForDb( $dbw ),
|
2013-11-23 20:00:11 +00:00
|
|
|
'img_sha1' => $this->sha1
|
2017-09-12 17:12:29 +00:00
|
|
|
] + $commentFields + $actorFields,
|
2016-02-17 09:09:32 +00:00
|
|
|
[ 'img_name' => $this->getName() ],
|
2011-12-26 23:35:40 +00:00
|
|
|
__METHOD__
|
2007-05-30 21:02:32 +00:00
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
$descTitle = $this->getTitle();
|
2015-05-17 15:36:29 +00:00
|
|
|
$descId = $descTitle->getArticleID();
|
2011-11-16 20:54:40 +00:00
|
|
|
$wikiPage = new WikiFilePage( $descTitle );
|
|
|
|
|
$wikiPage->setFile( $this );
|
2007-06-16 02:55:25 +00:00
|
|
|
|
2018-10-01 18:05:44 +00:00
|
|
|
// Determine log action. If reupload is done by reverting, use a special log_action.
|
2021-08-17 19:52:34 +00:00
|
|
|
if ( $revert ) {
|
2018-10-01 18:05:44 +00:00
|
|
|
$logAction = 'revert';
|
2021-08-17 19:52:34 +00:00
|
|
|
} elseif ( $reupload ) {
|
2018-10-01 18:05:44 +00:00
|
|
|
$logAction = 'overwrite';
|
|
|
|
|
} else {
|
|
|
|
|
$logAction = 'upload';
|
|
|
|
|
}
|
2015-11-01 22:29:05 +00:00
|
|
|
// Add the log entry...
|
2018-10-01 18:05:44 +00:00
|
|
|
$logEntry = new ManualLogEntry( 'upload', $logAction );
|
2016-01-24 20:31:34 +00:00
|
|
|
$logEntry->setTimestamp( $this->timestamp );
|
2021-05-27 19:56:40 +00:00
|
|
|
$logEntry->setPerformer( $performer->getUser() );
|
2013-04-17 23:39:58 +00:00
|
|
|
$logEntry->setComment( $comment );
|
|
|
|
|
$logEntry->setTarget( $descTitle );
|
|
|
|
|
// Allow people using the api to associate log entries with the upload.
|
|
|
|
|
// Log has a timestamp, but sometimes different from upload timestamp.
|
|
|
|
|
$logEntry->setParameters(
|
2016-02-17 09:09:32 +00:00
|
|
|
[
|
2013-04-17 23:39:58 +00:00
|
|
|
'img_sha1' => $this->sha1,
|
|
|
|
|
'img_timestamp' => $timestamp,
|
2016-02-17 09:09:32 +00:00
|
|
|
]
|
2013-04-17 23:39:58 +00:00
|
|
|
);
|
|
|
|
|
// Note we keep $logId around since during new image
|
|
|
|
|
// creation, page doesn't exist yet, so log_page = 0
|
|
|
|
|
// but we want it to point to the page we're making,
|
|
|
|
|
// so we later modify the log entry.
|
2013-06-18 00:11:44 +00:00
|
|
|
// For a similar reason, we avoid making an RC entry
|
|
|
|
|
// now and wait until the page exists.
|
2013-04-17 23:39:58 +00:00
|
|
|
$logId = $logEntry->insert();
|
2007-05-30 21:02:32 +00:00
|
|
|
|
2015-11-01 22:29:05 +00:00
|
|
|
if ( $descTitle->exists() ) {
|
2021-08-17 19:52:34 +00:00
|
|
|
if ( $createNullRevision ) {
|
2020-04-11 05:08:23 +00:00
|
|
|
$revStore = MediaWikiServices::getInstance()->getRevisionStore();
|
|
|
|
|
// Use own context to get the action text in content language
|
|
|
|
|
$formatter = LogFormatter::newFromEntry( $logEntry );
|
|
|
|
|
$formatter->setContext( RequestContext::newExtraneousContext( $descTitle ) );
|
|
|
|
|
$editSummary = $formatter->getPlainActionText();
|
|
|
|
|
$summary = CommentStoreComment::newUnsavedComment( $editSummary );
|
|
|
|
|
$nullRevRecord = $revStore->newNullRevision(
|
|
|
|
|
$dbw,
|
|
|
|
|
$descTitle,
|
|
|
|
|
$summary,
|
|
|
|
|
false,
|
2021-05-27 19:56:40 +00:00
|
|
|
$performer->getUser()
|
2015-11-01 22:29:05 +00:00
|
|
|
);
|
2020-04-11 05:08:23 +00:00
|
|
|
|
|
|
|
|
if ( $nullRevRecord ) {
|
|
|
|
|
$inserted = $revStore->insertRevisionOn( $nullRevRecord, $dbw );
|
|
|
|
|
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
$this->getHookRunner()->onRevisionFromEditComplete(
|
|
|
|
|
$wikiPage,
|
|
|
|
|
$inserted,
|
|
|
|
|
$inserted->getParentId(),
|
2021-05-27 19:56:40 +00:00
|
|
|
$performer->getUser(),
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
$tags
|
2020-04-16 01:52:24 +00:00
|
|
|
);
|
|
|
|
|
|
2020-04-24 00:55:09 +00:00
|
|
|
$wikiPage->updateRevisionOn( $dbw, $inserted );
|
2020-04-11 05:08:23 +00:00
|
|
|
// Associate null revision id
|
|
|
|
|
$logEntry->setAssociatedRevId( $inserted->getId() );
|
|
|
|
|
}
|
2011-07-20 00:15:05 +00:00
|
|
|
}
|
2014-04-06 02:50:10 +00:00
|
|
|
|
2015-11-01 22:29:05 +00:00
|
|
|
$newPageContent = null;
|
2007-05-30 21:02:32 +00:00
|
|
|
} else {
|
2015-11-01 22:29:05 +00:00
|
|
|
// Make the description page and RC log entry post-commit
|
|
|
|
|
$newPageContent = ContentHandler::makeContent( $pageText, $descTitle );
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2013-06-18 00:11:44 +00:00
|
|
|
|
2021-01-08 11:03:25 +00:00
|
|
|
// NOTE: Even after ending this atomic section, we are probably still in the transaction
|
|
|
|
|
// started by the call to lock() in publishTo(). We cannot yet safely schedule jobs,
|
|
|
|
|
// see T263301.
|
2015-11-01 22:29:05 +00:00
|
|
|
$dbw->endAtomic( __METHOD__ );
|
2018-09-30 14:57:54 +00:00
|
|
|
$fname = __METHOD__;
|
2015-11-01 22:29:05 +00:00
|
|
|
|
2015-10-06 19:02:11 +00:00
|
|
|
# Do some cache purges after final commit so that:
|
|
|
|
|
# a) Changes are more likely to be seen post-purge
|
|
|
|
|
# b) They won't cause rollback of the log publish/update above
|
2021-01-08 11:03:25 +00:00
|
|
|
$purgeUpdate = new AutoCommitUpdate(
|
|
|
|
|
$dbw,
|
|
|
|
|
__METHOD__,
|
|
|
|
|
/** @suppress PhanTypeArraySuspiciousNullable False positives with $this->status->value */
|
|
|
|
|
function () use (
|
2021-05-27 19:56:40 +00:00
|
|
|
$reupload, $wikiPage, $newPageContent, $comment, $performer,
|
2021-01-08 11:03:25 +00:00
|
|
|
$logEntry, $logId, $descId, $tags, $fname
|
|
|
|
|
) {
|
|
|
|
|
# Update memcache after the commit
|
|
|
|
|
$this->invalidateCache();
|
|
|
|
|
|
|
|
|
|
$updateLogPage = false;
|
|
|
|
|
if ( $newPageContent ) {
|
|
|
|
|
# New file page; create the description page.
|
|
|
|
|
# There's already a log entry, so don't make a second RC entry
|
2021-06-24 08:42:19 +00:00
|
|
|
# CDN and file cache for the description page are purged by doUserEditContent.
|
|
|
|
|
$status = $wikiPage->doUserEditContent(
|
2021-01-08 11:03:25 +00:00
|
|
|
$newPageContent,
|
2021-06-24 08:42:19 +00:00
|
|
|
$performer,
|
2021-01-08 11:03:25 +00:00
|
|
|
$comment,
|
2021-06-24 08:42:19 +00:00
|
|
|
EDIT_NEW | EDIT_SUPPRESS_RC
|
2021-01-08 11:03:25 +00:00
|
|
|
);
|
2016-07-21 19:43:26 +00:00
|
|
|
|
2021-01-08 11:03:25 +00:00
|
|
|
if ( isset( $status->value['revision-record'] ) ) {
|
|
|
|
|
/** @var RevisionRecord $revRecord */
|
|
|
|
|
$revRecord = $status->value['revision-record'];
|
|
|
|
|
// Associate new page revision id
|
|
|
|
|
$logEntry->setAssociatedRevId( $revRecord->getId() );
|
2016-07-21 19:43:26 +00:00
|
|
|
}
|
2021-01-08 11:03:25 +00:00
|
|
|
// This relies on the resetArticleID() call in WikiPage::insertOn(),
|
2021-06-24 08:42:19 +00:00
|
|
|
// which is triggered on $descTitle by doUserEditContent() above.
|
2021-01-08 11:03:25 +00:00
|
|
|
if ( isset( $status->value['revision-record'] ) ) {
|
|
|
|
|
/** @var RevisionRecord $revRecord */
|
|
|
|
|
$revRecord = $status->value['revision-record'];
|
|
|
|
|
$updateLogPage = $revRecord->getPageId();
|
2016-07-21 19:43:26 +00:00
|
|
|
}
|
2021-01-08 11:03:25 +00:00
|
|
|
} else {
|
|
|
|
|
# Existing file page: invalidate description page cache
|
|
|
|
|
$title = $wikiPage->getTitle();
|
|
|
|
|
$title->invalidateCache();
|
|
|
|
|
$hcu = MediaWikiServices::getInstance()->getHtmlCacheUpdater();
|
|
|
|
|
$hcu->purgeTitleUrls( $title, $hcu::PURGE_INTENT_TXROUND_REFLECTED );
|
|
|
|
|
# Allow the new file version to be patrolled from the page footer
|
|
|
|
|
Article::purgePatrolFooterCache( $descId );
|
|
|
|
|
}
|
2016-07-21 19:43:26 +00:00
|
|
|
|
2021-01-08 11:03:25 +00:00
|
|
|
# Update associated rev id. This should be done by $logEntry->insert() earlier,
|
|
|
|
|
# but setAssociatedRevId() wasn't called at that point yet...
|
|
|
|
|
$logParams = $logEntry->getParameters();
|
|
|
|
|
$logParams['associated_rev_id'] = $logEntry->getAssociatedRevId();
|
|
|
|
|
$update = [ 'log_params' => LogEntryBase::makeParamBlob( $logParams ) ];
|
|
|
|
|
if ( $updateLogPage ) {
|
|
|
|
|
# Also log page, in case where we just created it above
|
|
|
|
|
$update['log_page'] = $updateLogPage;
|
|
|
|
|
}
|
2021-04-19 01:32:42 +00:00
|
|
|
$this->getRepo()->getPrimaryDB()->update(
|
2021-01-08 11:03:25 +00:00
|
|
|
'logging',
|
|
|
|
|
$update,
|
|
|
|
|
[ 'log_id' => $logId ],
|
|
|
|
|
$fname
|
|
|
|
|
);
|
2021-04-19 01:32:42 +00:00
|
|
|
$this->getRepo()->getPrimaryDB()->insert(
|
2021-01-08 11:03:25 +00:00
|
|
|
'log_search',
|
|
|
|
|
[
|
|
|
|
|
'ls_field' => 'associated_rev_id',
|
|
|
|
|
'ls_value' => (string)$logEntry->getAssociatedRevId(),
|
|
|
|
|
'ls_log_id' => $logId,
|
|
|
|
|
],
|
|
|
|
|
$fname
|
|
|
|
|
);
|
2016-07-21 19:43:26 +00:00
|
|
|
|
2021-01-08 11:03:25 +00:00
|
|
|
# Add change tags, if any
|
|
|
|
|
if ( $tags ) {
|
|
|
|
|
$logEntry->addTags( $tags );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# Uploads can be patrolled
|
|
|
|
|
$logEntry->setIsPatrollable( true );
|
|
|
|
|
|
|
|
|
|
# Now that the log entry is up-to-date, make an RC entry.
|
|
|
|
|
$logEntry->publish( $logId );
|
2016-08-16 11:44:08 +00:00
|
|
|
|
2021-01-08 11:03:25 +00:00
|
|
|
# Run hook for other updates (typically more cache purging)
|
|
|
|
|
$this->getHookRunner()->onFileUpload( $this, $reupload, !$newPageContent );
|
|
|
|
|
|
|
|
|
|
if ( $reupload ) {
|
|
|
|
|
# Delete old thumbnails
|
|
|
|
|
$this->purgeThumbnails();
|
|
|
|
|
# Remove the old file from the CDN cache
|
|
|
|
|
$hcu = MediaWikiServices::getInstance()->getHtmlCacheUpdater();
|
|
|
|
|
$hcu->purgeUrls( $this->getUrl(), $hcu::PURGE_INTENT_TXROUND_REFLECTED );
|
|
|
|
|
} else {
|
|
|
|
|
# Update backlink pages pointing to this title if created
|
2021-09-08 22:07:01 +00:00
|
|
|
$blcFactory = MediaWikiServices::getInstance()->getBacklinkCacheFactory();
|
2021-01-08 11:03:25 +00:00
|
|
|
LinksUpdate::queueRecursiveJobsForTable(
|
|
|
|
|
$this->getTitle(),
|
|
|
|
|
'imagelinks',
|
|
|
|
|
'upload-image',
|
2021-09-08 22:07:01 +00:00
|
|
|
$performer->getUser()->getName(),
|
|
|
|
|
$blcFactory->getBacklinkCache( $this->getTitle() )
|
2021-01-08 11:03:25 +00:00
|
|
|
);
|
2015-11-01 22:29:05 +00:00
|
|
|
}
|
2010-08-31 13:16:42 +00:00
|
|
|
|
2021-01-08 11:03:25 +00:00
|
|
|
$this->prerenderThumbnails();
|
|
|
|
|
}
|
|
|
|
|
);
|
2015-11-01 22:29:05 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
# Invalidate cache for all pages using this file
|
2021-01-08 11:03:25 +00:00
|
|
|
$cacheUpdateJob = HTMLCacheUpdateJob::newForBacklinks(
|
2019-03-15 00:23:26 +00:00
|
|
|
$this->getTitle(),
|
|
|
|
|
'imagelinks',
|
2021-05-27 19:56:40 +00:00
|
|
|
[ 'causeAction' => 'file-upload', 'causeAgent' => $performer->getUser()->getName() ]
|
2017-10-30 17:47:30 +00:00
|
|
|
);
|
2021-01-08 11:03:25 +00:00
|
|
|
|
|
|
|
|
// NOTE: We are probably still in the transaction started by the call to lock() in
|
|
|
|
|
// publishTo(). We should only schedule jobs after that transaction was committed,
|
|
|
|
|
// so a job queue failure doesn't cause the upload to fail (T263301).
|
|
|
|
|
// Also, we should generally not schedule any Jobs or the DeferredUpdates that
|
|
|
|
|
// assume the update is complete until after the transaction has been committed and
|
|
|
|
|
// we are sure that the upload was indeed successful.
|
2021-04-29 16:24:12 +00:00
|
|
|
$dbw->onTransactionCommitOrIdle( static function () use ( $reupload, $purgeUpdate, $cacheUpdateJob ) {
|
2021-01-08 11:03:25 +00:00
|
|
|
DeferredUpdates::addUpdate( $purgeUpdate, DeferredUpdates::PRESEND );
|
|
|
|
|
|
|
|
|
|
if ( !$reupload ) {
|
|
|
|
|
// This is a new file, so update the image count
|
|
|
|
|
DeferredUpdates::addUpdate( SiteStatsUpdate::factory( [ 'images' => 1 ] ) );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
JobQueueGroup::singleton()->lazyPush( $cacheUpdateJob );
|
2021-06-05 08:26:50 +00:00
|
|
|
}, __METHOD__ );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
Block same-file reuploads
When uploading a file, there are a few ways of checking for and blocking
(or at least warning about) duplicate uploads already.
However, it occasionally seems to happen that files get uploaded twice.
The exact same file, usually - submitted at (almost) the exact same time
(possibly some error in whatever submits the file upload, but still)
Given 2 uploads at (almost) the exact same time, both of them are stored,
even if they are the exact same files.
The last upload also ends up with a `logging` entry with `log_page = 0`.
I don’t believe such upload should go through: if we do find that a file
is an exact duplicate of something that already exists, I don’t see any
reason it should go through.
Note that with this patch, it will become impossible to reupload a file
with the exact same hash (which was possible before.)
If we still want to allow same-file reuploads while also blocking these
kind of race-condition same-uploads, we could make the check more strict
(e.g. also check timestamps, or check if page already exists, or …)
Bug: T158480
Change-Id: I76cbd2c64c3b893997f1f85974d6f82cbfe121e1
2017-09-18 13:35:42 +00:00
|
|
|
return Status::newGood();
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2008-04-14 07:45:50 +00:00
|
|
|
* Move or copy a file to its public location. If a file exists at the
|
2016-12-21 06:21:34 +00:00
|
|
|
* destination, move it to an archive. Returns a Status object with
|
2010-04-16 15:20:12 +00:00
|
|
|
* the archive name in the "value" member on success.
|
2007-05-30 21:02:32 +00:00
|
|
|
*
|
|
|
|
|
* The archive name should be passed through to recordUpload for database
|
|
|
|
|
* registration.
|
|
|
|
|
*
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2016-03-02 23:42:36 +00:00
|
|
|
* @param string|FSFile $src Local filesystem path or virtual URL to the source image
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param int $flags A bitwise combination of:
|
2013-11-23 20:00:11 +00:00
|
|
|
* File::DELETE_SOURCE Delete the source file, i.e. move rather than copy
|
2013-03-11 17:15:01 +00:00
|
|
|
* @param array $options Optional additional parameters
|
2016-10-05 22:50:33 +00:00
|
|
|
* @return Status On success, the value member contains the
|
2008-04-14 07:45:50 +00:00
|
|
|
* archive name, or an empty string if it was a new file.
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-17 23:02:18 +00:00
|
|
|
public function publish( $src, $flags = 0, array $options = [] ) {
|
2016-03-02 23:42:36 +00:00
|
|
|
return $this->publishTo( $src, $this->getRel(), $flags, $options );
|
2011-06-17 17:12:20 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2011-06-17 17:12:20 +00:00
|
|
|
/**
|
2016-12-21 06:21:34 +00:00
|
|
|
* Move or copy a file to a specified location. Returns a Status
|
2011-06-17 17:12:20 +00:00
|
|
|
* object with the archive name in the "value" member on success.
|
|
|
|
|
*
|
|
|
|
|
* The archive name should be passed through to recordUpload for database
|
|
|
|
|
* registration.
|
|
|
|
|
*
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2016-03-02 23:42:36 +00:00
|
|
|
* @param string|FSFile $src Local filesystem path or virtual URL to the source image
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param string $dstRel Target relative path
|
|
|
|
|
* @param int $flags A bitwise combination of:
|
2013-11-23 20:00:11 +00:00
|
|
|
* File::DELETE_SOURCE Delete the source file, i.e. move rather than copy
|
2013-03-11 17:15:01 +00:00
|
|
|
* @param array $options Optional additional parameters
|
2016-10-05 22:50:33 +00:00
|
|
|
* @return Status On success, the value member contains the
|
2011-06-17 17:12:20 +00:00
|
|
|
* archive name, or an empty string if it was a new file.
|
|
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
protected function publishTo( $src, $dstRel, $flags = 0, array $options = [] ) {
|
2016-03-02 23:42:36 +00:00
|
|
|
$srcPath = ( $src instanceof FSFile ) ? $src->getPath() : $src;
|
|
|
|
|
|
2015-03-10 13:26:14 +00:00
|
|
|
$repo = $this->getRepo();
|
|
|
|
|
if ( $repo->getReadOnlyReason() !== false ) {
|
2012-03-14 21:30:26 +00:00
|
|
|
return $this->readOnlyFatalStatus();
|
|
|
|
|
}
|
|
|
|
|
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->lock();
|
2011-07-09 03:49:25 +00:00
|
|
|
|
2019-01-02 14:34:59 +00:00
|
|
|
if ( $this->isOld() ) {
|
|
|
|
|
$archiveRel = $dstRel;
|
|
|
|
|
$archiveName = basename( $archiveRel );
|
|
|
|
|
} else {
|
|
|
|
|
$archiveName = wfTimestamp( TS_MW ) . '!' . $this->getName();
|
|
|
|
|
$archiveRel = $this->getArchiveRel( $archiveName );
|
|
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2015-03-10 13:26:14 +00:00
|
|
|
if ( $repo->hasSha1Storage() ) {
|
2019-01-18 21:03:45 +00:00
|
|
|
$sha1 = FileRepo::isVirtualUrl( $srcPath )
|
2015-03-10 13:26:14 +00:00
|
|
|
? $repo->getFileSha1( $srcPath )
|
2016-03-02 23:42:36 +00:00
|
|
|
: FSFile::getSha1Base36FromPath( $srcPath );
|
2016-08-30 06:22:22 +00:00
|
|
|
/** @var FileBackendDBRepoWrapper $wrapperBackend */
|
|
|
|
|
$wrapperBackend = $repo->getBackend();
|
2019-08-31 16:14:38 +00:00
|
|
|
'@phan-var FileBackendDBRepoWrapper $wrapperBackend';
|
2016-08-30 06:22:22 +00:00
|
|
|
$dst = $wrapperBackend->getPathForSHA1( $sha1 );
|
2016-03-02 23:42:36 +00:00
|
|
|
$status = $repo->quickImport( $src, $dst );
|
2015-03-10 13:26:14 +00:00
|
|
|
if ( $flags & File::DELETE_SOURCE ) {
|
|
|
|
|
unlink( $srcPath );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if ( $this->exists() ) {
|
|
|
|
|
$status->value = $archiveName;
|
|
|
|
|
}
|
2007-05-30 21:02:32 +00:00
|
|
|
} else {
|
2015-03-10 13:26:14 +00:00
|
|
|
$flags = $flags & File::DELETE_SOURCE ? LocalRepo::DELETE_SOURCE : 0;
|
|
|
|
|
$status = $repo->publish( $srcPath, $dstRel, $archiveRel, $flags, $options );
|
|
|
|
|
|
|
|
|
|
if ( $status->value == 'new' ) {
|
|
|
|
|
$status->value = '';
|
|
|
|
|
} else {
|
|
|
|
|
$status->value = $archiveName;
|
|
|
|
|
}
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->unlock();
|
2007-07-22 14:45:12 +00:00
|
|
|
return $status;
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** getLinksTo inherited */
|
|
|
|
|
/** getExifData inherited */
|
|
|
|
|
/** isLocal inherited */
|
|
|
|
|
/** wasDeleted inherited */
|
2008-04-14 07:45:50 +00:00
|
|
|
|
2008-05-03 13:09:34 +00:00
|
|
|
/**
|
|
|
|
|
* Move file to the new title
|
|
|
|
|
*
|
|
|
|
|
* Move current, old version and all thumbnails
|
|
|
|
|
* to the new filename. Old file is deleted.
|
|
|
|
|
*
|
|
|
|
|
* Cache purging is done; checks for validity
|
|
|
|
|
* and logging are caller's responsibility
|
|
|
|
|
*
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param Title $target New file name
|
2016-10-05 22:50:33 +00:00
|
|
|
* @return Status
|
2008-05-03 13:09:34 +00:00
|
|
|
*/
|
2020-05-17 23:02:18 +00:00
|
|
|
public function move( $target ) {
|
2019-06-25 14:30:31 +00:00
|
|
|
$localRepo = MediaWikiServices::getInstance()->getRepoGroup()->getLocalRepo();
|
2012-03-14 21:30:26 +00:00
|
|
|
if ( $this->getRepo()->getReadOnlyReason() !== false ) {
|
|
|
|
|
return $this->readOnlyFatalStatus();
|
|
|
|
|
}
|
|
|
|
|
|
2008-07-06 20:39:11 +00:00
|
|
|
wfDebugLog( 'imagemove', "Got request to move {$this->name} to " . $target->getText() );
|
2008-07-04 13:04:03 +00:00
|
|
|
$batch = new LocalFileMoveBatch( $this, $target );
|
2012-05-07 06:28:03 +00:00
|
|
|
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->lock();
|
2008-05-03 13:09:34 +00:00
|
|
|
$batch->addCurrent();
|
2012-05-09 00:29:34 +00:00
|
|
|
$archiveNames = $batch->addOlds();
|
2008-05-03 13:09:34 +00:00
|
|
|
$status = $batch->execute();
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->unlock();
|
2012-05-07 06:28:03 +00:00
|
|
|
|
2008-07-06 20:39:11 +00:00
|
|
|
wfDebugLog( 'imagemove', "Finished moving {$this->name}" );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2019-08-23 16:44:24 +00:00
|
|
|
// Purge the source and target files outside the transaction...
|
2019-06-25 14:30:31 +00:00
|
|
|
$oldTitleFile = $localRepo->newFile( $this->title );
|
|
|
|
|
$newTitleFile = $localRepo->newFile( $target );
|
2016-07-19 20:43:17 +00:00
|
|
|
DeferredUpdates::addUpdate(
|
|
|
|
|
new AutoCommitUpdate(
|
2021-04-19 01:32:42 +00:00
|
|
|
$this->getRepo()->getPrimaryDB(),
|
2016-07-19 20:43:17 +00:00
|
|
|
__METHOD__,
|
2021-02-10 22:31:02 +00:00
|
|
|
static function () use ( $oldTitleFile, $newTitleFile, $archiveNames ) {
|
2016-07-19 20:43:17 +00:00
|
|
|
$oldTitleFile->purgeEverything();
|
|
|
|
|
foreach ( $archiveNames as $archiveName ) {
|
2019-08-23 16:44:24 +00:00
|
|
|
/** @var OldLocalFile $oldTitleFile */
|
2019-08-31 16:14:38 +00:00
|
|
|
'@phan-var OldLocalFile $oldTitleFile';
|
2016-07-19 20:43:17 +00:00
|
|
|
$oldTitleFile->purgeOldThumbnails( $archiveName );
|
|
|
|
|
}
|
|
|
|
|
$newTitleFile->purgeEverything();
|
2013-10-17 18:35:00 +00:00
|
|
|
}
|
2016-07-19 20:43:17 +00:00
|
|
|
),
|
|
|
|
|
DeferredUpdates::PRESEND
|
2013-10-17 18:35:00 +00:00
|
|
|
);
|
|
|
|
|
|
2012-05-07 06:28:03 +00:00
|
|
|
if ( $status->isOK() ) {
|
2008-05-07 15:03:18 +00:00
|
|
|
// Now switch the object
|
|
|
|
|
$this->title = $target;
|
|
|
|
|
// Force regeneration of the name and hashpath
|
2019-08-29 13:19:39 +00:00
|
|
|
$this->name = null;
|
|
|
|
|
$this->hashPath = null;
|
2008-05-07 15:03:18 +00:00
|
|
|
}
|
2010-09-04 01:06:34 +00:00
|
|
|
|
2008-05-03 13:09:34 +00:00
|
|
|
return $status;
|
|
|
|
|
}
|
|
|
|
|
|
2020-02-25 22:33:18 +00:00
|
|
|
/**
|
|
|
|
|
* Delete all versions of the file.
|
|
|
|
|
*
|
|
|
|
|
* @since 1.35
|
|
|
|
|
*
|
|
|
|
|
* Moves the files into an archive directory (or deletes them)
|
|
|
|
|
* and removes the database rows.
|
|
|
|
|
*
|
|
|
|
|
* Cache purging is done; logging is caller's responsibility.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2020-02-25 22:33:18 +00:00
|
|
|
*
|
|
|
|
|
* @param string $reason
|
2021-04-05 19:43:12 +00:00
|
|
|
* @param UserIdentity $user
|
2020-02-25 22:33:18 +00:00
|
|
|
* @param bool $suppress
|
|
|
|
|
* @return Status
|
|
|
|
|
*/
|
2021-04-05 19:43:12 +00:00
|
|
|
public function deleteFile( $reason, UserIdentity $user, $suppress = false ) {
|
2012-03-14 21:30:26 +00:00
|
|
|
if ( $this->getRepo()->getReadOnlyReason() !== false ) {
|
|
|
|
|
return $this->readOnlyFatalStatus();
|
|
|
|
|
}
|
|
|
|
|
|
2020-02-25 22:33:18 +00:00
|
|
|
$batch = new LocalFileDeleteBatch( $this, $user, $reason, $suppress );
|
2007-05-30 21:02:32 +00:00
|
|
|
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->lock();
|
2012-05-07 06:28:03 +00:00
|
|
|
$batch->addCurrent();
|
2016-07-19 20:43:17 +00:00
|
|
|
// Get old version relative paths
|
2012-05-09 00:29:34 +00:00
|
|
|
$archiveNames = $batch->addOlds();
|
2007-07-22 14:45:12 +00:00
|
|
|
$status = $batch->execute();
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->unlock();
|
2012-05-09 20:59:27 +00:00
|
|
|
|
2012-03-14 21:30:26 +00:00
|
|
|
if ( $status->isOK() ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
DeferredUpdates::addUpdate( SiteStatsUpdate::factory( [ 'images' => -1 ] ) );
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2016-07-19 20:43:17 +00:00
|
|
|
// To avoid slow purges in the transaction, move them outside...
|
|
|
|
|
DeferredUpdates::addUpdate(
|
|
|
|
|
new AutoCommitUpdate(
|
2021-04-19 01:32:42 +00:00
|
|
|
$this->getRepo()->getPrimaryDB(),
|
2016-07-19 20:43:17 +00:00
|
|
|
__METHOD__,
|
|
|
|
|
function () use ( $archiveNames ) {
|
|
|
|
|
$this->purgeEverything();
|
|
|
|
|
foreach ( $archiveNames as $archiveName ) {
|
|
|
|
|
$this->purgeOldThumbnails( $archiveName );
|
|
|
|
|
}
|
2013-08-21 20:20:40 +00:00
|
|
|
}
|
2016-07-19 20:43:17 +00:00
|
|
|
),
|
|
|
|
|
DeferredUpdates::PRESEND
|
2013-08-21 20:20:40 +00:00
|
|
|
);
|
2013-08-19 18:11:44 +00:00
|
|
|
|
2015-12-10 01:30:47 +00:00
|
|
|
// Purge the CDN
|
2016-02-17 09:09:32 +00:00
|
|
|
$purgeUrls = [];
|
2015-12-01 00:05:56 +00:00
|
|
|
foreach ( $archiveNames as $archiveName ) {
|
|
|
|
|
$purgeUrls[] = $this->getArchiveUrl( $archiveName );
|
|
|
|
|
}
|
2019-03-15 00:23:26 +00:00
|
|
|
|
|
|
|
|
$hcu = MediaWikiServices::getInstance()->getHtmlCacheUpdater();
|
|
|
|
|
$hcu->purgeUrls( $purgeUrls, $hcu::PURGE_INTENT_TXROUND_REFLECTED );
|
2015-12-01 00:05:56 +00:00
|
|
|
|
2007-07-22 14:45:12 +00:00
|
|
|
return $status;
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2020-02-25 22:33:18 +00:00
|
|
|
/**
|
|
|
|
|
* Delete an old version of the file.
|
|
|
|
|
*
|
|
|
|
|
* @since 1.35
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2020-02-25 22:33:18 +00:00
|
|
|
*
|
|
|
|
|
* Moves the file into an archive directory (or deletes it)
|
|
|
|
|
* and removes the database row.
|
|
|
|
|
*
|
|
|
|
|
* Cache purging is done; logging is caller's responsibility.
|
|
|
|
|
*
|
|
|
|
|
* @param string $archiveName
|
|
|
|
|
* @param string $reason
|
2021-04-05 19:43:12 +00:00
|
|
|
* @param UserIdentity $user
|
2020-02-25 22:33:18 +00:00
|
|
|
* @param bool $suppress
|
|
|
|
|
* @throws MWException Exception on database or file store failure
|
|
|
|
|
* @return Status
|
|
|
|
|
*/
|
2021-04-05 19:43:12 +00:00
|
|
|
public function deleteOldFile( $archiveName, $reason, UserIdentity $user, $suppress = false ) {
|
2012-03-14 21:30:26 +00:00
|
|
|
if ( $this->getRepo()->getReadOnlyReason() !== false ) {
|
|
|
|
|
return $this->readOnlyFatalStatus();
|
|
|
|
|
}
|
|
|
|
|
|
2020-02-25 22:33:18 +00:00
|
|
|
$batch = new LocalFileDeleteBatch( $this, $user, $reason, $suppress );
|
2012-05-09 00:29:34 +00:00
|
|
|
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->lock();
|
2007-07-22 14:45:12 +00:00
|
|
|
$batch->addOld( $archiveName );
|
|
|
|
|
$status = $batch->execute();
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->unlock();
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2012-05-09 00:29:34 +00:00
|
|
|
$this->purgeOldThumbnails( $archiveName );
|
2012-03-14 21:30:26 +00:00
|
|
|
if ( $status->isOK() ) {
|
2007-07-22 14:45:12 +00:00
|
|
|
$this->purgeDescription();
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2019-03-15 00:23:26 +00:00
|
|
|
$url = $this->getArchiveUrl( $archiveName );
|
|
|
|
|
$hcu = MediaWikiServices::getInstance()->getHtmlCacheUpdater();
|
|
|
|
|
$hcu->purgeUrls( $url, $hcu::PURGE_INTENT_TXROUND_REFLECTED );
|
2013-08-19 18:11:44 +00:00
|
|
|
|
2007-07-22 14:45:12 +00:00
|
|
|
return $status;
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
2007-10-01 19:50:25 +00:00
|
|
|
/**
|
2007-05-30 21:02:32 +00:00
|
|
|
* Restore all or specified deleted revisions to the given file.
|
|
|
|
|
* Permissions and logging are left to the caller.
|
|
|
|
|
*
|
|
|
|
|
* May throw database exceptions on error.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2007-05-30 21:02:32 +00:00
|
|
|
*
|
2014-04-19 15:19:17 +00:00
|
|
|
* @param array $versions Set of record ids of deleted items to restore,
|
2013-12-04 16:18:05 +00:00
|
|
|
* or empty to restore all revisions.
|
|
|
|
|
* @param bool $unsuppress
|
2016-10-05 22:50:33 +00:00
|
|
|
* @return Status
|
2007-05-30 21:02:32 +00:00
|
|
|
*/
|
2020-05-17 23:02:18 +00:00
|
|
|
public function restore( $versions = [], $unsuppress = false ) {
|
2012-03-14 21:30:26 +00:00
|
|
|
if ( $this->getRepo()->getReadOnlyReason() !== false ) {
|
|
|
|
|
return $this->readOnlyFatalStatus();
|
|
|
|
|
}
|
|
|
|
|
|
2008-03-15 00:27:57 +00:00
|
|
|
$batch = new LocalFileRestoreBatch( $this, $unsuppress );
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->lock();
|
2007-10-01 19:50:25 +00:00
|
|
|
if ( !$versions ) {
|
2007-07-22 14:45:12 +00:00
|
|
|
$batch->addAll();
|
|
|
|
|
} else {
|
2007-10-01 19:50:25 +00:00
|
|
|
$batch->addIds( $versions );
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2007-07-22 14:45:12 +00:00
|
|
|
$status = $batch->execute();
|
2012-03-14 21:30:26 +00:00
|
|
|
if ( $status->isGood() ) {
|
|
|
|
|
$cleanupStatus = $batch->cleanup();
|
|
|
|
|
$cleanupStatus->successCount = 0;
|
|
|
|
|
$cleanupStatus->failCount = 0;
|
|
|
|
|
$status->merge( $cleanupStatus );
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->unlock();
|
2007-07-22 14:45:12 +00:00
|
|
|
return $status;
|
2007-05-30 21:02:32 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/** isMultipage inherited */
|
|
|
|
|
/** pageCount inherited */
|
|
|
|
|
/** scaleHeight inherited */
|
|
|
|
|
/** getImageSize inherited */
|
2008-04-14 07:45:50 +00:00
|
|
|
|
2011-03-02 21:06:54 +00:00
|
|
|
/**
|
|
|
|
|
* Get the URL of the file description page.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2019-08-30 08:55:01 +00:00
|
|
|
* @return string|bool
|
2011-03-02 21:06:54 +00:00
|
|
|
*/
|
2020-05-17 23:02:18 +00:00
|
|
|
public function getDescriptionUrl() {
|
2019-08-30 08:55:01 +00:00
|
|
|
if ( !$this->title ) {
|
|
|
|
|
return false; // Avoid hard failure when the file does not exist. T221812
|
|
|
|
|
}
|
|
|
|
|
|
2013-03-27 13:36:05 +00:00
|
|
|
return $this->title->getLocalURL();
|
2011-03-02 21:06:54 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Get the HTML text of the description page
|
|
|
|
|
* This is not used by ImagePage for local files, since (among other things)
|
|
|
|
|
* it skips the parser cache.
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2013-08-31 05:45:43 +00:00
|
|
|
*
|
2018-04-19 08:30:33 +00:00
|
|
|
* @param Language|null $lang What language to get description in (Optional)
|
|
|
|
|
* @return string|false
|
2011-03-02 21:06:54 +00:00
|
|
|
*/
|
2020-05-17 23:02:18 +00:00
|
|
|
public function getDescriptionText( Language $lang = null ) {
|
2019-08-30 08:55:01 +00:00
|
|
|
if ( !$this->title ) {
|
|
|
|
|
return false; // Avoid hard failure when the file does not exist. T221812
|
|
|
|
|
}
|
|
|
|
|
|
2018-08-14 16:37:30 +00:00
|
|
|
$store = MediaWikiServices::getInstance()->getRevisionStore();
|
2020-07-03 00:20:38 +00:00
|
|
|
$revision = $store->getRevisionByTitle( $this->title, 0, RevisionStore::READ_NORMAL );
|
2013-04-20 17:18:13 +00:00
|
|
|
if ( !$revision ) {
|
|
|
|
|
return false;
|
|
|
|
|
}
|
2018-08-14 16:37:30 +00:00
|
|
|
|
|
|
|
|
$renderer = MediaWikiServices::getInstance()->getRevisionRenderer();
|
2020-09-18 15:07:18 +00:00
|
|
|
$rendered = $renderer->getRenderedRevision(
|
|
|
|
|
$revision,
|
|
|
|
|
ParserOptions::newFromUserAndLang(
|
|
|
|
|
RequestContext::getMain()->getUser(),
|
|
|
|
|
$lang
|
|
|
|
|
)
|
|
|
|
|
);
|
2018-08-14 16:37:30 +00:00
|
|
|
|
|
|
|
|
if ( !$rendered ) {
|
|
|
|
|
// audience check failed
|
2013-04-20 17:18:13 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2018-08-14 16:37:30 +00:00
|
|
|
$pout = $rendered->getRevisionParserOutput();
|
2011-03-02 21:06:54 +00:00
|
|
|
return $pout->getText();
|
|
|
|
|
}
|
2007-05-30 21:02:32 +00:00
|
|
|
|
2021-05-27 16:56:43 +00:00
|
|
|
/**
|
|
|
|
|
* @since 1.37
|
|
|
|
|
* @stable to override
|
|
|
|
|
* @param int $audience
|
|
|
|
|
* @param Authority|null $performer
|
|
|
|
|
* @return UserIdentity|null
|
|
|
|
|
*/
|
|
|
|
|
public function getUploader( int $audience = self::FOR_PUBLIC, Authority $performer = null ): ?UserIdentity {
|
|
|
|
|
$this->load();
|
|
|
|
|
if ( $audience === self::FOR_PUBLIC && $this->isDeleted( self::DELETED_USER ) ) {
|
|
|
|
|
return null;
|
|
|
|
|
} elseif ( $audience === self::FOR_THIS_USER && !$this->userCan( self::DELETED_USER, $performer ) ) {
|
|
|
|
|
return null;
|
|
|
|
|
} else {
|
|
|
|
|
return $this->user;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2012-05-10 07:55:33 +00:00
|
|
|
/**
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2013-12-04 16:18:05 +00:00
|
|
|
* @param int $audience
|
2021-04-05 19:43:12 +00:00
|
|
|
* @param Authority|null $performer
|
2012-05-10 07:55:33 +00:00
|
|
|
* @return string
|
|
|
|
|
*/
|
2021-04-05 19:43:12 +00:00
|
|
|
public function getDescription( $audience = self::FOR_PUBLIC, Authority $performer = null ) {
|
2008-01-20 06:48:57 +00:00
|
|
|
$this->load();
|
2012-06-05 22:58:54 +00:00
|
|
|
if ( $audience == self::FOR_PUBLIC && $this->isDeleted( self::DELETED_COMMENT ) ) {
|
|
|
|
|
return '';
|
2021-05-27 19:37:50 +00:00
|
|
|
} elseif ( $audience == self::FOR_THIS_USER && !$this->userCan( self::DELETED_COMMENT, $performer ) ) {
|
2012-06-05 22:58:54 +00:00
|
|
|
return '';
|
|
|
|
|
} else {
|
|
|
|
|
return $this->description;
|
|
|
|
|
}
|
2008-01-20 06:48:57 +00:00
|
|
|
}
|
|
|
|
|
|
2012-05-10 07:55:33 +00:00
|
|
|
/**
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2012-05-10 07:55:33 +00:00
|
|
|
* @return bool|string
|
|
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function getTimestamp() {
|
2007-05-30 21:02:32 +00:00
|
|
|
$this->load();
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2007-05-30 21:02:32 +00:00
|
|
|
return $this->timestamp;
|
|
|
|
|
}
|
2007-07-22 14:45:12 +00:00
|
|
|
|
2015-02-05 03:43:22 +00:00
|
|
|
/**
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2015-02-05 03:43:22 +00:00
|
|
|
* @return bool|string
|
|
|
|
|
*/
|
|
|
|
|
public function getDescriptionTouched() {
|
2019-08-30 08:55:01 +00:00
|
|
|
if ( !$this->exists() ) {
|
|
|
|
|
return false; // Avoid hard failure when the file does not exist. T221812
|
|
|
|
|
}
|
|
|
|
|
|
2015-02-05 03:43:22 +00:00
|
|
|
// The DB lookup might return false, e.g. if the file was just deleted, or the shared DB repo
|
|
|
|
|
// itself gets it from elsewhere. To avoid repeating the DB lookups in such a case, we
|
|
|
|
|
// need to differentiate between null (uninitialized) and false (failed to load).
|
|
|
|
|
if ( $this->descriptionTouched === null ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$cond = [
|
2015-03-15 02:04:52 +00:00
|
|
|
'page_namespace' => $this->title->getNamespace(),
|
|
|
|
|
'page_title' => $this->title->getDBkey()
|
2016-02-17 09:09:32 +00:00
|
|
|
];
|
2016-11-18 15:42:39 +00:00
|
|
|
$touched = $this->repo->getReplicaDB()->selectField( 'page', 'page_touched', $cond, __METHOD__ );
|
2015-02-05 03:43:22 +00:00
|
|
|
$this->descriptionTouched = $touched ? wfTimestamp( TS_MW, $touched ) : false;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
return $this->descriptionTouched;
|
|
|
|
|
}
|
|
|
|
|
|
2012-05-10 07:55:33 +00:00
|
|
|
/**
|
2020-07-13 08:57:12 +00:00
|
|
|
* @stable to override
|
2021-04-27 20:48:53 +00:00
|
|
|
* @return string|false
|
2012-05-10 07:55:33 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function getSha1() {
|
2007-07-22 14:45:12 +00:00
|
|
|
$this->load();
|
2007-08-25 13:54:12 +00:00
|
|
|
// Initialise now if necessary
|
|
|
|
|
if ( $this->sha1 == '' && $this->fileExists ) {
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->lock();
|
2011-12-26 23:35:40 +00:00
|
|
|
|
|
|
|
|
$this->sha1 = $this->repo->getFileSha1( $this->getPath() );
|
2008-10-12 17:07:09 +00:00
|
|
|
if ( !wfReadOnly() && strval( $this->sha1 ) != '' ) {
|
2021-04-19 01:32:42 +00:00
|
|
|
$dbw = $this->repo->getPrimaryDB();
|
2008-04-14 07:45:50 +00:00
|
|
|
$dbw->update( 'image',
|
2016-02-17 09:09:32 +00:00
|
|
|
[ 'img_sha1' => $this->sha1 ],
|
|
|
|
|
[ 'img_name' => $this->getName() ],
|
2007-08-25 13:54:12 +00:00
|
|
|
__METHOD__ );
|
2015-04-28 00:26:58 +00:00
|
|
|
$this->invalidateCache();
|
2007-08-25 13:54:12 +00:00
|
|
|
}
|
2011-12-26 23:35:40 +00:00
|
|
|
|
2019-01-09 16:01:09 +00:00
|
|
|
$this->unlock();
|
2007-08-25 13:54:12 +00:00
|
|
|
}
|
|
|
|
|
|
2007-07-22 14:45:12 +00:00
|
|
|
return $this->sha1;
|
|
|
|
|
}
|
|
|
|
|
|
2012-05-10 07:55:33 +00:00
|
|
|
/**
|
2013-04-24 00:17:50 +00:00
|
|
|
* @return bool Whether to cache in RepoGroup (this avoids OOMs)
|
2012-05-10 07:55:33 +00:00
|
|
|
*/
|
2020-05-19 03:08:56 +00:00
|
|
|
public function isCacheable() {
|
2013-05-14 18:45:34 +00:00
|
|
|
$this->load();
|
2013-11-23 20:00:11 +00:00
|
|
|
|
2013-05-14 18:45:34 +00:00
|
|
|
// If extra data (metadata) was not loaded then it must have been large
|
|
|
|
|
return $this->extraDataLoaded
|
Use the unserialized form of image metadata internally
Image metadata is usually a serialized string representing an array.
Passing the string around internally and having everything unserialize
it is an awkward convention.
Also, many image handlers were reading the file twice: once for
getMetadata() and again for getImageSize(). Often getMetadata()
would actually read the width and height and then throw it away.
So, in filerepo:
* Add File::getMetadataItem(), which promises to allow partial
loading of metadata per my proposal on T275268 in a future commit.
* Add File::getMetadataArray(), which returns the unserialized array.
Some file handlers were returning non-serializable strings from
getMetadata(), so I gave them a legacy array form ['_error' => ...]
* Changed MWFileProps to return the array form of metadata.
* Deprecate the weird File::getImageSize(). It was apparently not
called by anything, but was overridden by UnregisteredLocalFile.
* Wrap serialize/unserialize with File::getMetadataForDb() and
File::loadMetadataFromDb() in preparation for T275268.
In MediaHandler:
* Merged MediaHandler::getImageSize() and MediaHandler::getMetadata()
into getSizeAndMetadata(). Deprecated the old methods.
* Instead of isMetadataValid() we now have isFileMetadataValid(), which
only gets a File object, so it can decide what data it needs to load.
* Simplified getPageDimensions() by having it return false for non-paged
media. It was not called in that case, but was implemented anyway.
In specific handlers:
* Rename DjVuHandler::getUnserializedMetadata() and
extractTreesFromMetadata() for clarity. "Metadata" in these function
names meant an XML string.
* Updated DjVuImage::getImageSize() to provide image sizes in the new
style.
* In ExifBitmapHandler, getRotationForExif() now takes just the
Orientation tag, rather than a serialized string. Also renamed for
clarity.
* In GIFMetadataExtractor, return the width, height and bits per channel
instead of throwing them away. There was some conflation in
decodeBPP() which I picked apart. Refer to GIF89a section 18.
* In JpegMetadataExtractor, process the SOF0/SOF2 segment to extract
bits per channel, width, height and components (channel count). This
is essentially a port of PHP's getimagesize(), so should be bugwards
compatible.
* In PNGMetadataExtractor, return the width and height, which were
previously assigned to unused local variables. I verified the
implementation by referring to the specification.
* In SvgHandler, retain the version validation from unpackMetadata(),
but rename the function since it now takes an array as input.
In tests:
* In ExifBitmapTest, refactored some tests by using a provider.
* In GIFHandlerTest and PNGHandlerTest, I removed the tests in which
getMetadata() returns null, since it doesn't make sense when ported to
getMetadataArray(). I added tests for empty arrays instead.
* In tests, I retained serialization of input data since I figure it's
useful to confirm that existing database rows will continue to be read
correctly. I removed serialization of expected values, replacing them
with plain data.
* In tests, I replaced access to private class constants like
BROKEN_FILE with string literals, since stability is essential. If
the class constant changes, the test should fail.
Elsewhere:
* In maintenance/refreshImageMetadata.php, I removed the check for
shrinking image metadata, since it's not easy to implement and is
not future compatible. Image metadata is expected to shrink in
future.
Bug: T275268
Change-Id: I039785d5b6439d71dcc21dcb972177dba5c3a67d
2021-05-19 00:24:32 +00:00
|
|
|
&& strlen( serialize( $this->metadataArray ) ) <= self::CACHE_FIELD_MAX_LEN;
|
2012-04-06 17:38:38 +00:00
|
|
|
}
|
|
|
|
|
|
2016-07-21 17:33:26 +00:00
|
|
|
/**
|
|
|
|
|
* @return Status
|
|
|
|
|
* @since 1.28
|
|
|
|
|
*/
|
|
|
|
|
public function acquireFileLock() {
|
2018-06-13 23:52:31 +00:00
|
|
|
return Status::wrap( $this->getRepo()->getBackend()->lockFiles(
|
2016-07-21 17:33:26 +00:00
|
|
|
[ $this->getPath() ], LockManager::LOCK_EX, 10
|
2018-06-13 23:52:31 +00:00
|
|
|
) );
|
2016-07-21 17:33:26 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* @return Status
|
|
|
|
|
* @since 1.28
|
|
|
|
|
*/
|
|
|
|
|
public function releaseFileLock() {
|
2018-06-13 23:52:31 +00:00
|
|
|
return Status::wrap( $this->getRepo()->getBackend()->unlockFiles(
|
2016-07-21 17:33:26 +00:00
|
|
|
[ $this->getPath() ], LockManager::LOCK_EX
|
2018-06-13 23:52:31 +00:00
|
|
|
) );
|
2016-07-21 17:33:26 +00:00
|
|
|
}
|
|
|
|
|
|
2007-07-22 14:45:12 +00:00
|
|
|
/**
|
2016-07-14 21:51:25 +00:00
|
|
|
* Start an atomic DB section and lock the image for update
|
|
|
|
|
* or increments a reference counter if the lock is already held
|
|
|
|
|
*
|
2016-07-21 17:33:26 +00:00
|
|
|
* This method should not be used outside of LocalFile/LocalFile*Batch
|
|
|
|
|
*
|
2016-04-19 15:58:49 +00:00
|
|
|
* @throws LocalFileLockError Throws an error if the lock was not acquired
|
2014-10-07 21:08:34 +00:00
|
|
|
* @return bool Whether the file lock owns/spawned the DB transaction
|
2007-07-22 14:45:12 +00:00
|
|
|
*/
|
2016-07-21 17:33:26 +00:00
|
|
|
public function lock() {
|
2007-07-22 14:45:12 +00:00
|
|
|
if ( !$this->locked ) {
|
2016-06-30 00:22:03 +00:00
|
|
|
$logger = LoggerFactory::getInstance( 'LocalFile' );
|
2016-07-14 21:51:25 +00:00
|
|
|
|
2021-04-19 01:32:42 +00:00
|
|
|
$dbw = $this->repo->getPrimaryDB();
|
2016-07-14 21:51:25 +00:00
|
|
|
$makesTransaction = !$dbw->trxLevel();
|
|
|
|
|
$dbw->startAtomic( self::ATOMIC_SECTION_LOCK );
|
2017-02-20 22:44:19 +00:00
|
|
|
// T56736: use simple lock to handle when the file does not exist.
|
2014-05-20 21:22:52 +00:00
|
|
|
// SELECT FOR UPDATE prevents changes, not other SELECTs with FOR UPDATE.
|
|
|
|
|
// Also, that would cause contention on INSERT of similarly named rows.
|
2016-07-21 17:33:26 +00:00
|
|
|
$status = $this->acquireFileLock(); // represents all versions of the file
|
2014-05-20 21:22:52 +00:00
|
|
|
if ( !$status->isGood() ) {
|
2016-07-14 21:51:25 +00:00
|
|
|
$dbw->endAtomic( self::ATOMIC_SECTION_LOCK );
|
2016-06-30 00:22:03 +00:00
|
|
|
$logger->warning( "Failed to lock '{file}'", [ 'file' => $this->name ] );
|
|
|
|
|
|
2016-06-08 08:10:02 +00:00
|
|
|
throw new LocalFileLockError( $status );
|
2013-10-01 20:53:13 +00:00
|
|
|
}
|
2016-07-04 18:02:42 +00:00
|
|
|
// Release the lock *after* commit to avoid row-level contention.
|
|
|
|
|
// Make sure it triggers on rollback() as well as commit() (T132921).
|
2016-09-15 21:40:00 +00:00
|
|
|
$dbw->onTransactionResolution(
|
|
|
|
|
function () use ( $logger ) {
|
|
|
|
|
$status = $this->releaseFileLock();
|
|
|
|
|
if ( !$status->isGood() ) {
|
|
|
|
|
$logger->error( "Failed to unlock '{file}'", [ 'file' => $this->name ] );
|
|
|
|
|
}
|
|
|
|
|
},
|
|
|
|
|
__METHOD__
|
|
|
|
|
);
|
2016-07-14 21:51:25 +00:00
|
|
|
// Callers might care if the SELECT snapshot is safely fresh
|
|
|
|
|
$this->lockedOwnTrx = $makesTransaction;
|
2007-07-22 14:45:12 +00:00
|
|
|
}
|
2010-09-04 13:48:16 +00:00
|
|
|
|
2016-07-04 18:02:42 +00:00
|
|
|
$this->locked++;
|
|
|
|
|
|
2014-10-07 21:08:34 +00:00
|
|
|
return $this->lockedOwnTrx;
|
2007-07-22 14:45:12 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2016-07-14 21:51:25 +00:00
|
|
|
* Decrement the lock reference count and end the atomic section if it reaches zero
|
|
|
|
|
*
|
2016-07-21 17:33:26 +00:00
|
|
|
* This method should not be used outside of LocalFile/LocalFile*Batch
|
|
|
|
|
*
|
2021-02-11 21:09:42 +00:00
|
|
|
* The commit and lock release will happen when no atomic sections are active, which
|
2016-07-14 21:51:25 +00:00
|
|
|
* may happen immediately or at some point after calling this
|
2007-07-22 14:45:12 +00:00
|
|
|
*/
|
2016-07-21 17:33:26 +00:00
|
|
|
public function unlock() {
|
2007-07-22 14:45:12 +00:00
|
|
|
if ( $this->locked ) {
|
|
|
|
|
--$this->locked;
|
2016-07-14 21:51:25 +00:00
|
|
|
if ( !$this->locked ) {
|
2021-04-19 01:32:42 +00:00
|
|
|
$dbw = $this->repo->getPrimaryDB();
|
2016-07-14 21:51:25 +00:00
|
|
|
$dbw->endAtomic( self::ATOMIC_SECTION_LOCK );
|
2013-02-20 19:28:27 +00:00
|
|
|
$this->lockedOwnTrx = false;
|
2007-07-22 14:45:12 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2012-03-14 21:30:26 +00:00
|
|
|
/**
|
|
|
|
|
* @return Status
|
|
|
|
|
*/
|
|
|
|
|
protected function readOnlyFatalStatus() {
|
|
|
|
|
return $this->getRepo()->newFatal( 'filereadonlyerror', $this->getName(),
|
|
|
|
|
$this->getRepo()->getName(), $this->getRepo()->getReadOnlyReason() );
|
|
|
|
|
}
|
2014-03-31 20:36:14 +00:00
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Clean up any dangling locks
|
|
|
|
|
*/
|
2019-12-05 17:52:55 +00:00
|
|
|
public function __destruct() {
|
2014-03-31 20:36:14 +00:00
|
|
|
$this->unlock();
|
|
|
|
|
}
|
2016-04-19 15:58:49 +00:00
|
|
|
}
|