Commit graph

126 commits

Author SHA1 Message Date
Umherirrender
e4d1a2c8bd Use __CLASS__/::class to define callback for array_map/_filter/usort
Change-Id: I3519dd5a1ce1ea688de602190cd74755c400c717
2021-01-22 16:39:29 +00:00
Reedy
7acc57cff9 media: Swap second if for elseif in FormatMetdata::sanitizeKeyForAPI()
Also swap == for ===

Noted in T268133 as a minor optimisation

Change-Id: I7c2198b68cd91dc7a642d0c1f6ce3bf39aeccc41
2020-11-18 12:58:55 +00:00
jenkins-bot
1976283835 Merge "Update a lot of unspecific "array" types in PHPDocs" 2020-11-13 21:48:24 +00:00
C. Scott Ananian
c79d3289f8 Downgrade the severity of the non-numeric argument to formatNum warnings
The core Language::formatNum() method w/ a non-numeric argument is
deprecated, but it may take us a while to chase down more of the long
tail of callers so temporarily downgrade the severity of the
hard-deprecation warning.

The uses in media (FormatMetadata) are due to unknown EXIF tags (which
we should eventually teach our EXIF parser about so they get localized
properly) and bad metadata in uploaded images (which we will probably
never fix).  Downgrade the severity of these logs permanently so we
can track down unknown EXIF tags w/o flooding our logs w/ every bad
image uploaded to commons.

Bug: T267370
Bug: T267587
Change-Id: I778daed5c2d23ff880ada9e226902ad97b6d00c4
2020-11-10 12:44:22 -06:00
jenkins-bot
5072cb83b1 Merge "media: Support Google panorama XMP properties" 2020-11-10 15:39:12 +00:00
jenkins-bot
568628e432 Merge "media: EXIF SubSecTime* are text not numeric" 2020-11-10 15:39:06 +00:00
C. Scott Ananian
bdf8bcf71b media: Support Google panorama XMP properties
Defining appropriate handlers for the non-numeric tags
(FirstPhotoDate, LastPhotoDate, ProjectionType, UsePanoramaViewer,
ExposureLockUsed) ensures we don't try to ::formatNum() something
which isn't a number.

But also define localizations for all the tag names, including the
numeric tags.

These extensions are documented at
https://developers.google.com/streetview/spherical-metadata

Bug: T267370
Bug: T230960
Change-Id: I10c8c52190333fcaf025c8a91ff547d00265dbe2
2020-11-10 15:11:44 +00:00
C. Scott Ananian
24b77396c7 media: EXIF SubSecTime* are text not numeric
SubSecTime are the 'digits after the decimal point'.  It can be a value
such as '014' which is not the same as '14'.

Bug: T267370
Change-Id: Id874295c285cbe587747fc906cd2d3832b75f7f5
2020-11-09 15:49:29 -05:00
C. Scott Ananian
58e299d95f media: Filter out GPSAltitudeRef exif tag
The tag is processed to normalize the value of the GPSAltitude tag,
but sometimes sticks around in the output if the GPSAltitude is
missing or invalid.  The value of GPSAltitudeRef is single byte,
encoded as a string -- "\0" or "\1" -- but when accessing a file using
instant commons (or generally, using ForeignFileRepo) the "\0" is
munged by the action API which results in a bogus U+FFFD appearing in
the EXIF metadata output.

This patch reverts e6a04bf977ed9372e8b58a44765fde8edd1ec50b
(If85081f023a59b56557b9a74b8431c7be332d7e5) which addressed the same
issue (bogus GPSAltitudeRef values appearing on the File page) but did
it by 'supporting' the bogus U+FFFD value.  Better to preprocess the
EXIF metadata to remove the need for GPSAltitudeRef entirely.

Bug: T267370
Change-Id: Icb705c8f3248e14257532e2689731e67ead7c722
2020-11-09 13:23:39 -05:00
C. Scott Ananian
7dc06656db media: Support GPSAltitudeRef exif tag
JPG images generated by the Nikon COOLPIX AW100 camera (and others)
create bogus non-numeric GPSAltitudeRef tags in their EXIF data. That
may not be the camera's fault, maybe it's actually PHP which is
misparsing the tag -- the value is the "unicode replacement character"
U+FFFD which is very suspicious.  In any case, we have a decent number
of images on commons uploaded from these cameras as it turns out, so
it generates a large-ish amount of logspam if we can't handle their
bogus data correctly. Add support for GPSAltitudeRef, including the
non-standard non-numeric value emitted when photos from these cameras
are processed.

Bug: T267370
Change-Id: If85081f023a59b56557b9a74b8431c7be332d7e5
2020-11-06 14:03:52 -05:00
jenkins-bot
b8eb2c4beb Merge "Extend FormatMetadata to handle non-numeric EXIF tags from PDF XMP data" 2020-10-30 22:23:57 +00:00
jenkins-bot
970ad46508 Merge "Deprecate FormatMetadata::flattenArrayContentLang()" 2020-10-30 19:22:55 +00:00
jenkins-bot
6e7d0bf1fe Merge "Ensure FormatMetadata::makeFormattedData always escapes EXIF values" 2020-10-30 19:12:37 +00:00
C. Scott Ananian
8dbf403b1a Extend FormatMetadata to handle non-numeric EXIF tags from PDF XMP data
The \Wikimedia\XMPReader\Info class defines the set of EXIF tags
parsed from PDF XMP data.  It includes a number of EXIF tags not
previously handled by FormatMetadata::makeFormattedData().  These are
EXIF tags, not PDF-specific tags, so they ought to be handled here in
core not in the PDFHandler extension.

Numeric tags are handled via the default fallthrough, but add cases
for structured and string data to ensure we don't try to coerce these
values to numbers.

Bug: T266677
Change-Id: Ifc0bb1eda4c4e390014dae64500bdb01b5c6a2e5
2020-10-30 15:04:04 -04:00
C. Scott Ananian
098bcb38e8 Ensure FormatMetadata::makeFormattedData always escapes EXIF values
This is a step in addressing T266707, by ensuring that every path
through FormatMetadata::makeFormattedData() passes the value through
either a specific formatting method, or else FormatMetadata::literal(),
which should properly wikitext-escape the value.

However, we don't actually want to fully-escape literals, because we
want URL auto-linking to still work. So in this patch, ::literal() is
still using htmlspecialchars() (like the original code did, at least
for the times it remembered to escape strings). That lets some
wikitext meta-characters through. Tightening up
FormatMetadata::literal() is left to a future patch, because it might
break some media pages if editors are deliberately (ab)using wikitext
markup here.

Bug: T266707
Change-Id: I289cf7687565612f8988a5c8a8211386e7904a44
2020-10-30 14:06:56 -04:00
jenkins-bot
21c265d524 Merge "Provide mechanism for MediaHandlers to override metadata formatting" 2020-10-30 17:12:40 +00:00
C. Scott Ananian
1c54541204 Deprecate FormatMetadata::flattenArrayContentLang()
There appear to be no callers to flattenArrayContentLang() in code
search.  It also takes a $noHtml parameter which is labeled as a
backward-compat hack and we'd like to get rid of, in the case where
some caller for flattenArrayContentLang() does turn up.

Also mark FormatMetadata::flattenArrayReal() as @internal.  It's a
poorly named method and no one outside the media-handling code should
need to use it.

Change-Id: Ie3296d115816280740b5c182e7d0e2df20a4f909
2020-10-30 12:15:59 -04:00
Thiemo Kreuz
fe252d715a media: Fix mismatching/incomplete PHPDocs related to metadata
* Use false instead of bool in PHPDocs, because that's the only bool
value that's allowed.

* This patch also fixes DjVuHandler::getPageText() not returning
strings, but XML objects. This kind of "worked" because all consuming
code magically casts these to strings. But this is an actual violation
of the contract of the method. This is also why the test was doing this
weird (string) cast, instead of actually testing the type of the return
value.

Change-Id: I00db6b910f1de6d37a80543b8a5dd5ea3bab3c76
2020-10-30 11:59:42 -04:00
C. Scott Ananian
916a1777ef Provide mechanism for MediaHandlers to override metadata formatting
This avoids the incorrect use of the default formatting, which (for
extension tags not handled in core) will usually assume the tag value
is numeric.

Bug: T266677
Change-Id: I184a7976f2e63f2e70a87257d7749af688659c9d
2020-10-30 11:50:08 -04:00
Umherirrender
d621adbcb6 build: Updating mediawiki/mediawiki-codesniffer to 32.0.0
Exclude failing sniff to fix in follow ups
Includes some simply fix, most are autofix

Change-Id: I5bb4743f08618bb6226bc2a4cc7f4d73a7ad142d
2020-10-28 20:06:22 +00:00
Thiemo Kreuz
b0130ca649 Update a lot of unspecific "array" types in PHPDocs
This includes fixing some mistakes, as well as removing
redundant text that doesn't add new information, either because
it literally repeats what the code already says, or is actually
duplicated.

Change-Id: I3a8dd8ce57192deda8916cc444c87d7ab1a36515
2020-10-28 11:01:33 +01:00
Reedy
4760265acb media: Fix case of FlashPixVersion in FormatMetadata::makeFormattedData()
FlashpixVersion was already handled, but the switch is case sensitive.

FlashPixVersion is the correct capitalisation, as per
https://github.com/php/php-src/blame/master/ext/exif/exif.c#L725

Bug: T263592
Change-Id: Ib78c87c357dc2efeb69d89b39a82b6ca38968372
2020-10-26 04:14:52 +00:00
C. Scott Ananian
cd461c211e Don't try to formatNum() non-numeric media metadata
Bug: T263592
Change-Id: I7b6d6011d3db4c93697e3fff1f184c0967fe6389
2020-10-23 14:22:08 -04:00
Ammar Abdulhamid
a31b638245 Typehint FormatMetadata::collapseContactInfO()
The method expects array, but was given a string.
Since there's only one caller, the caller is fixed
and the method typehinted.

Also fix doc comment

Bug: T257497
Change-Id: I67c337c4ee95ca30d968b89251dbbe077d2110e3
2020-07-14 20:26:47 +00:00
Umherirrender
4f45c8b2de media: Remove truthy check on array in FormatMetadata
Change-Id: Iafe3741e619437d9fb72c46882395983b4de27b0
2020-06-20 20:54:38 +00:00
Umherirrender
47f422d59f Check for INF instead of false for message exif-maxaperturevalue-value
** cannot return false, but INF

Change-Id: Ib08a94720fb70b845753b6640e3c0ff5e6894322
2020-06-19 08:51:39 +00:00
Tim Starling
47a1619027 Remove terminating line breaks from debug messages
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.

So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.

Also:
* Fix the stripping of leading line breaks from the log header emitted
  by Setup.php. This feature, accidentally broken in 2014, allows
  requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
  WebRequest in the hopes that it would not always be needed, however
  $wgRequest->getIP() is now called unconditionally a few lines up in
  Setup.php. This means that it is put in its proper place after the
  "start request" message.
* Wrap the log header code in a closure so that variables like $name do
  not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
  parameter to wfDebug().

Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
2020-06-03 12:01:16 +10:00
Tim Starling
68c433bd23 Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.

General principles:
* Use DI if it is already used. We're not changing the way state is
  managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
  is a service, it's a more generic interface, it is the only
  thing that provides isRegistered() which is needed in some cases,
  and a HookRunner can be efficiently constructed from it
  (confirmed by benchmark). Because HookContainer is needed
  for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
  SpecialPage and ApiBase have getHookContainer() and getHookRunner()
  methods in the base class, and classes that extend that base class
  are not expected to know or care where the base class gets its
  HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
  getHookRunner() methods, getting them from the global service
  container. The point of this is to ease migration to DI by ensuring
  that call sites ask their local friendly base class rather than
  getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
  methods did not seem warranted, there is a private HookRunner property
  which is accessed directly. Very rarely (two cases), there is a
  protected property, for consistency with code that conventionally
  assumes protected=private, but in cases where the class might actually
  be overridden, a protected accessor is preferred over a protected
  property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
  global code. In a few cases it was used for objects with broken
  construction schemes, out of horror or laziness.

Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore

Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router

setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine

Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-05-30 14:23:28 +00:00
Petr Pchelko
204fa7e509 Remove usages of deprecated Language methods
Change-Id: Iad3375b141b1d87c890baec6ecd16ed92f93e699
2020-02-16 00:45:48 +00:00
jenkins-bot
9d8954a372 Merge "build: Upgrade phan to 0.9.0" 2019-12-09 16:15:27 +00:00
Daimona Eaytoy
ce0856b12f Fix more scalar types in docblocks
Change-Id: I574d4e261ab986e028c3ce26c4f0ec648b88a2ac
2019-12-08 17:59:08 +00:00
Daimona Eaytoy
598c4d7fcb build: Upgrade phan to 0.9.0
Scalar casts are still allowed (for now), because there's a huge amount
of false positives. Ditto for invalid array offsets.

Thoughts about the rest: luckily, many false positives with array offsets
have gone. Moreover, since *Internal issues are suppressed in the base
config, we can remove inline suppressions.

Unfortunately, there are a couple of new issues about array additions
with only false positives, because apparently they don't take
branches into account.

Change-Id: I5a3913c6e762f77bfdae55051a395fae95d1f841
2019-12-07 20:16:19 +00:00
jenkins-bot
e185eb21cf Merge "media: Log and fail gracefully on invalid EXIF coordinates" 2019-11-29 21:53:03 +00:00
Thiemo Kreuz
f6787ede2d media: Log and fail gracefully on invalid EXIF coordinates
The $coord value is a value extracted from the EXIF section of an
image file. We expect it to be a float, but there is no guarantee this
is the case. It could, for example, be an empty string.

I suggest this trivial fix. It does have the following effects:
* Instead of logging a PHP notice when floor() hits something that is
  not a number, I try to log something that's more useful for later,
  more in-depth debugging. Note this log call isn't necessarily meant
  to stay, but to find an even better fix for this issue.
* I return the string as it is. If it's "foo", the user will see "foo"
  instead of "0° 0′ 0″ N", which wasn't helpful.

Also note how wrong and misleading the PHPDoc block for this function
was.

Bug: T226751
Change-Id: I1ca98728de4113ee1ae4362bd3e62b425d589388
2019-11-29 14:08:01 +01:00
Umherirrender
c7ad21c25f Improve param docs
Change-Id: I746a69f6ed01c3ff000da125457df62b02d13b34
2019-11-28 19:08:59 +01:00
Daimona Eaytoy
e70b5b3309 Unsuppress other phan issues (part 4)
Bug: T231636
Depends-On: I58e67c2b38389df874438deada4239510d21654f
Change-Id: I6e5fba7bd273219b1206559420b5bdb78734aa84
2019-08-31 17:13:39 +00:00
Derk-Jan Hartman
7c68604e4c Recognize exif values for Apple iOS photo modes
CustomRendered value 2-8 are used by Apple to indicate the processing
modes used like HDR, Portrait and Panorama.

Bug: T231385
Change-Id: I767a81a8bebdf25c230b104d35236a4b38cbe4ed
2019-08-27 20:37:02 +00:00
Thiemo Kreuz
3a66680ec5 Simplify a few list() that only care about the first element
The nice thing about explode() is that the resulting array is
guaranteed to contain at least one element. The array can not be
empty.

In some of these cases it might be possible to use strstr() instead,
but that returns an empty string when the needle character is not
found. explode() returns the original string in this case.

Change-Id: I6ad1f3273defeaf36e2305fd871eaaf9d3c1e134
2019-05-17 16:54:47 +02:00
Aryeh Gregor
90d4f56fe4 Mass conversion of $wgContLang to service
Brought to you by vim macros.

Bug: T200246
Change-Id: I79e919f4553e3bd3eb714073fed7a43051b4fb2a
2018-08-11 22:44:29 -06:00
Fomafix
6866cfec37 Simplify PHP by using ?? and ?:
Also remove not necessary surrounding parentheses.

Change-Id: I0eb5c9c1bdfb09a800258379cdcefb5fd4d3d21c
2018-07-10 20:03:17 +00:00
Edward Chernenko
0a4a274b62 Fix PHP7 warning "non well formed numeric value encountered"
PHP 7.1 warns when non-numeric string is implicitly cast to integer.

Change-Id: Ia46ea793e9495548c7d421b3372f6deaeda163f5
2018-06-26 01:46:47 +03:00
Edward Chernenko
d88e924b6e Fix PHP warnings "preg_replace(): [...] invalid range in character class"
This was spotted when running tests on Travis (PHP 7.3 nighly, trusty).

Two expressions inside preg_replace() contained non-escaped "-" inside [],
where this "-" meant an actual "-" character.
The warning is because "-" has special meaning inside [] ("a-z" for range),
and things like [\w-.] are considered "invalid range".

Solution is to escape "-" like this: [\w\-.]

Change-Id: I41cc217081f00f54d957b6d8052ee209412f5ff6
2018-06-19 00:11:33 +00:00
Bartosz Dziewoński
4fd27f006f Use PHP 5.6 '**' operator instead of 'pow()' function
Change-Id: Ieb22e1dbfcffaa4e7b3dcfabbcc999e5dd59a4bf
2018-05-30 18:05:19 -07:00
Kunal Mehta
e0193327bd Fix MediaWiki.Commenting.LicenseComment.InvalidLicenseTag errors
Change-Id: I936c3f5fca1a0061f215e80469f5d882cb32ee29
2018-05-23 16:23:42 -07:00
Mark A. Hershberger
47a7977b15 Make FormatMetadata::flattenArrayReal() work for an associative array
Bug: T87572
Change-Id: I19490ebbbdc3613ae2116c6890ca470bb9f332db
2018-01-05 17:31:38 +00:00
Sébastien Santoro
241e741377 media: Ensure there ie enough data to extract software version
The Software EXIF / other metadata field was expected to contain
the software name followed by the version number.

There are occurences in Wikimedia production logs of errors showing
that's not always the case.

Bug: T178130
Change-Id: I4187a41b5fd8d7b5574ab50523668d8feb11bccc
2017-12-06 21:16:17 +00:00
Brad Jorsch
2c34fd6e0e Replace uses of each()
It's deprecated in PHP 7.2, may as well replace it now.

I note that, contrary to claims at
https://wiki.php.net/rfc/deprecations_php_7_2#each, none of our uses
were trivially replaceable with foreach.

* wfArrayDiff2_cmp() is processing two arrays by value in parallel.
* MagicWordArray::parseMatch() is doing something funky with the data
  structure returned by preg_match().
* HashRing was using it like "nextKey()", replaced with calls to key()
  and next().
* FormatMetadata and IndexPager were both using it as a shorter way to
  get both key() and current() for the first element in the array. I
  suppose a foreach(){ break; } would do the same, but that's confusing.

Bug: T174354
Change-Id: I36169a04c764fdf1bfd6603395111c6fe0aae5eb
2017-09-20 09:51:28 -04:00
Umherirrender
a9007e8baf Add missing & to @param documentation to match functon call
Change-Id: I81e68310abcbc59964b22e0e74842d509f6b1fb9
2017-08-11 18:47:46 +02:00
jenkins-bot
171255101f Merge "Use file width/height instead of metadata for getContentHeaders" 2017-05-24 06:44:54 +00:00
Kunal Mehta
e37a7f257a media: Avoid deprecated wfMemcKey()
And ObjectCache::getMainWANInstance() while we're at it.

Change-Id: Ib22bd134c3faa56f8d8f111bb9ed99d826cbed40
2017-05-23 21:01:09 -07:00