Commit graph

147 commits

Author SHA1 Message Date
James D. Forrester
4bae64d1c7 Namespace includes/context
Bug: T353458
Change-Id: I4dbef138fd0110c14c70214282519189d70c94fb
2024-02-08 11:07:01 -05:00
thiemowmde
0ea7599e3e Replace unspecific exceptions with InvalidArgumentException
A LogicException is very generic and doesn't mean much. An
InvalidArgumentException is also a LogicException, but more
specific: A method was called (for whatever reason – bad code,
bad user input – we don't know) with an invalid, unsupported
argument. This is exactly what's going on in these cases.

BadMethodCallException does have a confusing name and is often
misused because of this. It's documented as "thrown if a callback
refers to an undefined method or if some arguments are missing".
That's something else and not what's going on in these cases.

Change-Id: Id446227f578ba701e22acd5e530ffb795e76c147
2024-01-20 21:10:12 +01:00
Piotr Miazga
cfb64b8585 media: handle empty strings when parsing flash exif metadata
On commons wiki there is at least one PNG file with Flash exif
metadata set to empty string ( `exif:Flash=""` ). Every pageview
triggers FormatMetadata::makeFormattedData() throw a PHP warning
as it cannot proceed with bitwise operations between empty string
and integer.
Exif tools recognize this value as "flash not fired", therefore
once we get empty string lets treat that as a 0.

Bug: T350893
Change-Id: Idd7d4a2bbac377d5f6506f5df2c91e4912559c41
2023-11-30 19:29:03 +01:00
Daimona Eaytoy
154f04299c Remove redundant empty() constructs (2)
empty() only makes sense when the expression it checks is possibly
undefined, otherwise it's equivalent to a truthiness check with the
additional downside of suppressing errors when it's not wanted.

Replace it with simple truthiness checks, using strict comparison when
that seems to help with polymorphic variables.

These were caught by a bespoke phan plugin.

Change-Id: I70b629dbf9e47cf3ba48ff439b18f19e839677f4
2023-09-08 23:28:11 +02:00
Daimona Eaytoy
44e8c7885b media: Replace deprecated MWException
Introduce new classes for checked exceptions.

Bug: T328220
Change-Id: Idbcdc09647a857e359e41ecec98212a8937c5c2e
2023-06-09 18:51:07 +02:00
Derk-Jan Hartman
9556256bda media: code style improvements
- Avoid unnecesary else branching
- Static vs non-static fixes
- string, int and float casting instead of strval,
  intval, floatval (faster and more readable)
- Strict comparisons (but not for all '' and 0 as might have
  implicit falsey behavior)
- Few spelling mistakes
- Remove TimestampException handling, caught by parent function

Change-Id: I08725c8e391965529a2766dfaf5d8f6cf8a86db8
2023-03-09 11:07:32 +00:00
Amir Sarabadani
7d8768e931 Reorg: Move HTML-related classes out of includes/ to Html/
Bug: T321882
Change-Id: I5dc1f7e9c303cd3f5b9dd7010d6bb470d8400a18
2023-02-16 20:40:01 +01:00
Matěj Suchánek
bd42c2720e Remove FormatMetadata::flattenArrayContentLang
It has been deprecated since 1.36 and it is unused.

Change-Id: I51fdd62207d9cc448ece1d2133c2d2cde8f77cc8
2022-11-04 09:24:14 -04:00
jenkins-bot
24d2b7430e Merge "Account for null values in Exif data" 2022-10-25 06:48:13 +00:00
Tim Starling
0077c5da15 Use short array destructuring instead of list()
Introduced in PHP 7.1. Because it's shorter and looks nice.

I used regex replacement.

Change-Id: I0555e199d126cd44501f859cb4589f8bd49694da
2022-10-21 15:33:37 +11:00
Sam Wilson
ea0e09e0cf Account for null values in Exif data
* Get Exif::validate() to return early for null values.
* Handle null values provided to FormatMetadata::literal() and
  formatNum().
* Change the return type hints accordingly.

Bug: T315202
Change-Id: Ic53031b02a518aa46b902e4bb66083834b539276
2022-10-21 12:49:49 +11:00
Fomafix
fdcaf249e3 Use spacey style also for code in comments and documentation
Change-Id: I5ab30489fb1241b89d1bec88a4d1fcebcc69bb9a
2022-08-12 08:59:10 +00:00
Matěj Suchánek
99aad54ea8 Hard deprecate FormatMetadata::flattenArrayContentLang
It has been deprecated since 1.36 and it is unused.

Change-Id: I316333f55526116e9510e4719fce2a93e7683ea6
2022-07-08 07:01:10 +00:00
Mark Shenouda
d4e3a74e9e FormatMetadata: PHP Notice: Array to string conversion
[x] Logging more data when it happens again

Bug: T297403
Change-Id: I4bf2f4204e990b2869c497ffd3a79af7cde34434
2022-07-05 17:09:15 +02:00
Umherirrender
8d235dd805 media: Improve documentation on Handler functions
Add missing @return and use false instead of bool where possible

Change-Id: Ie85a40987422e32aa02c56969e103371ec2e5b5f
2022-03-29 17:50:17 +00:00
Umherirrender
89e5632737 media: Limit result array of explode() to minimum needed
Limit the array for the input to the expected size (or one bigger).
This avoids big arrays on bad input and allows to process with the
expected values in the array.

Change-Id: Iec1a85c29d928966c14cc0273b1a251dc0b6b738
2022-03-01 20:54:15 +01:00
Umherirrender
9efd9ca45e Add explicit casts between scalar types
* Some functions accept only string, cast ints and floats to string
* After preg_matches or explode() casts numbers to int to do maths
* Cast unix timestamps to int to do maths
* Cast return values from timestamp format function to int
* Cast bitwise operator to bool when needed as bool

* php internal functions like floor/round/ceil documented to return
  float, most cases the result is used as int, added casts

Found by phan strict checks

Change-Id: Icb2de32107f43817acc45fe296fb77acf65c1786
2022-03-01 18:19:33 +01:00
Umherirrender
6b9a4a1fd8 Improve indent of very short lines in FormatMetadata
Change-Id: Icf69cb13d8f5b2d92cb350af372ce34116e9f7e3
2021-12-18 16:18:36 +00:00
Tim Starling
5150f19d65 media: Ignore EXIF tag GPSAltitudeRef in FormatMetadata
GPSAltitudeRef is no longer written to img_metadata, but for
compatibility with old rows, don't raise a formatnum warning, just
ignore the tag.

Bug: T285213
Change-Id: Icc074bc0d7bd6a84f73a26e8dd001be85cbef165
2021-06-29 00:44:25 +00:00
Tim Starling
9c3c0b704b Use array_fill_keys() instead of array_flip() if that reflects the developer's intention
array_fill_keys() was introduced in PHP 5.2.0 and works like
array_flip() except that it does only one thing (copying keys) instead
of two things (copying keys and values). That makes it faster and more
obvious.

When array_flip() calls were paired, I left them as is, because that
pattern is too cute. I couldn't kill something so cute.

Sometimes it was hard to figure out whether the values in array_flip()
result were used. That's the point of this change. If you use
array_fill_keys(), the intention is obvious.

Change-Id: If8d340a8bc816a15afec37e64f00106ae45e10ed
2021-06-15 00:11:10 +00:00
Umherirrender
03ed01445d Avoid double escape of exif message in FormatMetadata
Change-Id: I1aece2b2c26a06b887c6aa71719de8d4574f9dcc
2021-04-12 23:24:33 +02:00
Umherirrender
e4d1a2c8bd Use __CLASS__/::class to define callback for array_map/_filter/usort
Change-Id: I3519dd5a1ce1ea688de602190cd74755c400c717
2021-01-22 16:39:29 +00:00
Reedy
7acc57cff9 media: Swap second if for elseif in FormatMetdata::sanitizeKeyForAPI()
Also swap == for ===

Noted in T268133 as a minor optimisation

Change-Id: I7c2198b68cd91dc7a642d0c1f6ce3bf39aeccc41
2020-11-18 12:58:55 +00:00
jenkins-bot
1976283835 Merge "Update a lot of unspecific "array" types in PHPDocs" 2020-11-13 21:48:24 +00:00
C. Scott Ananian
c79d3289f8 Downgrade the severity of the non-numeric argument to formatNum warnings
The core Language::formatNum() method w/ a non-numeric argument is
deprecated, but it may take us a while to chase down more of the long
tail of callers so temporarily downgrade the severity of the
hard-deprecation warning.

The uses in media (FormatMetadata) are due to unknown EXIF tags (which
we should eventually teach our EXIF parser about so they get localized
properly) and bad metadata in uploaded images (which we will probably
never fix).  Downgrade the severity of these logs permanently so we
can track down unknown EXIF tags w/o flooding our logs w/ every bad
image uploaded to commons.

Bug: T267370
Bug: T267587
Change-Id: I778daed5c2d23ff880ada9e226902ad97b6d00c4
2020-11-10 12:44:22 -06:00
jenkins-bot
5072cb83b1 Merge "media: Support Google panorama XMP properties" 2020-11-10 15:39:12 +00:00
jenkins-bot
568628e432 Merge "media: EXIF SubSecTime* are text not numeric" 2020-11-10 15:39:06 +00:00
C. Scott Ananian
bdf8bcf71b media: Support Google panorama XMP properties
Defining appropriate handlers for the non-numeric tags
(FirstPhotoDate, LastPhotoDate, ProjectionType, UsePanoramaViewer,
ExposureLockUsed) ensures we don't try to ::formatNum() something
which isn't a number.

But also define localizations for all the tag names, including the
numeric tags.

These extensions are documented at
https://developers.google.com/streetview/spherical-metadata

Bug: T267370
Bug: T230960
Change-Id: I10c8c52190333fcaf025c8a91ff547d00265dbe2
2020-11-10 15:11:44 +00:00
C. Scott Ananian
24b77396c7 media: EXIF SubSecTime* are text not numeric
SubSecTime are the 'digits after the decimal point'.  It can be a value
such as '014' which is not the same as '14'.

Bug: T267370
Change-Id: Id874295c285cbe587747fc906cd2d3832b75f7f5
2020-11-09 15:49:29 -05:00
C. Scott Ananian
58e299d95f media: Filter out GPSAltitudeRef exif tag
The tag is processed to normalize the value of the GPSAltitude tag,
but sometimes sticks around in the output if the GPSAltitude is
missing or invalid.  The value of GPSAltitudeRef is single byte,
encoded as a string -- "\0" or "\1" -- but when accessing a file using
instant commons (or generally, using ForeignFileRepo) the "\0" is
munged by the action API which results in a bogus U+FFFD appearing in
the EXIF metadata output.

This patch reverts e6a04bf977ed9372e8b58a44765fde8edd1ec50b
(If85081f023a59b56557b9a74b8431c7be332d7e5) which addressed the same
issue (bogus GPSAltitudeRef values appearing on the File page) but did
it by 'supporting' the bogus U+FFFD value.  Better to preprocess the
EXIF metadata to remove the need for GPSAltitudeRef entirely.

Bug: T267370
Change-Id: Icb705c8f3248e14257532e2689731e67ead7c722
2020-11-09 13:23:39 -05:00
C. Scott Ananian
7dc06656db media: Support GPSAltitudeRef exif tag
JPG images generated by the Nikon COOLPIX AW100 camera (and others)
create bogus non-numeric GPSAltitudeRef tags in their EXIF data. That
may not be the camera's fault, maybe it's actually PHP which is
misparsing the tag -- the value is the "unicode replacement character"
U+FFFD which is very suspicious.  In any case, we have a decent number
of images on commons uploaded from these cameras as it turns out, so
it generates a large-ish amount of logspam if we can't handle their
bogus data correctly. Add support for GPSAltitudeRef, including the
non-standard non-numeric value emitted when photos from these cameras
are processed.

Bug: T267370
Change-Id: If85081f023a59b56557b9a74b8431c7be332d7e5
2020-11-06 14:03:52 -05:00
jenkins-bot
b8eb2c4beb Merge "Extend FormatMetadata to handle non-numeric EXIF tags from PDF XMP data" 2020-10-30 22:23:57 +00:00
jenkins-bot
970ad46508 Merge "Deprecate FormatMetadata::flattenArrayContentLang()" 2020-10-30 19:22:55 +00:00
jenkins-bot
6e7d0bf1fe Merge "Ensure FormatMetadata::makeFormattedData always escapes EXIF values" 2020-10-30 19:12:37 +00:00
C. Scott Ananian
8dbf403b1a Extend FormatMetadata to handle non-numeric EXIF tags from PDF XMP data
The \Wikimedia\XMPReader\Info class defines the set of EXIF tags
parsed from PDF XMP data.  It includes a number of EXIF tags not
previously handled by FormatMetadata::makeFormattedData().  These are
EXIF tags, not PDF-specific tags, so they ought to be handled here in
core not in the PDFHandler extension.

Numeric tags are handled via the default fallthrough, but add cases
for structured and string data to ensure we don't try to coerce these
values to numbers.

Bug: T266677
Change-Id: Ifc0bb1eda4c4e390014dae64500bdb01b5c6a2e5
2020-10-30 15:04:04 -04:00
C. Scott Ananian
098bcb38e8 Ensure FormatMetadata::makeFormattedData always escapes EXIF values
This is a step in addressing T266707, by ensuring that every path
through FormatMetadata::makeFormattedData() passes the value through
either a specific formatting method, or else FormatMetadata::literal(),
which should properly wikitext-escape the value.

However, we don't actually want to fully-escape literals, because we
want URL auto-linking to still work. So in this patch, ::literal() is
still using htmlspecialchars() (like the original code did, at least
for the times it remembered to escape strings). That lets some
wikitext meta-characters through. Tightening up
FormatMetadata::literal() is left to a future patch, because it might
break some media pages if editors are deliberately (ab)using wikitext
markup here.

Bug: T266707
Change-Id: I289cf7687565612f8988a5c8a8211386e7904a44
2020-10-30 14:06:56 -04:00
jenkins-bot
21c265d524 Merge "Provide mechanism for MediaHandlers to override metadata formatting" 2020-10-30 17:12:40 +00:00
C. Scott Ananian
1c54541204 Deprecate FormatMetadata::flattenArrayContentLang()
There appear to be no callers to flattenArrayContentLang() in code
search.  It also takes a $noHtml parameter which is labeled as a
backward-compat hack and we'd like to get rid of, in the case where
some caller for flattenArrayContentLang() does turn up.

Also mark FormatMetadata::flattenArrayReal() as @internal.  It's a
poorly named method and no one outside the media-handling code should
need to use it.

Change-Id: Ie3296d115816280740b5c182e7d0e2df20a4f909
2020-10-30 12:15:59 -04:00
Thiemo Kreuz
fe252d715a media: Fix mismatching/incomplete PHPDocs related to metadata
* Use false instead of bool in PHPDocs, because that's the only bool
value that's allowed.

* This patch also fixes DjVuHandler::getPageText() not returning
strings, but XML objects. This kind of "worked" because all consuming
code magically casts these to strings. But this is an actual violation
of the contract of the method. This is also why the test was doing this
weird (string) cast, instead of actually testing the type of the return
value.

Change-Id: I00db6b910f1de6d37a80543b8a5dd5ea3bab3c76
2020-10-30 11:59:42 -04:00
C. Scott Ananian
916a1777ef Provide mechanism for MediaHandlers to override metadata formatting
This avoids the incorrect use of the default formatting, which (for
extension tags not handled in core) will usually assume the tag value
is numeric.

Bug: T266677
Change-Id: I184a7976f2e63f2e70a87257d7749af688659c9d
2020-10-30 11:50:08 -04:00
Umherirrender
d621adbcb6 build: Updating mediawiki/mediawiki-codesniffer to 32.0.0
Exclude failing sniff to fix in follow ups
Includes some simply fix, most are autofix

Change-Id: I5bb4743f08618bb6226bc2a4cc7f4d73a7ad142d
2020-10-28 20:06:22 +00:00
Thiemo Kreuz
b0130ca649 Update a lot of unspecific "array" types in PHPDocs
This includes fixing some mistakes, as well as removing
redundant text that doesn't add new information, either because
it literally repeats what the code already says, or is actually
duplicated.

Change-Id: I3a8dd8ce57192deda8916cc444c87d7ab1a36515
2020-10-28 11:01:33 +01:00
Reedy
4760265acb media: Fix case of FlashPixVersion in FormatMetadata::makeFormattedData()
FlashpixVersion was already handled, but the switch is case sensitive.

FlashPixVersion is the correct capitalisation, as per
https://github.com/php/php-src/blame/master/ext/exif/exif.c#L725

Bug: T263592
Change-Id: Ib78c87c357dc2efeb69d89b39a82b6ca38968372
2020-10-26 04:14:52 +00:00
C. Scott Ananian
cd461c211e Don't try to formatNum() non-numeric media metadata
Bug: T263592
Change-Id: I7b6d6011d3db4c93697e3fff1f184c0967fe6389
2020-10-23 14:22:08 -04:00
Ammar Abdulhamid
a31b638245 Typehint FormatMetadata::collapseContactInfO()
The method expects array, but was given a string.
Since there's only one caller, the caller is fixed
and the method typehinted.

Also fix doc comment

Bug: T257497
Change-Id: I67c337c4ee95ca30d968b89251dbbe077d2110e3
2020-07-14 20:26:47 +00:00
Umherirrender
4f45c8b2de media: Remove truthy check on array in FormatMetadata
Change-Id: Iafe3741e619437d9fb72c46882395983b4de27b0
2020-06-20 20:54:38 +00:00
Umherirrender
47f422d59f Check for INF instead of false for message exif-maxaperturevalue-value
** cannot return false, but INF

Change-Id: Ib08a94720fb70b845753b6640e3c0ff5e6894322
2020-06-19 08:51:39 +00:00
Tim Starling
47a1619027 Remove terminating line breaks from debug messages
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.

So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.

Also:
* Fix the stripping of leading line breaks from the log header emitted
  by Setup.php. This feature, accidentally broken in 2014, allows
  requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
  WebRequest in the hopes that it would not always be needed, however
  $wgRequest->getIP() is now called unconditionally a few lines up in
  Setup.php. This means that it is put in its proper place after the
  "start request" message.
* Wrap the log header code in a closure so that variables like $name do
  not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
  parameter to wfDebug().

Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
2020-06-03 12:01:16 +10:00
Tim Starling
68c433bd23 Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.

General principles:
* Use DI if it is already used. We're not changing the way state is
  managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
  is a service, it's a more generic interface, it is the only
  thing that provides isRegistered() which is needed in some cases,
  and a HookRunner can be efficiently constructed from it
  (confirmed by benchmark). Because HookContainer is needed
  for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
  SpecialPage and ApiBase have getHookContainer() and getHookRunner()
  methods in the base class, and classes that extend that base class
  are not expected to know or care where the base class gets its
  HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
  getHookRunner() methods, getting them from the global service
  container. The point of this is to ease migration to DI by ensuring
  that call sites ask their local friendly base class rather than
  getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
  methods did not seem warranted, there is a private HookRunner property
  which is accessed directly. Very rarely (two cases), there is a
  protected property, for consistency with code that conventionally
  assumes protected=private, but in cases where the class might actually
  be overridden, a protected accessor is preferred over a protected
  property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
  global code. In a few cases it was used for objects with broken
  construction schemes, out of horror or laziness.

Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore

Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router

setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine

Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-05-30 14:23:28 +00:00
Petr Pchelko
204fa7e509 Remove usages of deprecated Language methods
Change-Id: Iad3375b141b1d87c890baec6ecd16ed92f93e699
2020-02-16 00:45:48 +00:00