The core Language::formatNum() method w/ a non-numeric argument is
deprecated, but it may take us a while to chase down more of the long
tail of callers so temporarily downgrade the severity of the
hard-deprecation warning.
The uses in media (FormatMetadata) are due to unknown EXIF tags (which
we should eventually teach our EXIF parser about so they get localized
properly) and bad metadata in uploaded images (which we will probably
never fix). Downgrade the severity of these logs permanently so we
can track down unknown EXIF tags w/o flooding our logs w/ every bad
image uploaded to commons.
Bug: T267370
Bug: T267587
Change-Id: I778daed5c2d23ff880ada9e226902ad97b6d00c4
Defining appropriate handlers for the non-numeric tags
(FirstPhotoDate, LastPhotoDate, ProjectionType, UsePanoramaViewer,
ExposureLockUsed) ensures we don't try to ::formatNum() something
which isn't a number.
But also define localizations for all the tag names, including the
numeric tags.
These extensions are documented at
https://developers.google.com/streetview/spherical-metadata
Bug: T267370
Bug: T230960
Change-Id: I10c8c52190333fcaf025c8a91ff547d00265dbe2
SubSecTime are the 'digits after the decimal point'. It can be a value
such as '014' which is not the same as '14'.
Bug: T267370
Change-Id: Id874295c285cbe587747fc906cd2d3832b75f7f5
The tag is processed to normalize the value of the GPSAltitude tag,
but sometimes sticks around in the output if the GPSAltitude is
missing or invalid. The value of GPSAltitudeRef is single byte,
encoded as a string -- "\0" or "\1" -- but when accessing a file using
instant commons (or generally, using ForeignFileRepo) the "\0" is
munged by the action API which results in a bogus U+FFFD appearing in
the EXIF metadata output.
This patch reverts e6a04bf977ed9372e8b58a44765fde8edd1ec50b
(If85081f023a59b56557b9a74b8431c7be332d7e5) which addressed the same
issue (bogus GPSAltitudeRef values appearing on the File page) but did
it by 'supporting' the bogus U+FFFD value. Better to preprocess the
EXIF metadata to remove the need for GPSAltitudeRef entirely.
Bug: T267370
Change-Id: Icb705c8f3248e14257532e2689731e67ead7c722
JPG images generated by the Nikon COOLPIX AW100 camera (and others)
create bogus non-numeric GPSAltitudeRef tags in their EXIF data. That
may not be the camera's fault, maybe it's actually PHP which is
misparsing the tag -- the value is the "unicode replacement character"
U+FFFD which is very suspicious. In any case, we have a decent number
of images on commons uploaded from these cameras as it turns out, so
it generates a large-ish amount of logspam if we can't handle their
bogus data correctly. Add support for GPSAltitudeRef, including the
non-standard non-numeric value emitted when photos from these cameras
are processed.
Bug: T267370
Change-Id: If85081f023a59b56557b9a74b8431c7be332d7e5
The \Wikimedia\XMPReader\Info class defines the set of EXIF tags
parsed from PDF XMP data. It includes a number of EXIF tags not
previously handled by FormatMetadata::makeFormattedData(). These are
EXIF tags, not PDF-specific tags, so they ought to be handled here in
core not in the PDFHandler extension.
Numeric tags are handled via the default fallthrough, but add cases
for structured and string data to ensure we don't try to coerce these
values to numbers.
Bug: T266677
Change-Id: Ifc0bb1eda4c4e390014dae64500bdb01b5c6a2e5
This is a step in addressing T266707, by ensuring that every path
through FormatMetadata::makeFormattedData() passes the value through
either a specific formatting method, or else FormatMetadata::literal(),
which should properly wikitext-escape the value.
However, we don't actually want to fully-escape literals, because we
want URL auto-linking to still work. So in this patch, ::literal() is
still using htmlspecialchars() (like the original code did, at least
for the times it remembered to escape strings). That lets some
wikitext meta-characters through. Tightening up
FormatMetadata::literal() is left to a future patch, because it might
break some media pages if editors are deliberately (ab)using wikitext
markup here.
Bug: T266707
Change-Id: I289cf7687565612f8988a5c8a8211386e7904a44
There appear to be no callers to flattenArrayContentLang() in code
search. It also takes a $noHtml parameter which is labeled as a
backward-compat hack and we'd like to get rid of, in the case where
some caller for flattenArrayContentLang() does turn up.
Also mark FormatMetadata::flattenArrayReal() as @internal. It's a
poorly named method and no one outside the media-handling code should
need to use it.
Change-Id: Ie3296d115816280740b5c182e7d0e2df20a4f909
* Use false instead of bool in PHPDocs, because that's the only bool
value that's allowed.
* This patch also fixes DjVuHandler::getPageText() not returning
strings, but XML objects. This kind of "worked" because all consuming
code magically casts these to strings. But this is an actual violation
of the contract of the method. This is also why the test was doing this
weird (string) cast, instead of actually testing the type of the return
value.
Change-Id: I00db6b910f1de6d37a80543b8a5dd5ea3bab3c76
This avoids the incorrect use of the default formatting, which (for
extension tags not handled in core) will usually assume the tag value
is numeric.
Bug: T266677
Change-Id: I184a7976f2e63f2e70a87257d7749af688659c9d
This includes fixing some mistakes, as well as removing
redundant text that doesn't add new information, either because
it literally repeats what the code already says, or is actually
duplicated.
Change-Id: I3a8dd8ce57192deda8916cc444c87d7ab1a36515
The method expects array, but was given a string.
Since there's only one caller, the caller is fixed
and the method typehinted.
Also fix doc comment
Bug: T257497
Change-Id: I67c337c4ee95ca30d968b89251dbbe077d2110e3
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.
So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.
Also:
* Fix the stripping of leading line breaks from the log header emitted
by Setup.php. This feature, accidentally broken in 2014, allows
requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
WebRequest in the hopes that it would not always be needed, however
$wgRequest->getIP() is now called unconditionally a few lines up in
Setup.php. This means that it is put in its proper place after the
"start request" message.
* Wrap the log header code in a closure so that variables like $name do
not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
parameter to wfDebug().
Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
Scalar casts are still allowed (for now), because there's a huge amount
of false positives. Ditto for invalid array offsets.
Thoughts about the rest: luckily, many false positives with array offsets
have gone. Moreover, since *Internal issues are suppressed in the base
config, we can remove inline suppressions.
Unfortunately, there are a couple of new issues about array additions
with only false positives, because apparently they don't take
branches into account.
Change-Id: I5a3913c6e762f77bfdae55051a395fae95d1f841
The $coord value is a value extracted from the EXIF section of an
image file. We expect it to be a float, but there is no guarantee this
is the case. It could, for example, be an empty string.
I suggest this trivial fix. It does have the following effects:
* Instead of logging a PHP notice when floor() hits something that is
not a number, I try to log something that's more useful for later,
more in-depth debugging. Note this log call isn't necessarily meant
to stay, but to find an even better fix for this issue.
* I return the string as it is. If it's "foo", the user will see "foo"
instead of "0° 0′ 0″ N", which wasn't helpful.
Also note how wrong and misleading the PHPDoc block for this function
was.
Bug: T226751
Change-Id: I1ca98728de4113ee1ae4362bd3e62b425d589388
CustomRendered value 2-8 are used by Apple to indicate the processing
modes used like HDR, Portrait and Panorama.
Bug: T231385
Change-Id: I767a81a8bebdf25c230b104d35236a4b38cbe4ed
The nice thing about explode() is that the resulting array is
guaranteed to contain at least one element. The array can not be
empty.
In some of these cases it might be possible to use strstr() instead,
but that returns an empty string when the needle character is not
found. explode() returns the original string in this case.
Change-Id: I6ad1f3273defeaf36e2305fd871eaaf9d3c1e134
This was spotted when running tests on Travis (PHP 7.3 nighly, trusty).
Two expressions inside preg_replace() contained non-escaped "-" inside [],
where this "-" meant an actual "-" character.
The warning is because "-" has special meaning inside [] ("a-z" for range),
and things like [\w-.] are considered "invalid range".
Solution is to escape "-" like this: [\w\-.]
Change-Id: I41cc217081f00f54d957b6d8052ee209412f5ff6
The Software EXIF / other metadata field was expected to contain
the software name followed by the version number.
There are occurences in Wikimedia production logs of errors showing
that's not always the case.
Bug: T178130
Change-Id: I4187a41b5fd8d7b5574ab50523668d8feb11bccc
It's deprecated in PHP 7.2, may as well replace it now.
I note that, contrary to claims at
https://wiki.php.net/rfc/deprecations_php_7_2#each, none of our uses
were trivially replaceable with foreach.
* wfArrayDiff2_cmp() is processing two arrays by value in parallel.
* MagicWordArray::parseMatch() is doing something funky with the data
structure returned by preg_match().
* HashRing was using it like "nextKey()", replaced with calls to key()
and next().
* FormatMetadata and IndexPager were both using it as a shorter way to
get both key() and current() for the first element in the array. I
suppose a foreach(){ break; } would do the same, but that's confusing.
Bug: T174354
Change-Id: I36169a04c764fdf1bfd6603395111c6fe0aae5eb