All the other ways of doing it were ridiculous and much harder to read,
and usually required repeating the needle expression (to get its
length). I found these occurrences by grepping for various expressions,
but I undoubtedly missed some.
I didn't try replacing the many instances of strpos(...) === 0 with
str_starts_with(...), because I think they're readable enough as-is
(although less efficient). Likewise I didn't try porting strpos(...) !==
false to str_contains(...). For case-insensitive comparisons, Tim
Starling requested that we stick with substr_compare() because it's more
efficient than calling strtolower().
On PHP < 8 these functions will be included with a polyfill via
vendor/autoload.php. This is included at the beginning of
includes/AutoLoader.php, so if our autoloader has been included the
polyfill will be available. This means it should be safe to call these
functions from any code that would not be usable without our autoloader.
Three uses that Tim Starling identified as being performance-sensitive
have been split out to a separate commit for porting after the switch to
PHP 8.
Change-Id: I113a8d052b6845852c15969a2f0e6fbbe3e9f8d9
Called on every request. Profiling attributes this to the no-op
for the wfDebug() call in Setup.php, but in practice there are later
calls as well.
Use str_contains() while at it. This isn't where the time is spent
and isn't native until PHP 8, but might as well start using it.
Bug: T85805
Change-Id: Ia7117e6d71327edd396b0cc0e59d90af158d381e
Added support for an easy to configure multi-tenant ("wiki farm") mode:
Settings for each site can be placed in a directory specified by
$wgWikiFarmSettingsDirectory. Site detection is controlled by
$wgWikiFarmSiteDetector and defaults to the requested host name.
Instructions for manual testing: https://etherpad.wikimedia.org/p/T221535
Bug: T221535
Change-Id: I7581921b7d99ba1fe7e25523fde691d76b67a99c
Make phan stricter about conditional variable declaration
Remaining false positive issues are suppressed.
The suppression and the setting change can only be done together
Bug: T259172
Change-Id: I1f200ac37df7448453688bf464a8250c97313e5d
Make phan stricter about null types by setting null_casts_as_any_type to
false (the default in mediawiki-phan-config)
Remaining false positive issues are suppressed.
The suppression and the setting change can only be done together
Bug: T242536
Bug: T301991
Change-Id: I0f295382b96fb3be8037a01c10487d9d591e7e01
Make phan stricter about scalar types by setting scalar_implicit_cast to
false (the default in mediawiki-phan-config)
Bug: T242536
Bug: T301991
Change-Id: Ia2fe30b17804186571722e728578121c8b75d455
Prior art in I8ce1fa31f, Id37619cead, I9db7b3f36f, If219463d80a.
Start to nudge future development in a better direction by starting
to discourage getVal() since it is an odd hybrid after all, that
for the most part could probably be replaced by getRawVal, except
that there is a handful of really critical getVal() calls for
"title" in many places that are difficult to track down, so for now
we're probably stuck with this.
Change-Id: I18767cd809f67bde6784bda5e3c85974ea2e7205
This is as useless on load.php as it already was on api.php, and for
things like thumb.php, rest.php, and opensearch.php it's probably not
helping, either.
Change-Id: I86745ad244ae9d969d4d33bfd5e37f1e7eea39f6
And use it in core
Avoid direct use of super global $_FILES
This can breaks all FauxRequest relaying on $_FILES
in tests or production code via FauxRequest::getUpload.
Falls back to $_FILES for the moment
Bug: T48163
Change-Id: I7392acc9bb682ec6b7025dbed0734c142f45c91a
This was made @internal in MW 1.35 (507501d6ee; If2b82759f3).
Callers outside core never existed as far everything that Codesearch
is aware of.
Time to make this no longer public.
Change-Id: I4b638ef33cea2a987d1a44af0ca64ed04de1df67
I don't actually think this will fix T273256, but it does help rule
out the 99.9% of possible causes that I don't currently suspect.
This is a static method meant to read global state. Rather than
accessing it through a chain of twenty global singletons and one
hundred method calls, just read the string already.
Bug: T273256
Change-Id: Ia542ac7aab7bdcbd15bdba5023be9efa9fca5c83
For example, documenting the method getUser() with "get the User
object" does not add any information that's not already there.
But I have to read the text first to understand that it doesn't
document anything that's not already obvious from the code.
Some of this is from a time when we had a PHPCS sniff that was
complaining when a line like `@param User $user` doesn't end
with some descriptive text. Some users started adding text like
`@param User $user The User` back then. Let's please remove
this.
Change-Id: I0ea8d051bc732466c73940de9259f87ffb86ce7a
This is a follow-up for Ifcf0bc4.
I checked the callers of this method. The majority of them does not
care about these numeric values, only about the language codes. Most
callers even do an array_keys() immediatelly. The only effect I could
find will be visible in the ApiQueryUserInfo API module. Example:
https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&meta=userinfo&formatversion=2&uiprop=acceptlang
In this response, the numbers are currently reported as strings, but
will be reported as floats. I would argue this counts as a bugfix and
doesn't need anouncement.
This patch also splits the responsible regular expression pattern for
improved readability, utilizing the /x extended mode. No change is
made to the pattern.
Change-Id: I42de46f56d7544f9781b02a2f0ed49ef85ee44a1
According to php.net docs, the warnings this used to emit have
been removed in PHP 5.3.
<https://www.php.net/manual/en/function.parse-url.php>
Thanks Ammarpad for showing this during development of
Ib829afc7b33419b0.
Change-Id: Ia078049c34f5fc6b1534f0ea0551ecc2c1fc3388
* Add $wgCookieSameSite, which controls the SameSite attribute for login
cookies. This will need to be set to "None" on WMF and other wikis
with a CentralAuth installation spanning multiple registrable domains.
* Add $wgUseSameSiteLegacyCookies, which causes a "legacy" cookie to be
sent without a SameSite attribute whenever a SameSite=None cookie is
sent. I used the prefix "ss0" since it's like SameSite version 0, and
that's shorter than "legacy". It's a prefix instead of a suffix to
avoid the need to update the VCL config which identifies cookie types
by their name suffix.
* Simplify WebRequest::getCookie() removing the unnecessary unicode
normalization. This was added by analogy with GET/POST, I don't
believe it was ever necessary for cookies.
* Add WebRequest::getCrossSiteCookie(), which implements the read side
of the legacy SameSite cookie support.
* Fix Doxygen formatting of the parameter list in
WebResponse::setCookie().
* To work around the lack of SameSite cookie support in PHP 7.2, emulate
setcookie() with header() where necessary.
Bug: T252236
Change-Id: I141ea114fea007a72a4f24bfc34dd81100854d68
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.
So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.
Also:
* Fix the stripping of leading line breaks from the log header emitted
by Setup.php. This feature, accidentally broken in 2014, allows
requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
WebRequest in the hopes that it would not always be needed, however
$wgRequest->getIP() is now called unconditionally a few lines up in
Setup.php. This means that it is put in its proper place after the
"start request" message.
* Wrap the log header code in a closure so that variables like $name do
not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
parameter to wfDebug().
Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
apache_request_headers() is a vendor-specific function - it got used
when present and alternative code paths were exercised otherwise.
These preserved certain "special" headers, e.g. Content-Type, only
inconsistently.
The function getallheaders() is an alias[1] for apache_request_headers()
on systems where the latter is present. Alternatively, there is a
polyfill (ralouphie/getallheaders) which is already installed in
mediawiki-vendor[2] (by virtue of guzzle).
Using getallheaders() exclusively, will make sure these "special"
headers are consistently available alongside their "regular"[3] peers
and helps MediaWiki code focus on its domain.
The dependency to ralouphie/getallheaders is made explicit in the same
version in which it is currently locked in mediawiki-vendor[4].
This surfaced because the deprecation warning for API POST requests
without a Content-Type header, introduced in bba1a0f, appeared in my
development system (somewhat dated addshore/mediawiki-docker-dev/) even
though the client did a fine job.
Interesting implementation detail: While WebRequest keeps track of
headers using keys in all upper case, REST RequestFromGlobals does so in
all lower case - but both use retrieval logic complementary to their
respective approach however. In case of REST RequestFromGlobals this is
encapsulated inside of HeaderContainer (setting and retrieving), while
WebRequest does all of this by itself. Cf. [5] and [6]
[1]: https://www.php.net/manual/en/function.getallheaders.php
[2]: https://github.com/wikimedia/mediawiki-vendor/tree/8f2967d/ralouphie/getallheaders
[3]: https://www.php.net/manual/en/reserved.variables.server.php#110763
[4]: https://github.com/wikimedia/mediawiki-vendor/blob/8f2967d/composer.lock#L3250
[5]: https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2
[6]: https://www.php.net/manual/en/function.apache-request-headers.php#124236
Bug: T245535
Change-Id: Iba52f152e15928473b729a2588c2462e76e85634
It has become apparent that $_SERVER['SCRIPT_NAME'] may contain the same
thing as REQUEST_URI, for example in WMF production. PATH_INFO is not
set, so there is no way to split the URL into SCRIPT_NAME and PATH_INFO
components apart from configuration.
* Revert the fix for T34486, which added a route for SCRIPT_NAME to the
PathRouter for the benefit of img_auth.php. In T235357, the route thus
added contained $1, breaking everything.
* Remove calls to WebRequest::getPathInfo() from everywhere other than
index.php. Dynamic modification of $wgArticlePath in order to make
PathRouter work was weird and broken anyway. All that is really needed
is a suffix of REQUEST_URI, so I added a function which provides that.
* Add $wgImgAuthPath, for use as a last resort workaround for T34486.
* Avoid the use of $_SERVER['SCRIPT_NAME'] to detect the currently
running script.
* Deprecated wfGetScriptUrl(), a fairly simple wrapper for SCRIPT_NAME.
Apparently no callers in core or extensions.
Bug: T235357
Change-Id: If2b82759f3f4aecec79d6e2d88cd4330927fdeca
* Deprecate WebRequest::checkUrlExtension() and have it always return
true. This reverts the security fixes made for T30235.
* Remove IEUrlExtension. This is a helper for checkUrlExtension() which
is not used in any extensions.
* Remove CSS sanitization code which is specific to IE6. This reverts
the changes made to fix T57332, and related followups. I confirmed
that the relevant test cases do not result in XSS on IE8.
* Remove related tests.
Bug: T232563
Change-Id: I7318ea4a63210252ebc64968691d4f62d79a63e9
The variable is also read in a few other places, such as to
export the value from api.php (siteinfo) and load.php (mw.config)
but those requests don't need to be held back by this extra
logic.
Alternatively, if we really want to require this for all consumption,
we should probably let PathRouter provide the value and require
consumers to use it. E.g. services->getPathRouter->getArticlePath,
or something like that.
As easy first step, I'm moving it to PathRouter, called from
WebRequest::getPathInfo which is still called on all index.php
requests for any wiki page action in any namespace (incl Special)
when the wiki uses anything other than the default 'index.php?title='
article path.
Test Plan:
* Set '$wgArticlePath = 'bla';`
* View /mediawiki/index.php/Main_Page, and observe the fatal
error message (same as before this change).
Bug: T189966
Change-Id: Id06c2557e2addb58faeef0b6f7767a65b8de55a5
These were discovered by setting `null_casts_as_any_type` to true in
phan, and filtering by `PhanTypeMismatchReturnNullable`. Of course there
are others, some of which are false positives, but we cannot suppress
them now (or the UnusedSuppressionPlugin will complain).
Change-Id: Ia8443e575c22f47a6d8c63038f4e7ac36815fc27
I doubt there was ever a good reason for mangling $_GET to add the
title, this was just b/c for the sake of b/c. It was formerly used in
core but that was so long ago that I doubt there was any usage in
extensions at the time. Now there is one usage of $_GET['title'] in an
unmaintained extension, but it was only added in 2017.
Also I added WebRequest::getQueryValuesOnly() which is an interface to
the unmodified $_GET. The motivation is allowing OAuth to work with the
REST API, since OAuth needs an unmangled view of $_GET for signature
generation. The Action API gets around the problem with a special hack
in interpolateTitle(), disabling it for the Action API only.
A review of callers of getQueryValues() suggests that many would
benefit from using getQueryValuesOnly() instead. But I only changed it for
callers in api.php and thumb.php since the effect of the change there is
certainly beneficial, whereas callers under index.php may possibly be using
the path parameters to construct self-links.
Rest\RequestFromGlobals uses $_GET directly, which means that this
change causes it to not return PathRouter matches as GET parameters
anymore.
Change-Id: Ic469577fae17c0b1ac69466df7bc9f03e61c74e3
This is only relevant when processing page views or when constructing
Title urls with an 'action' query. Pretty important stuff, and worth
optimising for if we had to choose, but we can defer it in this case
without slowing it down, which is better for everything else.
It also means we don't mutate configuration (beyond setting whole values
as dynamic defaults), which seems desirable, and makes the overall behaviour
easier to test. Handling absence of 'view' should be PathRouter's
responsibility, not Setup.
Bug: T189966
Change-Id: I9c1eea2dcea74be0e283eb2b175268315ced1793
Called for all PHP web requests from WebStart.php via
WebRequest::interpolateTitle.
* Use isset() instead of empty() where we only need to check
that the key is supported.
* Don't import global `$wgUsePathInfo` in the common case.
* Migrate from deprecated wrapper Wikimedia\suppressWarnings
to AtEase::suppressWarnings.
* Use strpos() instead of preg_match(). Consistently faster,
albeit not by much (for 100 iterations: 0.04ms vs 0.23ms).
* Don't create unused $matches array for the common case.
Bug: T189966
Change-Id: I0de126953c25f3629cb85a0d4e46598baf261c15