This is a follow-up for Ifcf0bc4.
I checked the callers of this method. The majority of them does not
care about these numeric values, only about the language codes. Most
callers even do an array_keys() immediatelly. The only effect I could
find will be visible in the ApiQueryUserInfo API module. Example:
https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&meta=userinfo&formatversion=2&uiprop=acceptlang
In this response, the numbers are currently reported as strings, but
will be reported as floats. I would argue this counts as a bugfix and
doesn't need anouncement.
This patch also splits the responsible regular expression pattern for
improved readability, utilizing the /x extended mode. No change is
made to the pattern.
Change-Id: I42de46f56d7544f9781b02a2f0ed49ef85ee44a1
According to php.net docs, the warnings this used to emit have
been removed in PHP 5.3.
<https://www.php.net/manual/en/function.parse-url.php>
Thanks Ammarpad for showing this during development of
Ib829afc7b33419b0.
Change-Id: Ia078049c34f5fc6b1534f0ea0551ecc2c1fc3388
* Add $wgCookieSameSite, which controls the SameSite attribute for login
cookies. This will need to be set to "None" on WMF and other wikis
with a CentralAuth installation spanning multiple registrable domains.
* Add $wgUseSameSiteLegacyCookies, which causes a "legacy" cookie to be
sent without a SameSite attribute whenever a SameSite=None cookie is
sent. I used the prefix "ss0" since it's like SameSite version 0, and
that's shorter than "legacy". It's a prefix instead of a suffix to
avoid the need to update the VCL config which identifies cookie types
by their name suffix.
* Simplify WebRequest::getCookie() removing the unnecessary unicode
normalization. This was added by analogy with GET/POST, I don't
believe it was ever necessary for cookies.
* Add WebRequest::getCrossSiteCookie(), which implements the read side
of the legacy SameSite cookie support.
* Fix Doxygen formatting of the parameter list in
WebResponse::setCookie().
* To work around the lack of SameSite cookie support in PHP 7.2, emulate
setcookie() with header() where necessary.
Bug: T252236
Change-Id: I141ea114fea007a72a4f24bfc34dd81100854d68
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.
So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.
Also:
* Fix the stripping of leading line breaks from the log header emitted
by Setup.php. This feature, accidentally broken in 2014, allows
requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
WebRequest in the hopes that it would not always be needed, however
$wgRequest->getIP() is now called unconditionally a few lines up in
Setup.php. This means that it is put in its proper place after the
"start request" message.
* Wrap the log header code in a closure so that variables like $name do
not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
parameter to wfDebug().
Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
apache_request_headers() is a vendor-specific function - it got used
when present and alternative code paths were exercised otherwise.
These preserved certain "special" headers, e.g. Content-Type, only
inconsistently.
The function getallheaders() is an alias[1] for apache_request_headers()
on systems where the latter is present. Alternatively, there is a
polyfill (ralouphie/getallheaders) which is already installed in
mediawiki-vendor[2] (by virtue of guzzle).
Using getallheaders() exclusively, will make sure these "special"
headers are consistently available alongside their "regular"[3] peers
and helps MediaWiki code focus on its domain.
The dependency to ralouphie/getallheaders is made explicit in the same
version in which it is currently locked in mediawiki-vendor[4].
This surfaced because the deprecation warning for API POST requests
without a Content-Type header, introduced in bba1a0f, appeared in my
development system (somewhat dated addshore/mediawiki-docker-dev/) even
though the client did a fine job.
Interesting implementation detail: While WebRequest keeps track of
headers using keys in all upper case, REST RequestFromGlobals does so in
all lower case - but both use retrieval logic complementary to their
respective approach however. In case of REST RequestFromGlobals this is
encapsulated inside of HeaderContainer (setting and retrieving), while
WebRequest does all of this by itself. Cf. [5] and [6]
[1]: https://www.php.net/manual/en/function.getallheaders.php
[2]: https://github.com/wikimedia/mediawiki-vendor/tree/8f2967d/ralouphie/getallheaders
[3]: https://www.php.net/manual/en/reserved.variables.server.php#110763
[4]: https://github.com/wikimedia/mediawiki-vendor/blob/8f2967d/composer.lock#L3250
[5]: https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2
[6]: https://www.php.net/manual/en/function.apache-request-headers.php#124236
Bug: T245535
Change-Id: Iba52f152e15928473b729a2588c2462e76e85634
It has become apparent that $_SERVER['SCRIPT_NAME'] may contain the same
thing as REQUEST_URI, for example in WMF production. PATH_INFO is not
set, so there is no way to split the URL into SCRIPT_NAME and PATH_INFO
components apart from configuration.
* Revert the fix for T34486, which added a route for SCRIPT_NAME to the
PathRouter for the benefit of img_auth.php. In T235357, the route thus
added contained $1, breaking everything.
* Remove calls to WebRequest::getPathInfo() from everywhere other than
index.php. Dynamic modification of $wgArticlePath in order to make
PathRouter work was weird and broken anyway. All that is really needed
is a suffix of REQUEST_URI, so I added a function which provides that.
* Add $wgImgAuthPath, for use as a last resort workaround for T34486.
* Avoid the use of $_SERVER['SCRIPT_NAME'] to detect the currently
running script.
* Deprecated wfGetScriptUrl(), a fairly simple wrapper for SCRIPT_NAME.
Apparently no callers in core or extensions.
Bug: T235357
Change-Id: If2b82759f3f4aecec79d6e2d88cd4330927fdeca
* Deprecate WebRequest::checkUrlExtension() and have it always return
true. This reverts the security fixes made for T30235.
* Remove IEUrlExtension. This is a helper for checkUrlExtension() which
is not used in any extensions.
* Remove CSS sanitization code which is specific to IE6. This reverts
the changes made to fix T57332, and related followups. I confirmed
that the relevant test cases do not result in XSS on IE8.
* Remove related tests.
Bug: T232563
Change-Id: I7318ea4a63210252ebc64968691d4f62d79a63e9
The variable is also read in a few other places, such as to
export the value from api.php (siteinfo) and load.php (mw.config)
but those requests don't need to be held back by this extra
logic.
Alternatively, if we really want to require this for all consumption,
we should probably let PathRouter provide the value and require
consumers to use it. E.g. services->getPathRouter->getArticlePath,
or something like that.
As easy first step, I'm moving it to PathRouter, called from
WebRequest::getPathInfo which is still called on all index.php
requests for any wiki page action in any namespace (incl Special)
when the wiki uses anything other than the default 'index.php?title='
article path.
Test Plan:
* Set '$wgArticlePath = 'bla';`
* View /mediawiki/index.php/Main_Page, and observe the fatal
error message (same as before this change).
Bug: T189966
Change-Id: Id06c2557e2addb58faeef0b6f7767a65b8de55a5
These were discovered by setting `null_casts_as_any_type` to true in
phan, and filtering by `PhanTypeMismatchReturnNullable`. Of course there
are others, some of which are false positives, but we cannot suppress
them now (or the UnusedSuppressionPlugin will complain).
Change-Id: Ia8443e575c22f47a6d8c63038f4e7ac36815fc27
I doubt there was ever a good reason for mangling $_GET to add the
title, this was just b/c for the sake of b/c. It was formerly used in
core but that was so long ago that I doubt there was any usage in
extensions at the time. Now there is one usage of $_GET['title'] in an
unmaintained extension, but it was only added in 2017.
Also I added WebRequest::getQueryValuesOnly() which is an interface to
the unmodified $_GET. The motivation is allowing OAuth to work with the
REST API, since OAuth needs an unmangled view of $_GET for signature
generation. The Action API gets around the problem with a special hack
in interpolateTitle(), disabling it for the Action API only.
A review of callers of getQueryValues() suggests that many would
benefit from using getQueryValuesOnly() instead. But I only changed it for
callers in api.php and thumb.php since the effect of the change there is
certainly beneficial, whereas callers under index.php may possibly be using
the path parameters to construct self-links.
Rest\RequestFromGlobals uses $_GET directly, which means that this
change causes it to not return PathRouter matches as GET parameters
anymore.
Change-Id: Ic469577fae17c0b1ac69466df7bc9f03e61c74e3
This is only relevant when processing page views or when constructing
Title urls with an 'action' query. Pretty important stuff, and worth
optimising for if we had to choose, but we can defer it in this case
without slowing it down, which is better for everything else.
It also means we don't mutate configuration (beyond setting whole values
as dynamic defaults), which seems desirable, and makes the overall behaviour
easier to test. Handling absence of 'view' should be PathRouter's
responsibility, not Setup.
Bug: T189966
Change-Id: I9c1eea2dcea74be0e283eb2b175268315ced1793
Called for all PHP web requests from WebStart.php via
WebRequest::interpolateTitle.
* Use isset() instead of empty() where we only need to check
that the key is supported.
* Don't import global `$wgUsePathInfo` in the common case.
* Migrate from deprecated wrapper Wikimedia\suppressWarnings
to AtEase::suppressWarnings.
* Use strpos() instead of preg_match(). Consistently faster,
albeit not by much (for 100 iterations: 0.04ms vs 0.23ms).
* Don't create unused $matches array for the common case.
Bug: T189966
Change-Id: I0de126953c25f3629cb85a0d4e46598baf261c15
For tracing and logging purposes, we want to be able to see/generate the
list of all of the requests that happen in the environment for a given
external incoming request. To that end, allow Mediawiki to accept the
request ID provided by the incoming request as its own.
Since this may be problematic for set-ups that don't have an entity in
front of MW that sanitises the headers on the way in, introduce a new
global variable, `$wgAllowExternalReqID`, that can disable this
behaviour. By default, the feature is disabled.
Bug: T201409
Change-Id: I605471fb8b5bbc290baeecc7d80d9d715cb240c9
Previously it relied on wfGetServerUrl to decide the url,
which passed PROTO_CURRENT on to wfExpandUrl(), when then uses
global $wgRequest and checks getProtocol() on that.
For typical web requests, these would be the same object by
reference, but the needless indirection makes the code harder
to reason about and harder to test.
Instead, check this locally and pass on an explicit HTTP or HTTPS
to wfGetServerUrl/wfExpandUrl.
Change-Id: I797bd2f909625536c3af36f015fb2e94bf922ba9
* Call $contLang->checkTitleEncoding() only with a string value as
parameter. (A boolean value here caused T218883.)
* MediaWikiServices::getInstance()->getContentLanguage() always return a
Language object. An exist check is not necessary.
Change-Id: Idd3e3a2baa5072e862d7502e30079a1b33d6a866
Added spaces around .
Removed empty return statement which are not required
Removed return after phpunit markTestIncomplete,
which is throwing to exit the test, no need for a return
Change-Id: I2c80b965ee52ba09949e70ea9e7adfc58a1d89ce
If the request URL was not normalized, for example having a double slash
in it, this could cause it to fail to match in the PathRouter. But the
canonicalizing redirect was using the normalized URL, causing a redirect
loop exception.
So:
* If the PathRouter fails to match with the original URL, try matching
against the normalized URL. This allows it to still work for
normalized URLs with a double slash in the title part of the path.
* Have WebRequest::getFullRequestURL() always return the URL without
removing dot segments or interpreting double slashes. Just append
the path to the server.
* Make MediaWikiTest.php use WebRequest instead of FauxRequest, allowing
it to reproduce the exception in question. Add relevant test.
* Add tests for the new PathRouter behaviour.
Bug: T100782
Change-Id: Ic0f3a0060904abc364f75dae920480b81175d52f
Find: /isset\(\s*([^()]+?)\s*\)\s*\?\s*\1\s*:\s*/
Replace with: '\1 ?? '
(Everywhere except includes/PHPVersionCheck.php)
(Then, manually fix some line length and indentation issues)
Then manually reviewed the replacements for cases where confusing
operator precedence would result in incorrect results
(fixing those in I478db046a1cc162c6767003ce45c9b56270f3372).
Change-Id: I33b421c8cb11cdd4ce896488c9ff5313f03a38cf
Disable it in specific files and places where there are legitimate uses
to access $_GET and $_POST directly.
For EditPage, which wants to output $_POST for debugging information,
introduce WebRequest::getPostValues() as a wrapper, matching the
existing ::getQueryValues().
Change-Id: I2cb0a7012fb7ed29dcd720056b42f56508ddc5fa
* Remove left-over mention of the .php5 entry points in docs.
* Remove dead logic in NoLocalSettings for php5 entry points.
* Remove dead match in WebRequest for php5 entry points (they'd
redirect since 1.25, and not seen by PHP).
Change-Id: Ia0ee8588591860b8fe34030c8503f38e9bce31f3
One bug fixed: a timestamp of '00' or similar would get interpreted as
'now' by mistake instead of Unix timestamp 0, without throwing the
warning for using 0 instead of 'now'. This is because it called
wfTimestamp() once to parse the input date, got a Unix timestamp of 0
back, and then tried passing that 0 back to wfTimestamp again to
reformat as a wiki date, but it got reinterpreted as 'now'.
Also fixed parameters with type "user" to validate usernames more
correctly. This might be risky, though, if I missed any valid
usernames, or if API clients were for some reason relying on passing in
invalid usernames. If we don't actually want to do this, we should add
a comment explaining why we're allowing any title without a fragment
rather than validating properly.
Still lots more work to do here.
Change-Id: I56b4290263df8698efdbddda71a7eabd9e303abc
Use human-readable stack trace instead of array dump,
try to display the URL and the request ID, use the same
message format as exceptions,
Bug: T189851
Change-Id: I3edf2dbd5639ceecc668719c065ecdce33157ff5
* Use $_SERVER['REQUEST_TIME_FLOAT'] unconditionally in WebRequest.php
and libs/Timing.php. WebStart.php was doing this already without issue.
The key existst since PHP 5.4, for both Web and CLI (we require 5.5).
* In wfDebug() and wfReportTime(), use $_SERVER['REQUEST_TIME_FLOAT'] instead.
* In ApiFormatBase and MWDebug, use WebRequest::getElapsedTime() instead.
* In Maintenance.php, remove setting of $wgRequestTime.
* In rebuildFileCache.php, update mocking to $_SERVER['REQUEST_TIME_FLOAT']
so that we avoid re-introducing bug T24852.
Change-Id: I1b647da2862f815029caa533b592ec8a05b33806
These comments do not add anything. I argue they are worse than having
no comments, because I have to read them first to understand they
actually don't explain anything. Removing them makes room for actual
improvements in the future (if needed).
Change-Id: Iee70aad681b3385e9af282d5581c10addbb91ac4
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.
Change-Id: I6f59febaf8fc96e80f8cfc11f4356283f461142a