CVE-2025-32699
Ensure that Unicode NFC normalization can be applied to our HTML
output safely. Even though the W3C officially recommends against
normalizing HTML
https://www.w3.org/International/questions/qa-html-css-normalization#converting
this is still easily done inadvertently, especially when using the
MediaWiki action API which normalizes parameters and results by
default.
See also I671648603c4635a35585c860b4857f5ea085e47f in Parsoid, and
T266140 / I2e78e660ba1867744e34eda7d00ea527ec016b71 for another similar
issue.
The following changes are made:
* The various HTML serializers (Remex/Tidy-derived, as well as the
Html::* helpers) are tweaked to entity-escape U+0338 wherever it
appears.
* Similarly, Message::escaped() is tweaked to entity-escape U+0338.
* Finally, a post-processing pass is added to the OutputTransform
pipeline to catch any remaining U+0338 and entity-escape them.
This catches U+0338 added during any of the previous OutputTransform
stages (like TOC insertion, section edit links, etc).
*When backporting* this code will likely need to be moved to
ParserOutput::getText(), as the OutputTransform pipeline wasn't added
until MW 1.42.
Bug: T387130
Change-Id: I66564e14e730f5393f4fa5780b80f24de6075af5
MWCryptHKDF was added ten years ago (in af66c04d39), and as far as
I can tell, it was never used anywhere. It seems unlikely that CryptHKDF
will be used in the future, at least in its current form, for several
reasons:
* PHP 7.1.2+ has hash_hkdf(), so HKDF() would not be needed.
* At the time MWCryptHKDF was created, access to a CSPRNG was dependent
on server configuration: operating system, enabled PHP extensions,
open_basedir, etc. The "clock drift" RNG used as a last resort was not
considered to be secure or fast enough for generating large amounts of
output.[1] random_bytes(), added in PHP 7, changed the situation.
* Depleting the input pool of Linux's RNG is no longer a concern; there
is no more blocking output pool for /dev/random.[2][3] In 2022, this
change and others, including some that improved performance,[4] were
backported to stable kernels as old as 4.9.[5]
* $wgAuthenticationTokenVersion obviated the primary use case of
quickly resetting the user_token field for all users, assuming all
the existing tokens are unique.
* CryptHKDF seems to perform much slower than random_bytes(), at least
on Linux, making it pointless to use given that the other reasons for
its existence no longer apply.
[1]: https://bots.wmflabs.org/logs/%23mediawiki-core/20161004.txt
[2]: https://lwn.net/Articles/808575/
[3]: https://lore.kernel.org/all/cover.1577088521.git.luto@kernel.org/
[4]: https://www.zx2c4.com/projects/linux-rng-5.17-5.18/
[5]: https://lore.kernel.org/all/Yo3pmh9hiUFtQz77@zx2c4.com/T/
Change-Id: I29136fad826341d21728671aa30285d5551f1162
Add SpecialWhatLinksHereQueryHook hook that allow extensions
to modify the query builder to add more conditions based on
the filters added in the SpecialPageBeforeFormDisplay hook.
Bug: T216368
Change-Id: I221d4e0ad671feab6937719d4a2f894ad6154bb1
This deprecates a number of methods which returned arrays by reference and
exposed internal representation details of the ParserOutput. It also
regularizes the return values to return consistent LinkTarget values,
working around the wide variety of different internal storage formats
used for links.
In the future, once these methods which expose the internal representation
are removed, we can simplify our internal storage as well. But for the
moment we add the new getter without changing the internal representation.
Note that by returning TitleValue objects this new interface also provides
a means to fix the issue identified in T204792 where interwiki and namespace
prefixes were getting confused. A TitleValue properly distinguishes between
these -- although the callers will still have to be careful to use it as
a TitleValue and not attempt to reparse it.
These methods also correctly handle fragments, which are present for the
language link type but stripped for the other linkt types.
Bug: T204792
Change-Id: I48a2077b9645124f83082afd953d6bf7a861270b
Why:
To facilitate the evaluation of conditions not directly
"known" by the lookup, eg: owned by extensions
What:
- Add ConditionalDefaultOptionsAddCondition hook which
runs before instantiating ConditionalDefaultsLookup allowing
to add conditions for evaluation in the $extraConditions
array.
- Evaluate the configured conditional default against the
extra added conditions after evaluation of "known"
conditions.
Bug: T376918
Change-Id: Ife6f96397eafd61fdb40528aac315ddde1ef2774
This adds support for serializing/deserializing objects which
implement the JsonCodecable interface from the wikimedia/json-codec
library used by Parsoid. JsonCodecable allows customizing the encoding
of objects of a given class using a class-specific codec object, and
JsonCodecable is an interface which is defined and can be used outside
mediawiki core.
In addition json-codec supports deserialization in the presence of
aliased class names, fixing T353883.
Backward and forward compatibility established via the mechanism
described in
https://www.mediawiki.org/wiki/Manual:Parser_cache/Serialization_compatibility
Test data generated by this patch was added in
I109640b510cef9b3b870a8c188f3b4f086d75d06 to ensure forward
compatibility with the output after this patch is merged.
Benchmarks:
PHP 7.4.33 PHP 8.2.19 PHP 8.3.6
BEFORE AFTER BEFORE AFTER BEFORE AFTER
Serialize: 926.7/s 1424.8/s 978.5/s 1542.4/s 1023.5/s 1488.6/s
Serialize (assoc): 930.2/s 1378.6/s 974.6/s 1541.9/s 1022.4/s 1463.4/s
Deserialize: 1942.7/s 1961.3/s 2118.8/s 2175.9/s 2129.8/s 2063.5/s
Deserialize (assoc): 1952.0/s 1905.7/s 2107.5/s 2192.1/s 2153.3/s 2011.1/s
These numbers definitely do not have as many significant digits as
written here. But they should be sufficient to demonstrate that
performance is not impaired by this patch and in fact serialization
speed improves slightly.
Bug: T273540
Bug: T327439
Bug: T346829
Bug: T353883
Depends-On: If1d70ba18712839615c1f4fea236843ffebc8645
Change-Id: Ia1017dcef462f3ac1ff5112106f7df81f5cc384f
In T340552, the official PHP OpenTelemetry client was effectively
rejected for inclusion in MediaWiki due to its size. Implement a minimal
tracing library instead that eschews conformance with the OTEL client
specification in favor of simplicity, while remaining capable of
emitting trace data in OTLP format and thus retaining compatibility with
any ingestion endpoint capable of handling OTLP.
In its current state, the library supports a basic feature set that
should be sufficient for basic tracing integration:
* Span creation, inclusive span activation and automatic parent span
assignment,
* Span attributes and span kinds,
* Basic resource (process/request)-level metadata generation,
* Data export over OTLP.
Additional functionality, such as trace propagation, can then be
incrementally added to the library.
Bug: T340552
Change-Id: Ibc3910058cd7ed064cad293a3cdc091344e66b86
We've discovered some new requirements.
Follow-up to 31f614f732.
The hook was not in a release yet, so we can rename it.
Bug: T371530
Change-Id: I82d8ae69c27a38c45eab5d19c063f0b9515b8ec8
Replaces 'copyright' with 'copyright-footer' and 'history_copyright'
with 'copyright-footer-history' (the original still takes precedence
if set). Adds SkinCopyrightFooterMessage hook which works the same
way as SkinCopyrightFooter for the new messages. Allows disabling
the old messages by setting $wgAllowRawHtmlCopyrightMessages = false.
Co-Authored-By: Gergő Tisza <tgr.huwiki@gmail.com>
Bug: T45646
Change-Id: I5fd5607f8d43b6e934c8d4d35097cec430c56043
Why:
* A hook is needed which is called when User::spreadAnyEditBlock
is called, so that extensions which provide alternative blocking
mechanisims (such as the GlobalBlocking extension) can spread
their blocks when local blocks are spread.
What:
* Add SpreadAnyEditBlockHook which is called from User
::spreadAnyEditBlock when it is called except when the user is
not registered.
** The hook is called even if the user is not locally blocked
* The return value of User::spreadAnyEditBlock is modified to
return true if either a local block or alternative blocking
mechanism spread blocks.
* Update UserTest to test this new behaviour.
Bug: T374857
Change-Id: Id302a6362d6177c89da9cdf4e677b3822ecb85f1
The discovery endpoint provides basic information about accessing the
wiki's APIs, as well as a directory of available modules.
Bug: T365753
Change-Id: I161aa68566da91867b650e13c8aadc87cd0c428c
Why:
* The logging table on en.wikipedia.org contains an entry from
2005 which is an unblock of an autoblock. However, the log_title
contains the namespace, which makes the code that looks for
logs which target an autoblock fail (because it checks for the
first character being '#').
* Fixing the log_title to remove the 'User:' prefix from rows which
are autoblocks (i.e. searching for log_titles which start with
'User:#') should address the exceptions seen on Special:Log for
these rows.
** The search can be limited to rows which have the 'unblock'
log_action, as this has only been seen for this type of log.
What:
* Create fixAutoblockLogTitles.php which searches for the entries
and then updates the log_title value to no longer include the
'User:' prefix
** The queries to search are split, such that the expensive LIKE
query is performed on batches of row IDs. If the LIKE query is
applied directly to all rows in the table, the query takes 30s
to run on WMF production.
* Add this maintenance script to update.php. It will be run once
as the class extends LoggedUpdateMaintenance.
* Test the newly added maintenance script to ensure it works.
Bug: T373929
Change-Id: Ia62db56eda456bb764303b5f4b5a29be8f2d8fff
Fix file doc blocks while at it.
> Remove duplicate description from file block in favour of class doc.
> This reduces needless duplication and is often incorrect or outdated,
> and helps make file headers more consistently (visually) ignorable.
>
> Add missing `ingroup` to class doc (and remove any from file doc)
> as otherwise the file is indexed twice (e.g. in Doxygen) which makes
> navigation on doc.wikimedia.org rather messy.
>
> Ref https://gerrit.wikimedia.org/r/q/message:ingroup+is:merged+owner:Krinkle+branch:master
Bug: T364652
Change-Id: Icc36566da1c7190b0f4269719f34d3d6a83026c1
WinCache is an APCu equivalent for use with Microsoft IIS, but in recent
years has been unmaintained and lacks support for PHP 8 and newer.[1]
So, remove support for it as MediaWiki will be raising the minimum
supported PHP version to 8.1.
[1] https://www.php.net/manual/en/install.windows.recommended.php
Bug: T365691
Change-Id: I4d2dc01a9119bb1f858132f0146b894750c1e86d
This allows to change the category link rendering by extension
CategoryTree without missing update of mCategoryData and mCategories
which leads to wgCategories = [] (T372155).
The new hook will be used in extension CategoryTree by
Ic86f210474cbc0e2dcebf664cf2309a4a4408f60.
Bug: T372155
Change-Id: Id82a77a57d1f12233d974ea4c1b093f50c5ab74f
Add a new hook that can be used to prevent authentication just
before AuthManager takes the main action (writing the session
for login, creating the local user account for account creation).
The driving use case is a wiki which supports both a local and
a central (wiki-farm-level) login or signup flow - various
security options (such as 2FA) are needed during local login
but unnecessary during central login (which will have those
security features centrally), so we need to skip much of the
security when the user is taking the central route, and a bug
in how that's done could result in circumvention of security
features during local login. The hook makes it easy to inspect
and potentially interrupt login near the end, when we know for
sure what route it took. (Specifically, we know which primary
provider was used. The hook doesn't expose other details,
such as the list of preauth or secondary provders that were
invoked, because they were not needed for the immediate use
case, but they are easy to add in the future.)
The hook is called after the secondary providers for login
and before them for account creation, since secondaries can
interrupt login but cannot interrupt account creation.
A shortcoming is that since the hook is called after a primary
provider succeeded, it cannot prevent the primary provider from
doing work, ie. it cannot prevent creation of the remote account
during account creation (although it will prevent the creation
of the local account). This is not great but acceptable, since
creating a new account isn't very security-sensitive.
This also means the hook would not be useful during account
linking, as AuthManager does not do anything there, all the work
happens in the primary provider. This is even less great but
few authentication extensions implement account linking.
The hook is not called for authentication happening via
CreatedAccountAuthenticationRequest, which is a weird internal
hack hook handlers should not have to know about.
Also rename a confusingly named variable.
Change-Id: I835b2fe2f43e6e81f23348165cbb9c93832e6583
Allow disabling authentication providers. This allows for
extensions to replace core providers with their own.
This is using the $wgAuthManagerAutoConfig keys instead of
AuthenticationProvider::getUniqueId() as the keys to filter.
This makes it more useful for site administrators, and also
it's probably the better known of the two identifiers so
more intuitive.
No effort is made to prevent the hook from filtering
differently in different steps of the same authentication
process.
Bug: T369180
Change-Id: If5435b54a4fc08f685c04fc10eb44c6d72cd78fa
Why:
* The fixDefaultJsonContentPages.php maintenance script was added
in 2986d47c90 which was MW 1.27
* Per the version policy, wikis wishing to upgrade to MW 1.43
should upgrade via 1.35 or 1.39 before moving to 1.43.
* As such, this script will have been already run for any wiki
upgrading to 1.43 and therefore this script is unused.
* Removing the script is useful to reduce the amount of untested
code in the maintenance directory and reducing unnecessary
maintenance on now unused code.
What:
* Remove fixDefaultJsonContentPages.php
Bug: T373335
Change-Id: Ie20f55c6a8723573aa7e9acd67766af9dfb67269
Why:
* The populatePPSortKey.php maintenance script was added in
993ce4d411 which was in at
least MW 1.34.
* Per the version policy, wikis wishing to upgrade to MW 1.43
should upgrade via 1.35 or 1.39 before moving to 1.43.
* As such, this script will have been already run for any wiki
upgrading to 1.43 and therefore this script is unused.
* Removing the script is useful to reduce the amount of untested
code in the maintenance directory and reducing unnecessary
maintenance on now unused code.
What:
* Remove populatePPSortKey.php
Bug: T373334
Change-Id: Iaa86bb193bf8feae9f5e2fe33255182e864d2e4f
Why:
* The populateBacklinkNamespace.php maintenance script was
added to update.php in MW 1.24 in
b8c038f678
* Per the version policy, wikis wishing to upgrade to MW 1.43
should upgrade via 1.35 or 1.39 before moving to 1.43.
* As such, this script will have been already run for any wiki
upgrading to 1.43 and therefore this script is unused.
* Removing the script is useful to reduce the amount of untested
code in the maintenance directory and reducing unnecessary
maintenance on now unused code.
What:
* Remove populateBacklinkNamespace.php
Bug: T373333
Change-Id: Ia70fdd5c5ae087d7f5bdf4499185701bbb106c1f
Why:
* The addRFCandPMIDInterwiki.php maintenance script last had
it's update key modified in MW 1.23 in commit
bd38435848.
* Per the version policy, wikis wishing to upgrade to MW 1.43
should upgrade via 1.35 or 1.39 before moving to 1.43.
* As such, this script will have been already run for any wiki
upgrading to 1.43 and therefore this script is unused.
* Removing the script is useful to reduce the amount of untested
code in the maintenance directory and reducing unnecessary
maintenance on now unused code.
What:
* Remove addRFCandPMIDInterwiki.php
Bug: T373331
Change-Id: Ie6791c0cc2cfab5e09aa0b1c7ddcea1099cd1f79