Implicitly marking parameter $... as nullable is deprecated in php8.4,
the explicit nullable type must be used instead
Created with autofix from Ide15839e98a6229c22584d1c1c88c690982e1d7a
Break one long line in SpecialPage.php
Bug: T376276
Change-Id: I807257b2ba1ab2744ab74d9572c9c3d3ac2a968e
… instead of running the same (possibly expensive) regular expression
twice. The new code does the same as before, as proven by the tests.
I leave the error handling untouched for the moment, even if we know
from T321234 that the probably only reason the preg_… calls here can
fail is broken UTF-8.
The error message still mentions "preg_match_all". This is intentional
to make it easier to find old and new errors under the same name in
logstash.
I benchmarked this and it's indeed a bit faster. About 25% faster.
However, performance was not the main motivation for this patch but
the readability of the code.
Bug: T321234
Change-Id: I5b7a04abc008dd095dc87d0a05d06e061eb8d6b2
Garbage in, garbage out. When the wikitext is broken, it's still
helpful if the user can see the broken wikitext. Even if it's not
fully parsed. It's not the job of this class to fix broken UTF-8.
The worst thing that can happen is that the wikitext contains some
unparsed magic words. However, this is really only relevant for
very old revisions (20 years old, see T321234). It's very normal
that old revisions can't be 100% parsed any more, most notably
because of deleted templates. This here is not much different.
Bug: T321234
Change-Id: I0ce40f6575668847ef309599ee32de52190ab212
The extra code that scans for duplicates and throws an exception was
added via I95dea67 in 2017. I'm not entrirely sure why. This should
be impossible in all relevant real-world scenarios. Maybe it happened
in a local dev scenario?
Even if, duplicates are harmless. Let me explain:
The only way a duplicate can end here is when the same magic word is
added twice to the $this->names array. The only thing that happens
then is that the resulting regex contains one of the sub-patterns
twice. It doesn't matter which one matches. We know these subpatterns
are identical. Unfortunately the PCRE compiler doesn't know and
assumes duplicate names are a problem. We have two options to fix
this: Strip duplicates in $this->names with array_unique() or tell
the PCRE compiler that duplicates are ok with the /J modifier.
I would like to avoid the extra, potentially expensive array_unique()
because, as said, duplicates never happen in real-world scenarios.
The /J modifier is supported since PHP 7.2.
Change-Id: I5f113abdbb44354fcc01be7f36fbc7d07f75876c
* MagicWord::getId was added in r24808 (164bb322f2) but never used.
At the time, access modifiers like 'private' were not yet in use.
Deprecate the method with warnings, for removal in a future release.
* Fix zero coverage for MagicWord, due to constructor being
internal, this is only intended to be created via array and
factory classes. Let their tests cover this class.
* Remove redundant file-level description and ensure the class desc
and ingroup tag are on the class block instead.
Ref https://gerrit.wikimedia.org/r/q/owner:Krinkle+message:ingroup
* Mark constructor `@internal` (was already implied by
stable interface policy), and explain where to get the object
instead.
* Mark load() `@internal`. Method was introduced in 1.1 when the
class (and PHP) did not yet use visibility modifiers for private
methods. The only way to get an instance of MagicWord
(MagicWordFactory::get) already calls load(), the method is not
a no-op if called a second time, and (fortunately) there exist no
callers to this outside this class that I could find.
* MagicWordArray::getBaseRegex was marked as internal
in change I17f1b7207db8d2203c904508f3ab8a64b68736a8.
Change-Id: I4084f858bb356029c142fbdb699f91cf0d6ec56f
… instead of the generic MWException and even more generic Exception.
Most, if not all of these should be unreachable anyway. I.e. these
are what we call "unchecked" exceptions, see T240672.
We also have a polyfill for preg_last_error_msg. No need to wrap it
in a function_exists (any more).
Change-Id: Ie26bef3b4371d011ec3f1874986072605692f486
The original motivation was readability. I added comments that
hopefully explain better what's going on here.
I also benchmarked this and it's more than 10 times faster than
before. The main difference comes from foreach itself.
Change-Id: I5e717ea0f3c0ce12f4beffac7105314b63cb752a
This patch is intentionally "incomplete". It's limited to places
where we can be 100% sure about the type just from looking at the
code. More to be done in later patches.
Change-Id: Ideea49ea9603127038ef08c6a9805f40a0b86b6d
Intentionally split across multiple patches. This is only about
documentation and impossible to break anything (other than Phan).
MagicWordArray::matchAndRemove is particularly confusing because the
documentation and structure of the returned array make it look like
it would support parameters. But it never (!) did.
The method was added like this in 2008 via commit 269a9103 (r31113).
There was always only a single caller in the Parser class. The
parser never used the array values, only the keys (via isset). Which
makes sense because that code in the parser is about "double
underscore" magic words (e.g. __NOTOC__). These don't support
parameters anyway.
Change-Id: Ife92fc3d6d5b03606ba2b209a886cadef3451fea