Why:
* Maintenance scripts in core have bolierplate code that is
added before and after the class to allow directly running
the maintenance script.
* Running the maintenance script directly has been deprecated
since 1.40, so this boilerplate code is only to support a now
deprecated method of running maintenance scripts.
* This code cannot also be marked as covered, due to PHPUnit
not recognising code coverage for files.
* Therefore, it is best to ignore this boilerplate code in code
coverage reports as it cannot be marked as covered and also
is for deprecated code.
What:
* Wrap the boilerplate code (requiring Maintenance.php and then
later defining the maintenance script class and running if the
maintenance script was called directly) with @codeCoverageIgnore
comments.
* Some files use a different boilerplate code, however, these
should also be marked as ignored for coverage for the same
reason that coverage is not properly reported for files.
Bug: T371167
Change-Id: I32f5c6362dfb354149a48ce9c28da9a7fc494f7c
Rename Sanitizer::removeHTMLtags() into an @internal method named
::internalRemoveHtmlTags() so that we can deprecate external use.
Code search:
https://codesearch.wmcloud.org/deployed/?q=removeHTMLtags&i=nope&files=&excludeFiles=&repos=
Followup-To: Ic864c01471c292f11799c4fbdac4d7d30b8bc50f
Depends-On: Iaca83ed06e9c61d8366579cd2283cba653c82319
Depends-On: I1963bfe9a99198ea02ca482a5769467ce806cd58
Depends-On: I83923d8b38d33f3638cd53958dd10f257ec21f7c
Depends-On: I018b34bb5f6e113056da9b04cc72d4318422adce
Change-Id: I202826f8b27519f7be89643e24eda47a6e3fc9f6
The existing Sanitizer::removeHTMLtags() method, in addition to having
dodgy capitalization, uses regular expressions to parse the HTML.
That produces corner cases like T298401 and T67747 and is not guaranteed
to yield balanced or well-formed HTML.
Instead, introduce and use a new Sanitizer::removeSomeTags() method
which is guaranteed to always return balanced and well-formed HTML.
Note that Sanitizer::removeHTMLtags()/::removeSomeTags() take a callback
argument which (as far as I can tell) is never used outside core. Mark
that argument as @internal, and clean up the version used by
::removeSomeTags().
Use the new ::removeSomeTags() method in the two places where
DISPLAYTITLE is handled (following up on T67747). The use by the
legacy parser is more difficult to replace (and would have a
performace cost), so leave the old ::removeHTMLtags() method in place
for that call site for now: when the legacy parser is replaced by
Parsoid the need for the old ::removeHTMLtags() will go away. In a
follow-up patch we'll rename ::removeHTMLtags() and mark it @internal
so that we can deprecate ::removeHTMLtags() for external use.
Some benchmarking code added. On my machine, with PHP 7.4, the new
method tidies short 30-character title strings at a rate of about
6764/s while the tidy-based method being replaced here managed 6384/s.
Sanitizer::removeHTMLtags blazes through short strings 20x faster
(120,915/s); some of this difference is due to the set up cost of
creating the tag whitelist and the Remex pipeline, so further
optimizations could doubtless be done if Sanitizer::removeSomeTags()
is more widely used.
Bug: T299722
Bug: T67747
Change-Id: Ic864c01471c292f11799c4fbdac4d7d30b8bc50f
Also move Benchmarker.php to maintenance/includes/ because
that file is not itself a benchmark, and that base class is
actually covered by tests and should remain included.
Change-Id: I4acd88242dde56a884d319dfc141a3511a8221a3