CVE-2025-32699 Ensure that Unicode NFC normalization can be applied to our HTML output safely. Even though the W3C officially recommends against normalizing HTML https://www.w3.org/International/questions/qa-html-css-normalization#converting this is still easily done inadvertently, especially when using the MediaWiki action API which normalizes parameters and results by default. See also I671648603c4635a35585c860b4857f5ea085e47f in Parsoid, and T266140 / I2e78e660ba1867744e34eda7d00ea527ec016b71 for another similar issue. The following changes are made: * The various HTML serializers (Remex/Tidy-derived, as well as the Html::* helpers) are tweaked to entity-escape U+0338 wherever it appears. * Similarly, Message::escaped() is tweaked to entity-escape U+0338. * Finally, a post-processing pass is added to the OutputTransform pipeline to catch any remaining U+0338 and entity-escape them. This catches U+0338 added during any of the previous OutputTransform stages (like TOC insertion, section edit links, etc). *When backporting* this code will likely need to be moved to ParserOutput::getText(), as the OutputTransform pipeline wasn't added until MW 1.42. Bug: T387130 Change-Id: I66564e14e730f5393f4fa5780b80f24de6075af5 |
||
|---|---|---|
| .. | ||
| Stages | ||
| ContentDOMTransformStage.php | ||
| ContentTextTransformStage.php | ||
| DefaultOutputPipelineFactory.php | ||
| OutputTransformPipeline.php | ||
| OutputTransformStage.php | ||
| README.md | ||
Output transformations pipelines for wikitext
The classes in the Stages/ subdirectory contains HTML and DOM transforms for use in
output processing pipelines, i.e. postprocessors for ParserOutput objects that either
directly result from a parse or are fetched from ParserCache.
The default pipeline is created by DefaultOutputTransformFactory; it corresponds to
what was previously contained in ParserOutput::getText. The shouldRun method in these
stages uses defaults that indicates if the stage runs or not in the default
OutputTransformPipeline.