Use HTML5 semantics for self-closed HTML tags in wikitext

This behavior has been deprecated and with a tracking category since
1.28.  Time to remove the temporary parameter added to
Sanitizer::removeHTMLtags() and (finally) tweak the behavior to match
HTML5.

Bug: T134423
Change-Id: I5c725175d05854139c95a2b3d8d35ff63cb6707b
This commit is contained in:
C. Scott Ananian 2020-04-02 11:17:41 -04:00
parent 7097484a21
commit 05bc687111
5 changed files with 19 additions and 37 deletions

View file

@ -500,6 +500,8 @@ because of Phabricator reports.
parameter. Passing a user was deprecated in 1.33.
* Sanitizer::setupAttributeWhitelist() and Sanitizer::attributeWhitelist(),
deprecated in 1.34, have been removed. They should not have been public.
* The $warnCallback parameter to Sanitizer::removeHTMLtags, deprecated since
its introduction in 1.28, has been removed.
* SpecialRecentChanges::filterByCategories(), deprecated in 1.31, was removed.
* The `ArticleContentViewCustom` hook, deprecated in 1.32, was removed.
* AuthManager::callLegacyAuthPlugin, deprecated in 1.33, was removed.

View file

@ -1560,8 +1560,7 @@ class Parser {
},
false,
[],
[],
[ $this, 'addTrackingCategory' ]
[]
);
Hooks::run( 'InternalParseBeforeLinks', [ &$parser, &$text, &$this->mStripState ] );

View file

@ -488,14 +488,10 @@ class Sanitizer {
* @param array|bool $args Arguments for the processing callback
* @param array $extratags For any extra tags to include
* @param array $removetags For any tags (default or extra) to exclude
* @param callable|null $warnCallback (Deprecated) Callback allowing the
* addition of a tracking category when bad input is encountered.
* DO NOT ADD NEW PARAMETERS AFTER $warnCallback, since it will be
* removed shortly.
* @return string
*/
public static function removeHTMLtags( $text, $processCallback = null,
$args = [], $extratags = [], $removetags = [], $warnCallback = null
$args = [], $extratags = [], $removetags = []
) {
$tagData = self::getRecognizedTagData( $extratags, $removetags );
$htmlpairs = $tagData['htmlpairs'];
@ -526,14 +522,9 @@ class Sanitizer {
}
if ( $brace == '/>' && !( isset( $htmlsingle[$t] ) || isset( $htmlsingleonly[$t] ) ) ) {
// Eventually we'll just remove the self-closing
// slash, in order to be consistent with HTML5
// semantics.
// $brace = '>';
// For now, let's just warn authors to clean up.
if ( is_callable( $warnCallback ) ) {
call_user_func_array( $warnCallback, [ 'deprecated-self-close-category' ] );
}
// Remove the self-closing slash, to be consistent
// with HTML5 semantics. T134423
$brace = '>';
}
if ( !self::validateTag( $params, $t ) ) {
$badtag = true;

View file

@ -9710,8 +9710,8 @@ Failing to transform badly formed HTML into correct XHTML
</p>
!!end
## FIXME: Is Parsoid's acceptance of self-closing html-tags
## a feature or a bug? See https://phabricator.wikimedia.org/T76962
## Parsoid's behavior w/ self-closing HTML tags is now a bug;
## see T134423. Legacy Parser now implements proper HTML5 semantics.
!! test
Handling html with a div self-closing tag
!! wikitext
@ -9721,20 +9721,13 @@ Handling html with a div self-closing tag
<div title=bar />
<div title=bar/>
<div title=bar/ >
!! html/php+tidy
<div title=""></div>
<div title=""></div>
!! html+tidy
<div title="">
<div title="bar"></div>
<div title="bar"></div>
<div title="bar/"></div></div>
!! html/parsoid
<div title="" data-parsoid='{"stx":"html","selfClose":true}'></div>
<div title="" data-parsoid='{"stx":"html","selfClose":true}'></div>
<div title="" data-parsoid='{"stx":"html","autoInsertedEnd":true}'>
<div title="bar" data-parsoid='{"stx":"html","selfClose":true}'></div>
<div title="bar" data-parsoid='{"stx":"html","selfClose":true}'></div>
<div title="bar/" data-parsoid='{"stx":"html","autoInsertedEnd":true}'></div></div>
<div title="">
<div title="">
<div title="bar">
<div title="bar">
<div title="bar/"></div></div></div></div></div></div>
!! end
!! test
@ -19753,8 +19746,7 @@ Self closed html pairs (T7487)
<center><font id="bug" />Centered text</center>
<div><font id="bug2" />In div text</div>
!! html+tidy
<center><font id="bug"></font>Centered text</center>
<div><font id="bug2"></font>In div text</div>
<center><font id="bug">Centered text</font></center><font id="bug"><div><font id="bug2">In div text</font></div></font>
!! end
!! test
@ -28220,10 +28212,8 @@ Self-closed tag with broken attribute value quoting
parsoid=wt2html,html2html
!! wikitext
<div title="Hello world />Foo
!! html/php+tidy
<div title="Hello world"></div><p>Foo</p>
!! html/parsoid
<div title="Hello world " data-parsoid='{"stx":"html","selfClose":true}'></div><p>Foo</p>
!! html+tidy
<div title="Hello world">Foo</div>
!! end
!! test

View file

@ -50,7 +50,7 @@ class SanitizerTest extends MediaWikiTestCase {
// former testSelfClosingTag
[
'<div>Hello world</div />',
'<div>Hello world</div></div>',
'<div>Hello world</div>',
'Self-closing closing div'
],
// Make sure special nested HTML5 semantics are not broken