Allow HTML5 <rtc> tag (ruby support for East Asian typography).

We currently allow <ruby>, <rt>, <rb>, and <rp> but not the W3C HTML5
<rtc> element.  Fix that.

(Note that <rb> and <rtc> are new additions to HTML5 which currently
appear in the W3C but not the WHATWG version of the HTML5 spec.
Support for these has already been merged in gecko and webkit and
the editor plans to update the WHATWG spec.)

Bug: 67042
Change-Id: I8c0e65d782b6d23057a9723b87323b28e8bf8852
This commit is contained in:
C. Scott Ananian 2014-06-24 14:15:25 -04:00
parent b4ed05d6a2
commit fb125de072
3 changed files with 47 additions and 7 deletions

View file

@ -383,7 +383,7 @@ class Sanitizer {
'h2', 'h3', 'h4', 'h5', 'h6', 'cite', 'code', 'em', 's',
'strike', 'strong', 'tt', 'var', 'div', 'center',
'blockquote', 'ol', 'ul', 'dl', 'table', 'caption', 'pre',
'ruby', 'rt', 'rb', 'rp', 'p', 'span', 'abbr', 'dfn',
'ruby', 'rb', 'rp', 'rt', 'rtc', 'p', 'span', 'abbr', 'dfn',
'kbd', 'samp', 'data', 'time', 'mark'
);
$htmlsingle = array(
@ -1685,10 +1685,10 @@ class Sanitizer {
# http://www.whatwg.org/html/text-level-semantics.html#the-ruby-element
'ruby' => $common,
# rbc
# rtc
'rb' => $common,
'rt' => $common, #array_merge( $common, array( 'rbspan' ) ),
'rp' => $common,
'rt' => $common, #array_merge( $common, array( 'rbspan' ) ),
'rtc' => $common,
# MathML root element, where used for extensions
# 'title' may not be 100% valid here; it's XHTML

View file

@ -378,7 +378,7 @@ class CoreParserFunctions {
// list of disallowed tags for DISPLAYTITLE
// these will be escaped even though they are allowed in normal wiki text
$bad = array( 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'div', 'blockquote', 'ol', 'ul', 'li', 'hr',
'table', 'tr', 'th', 'td', 'dl', 'dd', 'caption', 'p', 'ruby', 'rb', 'rt', 'rp', 'br' );
'table', 'tr', 'th', 'td', 'dl', 'dd', 'caption', 'p', 'ruby', 'rb', 'rt', 'rtc', 'rp', 'br' );
// disallow some styles that could be used to bypass $wgRestrictDisplayTitle
if ( $wgRestrictDisplayTitle ) {

View file

@ -1087,7 +1087,7 @@ Non-html5 tags should be accepted
</p>
!! end
## a,rtc not permitted
## a not permitted
## i,b,br omitted
!! test
Text-level semantic html elements in wikitext
@ -1109,7 +1109,7 @@ Text-level semantic html elements in wikitext
<sub>text</sub>
<u>text</u>
<mark>text</mark>
<ruby><rb>明日<rp>(</rp><rt>Ashita</rt><rp>)</rp></rb></ruby>
<ruby><rb>明日</rb><rp>(</rp><rt>Ashita</rt><rp> </rp><rtc>あした</rtc><rp>)</rp></ruby>
<bdi>text</bdi>
<bdo>text</bdo>
<span>text</span>
@ -1132,7 +1132,7 @@ Text-level semantic html elements in wikitext
<sub>text</sub>
<u>text</u>
<mark>text</mark>
<ruby><rb>明日<rp>(</rp><rt>Ashita</rt><rp>)</rp></rb></ruby>
<ruby><rb>明日</rb><rp>(</rp><rt>Ashita</rt><rp> </rp><rtc>あした</rtc><rp>)</rp></ruby>
<bdi>text</bdi>
<bdo>text</bdo>
<span>text</span>
@ -1140,6 +1140,46 @@ Text-level semantic html elements in wikitext
</p>
!! end
# test cases taken from
# http://www.w3.org/TR/html5/text-level-semantics.html#the-ruby-element
!! test
Ruby markup (W3C-style)
!! wikitext
; Mono-ruby for individual base characters
: <ruby>日<rt>に</rt>本<rt>ほん</rt>語<rt>ご</rt></ruby>
; Group ruby
: <ruby>今日<rt>きょう</rt></ruby>
; Jukugo ruby
: <ruby>法<rb>華</rb><rb>経</rb><rt>ほ</rt><rt>け</rt><rt>きょう</rt></ruby>
; Inline ruby
: <ruby>東<rb>京</rb><rp>(</rp><rt>とう</rt><rt>きょう</rt><rp>)</rp></ruby>
; Double-sided ruby
: <ruby><rb>旧</rb><rb>金</rb><rb>山</rb><rt>jiù</rt><rt>jīn</rt><rt>shān</rt><rtc>San Francisco</rtc></ruby>
<ruby>
<rb>♥</rb><rtc><rt>Heart</rt></rtc><rtc lang=fr><rt>Cœur</rt></rtc>
<rb>☘</rb><rtc><rt>Shamrock</rt></rtc><rtc lang=fr><rt>Trèfle</rt></rtc>
<rb>✶</rb><rtc><rt>Star</rt></rtc><rtc lang=fr><rt>Étoile</rt></rtc>
</ruby>
!! html
<dl><dt> Mono-ruby for individual base characters</dt>
<dd> <ruby>日<rt>に</rt>本<rt>ほん</rt>語<rt>ご</rt></ruby></dd>
<dt> Group ruby</dt>
<dd> <ruby>今日<rt>きょう</rt></ruby></dd>
<dt> Jukugo ruby</dt>
<dd> <ruby>法<rb>華</rb><rb>経</rb><rt>ほ</rt><rt>け</rt><rt>きょう</rt></ruby></dd>
<dt> Inline ruby</dt>
<dd> <ruby>東<rb>京</rb><rp>(</rp><rt>とう</rt><rt>きょう</rt><rp>)</rp></ruby></dd>
<dt> Double-sided ruby</dt>
<dd> <ruby><rb>旧</rb><rb>金</rb><rb>山</rb><rt>jiù</rt><rt>jīn</rt><rt>shān</rt><rtc>San Francisco</rtc></ruby></dd></dl>
<p><ruby>
<rb>♥</rb><rtc><rt>Heart</rt></rtc><rtc lang="fr"><rt>Cœur</rt></rtc>
<rb>☘</rb><rtc><rt>Shamrock</rt></rtc><rtc lang="fr"><rt>Trèfle</rt></rtc>
<rb>✶</rb><rtc><rt>Star</rt></rtc><rtc lang="fr"><rt>Étoile</rt></rtc>
</ruby>
</p>
!! end
!! test
Non-word characters don't terminate tag names (bug 17663, 40670, 52022)
!! wikitext