Commit graph

743 commits

Author SHA1 Message Date
Brad Jorsch
83b798bbab Hide <style> tags from Tidy
Some versions of html-tidy (e.g. the one currently in use on WMF wikis)
will try to move all <style> tags in the body into the head, effectively
removing them for our purposes. We need to avoid that for TemplateStyles.

Bug: T167349
Change-Id: I133776d16f366cad73ed30af0e5a665fdf9f5ed9
2017-06-13 13:02:57 -04:00
jenkins-bot
a2254d32bf Merge "Sync up with Parsoid parserTests.txt" 2017-06-12 20:51:54 +00:00
Arlo Breault
a0e6bc14b6 Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit 108eed81b3eb78b77ade5ba5daac71fb43fff6de

Change-Id: Ib2b31f008adaf16866de16ef963bc58d6cabb088
2017-06-12 15:39:42 -04:00
Fomafix
fbf939cdda Remove id selector for toctitle
In 1bf5a652 the id selector was changed to a class selector for toctitle.
The cached HTML has been expired now and the id selector is not necessary
anymore.

Also remove the id selector #toc.tochidden for print style. This is not
necessary because the tochidden gets only added to .toc and not to #toc.

Change-Id: I43cfffdb0807e8ed8f6b7b8732ba857b709bee80
2017-06-08 10:05:00 +02:00
C. Scott Ananian
b9280829ef Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit ebac189004d6edc4801719a5802766113bc84beb

Change-Id: I09bd3a72fd6210abc200bead2a16ad4106bcc6f9
2017-05-23 13:28:40 -04:00
C. Scott Ananian
186a182a15 Protect language converter markup in the preprocessor (take 2).
This revises 2877402276, which was
reverted in master due to unexpected issues with `-{{...}} ` markup
on translatewiki and enwiki.  Test cases are added to ensure that this
is parsed as a template, not as language converter markup.

https://www.mediawiki.org/wiki/Preprocessor_ABNF is the canonical
documentation for the preprocessor; this will be updated after this
patch is merged.  The basic principles described in that page are
maintained in this patch:

* Rightmost opening structure has precedence: `-{{` is parsed as a
dash followed by template opening.

* `{{{` has precedence over `{{` and `-{`: `-{{{{` is parsed as
`-{` `{{{` since we first grab the rightmost `{{{`.

A bunch of test cases were added to verify the "ideal precedence"
order described on that wiki page.

This patch introduced some minor incompatibilities in existing
markup, in particular with chemical formulae in templates.
Fixes for these are being tracked at
https://www.mediawiki.org/wiki/Parsoid/Language_conversion/Preprocessor_fixups

Bug: T146304
Bug: T153761
Change-Id: I2f0c186c75e392c95e1a3d89266cae2586349150
2017-05-23 15:43:49 +01:00
Brion Vibber
33e4ac5b22 Add \b to regexes in BlockLevelPass to avoid confusing tr & track
With TimedMediaHandler in video.js mode, videos can be inline,
without a wrapper div.

Previously, in this mode two paragraphs where one contained a
video would end up merged into one paragraph, due to BlockLevelPass
matching "<track .../>" against "<tr" in its regexes.

Added \b to a couple of the regexes to protect against such errors,
and corrected a parser test case that had bad output listed, where
"<link .../>" matched against "<li".

Bug: T165817
Change-Id: I06e82b881f5ebddae5e7df7fb940adfa54f6b659
2017-05-20 00:53:05 +02:00
C. Scott Ananian
f9de807e28 Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit d163deefa3aaeb0926af09a91075d6a611b36363

Change-Id: I9e543f0803247ffc264e634bf66b6bd0e143f187
2017-05-16 15:12:40 -04:00
Brad Jorsch
1aac0a2992 Wrap parser output in <div class="mw-parser-output">
This will allow CSS to target just the parser output, without also
accidentally targeting the edit form, diff tables, and so on.

Bug: T37247
Change-Id: If4eb5bf71f94fa366ec4eddb6964e8f4df6b824a
Depends-On: I330c6aa4aaee045614b1801ed34bc9e03be69650
Depends-On: I52a518fa44e017841fe78474012cd69823e0a41d
2017-05-08 05:32:03 +00:00
Arlo Breault
ed1afdee35 Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit 906375badbbf3d10455f36d9ecbaa8f66f5e6425

Change-Id: I1a102a4b6988eb972215eb7210a44cdf19d04c47
2017-04-10 17:15:05 -04:00
Ed Sanders
1bf5a652d6 Use classes instead of IDs for TOC collapsing
One may way to have multiple TOC's on the page (e.g. in VisualEditor).

Change-Id: I19701c4037b653b2944e407752e50f444861f883
2017-04-10 17:00:03 +00:00
jenkins-bot
4f43a9608f Merge "Valid tags on a gallery should correspond to unordered list" 2017-04-03 18:25:37 +00:00
Arlo Breault
aea349488c Valid tags on a gallery should correspond to unordered list
* This was introduced in 4d3446a8e3 when galleries were tables.
   However, in 05579cf0e6, it switched to ul's, but missed updating the
   sanitization.

 * As an example, the test shows that summary is currently wrongly
   permitted.

Change-Id: I8c52477dc65499d0c8a1ee5cc661a5f9ae78cc07
2017-04-01 09:59:21 -04:00
Fomafix
7a3418ae33 Use consistent spaces at start and end of comments
Change-Id: Idbb09b69aa1ef4e46433319aaea62f34f0dbc038
2017-03-30 22:06:40 +02:00
Arlo Breault
c593079972 Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit b1b271460c341e844e21641b5307794cf4dd7feb

Change-Id: I792787b38c464efcb8c68b66b52a4dc9a0b41c36
2017-03-30 10:23:38 -04:00
James D. Forrester
9f53096051 Linker: Render selflinks as href-less classed <a>s, not <strong>s
Self-links are still semantically links, and representing them as <strong>s
is inelegant and more important a real pain to work with, especially in
contexts where they may change state (like inside an editor).

Instead, render them as <a>, with no href to avoid user agent style over-
rides and with a class to style them as before, named 'mw-selflink' to go
with 'mw-redirect'. This allows much easier adjustment later. The old CSS
class 'selflink' is retained for backwards compatibility, but deprecated.

Bug: T160480
Change-Id: If058843924c3b30c116df2520aef93a004d98a5d
2017-03-29 23:12:17 +00:00
Aaron Schulz
488a647831 Move IDatabase/IMaintainableDatabase to Rdbms namespace
Change-Id: If7e8a8ff574661fd827de8bcec11d2c39a687300
2017-03-28 15:32:38 -07:00
Bartosz Dziewoński
bebe29662b Do not use real message names in 'All_system_messages' preprocessor test
This file seems to be a stress-test for the MediaWiki preprocessor.
It doesn't really matter whether the messages references here exist.
As messages are occasionally renamed or deleted, and since this file
was generated in 2011, people keep getting confused when they grep
for a message name and run into this list (and sometimes needlessly
spend their time updating this file, as seen in its Git history).

This commit replaces all of the message names with their SHA1 hash
truncated to 8 hex characters.

Regexps used for matching:
(?<=\?title=MediaWiki\:)([^&{}<>|\[\]]+)
(?<=int:)([^&{}<>|\[\]]+)
(?<=\[\[MediaWiki_talk:)([^&{}<>|\[\]]+)
(?<=action=edit )([^&{}<>|\[\]]+)

Change-Id: I52a71c0cc0e6fa21a61420d52df755066c6e9a08
2017-03-08 17:02:53 +01:00
C. Scott Ananian
3e32d21210 Strip U+0000 in wikitext
U+0000 is not allowed in HTML5, there's no reason to allow it in wikitext.

It simplifies our code if we can just strip them at the start.  Strip in
PST as well so they don't sneak into our database either.

Tweaked the EXT_LINK URLs to account for the fact that invalid characters
get transformed into U+FFFD when using Preprocessor_DOM.  See 73649741ed
(r65967) for context on that change.

Bug: T159174
Change-Id: I3f67e92b61aacc87a40c3662085c84d1dac08bfb
2017-03-06 22:23:38 +00:00
Brad Jorsch
69be6f316a Make the parser tests' "subpage" option actually enable for all subpages
The option says "enable subpages (disabled by default)", but it
currently just enables subpages for namespaces 0 and 2. This tripped me
up when writing some parser tests for TemplateStyles where I need
subpages enabled for namespace 10.

There's probably no reason not to have it enable subpages for all
namespaces.

Change-Id: Icf864dafc4208a76af7b3e71f5f9c97576c065b7
2017-03-03 05:32:36 +00:00
jenkins-bot
8a65a5486f Merge "Remove duplicate test" 2017-02-23 02:10:12 +00:00
Arlo Breault
44d4992a95 Remove duplicate test
Change-Id: If99b0672c631e0428550a73a2a6116394ef32bb9
2017-02-22 11:58:06 -08:00
Arlo Breault
0366498177 Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit e23a818554548cd922ee262ea1d8da47ea457248

Change-Id: Ib26b170c51aa4425a54871fa32543b2eef5db41e
2017-02-22 09:25:57 -08:00
James D. Forrester
1e9c361960 tests: Replace implicit Bugzilla bug numbers with Phab ones
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.

Change-Id: I46261416f7603558dceb76ebe695a5cac274e417
2017-02-21 02:14:34 +00:00
James D. Forrester
d3a9f574c3 parserTests.txt: Replace implicit Bugzilla bug numbers with Phab ones
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.

Change-Id: I3eeffe40e0a752e1e3c79e65fa2fb556950d9a24
2017-02-21 02:14:18 +00:00
C. Scott Ananian
3a8a986e35 Don't bail on single-line definition list due to excess close tags.
When parsing a single line definition list, we track nested tags so that:

	; <b>foo:bar</b>: baz

breaks before `baz`, not between `foo` and `bar`.  But we currently bail
out of this algorithm entirely if we see a mismatched close tag.  We should
just ignore the unmatched tag, like Parsoid does.

Change-Id: I6306dcad6347abeb6ab001d35562f1ab9f374bd1
2017-02-17 16:34:55 -05:00
C. Scott Ananian
ee002d67c9 Protect -{...}- variant constructs in definition lists.
Given the wikitext:

	;-{zh-cn:AAA;zh-tw:BBB}-

Prevent `doBlockLevels` from trying to split the definition list at the
embedded colon and using `AAA;zh-tw:BBB}-` as the `<dd>` portion.

Bug: T153135
Change-Id: I3a4d02f1fbd0d0fe8278d6b7c66005f0dd3dd36b
2017-02-17 15:52:44 -05:00
Arlo Breault
e9ee631c88 Don't test for tidy class on hhvm
* It doesn't support the oo interface.

 * Should make the tidy tests run in CI.

Bug: T157730
Change-Id: Ied80f70b7cafcf64d736cb0eeb1a30d52c1d7921
2017-02-13 14:54:48 -08:00
Kunal Mehta
1deada4f7f parser test editor: Fix emitting of !! hooks
The first newline was missing so a block like:
 !! hooks
 source
 !! endhooks

would turn into:
 !! hookssource
 !! endhooks

Change-Id: I2a4c5e52050d55fb0c9b4f5d0494eb00e34b233c
2017-01-31 03:12:15 +00:00
Kunal Mehta
54666773bf parserTests: Avoid using <big> for adoption agency algorithm test
The behavior of <big> may change in the future, c.f. T154067.

Change-Id: I817894c25cab96a491028fe2a9443140ea1d6e97
2017-01-30 01:52:23 -08:00
jenkins-bot
90f0807c54 Merge "Update html/php clauses for subpage parserTests." 2017-01-25 19:08:23 +00:00
jenkins-bot
4a07505402 Merge "Sync up with Parsoid parserTests." 2017-01-25 18:37:08 +00:00
C. Scott Ananian
121962101f Sync up with Parsoid parserTests.
This now aligns with Parsoid commit 643d5392bcf4dfebf906102627c51e8a608125bf

Change-Id: I4d7dc7378ca7cfdb3919f33959f58eb5c4d88ca8
2017-01-25 12:51:14 -05:00
Liangent
4312c2ad40 Prevent unexpected }- in converter output
Previously for input -{<span title="-{X}-">X</span>}-, the converter
sees -{<span title="-&#123;X}-">A</span>}-, so <span title="-&#123;X
becomes the content in the first block, and a stray }- is left to output.

Now, the converter sees -{<span title="-&#123;X&#125;-">A</span>}- with
this change. In further processing, the span tag may be parsed and have
its title attrib converted. For cases where the content is not processed
further (eg. "R" = raw flag), "-{X}-" is left as is in the attrib, which
is not so ideal, but at least it's better than the original extra }-
outside the whole tag.

Change-Id: Idbaaf53f914f362e5b8cc9fad02a524f8d591bb7
2017-01-25 17:38:08 +00:00
C. Scott Ananian
bb3a0c8251 Update html/php clauses for subpage parserTests.
The NS_USER namespace needs to have subpages enabled before these tests
work on the PHP parser.

Change-Id: I8e5e3bbd0dea6fc12f3b9ff9feeb58812fc51af1
2017-01-25 10:16:06 -05:00
This, that and the other
79cbecae83 Parser: Trim leading whitespace from links before checking for leading :
The leading spaces on the link only cause us problems, such as for the
$noforce check 20 lines later.

Bug: T129218
Change-Id: I93a8da1f73b38fa3da362f8f27479b3039ed3f13
2017-01-21 23:55:02 +00:00
Fomafix
ce6f7faadd Remove trailing empty lines in PHP
Performed using
find . -name \*.php -exec sed -i -e :a -e '/./,$!d;/^\n*$/{$d;N;};/\n$/ba' {} \;

Change-Id: I5d0627f94c73690cf3a8a453539c22c760c2aa60
2017-01-16 22:06:43 +01:00
TTO
7a5aaad742 Additional test for {{PAGELANGUAGE}} magic word
Make sure it returns the default content language on pages where the
language is not explicitly set.

Bug: T59603
Change-Id: I7b1437bf1650166c8be77e5bd84181c577961f27
2017-01-07 11:28:12 +00:00
divadsn
e8d13ad1a2 Add a new {{PAGELANGUAGE}} variable for use in wikitext
Returns the language code of the page being parsed.

Bug: T59603
Change-Id: I229edd6251cf1120b3395d1811dbb9d96d9cd8ee
2017-01-07 02:03:53 +00:00
Arlo Breault
6b97c82272 Sync up with Parsoid parserTests.
This now aligns with Parsoid commit 974dd5b3d70acf59bb15e057dc37e3702195f3e0

Change-Id: Ia45d8e2539e7fec23503706be1b40a6eaf1f5888
2017-01-05 13:20:36 -08:00
C. Scott Ananian
046c463635 Revert "Protect language converter markup in the preprocessor."
This effectively reverts commit 2877402276 in
order to unblock the deploy train.  The underlying behavior might not be
incorrect, but it was unexpected.

Bug: T153761
Change-Id: Ifc9c7cf3482dd5d222ff4da24a6d4cc401e9d965
2017-01-03 17:23:28 -05:00
jenkins-bot
12846c08cb Merge "Sync up with Parsoid parserTests." 2016-12-22 17:56:27 +00:00
Subramanya Sastry
c67a69411e Sync up with Parsoid parserTests.
This now aligns with Parsoid commit 471eb1031510e06a82a18007222396934b34264f

Change-Id: I676b656c9d45c73cfb4e8fdddd23d9c3c85845cd
2016-12-22 11:29:32 -06:00
C. Scott Ananian
23fd64afde Don't parse language converter markup as a cell parameter in tables.
Bug: T153140
Change-Id: I799363727162a0f337652b26bb69fe35c61a8553
2016-12-22 11:09:50 -05:00
C. Scott Ananian
5b050be643 Allow HTML tags in LanguageConverter output.
A "remove HTML tags to avoid disrupting the layout" block is removed
(previously added in f16d1e4ed7).

This is a follow-up to I9b099273203482ffb570a5654d8ba50c833e526d.

Bug: T54192
Change-Id: I565fac58b3b0da7bfaedf64f5001c364f52e2244
2016-12-22 01:32:24 +00:00
C. Scott Ananian
ae934157b2 Protect -{...}- variant constructs in galleries
This also protects naked external links, which are internally surrounded by
`-{R|...}-` by LanguageConverter::markNoConversion.

Originally found in failed tests in I7fa2d85d6.

Bug: T54190
Change-Id: I9b099273203482ffb570a5654d8ba50c833e526d
2016-12-20 22:14:37 +00:00
C. Scott Ananian
51d54b4b91 Protect -{...}- variant constructs in images.
A protected version of explode is factored out as
`StringUtils::delimiterExplode`, since it will be used in follow-up
patches in this series.  The `delimiterExplode` implementation creates
an intermediate array of the exploded results, which is reasonable as
the number of image options is small; but since an Iterator is
returned the implementation can be upgraded in the future (at the cost
of additional complexity) to avoid this.  The additional code in that
case would be similar to ExplodeIterator.

Bug: T146305
Change-Id: I1327685e9e8c07ef476dceaa6f6dae4ba40989ef
2016-12-20 22:08:36 +00:00
jenkins-bot
dfaa26a7b1 Merge "Add dotall modifier to EDITSECTION_REGEX" 2016-12-18 02:53:47 +00:00
C. Scott Ananian
2877402276 Protect language converter markup in the preprocessor.
This ensures that `{{echo|-{R|foo}-}}` is parsed correctly as
a template invocation with a single argument, not as two separate
arguments split by the `|`.

Bug: T146304
Change-Id: I709d007c70a3fd19264790055042c615999b2f67
2016-12-15 23:50:44 +00:00
C. Scott Ananian
caebba387a Sync up with Parsoid parserTests.
This now aligns with Parsoid commit 73798df0632e10313b82987d0b99e93c73407ca7

Change-Id: Ia0e511311eb05276617cc7bdff72b07347591ca3
2016-12-14 15:00:38 -05:00