Commit graph

30 commits

Author SHA1 Message Date
Thiemo Kreuz
1fc8d79ac6 Remove documentation that literally repeats the code
For example, documenting the method getUser() with "get the User
object" does not add any information that's not already there.
But I have to read the text first to understand that it doesn't
document anything that's not already obvious from the code.

Some of this is from a time when we had a PHPCS sniff that was
complaining when a line like `@param User $user` doesn't end
with some descriptive text. Some users started adding text like
`@param User $user The User` back then. Let's please remove
this.

Change-Id: I0ea8d051bc732466c73940de9259f87ffb86ce7a
2020-10-27 19:20:26 +00:00
C. Scott Ananian
b8abd8e01e Hard-deprecate Sanitizer::escapeIdReferenceList()
Code search: https://codesearch.wmcloud.org/search/?q=escapeIdReferenceList&i=nope&files=&repos=

Followup-To: Ifce057b0c436eabec310f812394e86ee7123e7c8
Change-Id: I18f2c47ad6b4f6256d1727f24314cc3c5e13f466
2020-08-20 19:59:13 -04:00
C. Scott Ananian
36da9ef204 Remove all methods of MWTidy except for MWTidy::tidy()
These methods were either @internal or deprecated in 1.35

Bug: T198214
Change-Id: Ica1d1fdfd2a23a2040eac90c71f6211a4513c916
2020-08-17 18:15:37 +00:00
addshore
959bc315f2 MediaWikiTestCase to MediaWikiIntegrationTestCase
The name change happened some time ago, and I think its
about time to start using the name name!
(Done with a find and replace)

My personal motivation for doing this is that I have started
trying out vscode as an IDE for mediawiki development, and
right now it doesn't appear to handle php aliases very well
or at all.

Change-Id: I412235d91ae26e4c1c6a62e0dbb7e7cf3c5ed4a6
2020-06-30 17:02:22 +01:00
C. Scott Ananian
86fb3b14af Use 'list of allowed attributes' in Sanitizer, instead of 'whitelist'
Bug: T254646
Change-Id: I48d1a5b318c3511fae94291d84f65e5c9cd05a27
2020-06-10 15:58:39 -04:00
Thiemo Kreuz
6aa6d10e86 Replace all call_user_func(_array) in all tests
There is native support for all of this now in PHP, thanks to changes
and additions that have been made in later versions. There should be no
need any more to ever use call_user_func() or call_user_func_array().

Reviewing this should be fairly easy: Because this patch touches
exclusivly tests, but no production code, there is no such thing as
"insufficent test coverage". As long as CI goes green, this should be
fine.

Change-Id: Ib9690103687734bb5a85d3dab0e5642a07087bbc
2020-06-06 18:41:20 +02:00
C. Scott Ananian
05bc687111 Use HTML5 semantics for self-closed HTML tags in wikitext
This behavior has been deprecated and with a tracking category since
1.28.  Time to remove the temporary parameter added to
Sanitizer::removeHTMLtags() and (finally) tweak the behavior to match
HTML5.

Bug: T134423
Change-Id: I5c725175d05854139c95a2b3d8d35ff63cb6707b
2020-05-27 11:59:18 -04:00
C. Scott Ananian
83a22b7fcd Remove codepaths which ran parser in 'untidy' mode
Disabling tidy has been deprecated since 1.33.  This cleans up the code
paths which still used untidy output.

Bug: T198214
Change-Id: I821ef3b8f59b272d983583d407b2f0794fe1e791
2020-04-13 21:34:04 +00:00
Brian Wolff
0bdce21381 Make id attributes not include ascii whitespace per spec
HTML5 says id attributes should not have whitespace, where
whitespace is defined as LF, CR, FF, TAB or SPACE (oddly enough
VT does not count). Firefox in my testing actually was fine with
these except CR. Nonetheless we should follow the spec, so this converts
these whitespace characters to _. I don't think this will
cause any back-compat issues, since its very hard to make these
characters in wikitext (other than space which was already
being converted) and basically requires either Lua or html entities
to make these (with FF seeming to be impossible).

Bug: T238385
Depends-On: Ie6fa40798f06a358f6082110b4d8cc0028c80321
Change-Id: Ie2b7c9429691e2c491c3359d5b400d8f078aa789
2020-02-25 05:27:33 -08:00
jenkins-bot
9b1445e1c5 Merge "Escape % sign if form valid percent-encoding in fragment identifiers" 2020-02-25 11:33:38 +00:00
Brian Wolff
28d44262aa Escape % sign if form valid percent-encoding in fragment identifiers
Currently if you combine a valid percent encoding and a non
escaped character that is reserved in urls in a headline, the toc
link does not work. E.g. ==`%41== needs #`%2541 but we currently
generate #`%41 which matches ==`A== instead.

Tested in firefox and chrome

Bug: T238385
Change-Id: Ice2bbf79bed612d488ed6feb7510035e9dfb33af
2020-02-15 02:54:32 -08:00
James D. Forrester
ea8195fb5f Follow-up 0437877: SanitizerTest: Fix whitespace, test false state too
Change-Id: Ibec845224493c409049b5212812e737d63abaad7
2020-02-04 11:06:45 -08:00
C. Scott Ananian
0437877656 Whitelist aria-hidden attribute in Sanitizer
Bug: T204618
Change-Id: I34b9b729eccd7658d5165b6661e5fd45a733b36c
2020-01-28 21:54:16 +00:00
C. Scott Ananian
2d4aced658 Remove Sanitizer::attributeWhitelist()/setupAttributeWhitelist()
These method were deprecated in 1.34 and should never have been public
in the first place.  New private methods have replaced them.

Code search:
https://codesearch.wmflabs.org/deployed/?q=attributeWhitelist%5C%28&i=nope&files=&repos=

Change-Id: I363530b7edaced77f2c5b06721b1930d85e2e9dc
2020-01-25 13:06:19 -05:00
Max Semenik
48a323f702 tests: Add explicit return type void to setUp() and tearDown()
Bug: T192167
Depends-On: I581e54278ac5da3f4e399e33f2c7ad468bae6b43
Change-Id: I3a21fb55db76bac51afdd399cf40ed0760e4f343
2019-10-30 14:31:22 -07:00
James D. Forrester
83d76f4cb5 phpcs: Enable MediaWiki.Commenting.PhpunitAnnotations.ForbiddenExpectedException* and make pass
Change-Id: I63f97497714a32236268be6965c5e181dade6c58
2019-10-14 12:48:48 -07:00
Umherirrender
5bd311b1a2 Add public as visibility in tests folder
Add public, protected or private to function missing a visibility
Enable the tests folder for the phpcs sniff

Change-Id: Ibefce76ea9984c47e08c94889ea2eafca7565e2c
2019-10-10 21:55:37 +02:00
Amir Sarabadani
d23af35764 Unset all globals unneeded for unit tests, assert correct directory
* Unset globals to avoid tests that look like unit tests but actually rely on
  globals
* move some tests out of unit directory so that the test suite will pass.
* Assert that tests which extend MediaWikiUnitTestCase are in a directory with
  "/unit/" in its path name

Depends-On: I67b37b1bde94eaa3d4298d9bd98ac57995ce93b9
Depends-On: I90921679518ee95fe393f8b1bbd9134daf0ba032
Bug: T87781
Change-Id: I16691fc8ac063705ba0c2bc63b96c4534ca8660b
2019-07-09 14:09:29 -04:00
Amir Sarabadani
7ec9745444 Split SanitizerTest to unit and integration tests
Out of 150 tests of SanitizerTest.php, 100 of them are pure unit tests
they are moved to the new file in the new structure, the rest stay

Change-Id: I366d37607abff4bcd624a56fb8b2299729fbc088
2019-07-08 09:48:07 +02:00
C. Scott Ananian
bda42cef3c Deprecate Sanitizer::setupAttributeWhitelist/attributeWhitelist
These methods should be made private in the next release, but
hard-deprecate them for 1.34.

Tweak the return value of the attribute whitelist to be an
associative rather than a sequential array, which makes the
lookup of allowed attributes more efficient and avoids an
array_flip for every html element sanitized.

Bug: T221677
Change-Id: I17d734937accec6c2679dbe17328cf9554bd556a
2019-06-20 14:42:20 -04:00
Max Semenik
214b37ff07 SECURITY: blacklist CSS var()
Bug: T208881
Change-Id: I9a4ced2bc47eb5f96cf35e693bf5261c48acb126
2019-06-06 16:15:55 +00:00
Erik Bernhardson
aef02d516d Improve RemexStripTagHandler working with tables
HTML, generated by some infoboxes and perhaps other places, gets
stripped in a way that merges words together that should not be
merged. Add tr, th, and td to the list of tags that should force
word separation.

Bug: T218001
Change-Id: Ib374339628b1f543ea4e07f24aa3e3b76f3117b5
2019-03-14 13:11:59 -07:00
C. Scott Ananian
6db35b3c98 Remove most support for configuring Tidy, including Raggett
Remex is pure PHP so there is no reason to use an external tidy any
more. Configuration variables and implementation classes were
deprecated in 1.32 or earlier.  We've kept only $wgTidyConfig
which can be used for experimental features or debugging Remex.

Bug: T198214
Change-Id: I99d48f858d97b6e1d1e6cd76a42c960cc2c61f9f
2018-11-15 12:22:06 -05:00
C. Scott Ananian
54ac31f94d Hard deprecate codepaths where tidy is disabled
Future parsers will not support the output generated with tidy disabled.

Parser tests using untidied output will also be deprecated (and
rewritten) in a follow-up patch.

No new release notes necessary since user-visible tidy configuration
was deprecated previously (in 1.32), and individual methods which had
disabled tidy during execution were individually release-noted as they
were updated.

Bug: T198214
Depends-On: I0f417f75a49dfea873e9a2f44d81796a48b9f428
Depends-On: If5c619cdd3e7f786687cfc2ca166074d9197ca11
Change-Id: I592e0e0dfef7d929f05c60ffe4d60e09725b39cc
2018-11-05 18:49:16 +00:00
Erik Bernhardson
0d779c1ac6 Preserve whitespace in search index text content
Certain html tags imply a word break, but our html stripping doesn't
understand that at all. Adjust the html stripping to inject whitespace
for all block level tags (per MDN) along with the <br> element.

Bug: T195389
Change-Id: I9fbfac765ea88628e4f9b2794fb54e1cd0060203
2018-09-14 11:10:35 -07:00
James D. Forrester
846f4f58f5 Remove $wgExperimentalHtmlIds and related code, deprecated in 1.30
Bug: T139744
Change-Id: Ia15d5ab6e7637fd40d5c3399822a3dbeb7b383b5
2018-05-01 14:34:02 -07:00
Kunal Mehta
2ab7ae9d24 Add @covers for RemexStripTagHandler
This internal class is only used by Sanitizer::stripAllTags().

Change-Id: Ib913ee14524539216305da7e3183c07ab7d72cb5
2018-02-05 21:15:52 -08:00
Kunal Mehta
546980e537 Add @covers tags to parser tests
Change-Id: I7bce04bef5e981fd203ad819882482e72ca3f61b
2017-12-24 23:29:00 -08:00
Roan Kattouw
ddb4913f53 Use Remex in Sanitizer::stripAllTags()
Using a real HTML tokenizer fixes bugs when < or > appear in attribute
values. The old implementation used delimiterReplace(), which didn't
handle this case:

    > print Sanitizer::stripAllTags( '<p data-foo="a&lt;b>c">Hello</p>' );
    c">Hello

We also can't use PHP's built-in strip_tags() because it doesn't handle
<?php and <? correctly:

    > print strip_tags('1<span class="<?php">2</span>3');
    1
    > print strip_tags('1<span class="<?">2</span>3');
    1

Bug: T179978
Change-Id: I53b98e6c877c00c03ff110914168b398559c9c3e
2017-11-15 17:31:31 -08:00
Roan Kattouw
7980e38a84 Move Sanitizer.php to includes/parser/
Change-Id: Id08d91c747ec77d715459b89b03eee247ccd4e1b
2017-11-15 15:16:41 -08:00
Renamed from tests/phpunit/includes/SanitizerTest.php (Browse further)