(Previously done in f51d0d9a81 and
reverted in 543f46e9c08e0ff8c5e8b4e917fcc045730ef1bc.)
I think it's saner to treat this as invalid syntax, and output the
mismatched tag code verbatim. The current behavior is particularly
annoying for <ref> tags, which often swallow everything afterwards.
This does not affect HTML tags, though. Assuming Tidy is enabled, they
are still auto-closed at the end of the page content. (For tags that
"shadow" a HTML tag name, this results in the tag being treated as a
HTML tag. This currently only affects <pre> tags: if unclosed, they
are still displayed as preformatted text, but without suppressing
wikitext formatting.)
It also does not affect <includeonly>, <noinclude> and <onlyinclude>
tags. Changing this behavior now would be too disruptive to existing
content, and is the reason why previous attempt was reverted. (They
are already special-cased enough that this isn't too weird, for example
mismatched closing tags are hidden.)
Related to T17712 and T58306. I think this brings the PHP parser closer
to Parsoid's interpretation.
It reduces performance somewhat in the worst case, though. Testing with
https://phabricator.wikimedia.org/F3245989 (a 1 MB page starting with
3000 opening tags of 15 different types), parsing time rises from
~0.2 seconds to ~1.1 seconds on my setup. We go from O(N) to O(kN),
where N is bytes of input and k is the number of types of tags present
on the page. Maximum k shouldn't exceed 30 or so in reasonable setups
(depends on installed extensions, it's 20 on English Wikipedia).
Change-Id: Ide8b034e464eefb1b7c9e2a48ed06e21a7f8d434
This reverts commit f51d0d9a81.
Breaks templates with non-closed </noinclude> tags, which
were previously acceptable.
Bug: T125754
Change-Id: I8bafb15eefac4e1d3e727c1c84782636d8b82c2b
I think it's saner to treat this as invalid syntax, and output the
mismatched tag code verbatim. The current behavior is particularly
annoying for <ref> tags, which often swallow everything afterwards.
This does not affect HTML tags, though. Assuming Tidy is enabled, they
are still auto-closed at the end of the page content.
Related to T17712 and T58306. I think this brings the PHP parser closer
to Parsoid's interpretation.
It reduces performance somewhat in the worst case, though. Testing with
https://phabricator.wikimedia.org/F3245989 (a 1 MB page starting with
3000 opening tags of 15 different types), parsing time rises from
~0.2 seconds to ~1.1 seconds on my setup. We go from O(N) to O(kN),
where N is bytes of input and k is the number of types of tags present
on the page. Maximum k shouldn't exceed 30 or so in reasonable setups
(depends on installed extensions, it's 20 on English Wikipedia).
To consider:
* Should we keep previous behavior for unclosed <includeonly> /
<noinclude>? This would be particularly disruptive for these if
someone relied on the old behavior, and they're already
special-cased in places.
* Unclosed <pre> tags are now treated as HTML tags, and are still
displayed as preformatted text, but without suppressing wikitext
formatting.
Change-Id: Ia2f24dbfb3567c4b0778761585e6c0303d11ddd0
Changed some old bugzilla links to new phabricator links in comments,
test data and error message. This reduces the need for redirects from
old bugzilla to new phabricator from our source code.
Change-Id: Id98278e26ce31656295a23f3cadb536859c4caa5
We originally allowed only spaces around comments. Now allow tabs as
well. This ought to affect very few pages, but it helps predictability
and to maintain consistency between the PHP preprocessor and parsoid.
Change-Id: Icb3ff6eec08aaa83ae332d03c910c13995c9c9ee
Parser tests also included, test case and original patch supplied by
Bergi on bugzilla. Tested against the current version.
Change-Id: Id7ec4e694783dd0f682f65f39d8b9e59f82e58aa
Fix almost all occurences of the following sniffs:
Generic.CodeAnalysis.UselessOverridingMethod.Found
Generic.Formatting.NoSpaceAfterCast.SpaceFound
Generic.Functions.FunctionCallArgumentSpacing.SpaceBeforeComma
Generic.Functions.OpeningFunctionBraceKernighanRitchie.BraceOnNewLine
Generic.PHP.LowerCaseConstant.Found
PSR2.Classes.PropertyDeclaration.ScopeMissing
PSR2.Files.EndFileNewline.TooMany
PSR2.Methods.MethodDeclaration.StaticBeforeVisibility
Change-Id: I96aacef5bafe5a2bca659744fba1380999cfc37d
Some class extending MediaWikiTestCase did not call its setUp method. We
most probably always want to do it since MediaWikiTestCase::setUp() does
garbage collection and might do more in the future.
Change-Id: I68dde370a62c8f4a779836ca0c4ad06844fdc916
This commit depends on the introduction of
MediaWikiTestCase::setMwGlobals in change Iccf6ea81f4.
Various tests already set their globals, but forgot to restore
them afterwards, or forgot to call the parent setUp, tearDown...
Either way they won't have to anymore with setMwGlobals.
Consistent use of function characteristics:
* protected function setUp
* protected function tearDown
* public static function (provide..)
(Matching the function signature with PHPUnit/Framework/TestCase.php)
Replaces:
* public function (setUp|tearDown)\(
* protected function $1(
* \tfunction (setUp|tearDown)\(
* \tprotected function $1(
* \tfunction (data|provide)\(
* \tpublic static function $1\(
Also renamed a few "data#", "provider#" and "provides#" functions
to "provide#" for consistency. This also removes confusion where
the /media tests had a few private methods called dataFile(),
which were sometimes expected to be data providers.
Fixes:
TimestampTest often failed due to a previous test setting a
different language (it tests "1 hour ago" so need to make sure
it is set to English).
MWNamespaceTest became a lot cleaner now that it executes with
a known context. Though the now-redundant code that was removed
didn't work anyway because wgContentNamespaces isn't keyed by
namespace id, it had them was values...
FileBackendTest:
* Fixed: "PHP Fatal: Using $this when not in object context"
HttpTest
* Added comment about:
"PHP Fatal: Call to protected MWHttpRequest::__construct()"
(too much unrelated code to fix in this commit)
ExternalStoreTest
* Add an assertTrue as well, without it the test is useless
because regardless of whether wgExternalStores is true or false
it only uses it if it is an array.
Change-Id: I9d2b148e57bada64afeb7d5a99bec0e58f8e1561
We can now do this since we finally switched to PHP 5.3 for MW 1.20 and get rid of the silly dirname(__FILE__) stuff :)
Change-Id: Id9b2c9cd2e678197aa81c78adced5d1d31ff57b1
Some preprocessor could have preprocessToObj as a native type.
PHP Fatal error: Call to a member function __toString() on a non-object in phase3/tests/phpunit/includes/parser/PreprocessorTest.php on line 119
Using same technique as ApiExpandTemplates to serialize the object tree back to XML, rather than asking for the DOM implementation's internal XML return function.
Have to also perform normalization on the test cases, as they aren't normalized to what libxml2 serializes. :P
Note that there are 4 test failures currently with Preprocessor_Hash, as it makes a separate <equals> element around = which doesn't appear to be in Preprocessor_Dom's output.
sed -i 's/<root><\(template\|tplarg\)>/<root><\1 lineStart=\\"1\\">/' phpunit/includes/parser/PreprocessorTest.php
sed -i 's/<root><\(template\|tplarg\)>/<root><\1 lineStart="1">/' parser/preprocess/*.expected
Fixed method call in Preprocessor_Native.php.
Added support for tags containing spaces (r80025), following the same odd order dependant behavior as the php preprocessors.
Extensions shouldn't rely on it. See http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/51496
As a result of these changes, there is much less worst-case lookahead now.
in_array.{c,h} are now unused.