This is a clarification of what already happens in practice for lists
generated from wikitext syntax, since that parsing happens
simultaneously.
Parsoid, for its part, does list handling prior to paragraph wrapping,
so must make use of these definitions.
Further, this helps reduce paragraph wrapping in interstitial spacing of
lists from HTML syntax, as spec'd in the tests, though the possibility
isn't eliminated entirely.
The TOC generation code is altered to reduce the number of newlines
emitted in between list items, since those are now left intact.
Change-Id: I6888b6e8e6768b0737565b87924fefa5a06ebd18
When this part of BlockLevelPass::execute() encounters a block-level tag,
such as <hr>, one of $openMatch or $closeMatch will be truthy.
Without this patch, $this->closeParagraph() is unconditionally called in
this situation, which sets $this->inPre = false. If we're already inside a
<pre> tag, this makes the parser think we're no longer in a <pre>
environment, so it starts wrapping the <pre> tag's content in <p> tags as
if it was processing regular content.
We should only call $this->closeParagraph() in the case that (a) we are not
inside a <pre> tag, or (b) the block-level tag that is being opened is
itself a <pre> tag (in which case $preOpenMatch will be truthy, and
$this->inPre will have already been set to true).
This doesn't affect the parsing of <pre> tags that are written in wikitext,
since their content isn't parsed. It only affects hooks and the like that
return <pre> tags.
This doesn't solve the task T7718 that is mentioned in the code comment,
but if the testwiki test cases linked there are anything to go by, it
doesn't make the problem worse in any way.
This is required for Poem change I754f2e84f7d6efc0829765c82297f2de5f9ca149.
Change-Id: I469e633fc41d8ca73653c7e982c591092dcb1708
* Matmarex had implemented this for wikitext headings in b3dd3881.
* This patch extends this to wikitext list items and wikitext table cells.
* Updated RELEASE NOTES.
tests/parser/parserTests.txt:
* All whitespace removed in output of list items, table cells, and
headings. Removed corresponding whitespace in the input wikitext
except for a few tests where the whitespace is significant "| +"
or "| -", for example.
* Updated output of html/parsoid sections as well.
* Added new tests to spec white-space trimming behavior.
tests/phpunit/*:
* Fixed a few tests that used whitespace in list items and table cells.
Bug: T157418
Change-Id: I8ea34c7ab893c0c125c81d810feeb3c581e4bba1
If <style> or <link> tags are by themselves on a line, don't wrap them
in <p> tags. But, at the same time, don't end an existing paragraph if
we find <style> or <link> in the middle (like we would if we just
treated them as block tags).
If <style> or <link> is on a line with other text, though, let it be
wrapped in a paragraph along with that other text.
Bug: T186965
Change-Id: Ide4005842cdab537226aa538cb5f7d8e363ba95d
* An open <dt> (;) should be closed when we encounter a new <dd> (:)
char even if it is on a new line that has other nested lists inside.
* Tidy was hiding this PHP parser bug by closing a <dt> and opening
a <dd> when given this HTML: "<dl><dt>a<ul><li>b</li></ul></dt></dl>"
It generates "<dl><dt>a</dt><dd><ul><li>b</li></ul></dd></dl>"
However, a HTML5 parser like RemexHTML, domino (used by Parsoid),
or browsers don't do this fixup.
* So, what I thought was a bug in RemexHTML turned out to be a bug
in the PHP parser that was being hidden by the use of Tidy.
* Added a regression test.
Bug: T175099
Change-Id: I6d5b225b82cecf9a43f23837ed8ec359b31aadad
With TimedMediaHandler in video.js mode, videos can be inline,
without a wrapper div.
Previously, in this mode two paragraphs where one contained a
video would end up merged into one paragraph, due to BlockLevelPass
matching "<track .../>" against "<tr" in its regexes.
Added \b to a couple of the regexes to protect against such errors,
and corrected a parser test case that had bad output listed, where
"<link .../>" matched against "<li".
Bug: T165817
Change-Id: I06e82b881f5ebddae5e7df7fb940adfa54f6b659
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.
Change-Id: I6f59febaf8fc96e80f8cfc11f4356283f461142a
When parsing a single line definition list, we track nested tags so that:
; <b>foo:bar</b>: baz
breaks before `baz`, not between `foo` and `bar`. But we currently bail
out of this algorithm entirely if we see a mismatched close tag. We should
just ignore the unmatched tag, like Parsoid does.
Change-Id: I6306dcad6347abeb6ab001d35562f1ab9f374bd1
Given the wikitext:
;-{zh-cn:AAA;zh-tw:BBB}-
Prevent `doBlockLevels` from trying to split the definition list at the
embedded colon and using `AAA;zh-tw:BBB}-` as the `<dd>` portion.
Bug: T153135
Change-Id: I3a4d02f1fbd0d0fe8278d6b7c66005f0dd3dd36b
* Improve some comments
* In getCommon() use min() to improve readability
* Use strict equals where possible
* Use camel case in variable names
* Remove the "case 0" optimisation, made sense in PHP 4 but those are
compile-time constants now and are presumably treated as such in both
PHP and HHVM.
* oLine -> inputLine, no idea what "o" stood for
* pos -> colonPos, lt -> ltPos: descriptive
* stack -> level, paragraphStack -> pendingPTag: they're not stacks
* Remove "m" prefix from member variable names
Change-Id: I6c1144c792ba3e1935be88a009a6d6c110d11090
It's independent of the rest of the Parser, but quite intrusive, with
its own instance variables and several private functions. It's also
pretty big (500 lines).
I removed a few functions from Parser here which were always marked
@private in the doc comment, but were inappropriately marked
"public" in the function declaration after migration to PHP 5. I grepped
core and deployed extensions and found no callers.
The helper functions are now all private, and the constructor is
private, with just a single public static entry point, reflecting
the self-contained nature of the module and its lack of hooks.
Change-Id: I1693ed48a9194719611b4afd9d989d44f0610f8d