Commit graph

18 commits

Author SHA1 Message Date
jenkins-bot
bd78869618 Merge "No yoda conditions" 2018-12-09 01:34:23 +00:00
Alangi Derick
19adaa6a4b parser: Fix PHPDoc annotations in parser module
Change-Id: I09680d72516f943051e86655b5fddf9ff2988e4e
2018-12-08 13:07:10 +00:00
Fomafix
3ee1560232 No yoda conditions
Replace
  if ( 42 === $foo )
by
  if ( $foo === 42 )

Change-Id: Ice320ef1ae64a59ed035c20134326b35d454f943
2018-11-21 17:54:39 +01:00
Arlo Breault
ce9f5c2546 Put <dt>/<dd>/<li> in the always-suppressing category of doBlockLevels
This is a clarification of what already happens in practice for lists
generated from wikitext syntax, since that parsing happens
simultaneously.

Parsoid, for its part, does list handling prior to paragraph wrapping,
so must make use of these definitions.

Further, this helps reduce paragraph wrapping in interstitial spacing of
lists from HTML syntax, as spec'd in the tests, though the possibility
isn't eliminated entirely.

The TOC generation code is altered to reduce the number of newlines
emitted in between list items, since those are now left intact.

Change-Id: I6888b6e8e6768b0737565b87924fefa5a06ebd18
2018-07-13 12:40:49 -04:00
Arlo Breault
1f907d500a Cleanup the element matches in doBlockLevels a bit
The MARKER_PREFIX is removed since `unstripGeneral()` happens before we
get here.

Change-Id: Ic668784fd8bbaa8395cd5449c83a993abda141eb
2018-04-25 13:22:40 -04:00
jenkins-bot
2384948256 Merge "Fix parsing of <pre> tags generated by extension tag hooks" 2018-04-04 19:36:31 +00:00
This, that and the other
5ff5dbc7dc Fix parsing of <pre> tags generated by extension tag hooks
When this part of BlockLevelPass::execute() encounters a block-level tag,
such as <hr>, one of $openMatch or $closeMatch will be truthy.

Without this patch, $this->closeParagraph() is unconditionally called in
this situation, which sets $this->inPre = false. If we're already inside a
<pre> tag, this makes the parser think we're no longer in a <pre>
environment, so it starts wrapping the <pre> tag's content in <p> tags as
if it was processing regular content.

We should only call $this->closeParagraph() in the case that (a) we are not
inside a <pre> tag, or (b) the block-level tag that is being opened is
itself a <pre> tag (in which case $preOpenMatch will be truthy, and
$this->inPre will have already been set to true).

This doesn't affect the parsing of <pre> tags that are written in wikitext,
since their content isn't parsed. It only affects hooks and the like that
return <pre> tags.

This doesn't solve the task T7718 that is mentioned in the code comment,
but if the testwiki test cases linked there are anything to go by, it
doesn't make the problem worse in any way.

This is required for Poem change I754f2e84f7d6efc0829765c82297f2de5f9ca149.

Change-Id: I469e633fc41d8ca73653c7e982c591092dcb1708
2018-03-17 23:20:50 +11:00
Subramanya Sastry
30495ea1f9 RFC T157418: Trim whitespace in table cells, list items, headings
* Matmarex had implemented this for wikitext headings in b3dd3881.
* This patch extends this to wikitext list items and wikitext table cells.
* Updated RELEASE NOTES.

tests/parser/parserTests.txt:
* All whitespace removed in output of list items, table cells, and
  headings. Removed corresponding whitespace in the input wikitext
  except for a few tests where the whitespace is significant "| +"
  or "| -", for example.
* Updated output of html/parsoid sections as well.
* Added new tests to spec white-space trimming behavior.

tests/phpunit/*:
* Fixed a few tests that used whitespace in list items and table cells.

Bug: T157418
Change-Id: I8ea34c7ab893c0c125c81d810feeb3c581e4bba1
2018-03-16 13:42:55 -05:00
Brad Jorsch
b3e575bb8f Parser: Don't wrap <style> or <link> tags in paragraphs
If <style> or <link> tags are by themselves on a line, don't wrap them
in <p> tags. But, at the same time, don't end an existing paragraph if
we find <style> or <link> in the middle (like we would if we just
treated them as block tags).

If <style> or <link> is on a line with other text, though, let it be
wrapped in a paragraph along with that other text.

Bug: T186965
Change-Id: Ide4005842cdab537226aa538cb5f7d8e363ba95d
2018-02-28 14:12:49 -05:00
Max Semenik
b3f2e653c8 Fix variable name to match code
Change-Id: Idb97c9c5379d2ba4f0874ceaffcf48870bdd682e
2018-01-19 16:36:52 -08:00
Huji Lee
e74bfe13f6 Require indentation of CASE statements in PHP code
Bug: T182546
Change-Id: I91a9555893a08e4ec58da97c6cc4d1e70000ff6b
2017-12-10 22:07:50 -05:00
Subramanya Sastry
ea3b95e20d Fix bug in dl-dt list output generation
* An open <dt> (;) should be closed when we encounter a new <dd> (:)
  char even if it is on a new line that has other nested lists inside.

* Tidy was hiding this PHP parser bug by closing a <dt> and opening
  a <dd> when given this HTML: "<dl><dt>a<ul><li>b</li></ul></dt></dl>"
  It generates "<dl><dt>a</dt><dd><ul><li>b</li></ul></dd></dl>"

  However, a HTML5 parser like RemexHTML, domino (used by Parsoid),
  or browsers don't do this fixup.

* So, what I thought was a bug in RemexHTML turned out to be a bug
  in the PHP parser that was being hidden by the use of Tidy.

* Added a regression test.

Bug: T175099
Change-Id: I6d5b225b82cecf9a43f23837ed8ec359b31aadad
2017-09-12 01:10:44 +00:00
Brion Vibber
33e4ac5b22 Add \b to regexes in BlockLevelPass to avoid confusing tr & track
With TimedMediaHandler in video.js mode, videos can be inline,
without a wrapper div.

Previously, in this mode two paragraphs where one contained a
video would end up merged into one paragraph, due to BlockLevelPass
matching "<track .../>" against "<tr" in its regexes.

Added \b to a couple of the regexes to protect against such errors,
and corrected a parser test case that had bad output listed, where
"<link .../>" matched against "<li".

Bug: T165817
Change-Id: I06e82b881f5ebddae5e7df7fb940adfa54f6b659
2017-05-20 00:53:05 +02:00
James D. Forrester
9635dda73a includes: Replace implicit Bugzilla bug numbers with Phab ones
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.

Change-Id: I6f59febaf8fc96e80f8cfc11f4356283f461142a
2017-02-21 18:13:24 +00:00
C. Scott Ananian
3a8a986e35 Don't bail on single-line definition list due to excess close tags.
When parsing a single line definition list, we track nested tags so that:

	; <b>foo:bar</b>: baz

breaks before `baz`, not between `foo` and `bar`.  But we currently bail
out of this algorithm entirely if we see a mismatched close tag.  We should
just ignore the unmatched tag, like Parsoid does.

Change-Id: I6306dcad6347abeb6ab001d35562f1ab9f374bd1
2017-02-17 16:34:55 -05:00
C. Scott Ananian
ee002d67c9 Protect -{...}- variant constructs in definition lists.
Given the wikitext:

	;-{zh-cn:AAA;zh-tw:BBB}-

Prevent `doBlockLevels` from trying to split the definition list at the
embedded colon and using `AAA;zh-tw:BBB}-` as the `<dd>` portion.

Bug: T153135
Change-Id: I3a4d02f1fbd0d0fe8278d6b7c66005f0dd3dd36b
2017-02-17 15:52:44 -05:00
Tim Starling
a5159d8f23 BlockLevelPass: minor changes due to initial code review
* Improve some comments
* In getCommon() use min() to improve readability
* Use strict equals where possible
* Use camel case in variable names
* Remove the "case 0" optimisation, made sense in PHP 4 but those are
  compile-time constants now and are presumably treated as such in both
  PHP and HHVM.
* oLine -> inputLine, no idea what "o" stood for
* pos -> colonPos, lt -> ltPos: descriptive
* stack -> level, paragraphStack -> pendingPTag: they're not stacks
* Remove "m" prefix from member variable names

Change-Id: I6c1144c792ba3e1935be88a009a6d6c110d11090
2016-05-06 14:42:58 +10:00
Tim Starling
2e3c1f87e2 Split out doBlockLevels() into its own class
It's independent of the rest of the Parser, but quite intrusive, with
its own instance variables and several private functions. It's also
pretty big (500 lines).

I removed a few functions from Parser here which were always marked
@private in the doc comment, but were inappropriately marked
"public" in the function declaration after migration to PHP 5. I grepped
core and deployed extensions and found no callers.

The helper functions are now all private, and the constructor is
private, with just a single public static entry point, reflecting
the self-contained nature of the module and its lack of hooks.

Change-Id: I1693ed48a9194719611b4afd9d989d44f0610f8d
2016-05-06 14:40:20 +10:00