* b3dd3881 was trimming whitespace in wikitext as well as HTML headings
whereas the whitespace-trimming proposal was going to leave HTML tags
untouched.
* 30495ea1 missed this because coincidentally, the test I added there
for HTML headings had a typo and used <h2>...<h2> instead of
<h2>...</h2> which caused the test to magically pass.
* This patch trims whitespace in
doHeadings (which deals with wikitext headings) instead of
formatHeadings (which deals with all headings).
* Updated parser tests to account for this.
Change-Id: I854f20b4c39a0a8e03d70155b269de77acf02cae
* Matmarex had implemented this for wikitext headings in b3dd3881.
* This patch extends this to wikitext list items and wikitext table cells.
* Updated RELEASE NOTES.
tests/parser/parserTests.txt:
* All whitespace removed in output of list items, table cells, and
headings. Removed corresponding whitespace in the input wikitext
except for a few tests where the whitespace is significant "| +"
or "| -", for example.
* Updated output of html/parsoid sections as well.
* Added new tests to spec white-space trimming behavior.
tests/phpunit/*:
* Fixed a few tests that used whitespace in list items and table cells.
Bug: T157418
Change-Id: I8ea34c7ab893c0c125c81d810feeb3c581e4bba1
The ability for URLs to be marked free even if they use bracketed syntax
but "sorta look free" (aka unbracketed) was added 13 years ago in
2d71cb3080 (r7074).
It seemed like a reasonable idea at the time: make printed output a little
prettier by marking "sorta free" URLs as free. But this complicates the
semantics of wikitext, and introduces all sorts of strange corner cases,
for example:
[http://example.com/& http://example.com/&]
isn't marked as free, even though the parser output is:
<a rel="nofollow" class="external text" href="http://example.com/&">http://example.com/&</a>
This functionality isn't actually needed: if you want the pretty printed
output of an unbracketed URL, then actually use an unbracketed URL.
In recent years we're more concerned with simplifying the semantics of
wikitext and eliminating corner cases, such that the content of our wikis
can be effectively archived. The "effectively free" URLs are low-hanging
fruit in this quest.
Change-Id: I339e8698786c60c96a37a73443cb9a04362662c4
* Add a new limit to the parser which limits the size of the output
generated by StripState. The relevant bug shows exponential blowup in
output size.
* Remove the $prefix parameter from the StripState constructor. Used by
no Gerrit-hosted extensions, hard-deprecated since 1.26.
* Convert the existing unstrip recursion depth limit to a normal parser
limit with limit report row, warning and tracking category. Provide
the same features in the new limit.
* Add an optional $parser parameter to the StripState constructor so
that warnings and tracking categories can be added.
Bug: T187833
Change-Id: Ie5f6081177610dc7830de4a0a40705c0c8cb82f1
This shouldn't be dependent on the current definition of legal title
chars and strip marker.
See the test "<nowiki> inside a link"
Change-Id: I0d87aca1bb0adf4ec5ac480e0373a65fcd150a72
This also removes all the in-core calls that had been kept for the
benefit of extensions, and causes them to not have any effect since
anything that had been calling them was already either a no-op or will
probably be broken now that nothing in core is setting or checking the
flags.
Change-Id: Id22c1a5a6d6a249debb14063ae3f8838d105b634
Used by the `setWidths` and `setHeights` methods to make sure we are
using correct values.
Makes `parseWidthParam` static to be used in the gallery class.
Bug: T129372
Change-Id: I38b9ef0ea26e3748ad5d5458fadd2545f677ef93
These comments do not add anything. I argue they are worse than having
no comments, because I have to read them first to understand they
actually don't explain anything. Removing them makes room for actual
improvements in the future (if needed).
Change-Id: Iee70aad681b3385e9af282d5581c10addbb91ac4
Clean up use of @codingStandardsIgnore
- @codingStandardsIgnoreFile -> phpcs:ignoreFile
- @codingStandardsIgnoreLine -> phpcs:ignore
- @codingStandardsIgnoreStart -> phpcs:disable
- @codingStandardsIgnoreEnd -> phpcs:enable
For phpcs:disable always the necessary sniffs are provided.
Some start/end pairs are changed to line ignore
Change-Id: I92ef235849bcc349c69e53504e664a155dd162c8
In the conversion away from extract(), the $title variable was missed. This
broke LabeledSectionTransclusion.
Change-Id: If4c140aedf16fc16a4ae2361f465798055748255
This is a re-submission of I4f24e7fbb68.
As a first major step towards Multi-Content-Revisions (MCR),
this patch turns the Revision class into a legacy proxy for
the new RevisionRecord and RevisionStore classes.
Backwards compatibility is maintained for all but some
rare edge cases, like constructing a completely empty
Revision object.
For more information on MCR, see
<https://www.mediawiki.org/wiki/Requests_for_comment/Multi-Content_Revisions>.
NOTE: once this is merged, verify create/delete/restore cycle on beta,
ideally with emulated replication lag.
Bug: T174025
Change-Id: Ia4c20a91e98df0b9b14b138eb4825c55e5200384
This reverts commit 9dcc56b3c9.
With this patch applied, newly created revisions are sometimes not found
just after submitting an edit, until replicas have caught up.
Our best theory is that it somehow interfere with ChronologyProtector,
but we don't have a good idea how.
Also, as legoktm mentioned, the commit message is terrible and needs fixing.
Change-Id: Idf3404f3fa8f8d08a7fb2ab8268726e2c1edecfe
It was 100 lines. Also update a few nearby comments. The one about just
handling <nowiki> sections was actually written by Lee, and is
hilariously outdated now.
Change-Id: I12ee2a7e488a3c787b36d3a457c6166bbbb46aff
Split up guessSectionNameFromWikiText() into pieces to reduce code
duplication, and provide guessSectionNameFromStrippedText() which
doesn't do link stripping.
Really these should be named guessSection*ANCHOR*From... because they
return an anchor (with encoding and a '#' prefix) instead of a section
name, but I didn't want to rename the existing one.
Also make normalizeSectionName static (it doesn't use $this) so that
guessSectionNameFromStrippedText() can be static as well.
Change-Id: I56b9dda805a51517549c5ed709f4bd747ca04577
In Parser#getVariableValue for the following magic variables
Language#formatNum was called without commafy parameter:
currentmonth, currentmonth1, currentday, currentday2, localmonth,
localmonth1, localday, localday2
The default value for formatNum nocommafy is false, meaning formatNum
will do commafication. For the above context, commafy is not needed
since the passed values are often month values like 02, 03 etc.
Commafy is noop on this values.
Explicitly pass false value for formatNum's nocommafy argument.
Language#formatNum method documentation for nocommafy also recommends
setting it true in case of dates.
Change-Id: I3233d5458af8cef583e5d1d599d9408542ba08c9
* in links (T103714)
* in indicators (T104196)
This change removes the automatic Sanitizer::decodeCharReferences from
Sanitizer::escapeId and Sanitizer::escapeIdInternal. Where decoding of
HTML entities are wanted an explicit call to
Sanitizer::decodeCharReferences is added.
Explicit decode HTML entities in non local autocomments. (T104311)
Bug: T103714
Bug: T104196
Bug: T104311
Change-Id: I88e8e2077e6f5eec2b232391f7818370894a62dc
This special case complicates wikitext semantics and ought to be
unnecessary. Parsoid doesn't include this special case; if this patch
to the PHP parser isn't merged, we should write one for Parsoid to
implement the missing special case logic.
Bug: T175416
Change-Id: I3865c51b21de9d63ac5d06dcc3a3fa9108129d6c
Previous attempt was I943cd9bec0855d9a326b0b50739d686a29995370, reverted in
e687f2da3e due to T174639.
There's still a weird behavior with newline stripping between links, which
I'll try to tackle in a follow-on patch (T175416).
Bug: T2087
Bug: T10897
Bug: T87753
Bug: T174639
Change-Id: I8228cdd3b80faf899000adb511a983edc454bc76
* Parser::getRandomString() (deprecated in 1.26) was removed.
* Parser::uniqPrefix() (deprecated in 1.26) was removed.
* Parser::extractTagsAndParams() now only accepts three arguments. The fourth, $uniq_prefix was deprecated in 1.26 and has now been removed.
Bug: T61113
Change-Id: I7333fff4eb8b9a754b4596992f2a69bbdaac664d
- mostly auto fixes
- some too long lines fixed
- ignore amp space in one case passing by reference
Change-Id: I6472f83bc3cbf4bd629d83050cc3319b19ec465c
It adds the ability to replace the current section ID escaping
schema (.C0.DE) with a HTML5-compliant escaping schema that is
displayed as Unicode in many modern browsers.
See the linked bug for discussion of various options that were
considered before the implementation. A few remarks:
* Because Sanitizer::escapeId() is used in a bunch of places without
escaping, I'm deprecating it without altering its behavior.
* The bug described in comments for Parser::guessLegacySectionNameFromWikiText()
is still there in some Edge versions that display mojibake.
Bug: T152540
Change-Id: Id304010a0342efbb7ef2d56c5b8b244f2e4fb2c5
And auto-fix all errors.
The `<exclude-pattern>` stanzas are now included in the default ruleset
and don't need to be repeated.
Change-Id: I928af549dc88ac2c6cb82058f64c7c7f3111598a
* Right now, one or two are permitted. This patch limits it to one.
The current behaviour seems more a byproduct of refactoring than an
explicit goal.
* Note that this will break links on a handful of pages surfaced in
Parsoid's roundtrip testing.
Change-Id: Icabd34bbf15781bb891bd8e0c079d1a65eb28595
The used phpcs has a bug, so the version 0.9.0 could not be enforced at the moment.
Will be fixed in next version, see T167168
Changed:
- Remove duplicate newline at end of file
- Add space between function and ( for closures
- and -> &&, or -> ||
Change-Id: I4172fb08861729bccd55aecbd07e029e2638d311