$length is user input, so cast it to an int before passing it to min().
If there is nothing to add at that point, return immediately.
In PHP 7.1+ this raised a warning of "A non-numeric value encountered"
because min() will return the junk value, returning a string. Then we
try and subtract an int from it (return value of mb_strlen()),
triggering the warning.
Added a parser test to verify the behavior, and confirmed that it
triggers warnings without the patch.
Bug: T180403
Change-Id: I614750962104f6251a864519035366ac9798fc0f
Find: /isset\(\s*([^()]+?)\s*\)\s*\?\s*\1\s*:\s*/
Replace with: '\1 ?? '
(Everywhere except includes/PHPVersionCheck.php)
(Then, manually fix some line length and indentation issues)
Then manually reviewed the replacements for cases where confusing
operator precedence would result in incorrect results
(fixing those in I478db046a1cc162c6767003ce45c9b56270f3372).
Change-Id: I33b421c8cb11cdd4ce896488c9ff5313f03a38cf
This test covers the branch of code when the $mTplRedirCache is already
populated, by using the same template that redirects twice.
Change-Id: Ie0ce277c75366b7b060e0da6873175976621aff9
Unicode 6.3.0 (September 2013) the added additional directional
formatting characters:
U+061C ARABIC LETTER MARK
U+2066 LEFT-TO-RIGHT ISOLATE
U+2067 RIGHT-TO-LEFT ISOLATE
U+2068 FIRST STRONG ISOLATE
U+2069 POP DIRECTIONAL ISOLATE
https://www.fileformat.info/info/unicode/version/6.3/index.htm
This change strips the new directional formatting characters from the
title like the directional formatting characters from Unicode 1.1.0
(June 1993).
Any existing titles containing the new Unicode directional formatting
characters get stripped by a run of maintenance/cleanupTitles.php after
deployment.
This change also allows to insert the new Unicode directional
formatting characters into the DISPLAYTITLE.
Change-Id: I2279f51048f5252c2e4280ec6a13f060ff9967cb
This change strips all soft hyphens from the title. This is already
done for Unicode bidi characters (T5696).
URLs with soft hyphens (%C2%AD) get redirected (301) to the URL without
soft hyphens (T145605):
https://de.wikipedia.org/wiki/Bosnatal%C2%ADbahn get redirected to
https://de.wikipedia.org/wiki/Bosnatalbahn
Links in wikitext containing soft hyphen "[[Bosnatal<AD>bahn]]" (the
"<AD>" stands here for a soft hyphen) links "Bosnatalbahn" but displays
"Bosnatal<AD>bahn".
This change also allows to insert soft hyphens into the displaytitle
(T66528). This allows to insert soft hyphens into the first heading for
manual hyphenation of titles with very long words.
This change prevents access to any existing articles containing soft
hyphens in the title. After deploying this change a run of
maintenance/cleanupTitles.php must performed to rename existing titles
with soft hyphens. Before deploying this change existing articles and
redirects with soft hyphens in the title can already renamed or
deleted.
Bug: T121979
Bug: T66528
Change-Id: Ie13626c433cdb460dbf00b3bba28d1bb5a7b6d6a
The content language object has a cache for namespaces, it might then
not take in account $wgExtraNamespaces set by the parser test suite
which causes unknown namespaces errors.
Ensure the new language object has a clean cache.
Repro:
php phpunit.php --filter '(ParserMethodsTest::testValidCovers|T53680)'
Bug: T190554
Change-Id: I9c4104d7bb3a0c84b60d7e7b4154743cbe58348c
* b3dd3881 was trimming whitespace in wikitext as well as HTML headings
whereas the whitespace-trimming proposal was going to leave HTML tags
untouched.
* 30495ea1 missed this because coincidentally, the test I added there
for HTML headings had a typo and used <h2>...<h2> instead of
<h2>...</h2> which caused the test to magically pass.
* This patch trims whitespace in
doHeadings (which deals with wikitext headings) instead of
formatHeadings (which deals with all headings).
* Updated parser tests to account for this.
Change-Id: I854f20b4c39a0a8e03d70155b269de77acf02cae
When this part of BlockLevelPass::execute() encounters a block-level tag,
such as <hr>, one of $openMatch or $closeMatch will be truthy.
Without this patch, $this->closeParagraph() is unconditionally called in
this situation, which sets $this->inPre = false. If we're already inside a
<pre> tag, this makes the parser think we're no longer in a <pre>
environment, so it starts wrapping the <pre> tag's content in <p> tags as
if it was processing regular content.
We should only call $this->closeParagraph() in the case that (a) we are not
inside a <pre> tag, or (b) the block-level tag that is being opened is
itself a <pre> tag (in which case $preOpenMatch will be truthy, and
$this->inPre will have already been set to true).
This doesn't affect the parsing of <pre> tags that are written in wikitext,
since their content isn't parsed. It only affects hooks and the like that
return <pre> tags.
This doesn't solve the task T7718 that is mentioned in the code comment,
but if the testwiki test cases linked there are anything to go by, it
doesn't make the problem worse in any way.
This is required for Poem change I754f2e84f7d6efc0829765c82297f2de5f9ca149.
Change-Id: I469e633fc41d8ca73653c7e982c591092dcb1708
* Matmarex had implemented this for wikitext headings in b3dd3881.
* This patch extends this to wikitext list items and wikitext table cells.
* Updated RELEASE NOTES.
tests/parser/parserTests.txt:
* All whitespace removed in output of list items, table cells, and
headings. Removed corresponding whitespace in the input wikitext
except for a few tests where the whitespace is significant "| +"
or "| -", for example.
* Updated output of html/parsoid sections as well.
* Added new tests to spec white-space trimming behavior.
tests/phpunit/*:
* Fixed a few tests that used whitespace in list items and table cells.
Bug: T157418
Change-Id: I8ea34c7ab893c0c125c81d810feeb3c581e4bba1
The ability for URLs to be marked free even if they use bracketed syntax
but "sorta look free" (aka unbracketed) was added 13 years ago in
2d71cb3080 (r7074).
It seemed like a reasonable idea at the time: make printed output a little
prettier by marking "sorta free" URLs as free. But this complicates the
semantics of wikitext, and introduces all sorts of strange corner cases,
for example:
[http://example.com/& http://example.com/&]
isn't marked as free, even though the parser output is:
<a rel="nofollow" class="external text" href="http://example.com/&">http://example.com/&</a>
This functionality isn't actually needed: if you want the pretty printed
output of an unbracketed URL, then actually use an unbracketed URL.
In recent years we're more concerned with simplifying the semantics of
wikitext and eliminating corner cases, such that the content of our wikis
can be effectively archived. The "effectively free" URLs are low-hanging
fruit in this quest.
Change-Id: I339e8698786c60c96a37a73443cb9a04362662c4
* RemexHtml is the future of "tidy" in MediaWiki,
so run our parser tests using it.
* This is a necessary step before we can make it
the default in MediaWiki (T185753).
* Cleaned up a bunch of tests:
(a) where html/php+tidy and html/parsoid match up,
retained a html+tidy section and removed the others.
(b) where html/php and html/php+tidy match up,
retained the html/php section and removed the
html/php+tidy section.
* Annotating tests with explanations where Parsoid & Remex
output differ. This is usually because of two reasons:
(a) Parsoid has Tidy-emulation code in some cases (which
we can consider stripping away separately).
(b) Parsoid does a bunch of cleanup on the DOM (which was
probably done to emulate Tidy output, but which could
probably be retained). Since Parsoid (in some form)
will be default parser in the future, no reason to try
to port this cleanup (in broken markup scenarios) into
Remex.
* Left a bunch of FIXMEs for later followup.
Unrelated cleanup:
* Renamed a few tests since the functionality in Parsoid
was fixed up. There is no more "implicit <td>" support.
Those all now lead to fostered content.
* Fixed some clearly broken output in html/parsoid sections
for some tests.
Co-Authored-by: Kunal Mehta <legoktm@member.fsf.org>
Co-Authored-by: Subramanya Sastry <ssastry@wikimedia.org>
Bug: T188167
Depends-On: I646dbabb3c2ed28c1ea72c5bd8f7f92d03f57c75
Change-Id: Ic7c34d57a300dbd36a37f03fbfe33391b2950b44
If <style> or <link> tags are by themselves on a line, don't wrap them
in <p> tags. But, at the same time, don't end an existing paragraph if
we find <style> or <link> in the middle (like we would if we just
treated them as block tags).
If <style> or <link> is on a line with other text, though, let it be
wrapped in a paragraph along with that other text.
Bug: T186965
Change-Id: Ide4005842cdab537226aa538cb5f7d8e363ba95d
Storing the user name or IP in every row in large tables like revision
and logging takes up space and makes operations on these tables slower.
This patch begins the process of moving those into one "actor" table
which other tables can reference with a single integer field.
A subsequent patch will remove the old columns.
Bug: T167246
Depends-On: I9293fd6e0f958d87e52965de925046f1bb8f8a50
Change-Id: I8d825eb02c69cc66d90bd41325133fd3f99f0226
$wgParserTestFiles is deprecated, so this wasn't running the core parser
tests. Using ParserTestRunner::getParserTestFiles() includes everything,
including autodiscovered extension parser tests.
Change-Id: Ie3b02565c184e8e06931ab52a39ca8ae0877aab9
And deprecate passing false for ParserOptions::setWrapOutputClass().
There are three cases for the Parser wrapper: the default
mw-parser-output, a custom wrapper, or no wrapper. As things currently
stand, we have to fragment the parser cache on each of these options,
which uses a nontrival amount of storage space (T167784).
Ideally we'd do all the wrapping as a post-cache transform, but
TemplateStyles needs to know the wrapper in use in order to properly
prefix its CSS rules (that's why we added the wrapper in the first
place). So, second best option is to make *un*wrapping be a post-cache
transform and make "custom wrapper" be uncacheable.
This patch does the first bit (unwrapping as a post-cache transform),
and a followup will do the second part once the deprecation process is
satisfied.
Bug: T181846
Change-Id: Iba16e78c41be992467101e7d83e9c3134765b101
Used by the `setWidths` and `setHeights` methods to make sure we are
using correct values.
Makes `parseWidthParam` static to be used in the gallery class.
Bug: T129372
Change-Id: I38b9ef0ea26e3748ad5d5458fadd2545f677ef93