Commit graph

944 commits

Author SHA1 Message Date
Umherirrender
130ec2523d Fix PhanTypeMismatchDeclaredParam
Auto fix MediaWiki.Commenting.FunctionComment.DefaultNullTypeParam sniff

Change-Id: I865323fd0295aabd06f3e3c75e0e5043fb31069e
2018-07-07 00:34:30 +00:00
Fomafix
a60dcdc2e3 Armor against French spaces detection in HTML attributes
This change also solves T13874 in a generic way.

Bug: T5158
Change-Id: Id8cdb887182f346acab2d108836ce201626848af
2018-06-21 19:24:07 +02:00
jenkins-bot
84fa176c9c Merge "Avoid deprecated LinkCache::singleton()" 2018-06-14 23:48:54 +00:00
C. Scott Ananian
7de2c566dd Deprecate Language::markNoConversion, which confuses readers
Language::markNoConversion is used only within Parser.php and differs
from LanguageConverter::markNoConversion in that, contrary to its name
and its namesake, it only protects *things which look like URLs* from
language conversion.

This wasted several days of my time before I realized what was going on.
It's needless; just hoist the "looks like a URL" special casing inline
to the single place where that functionality is used.  (And I wonder
if the "looks like a URL" case is actually needed at all any more,
since most of those cases are probably free external links, which
go through a different code path, not bracketed external links.)

This is a clean-up to the clean-up that liangent performed in 2012
with e01adbfc0b.

Change-Id: I80479600f34170651732b032e8881855aa1204d8
2018-06-13 13:26:58 -04:00
C. Scott Ananian
dbda7cdfb0 Remove unnecessary Parser::getConverterLanguage() indirection
The getConverterLanguage() method was added in March 2012 in commit
561424c266 as a workaround for a regression
in mediawiki 1.19.  It was an indirection which checked the global variable
$wgBug34832TransitionalRollback to return a different converter language
for Chinese wikis.

When this temporary bugfix was reverted in January 2013 in commit
a3fbdaaa2c, the temporary global variable
was removed, but not the getConverterLanguage() indirection.  Since then,
new code in the parser seems to have faithfully used getConverterLanguage()
instead of getTargetLanguage(), even though they are identical and the
need for getConverterLanguage() has long since passed.

Strike a small blow for elegant minimalism by removing the completely
unnecessary Parser::getConverterLanguage() indirection.  Well, sort
of: since this blight has been slowly growing inside Parser.php for
so long, we need to deprecate getConverterLanguage() first just in
case any external dependency has been infected.  Next release we
can finally excise the unnecessary method.

Change-Id: I567c29c9c7699020955699b76cbe8578d02e2fe6
2018-06-12 23:33:03 +00:00
Fomafix
e1630b6a53 PHP: Use short ternary operator (?:) where possible
Change-Id: Idcc7e4fcdd4d8302ceda44bf6d294fa8c2219381
2018-06-11 11:26:35 +02:00
Kunal Mehta
c4e5a9dd97 Avoid deprecated LinkCache::singleton()
Change-Id: Ie0e5c4ef0fe6ec896378bb2433af0898655dd907
2018-06-10 23:55:11 -07:00
Max Semenik
6e956d55aa Replace call_user_func_array(), part 2
Uses new PHP 5.6 syntax like ...parameter unpacking and
calling anything looking like a callback to make the code more readable.
There are much more occurrences but this commit is intentionally limited
to an easily reviewable size.

In one occurrence, a simple conditional instead of trickery was much more readable.

This patch finishes all the easy stuf in the core, the remainder is either unobvious
or would result in smaller readability gains. It will be carefully dealt with in
further commits.

Change-Id: I79a16c48bfb98b75e5b99f2f6f4fa07b3ae02c5b
2018-06-07 20:19:26 -07:00
Bartosz Dziewoński
485f66f174 Use PHP 7 '??' operator instead of '?:' with 'isset()' where convenient
Find: /isset\(\s*([^()]+?)\s*\)\s*\?\s*\1\s*:\s*/
Replace with: '\1 ?? '

(Everywhere except includes/PHPVersionCheck.php)
(Then, manually fix some line length and indentation issues)

Then manually reviewed the replacements for cases where confusing
operator precedence would result in incorrect results
(fixing those in I478db046a1cc162c6767003ce45c9b56270f3372).

Change-Id: I33b421c8cb11cdd4ce896488c9ff5313f03a38cf
2018-05-30 18:06:13 -07:00
Bartosz Dziewoński
b191e5e860 Use PHP 7 '<=>' operator in 'sort()' callbacks
`$a <=> $b` returns `-1` if `$a` is lesser, `1` if `$b` is lesser,
and `0` if they are equal, which are exactly the values 'sort()'
callbacks are supposed to return.

It also enables the neat idiom `$a[x] <=> $b[x] ?: $a[y] <=> $b[y]`
to sort arrays of objects first by 'x', and by 'y' if they are equal.

* Replace a common pattern like `return $a < $b ? -1 : 1` with the
  new operator (and similar patterns with the variables, the numbers
  or the comparison inverted). Some of the uses were previously not
  correctly handling the variables being equal; this is now
  automatically fixed.
* Also replace `return $a - $b`, which is equivalent to `return
  $a <=> $b` if both variables are integers but less intuitive.
* (Do not replace `return strcmp( $a, $b )`. It is also equivalent
  when both variables are strings, but if any of the variables is not,
  'strcmp()' converts it to a string before comparison, which could
  give different results than '<=>', so changing this would require
  careful review and isn't worth it.)
* Also replace `return $a > $b`, which presumably sort of works most
  of the time (returns `1` if `$b` is lesser, and `0` if they are
  equal or `$a` is lesser) but is erroneous.

Change-Id: I19a3d2fc8fcdb208c10330bd7a42c4e05d7f5cf3
2018-05-30 18:05:20 -07:00
Kunal Mehta
5952d84307 Deprecate Parser::fetchFile() since it's unused
Change-Id: Ic2bc3dd0479a373159a22da5f0a6961e212352ff
2018-05-27 23:17:50 -07:00
Kunal Mehta
e2ed410492 Parser: Don't catch exception just to rethrow it
This is left over from 4ff813680.

Change-Id: I624c2c22b7736af249647997565fe06f52d40fe2
2018-05-26 18:34:43 -07:00
Aaron Schulz
ec46c46787 Reduce impact of revision day/month/year variables on edit stashing
Change-Id: I0ddebdfa8a13844ab003aad577624e89daba7d6b
2018-05-20 08:11:22 +00:00
Kunal Mehta
230958d97c Autofix MediaWiki.Commenting.FunctionComment.SpacingDoc* errors
Change-Id: I63761ebce04c03b9b13237919c27cc10180f198f
2018-05-19 14:07:03 -07:00
Arlo Breault
5970ecec86 parser: Don't unnecessarily add and remove a pipe
Change-Id: I884ab88f9e8ac6f402cd4b3a54e33ccbd30637a2
2018-05-16 16:39:48 +00:00
Umherirrender
52338150c8 Fix return type for html strings
Change-Id: Ifc1ae7740ad1b130186b4b970d3d84651b016177
2018-04-06 13:07:01 +02:00
Subramanya Sastry
87c7ccd9bc Fix whitespace trimming in headings
* b3dd3881 was trimming whitespace in wikitext as well as HTML headings
  whereas the whitespace-trimming proposal was going to leave HTML tags
  untouched.

* 30495ea1 missed this because coincidentally, the test I added there
  for HTML headings had a typo and used <h2>...<h2> instead of
  <h2>...</h2> which caused the test to magically pass.

* This patch trims whitespace in
  doHeadings (which deals with wikitext headings) instead of
  formatHeadings (which deals with all headings).

* Updated parser tests to account for this.

Change-Id: I854f20b4c39a0a8e03d70155b269de77acf02cae
2018-03-23 11:42:01 -05:00
Subramanya Sastry
30495ea1f9 RFC T157418: Trim whitespace in table cells, list items, headings
* Matmarex had implemented this for wikitext headings in b3dd3881.
* This patch extends this to wikitext list items and wikitext table cells.
* Updated RELEASE NOTES.

tests/parser/parserTests.txt:
* All whitespace removed in output of list items, table cells, and
  headings. Removed corresponding whitespace in the input wikitext
  except for a few tests where the whitespace is significant "| +"
  or "| -", for example.
* Updated output of html/parsoid sections as well.
* Added new tests to spec white-space trimming behavior.

tests/phpunit/*:
* Fixed a few tests that used whitespace in list items and table cells.

Bug: T157418
Change-Id: I8ea34c7ab893c0c125c81d810feeb3c581e4bba1
2018-03-16 13:42:55 -05:00
C. Scott Ananian
65fcb7a945 Use class="free external" only on unbracketed URLs
The ability for URLs to be marked free even if they use bracketed syntax
but "sorta look free" (aka unbracketed) was added 13 years ago in
2d71cb3080 (r7074).

It seemed like a reasonable idea at the time: make printed output a little
prettier by marking "sorta free" URLs as free.  But this complicates the
semantics of wikitext, and introduces all sorts of strange corner cases,
for example:

  [http://example.com/&amp; http://example.com/&]

isn't marked as free, even though the parser output is:

  <a rel="nofollow" class="external text" href="http://example.com/&amp;">http://example.com/&amp;</a>

This functionality isn't actually needed: if you want the pretty printed
output of an unbracketed URL, then actually use an unbracketed URL.

In recent years we're more concerned with simplifying the semantics of
wikitext and eliminating corner cases, such that the content of our wikis
can be effectively archived.  The "effectively free" URLs are low-hanging
fruit in this quest.

Change-Id: I339e8698786c60c96a37a73443cb9a04362662c4
2018-03-07 00:20:09 -05:00
Tim Starling
f0247e05bd StripState testing and cleanup
* Added StripState unit tests
* Deprecated unmaintained "half-parsed" serialization experiment
* Renamed some variables for brevity and removed unused "prefix"

Change-Id: I838d7ac7f9a2189e13d39c6939dba5d70e74a6b7
2018-03-05 16:43:58 +11:00
Tim Starling
3dfda8c155 Limit total expansion size in StripState and improve limit handling
* Add a new limit to the parser which limits the size of the output
  generated by StripState. The relevant bug shows exponential blowup in
  output size.
* Remove the $prefix parameter from the StripState constructor. Used by
  no Gerrit-hosted extensions, hard-deprecated since 1.26.
* Convert the existing unstrip recursion depth limit to a normal parser
  limit with limit report row, warning and tracking category. Provide
  the same features in the new limit.
* Add an optional $parser parameter to the StripState constructor so
  that warnings and tracking categories can be added.

Bug: T187833
Change-Id: Ie5f6081177610dc7830de4a0a40705c0c8cb82f1
2018-03-05 05:16:04 +00:00
Arlo Breault
ee1787dd51 Ensure abort link parsing on xmlish tag in link title position
This shouldn't be dependent on the current definition of legal title
chars and strip marker.

See the test "<nowiki> inside a link"

Change-Id: I0d87aca1bb0adf4ec5ac480e0373a65fcd150a72
2018-03-01 14:08:24 -05:00
Brad Jorsch
2791fb0861 Hard-deprecate ParserOutput stateful transform methods
This also removes all the in-core calls that had been kept for the
benefit of extensions, and causes them to not have any effect since
anything that had been calling them was already either a no-op or will
probably be broken now that nothing in core is setting or checking the
flags.

Change-Id: Id22c1a5a6d6a249debb14063ae3f8838d105b634
2018-02-13 12:28:36 -05:00
Umherirrender
3124a990a2 Use ::class to resolve class names in includes files
This helps to find renamed or misspelled classes earlier.
Phan will check the class names

Change-Id: I07a925c2a9404b0865e8a8703864ded9d14aa769
2018-01-27 20:34:29 +01:00
Prateek Saxena
60a64e8912 Gallery: Use Parser::parseWidthParam() for gallery dimensions
Used by the `setWidths` and `setHeights` methods to make sure we are
using correct values.

Makes `parseWidthParam` static to be used in the gallery class.

Bug: T129372
Change-Id: I38b9ef0ea26e3748ad5d5458fadd2545f677ef93
2018-01-25 17:35:40 -05:00
jenkins-bot
a18476eab3 Merge "Remove @param comments that literally repeat what the code says" 2018-01-11 23:48:03 +00:00
Thiemo Mättig
ef470ebf7f Remove @param comments that literally repeat what the code says
These comments do not add anything. I argue they are worse than having
no comments, because I have to read them first to understand they
actually don't explain anything. Removing them makes room for actual
improvements in the future (if needed).

Change-Id: Iee70aad681b3385e9af282d5581c10addbb91ac4
2018-01-10 14:14:26 +01:00
Roan Kattouw
7f68220db6 Follow-up 6f07389ef2: fix variable name
Caused Notice: Undefined variable: text

Bug: T184123
Change-Id: I950a02134b145a2928af33995ca37a6965f265e4
2018-01-04 21:31:41 +00:00
Umherirrender
255d76f2a1 build: Updating mediawiki/mediawiki-codesniffer to 15.0.0
Clean up use of @codingStandardsIgnore
- @codingStandardsIgnoreFile -> phpcs:ignoreFile
- @codingStandardsIgnoreLine -> phpcs:ignore
- @codingStandardsIgnoreStart -> phpcs:disable
- @codingStandardsIgnoreEnd -> phpcs:enable

For phpcs:disable always the necessary sniffs are provided.
Some start/end pairs are changed to line ignore

Change-Id: I92ef235849bcc349c69e53504e664a155dd162c8
2018-01-01 14:10:16 +01:00
Kunal Mehta
37480222fb Parser: extract $title, follow-up 3d560be428
In the conversion away from extract(), the $title variable was missed. This
broke LabeledSectionTransclusion.

Change-Id: If4c140aedf16fc16a4ae2361f465798055748255
2017-12-30 18:50:06 +00:00
jenkins-bot
1a40e0cc86 Merge "Change php extract() to explicit code" 2017-12-28 09:44:59 +00:00
daniel
6af796f3e0 MCR: Deprecate and gut Revision class
This is a re-submission of I4f24e7fbb68.

As a first major step towards Multi-Content-Revisions (MCR),
this patch turns the Revision class into a legacy proxy for
the new RevisionRecord and RevisionStore classes.

Backwards compatibility is maintained for all but some
rare edge cases, like constructing a completely empty
Revision object.

For more information on MCR, see
<https://www.mediawiki.org/wiki/Requests_for_comment/Multi-Content_Revisions>.

NOTE: once this is merged, verify create/delete/restore cycle on beta,
      ideally with emulated replication lag.

Bug: T174025
Change-Id: Ia4c20a91e98df0b9b14b138eb4825c55e5200384
2017-12-21 18:08:54 +00:00
Daniel Kinzler
09bf4f5bb2 Revert "[MCR] Turn Revision into a proxy to new code."
This reverts commit 9dcc56b3c9.

With this patch applied, newly created revisions are sometimes not found
just after submitting an edit, until replicas have caught up.

Our best theory is that it somehow interfere with ChronologyProtector,
but we don't have a good idea how.

Also, as legoktm mentioned, the commit message is terrible and needs fixing.

Change-Id: Idf3404f3fa8f8d08a7fb2ab8268726e2c1edecfe
2017-12-19 12:38:48 +00:00
jenkins-bot
3d95da4952 Merge "Require indentation of CASE statements in PHP code" 2017-12-19 12:21:59 +00:00
daniel
9dcc56b3c9 [MCR] Turn Revision into a proxy to new code.
Change-Id: I4f24e7fbb683cb51f3fd8b250732bae9c7541ba2
2017-12-18 14:37:29 +00:00
jenkins-bot
47818f1b44 Merge "Split limit report out of Parser::parse()" 2017-12-15 05:04:01 +00:00
jenkins-bot
3844fd9d63 Merge "Parser: Add guessSectionNameFromStrippedText() and refactor" 2017-12-12 13:10:55 +00:00
Huji Lee
e74bfe13f6 Require indentation of CASE statements in PHP code
Bug: T182546
Change-Id: I91a9555893a08e4ec58da97c6cc4d1e70000ff6b
2017-12-10 22:07:50 -05:00
Phantom42
6c3a9662b2 Add quotes to comment based strip markers
Bug: T180159
Change-Id: Ic9dbb8ef3948fe751d16c3963769b616b5db2fc7
2017-12-08 17:00:26 +02:00
Umherirrender
3d560be428 Change php extract() to explicit code
Avoid php magic and make var settings more visible

Change-Id: I223874fd871104b0ac6a80d7f39c6dd997d0551d
2017-12-08 14:46:33 +01:00
Tim Starling
6a2a43f285 Split limit report out of Parser::parse()
It was 100 lines. Also update a few nearby comments. The one about just
handling <nowiki> sections was actually written by Lee, and is
hilariously outdated now.

Change-Id: I12ee2a7e488a3c787b36d3a457c6166bbbb46aff
2017-12-08 16:33:05 +11:00
Roan Kattouw
6f07389ef2 Parser: Add guessSectionNameFromStrippedText() and refactor
Split up guessSectionNameFromWikiText() into pieces to reduce code
duplication, and provide guessSectionNameFromStrippedText() which
doesn't do link stripping.

Really these should be named guessSection*ANCHOR*From... because they
return an anchor (with encoding and a '#' prefix) instead of a section
name, but I didn't want to rename the existing one.

Also make normalizeSectionName static (it doesn't use $this) so that
guessSectionNameFromStrippedText() can be static as well.

Change-Id: I56b9dda805a51517549c5ed709f4bd747ca04577
2017-12-07 10:22:45 -08:00
Max Semenik
129067c907 Remove nbsp and similar characters from section IDs
Bug: T90902
Change-Id: I71bdb7dd43c3e532287290e3c691d9739da45475
2017-11-02 19:35:11 -07:00
Santhosh Thottingal
f07b32a7dd Parser: Disable commafy for magic variables for month and day
In Parser#getVariableValue for the following magic variables
Language#formatNum was called without commafy parameter:

currentmonth, currentmonth1, currentday, currentday2, localmonth,
localmonth1, localday, localday2

The default value for formatNum nocommafy is false, meaning formatNum
will do commafication. For the above context, commafy is not needed
since the passed values are often month values like 02, 03 etc.
Commafy is noop on this values.

Explicitly pass false value for formatNum's nocommafy argument.
Language#formatNum method documentation for nocommafy also recommends
setting it true in case of dates.

Change-Id: I3233d5458af8cef583e5d1d599d9408542ba08c9
2017-10-16 08:35:26 +00:00
jenkins-bot
079d61fb79 Merge "Remove "only newlines in trailer" special case for category/language links" 2017-09-29 22:20:52 +00:00
jenkins-bot
a5b41a26b6 Merge "Fix link prefix/suffixes around Category and Language links (take 2)." 2017-09-19 16:59:58 +00:00
Fomafix
b6c895ddc5 Do not double decode HTML entities for IDs
* in links (T103714)
* in indicators (T104196)

This change removes the automatic Sanitizer::decodeCharReferences from
Sanitizer::escapeId and Sanitizer::escapeIdInternal. Where decoding of
HTML entities are wanted an explicit call to
Sanitizer::decodeCharReferences is added.

Explicit decode HTML entities in non local autocomments. (T104311)

Bug: T103714
Bug: T104196
Bug: T104311
Change-Id: I88e8e2077e6f5eec2b232391f7818370894a62dc
2017-09-12 15:42:17 +02:00
jenkins-bot
2480aae0c9 Merge "Show a warning in edit preview when a template loop is detected" 2017-09-11 18:26:11 +00:00
C. Scott Ananian
5676481c6d Remove "only newlines in trailer" special case for category/language links
This special case complicates wikitext semantics and ought to be
unnecessary.  Parsoid doesn't include this special case; if this patch
to the PHP parser isn't merged, we should write one for Parsoid to
implement the missing special case logic.

Bug: T175416
Change-Id: I3865c51b21de9d63ac5d06dcc3a3fa9108129d6c
2017-09-08 23:44:42 -04:00
C. Scott Ananian
6d5fd8077f Fix link prefix/suffixes around Category and Language links (take 2).
Previous attempt was I943cd9bec0855d9a326b0b50739d686a29995370, reverted in
e687f2da3e due to T174639.

There's still a weird behavior with newline stripping between links, which
I'll try to tackle in a follow-on patch (T175416).

Bug: T2087
Bug: T10897
Bug: T87753
Bug: T174639
Change-Id: I8228cdd3b80faf899000adb511a983edc454bc76
2017-09-08 16:12:21 -04:00