Commit graph

717 commits

Author SHA1 Message Date
Brad Jorsch
8eeb906f93 Break accidental references in Parser::__clone
If you have a reference *to* an object field (anywhere in the call
stack) when you clone the object, the field will be cloned as a
reference rather than as a value.

So we have to break those unexpected references in the cloned object
manually, which is easy enough by making a non-reference copy and then
rebinding the cloned object's reference to this copy.

Bug: 56226
Change-Id: I9c600e9c0845b4fde0366126ce3809d74e2240b4
2014-09-22 13:44:49 -04:00
Brad Jorsch
e2c9d4dfa9 Improve/rename Parser::replaceUnusualEscapes
The previous implementation would unescape '&', '=', '+', and '%'. The
first three will break the URL when unescaped in the query string, and
the last will break when unescaped anywhere.

The code is now changed to treat the path, query, and fragment parts of
the URL separately when unescaping. We also escape any unsafe characters
and ensure all percent-encodings use uppercase hexits.

And since the old name is no longer accurate,
Parser::replaceUnusualEscapes is deprecated in favor of
Parser::normalizeLinkUrl.

Bug: 57909
Change-Id: I77dc308d0d016c395ad737c08cf10a7711e25bbd
2014-09-16 23:00:16 +00:00
umherirrender
896f835ea9 Refactor: Use local variables for editsections in Parser
In Parser.php an array was built and then the elements of that array
were used, replaced this by local vars.

In ParserOutput.php also use local vars to make the code more readable.
Also inlined a private callback by using an anonymous function.

Change-Id: I1c31c9e4855f93a8fb65e1c21faba46fcdcb1f4b
2014-09-05 13:33:05 +00:00
This, that and the other
fb7e8b876a Fix URL protocol detection regex for file link= parameter
This regex looked something like /^(?i)bitcoin:|ftp://|ftps://|.../, which
meant the anchoring ^ only applied to the first name. This meant that any
link= value that happened to contain a URL protocol anywhere within it
(e.g. wikinews:Foo containing "news:") got incorrectly matched by this
regex.

Bug: 69317
Change-Id: Ide1c4f64137666db99f8e3b6816df01ef5099c8e
2014-08-16 22:09:42 +10:00
addshore
61c989cfc0 Fix phpcs issues in parser
This fixes all issues except for:
 - class names
 - line length

Change-Id: Ie91b010d5b3eec49d3b80b6e93b125a901ef43c6
2014-08-12 01:00:15 +00:00
jenkins-bot
bfc3710111 Merge "Don't include images/categories when behind a local interwiki prefix" 2014-08-09 11:51:07 +00:00
umherirrender
c332e33c2b Doc: Parser::getTargetLanguage cannot return null
Change-Id: I979d3d5010dc3d0ada3d82ca6d9546c5e800aaec
2014-08-08 21:03:46 +02:00
This, that and the other
9883b2471c Don't include images/categories when behind a local interwiki prefix
This solution is somewhat imperfect, as the logic being added here to
MediaWikiTitleCodec really belongs in the parser. However, given the
current state of this code, this is the cleanest possible solution at
the moment.

Modified the existing release note for this.

Bug: 68802
Change-Id: I38309186bdcad23f49e23beb26daaf3ef5bceea1
2014-08-01 18:20:51 +10:00
umherirrender
dd8921c9d9 Cleanup some docs (includes/[m-r])
- Swap "$variable type" to "type $variable"
- Added missing types
- Fixed spacing inside docs
- Makes beginning of @param/@return/@var/@throws in capital
- Changed some types to match the more common spelling

Change-Id: I8ebfbcea0e2ae2670553822acedde49c1aa7e98d
2014-07-24 19:43:25 +02:00
This, that and the other
e349358a5d No interlanguage links after local interwiki prefixes
This was noticed on enwiki after w: was marked as a local interwiki prefix
there. Links like [[w🇩🇪Foo]] ought to act like [[🇩🇪Foo]], not
[[de:Foo]].

Also adding a number of additional parser tests related to interwiki links.

Bug: 68085
Change-Id: If39af06edb4af2da85c9bcf43df7088181809fcf
2014-07-22 15:01:07 +02:00
umherirrender
de39f3e019 Use some callable hints on @param docs
Callbacks can be given as a string or array, so the hint 'callable' is
used.

Change-Id: I3842606f74c8c3705dffc70bf13e31f44a37fa65
2014-07-03 21:20:35 +02:00
Max Semenik
467f4affd1 New hook, AfterParserFetchFileAndTitle
It is needed for PageImages to collect information about galleries, improving results
for Commons mainspace.

Bug: 66510
Change-Id: I3136d648ef2c1841767db0ab33855cd168e3de3e
2014-07-01 17:40:11 -07:00
Jackmcbarn
c313a75c80 Support {{!}} as a magic word
Add {{!}} as a magic word that expands to a pipe. Parsoid already does
this, so we know it isn't going to cause major breakage.

Change-Id: I1f857417d224d6443504074a5add852df3975b89
2014-06-26 14:56:04 -07:00
jenkins-bot
ddeadfc49b Merge "Prevent OutputPage::addWikiText and friends from causing UNIQ fails" 2014-06-26 09:25:19 +00:00
Brian Wolff
4e6b0e4f4d Prevent OutputPage::addWikiText and friends from causing UNIQ fails
If you transclude a special page, OutputPage::addWikiText can cause
problems. This prevents that from happening, by using a new object
if currently in a parsing operation.

Bug: 14562
Bug: 65826
Change-Id: I7c38fa9e2fbd270e45f73f522612451e77ab8cbb
2014-06-25 15:16:14 -03:00
Brian Wolff
d7d8458bc0 Allow fragments in link= parameter in <gallery> tags.
This brings the image syntax in gallery tags inline with normal
syntax. Handle <gallery>File:foo.png|link=bar#baz</gallery>
properly.

Bug: 62343
Change-Id: If6149ccc19f70605ad4481e4da2ca55676d6001d
2014-06-23 19:45:31 -03:00
jenkins-bot
2da03f8806 Merge "Allow interlanguage link prefixes that are not language codes" 2014-06-20 15:19:32 +00:00
This, that and the other
7665f7d767 Allow interlanguage link prefixes that are not language codes
$wgExtraInterlanguageLinkPrefixes holds a list of interwiki prefixes to be
treated as language codes if $wgInterwikiMagic is true.

To set the display text for the interlanguage links generated by this
code, you need to create MediaWiki:Interlanguage-link-foo, where "foo" is
the interwiki prefix.  To provide a friendly site name for the link title
text, use MediaWiki:Interlanguage-link-sitename-foo.  On the WMF cluster,
these messages could be set using the WikimediaMessages extension.

Information about extra language links (in the site language only) is
provided via the API in meta=siteinfo&prop=interwikimap.

Bug: 32189
Change-Id: I3d04760e2d9fb3320bb71e3d5ad115eed54a899c
2014-06-20 11:29:05 +10:00
Thiemo Mättig
f6cff5e392 Update documentation of what a "section" is
There are so many slightly different understandings of what a
"section" is or can be. I'm aware the documentation was improved
just a few weeks ago. I still find it incomplete and confusing.

1. I renamed it to $sectionId to make it more clear what it
really is.

2. Sections are usually numbers. 0, 1 and so on. There is no
reason to disallow the use of ints or even floats (this works
because the string representation of 0.0 is "0"). The code never
disallowed numbers.

3. 'T1' never was supported, as far as I can tell. 'T-1' is
supported. See Parser::extractSections().

4. null and false and '' all mean "the whole page" in
WikiPage::replaceSectionAtRev() but for some reason this meaning got
lost in WikitextContent::replaceSection(). I made it the same again.

Change-Id: Icc3997722d2ed742bf7703cd7c06d09199225720
2014-06-12 18:13:23 +02:00
jenkins-bot
93405c852c Merge "Update list item newline handling to follow Parsoid's model" 2014-06-09 18:13:14 +00:00
Gabriel Wicke
b33b5d5840 Update list item newline handling to follow Parsoid's model
This improves on commit 34bd573144 by matching
Parsoid's newline handling in the PHP parser. It is the outcome of a
discussion with Erwin, where we agreed that

* foo
* bar

should produce

<ul><li>foo</li>
<li>bar</li></ul>

See the discussion in https://gerrit.wikimedia.org/r/#/c/94443/

The original rendering issue this tried to address is no longer present after
a change to the template. The pure CSS solution is now working.

Bug: 39617
Bug: 56809
Change-Id: Ib7aa9449bbd994cb23b83b3f23cff944b1cddadf
2014-06-09 11:01:52 -07:00
Brad Jorsch
d18ba4e9df Add PPFrame::isVolatile and PPFrame::setVolatile
Most wikitext is safe to parse once and then cache for when that same
wikitext is used again, such as for multiple transclusions of the same
template within a page. There are occasions, though, where some piece of
wikitext has side effects and so should not be cached; a prominent
example of such wikitext is the <ref> and <references> tags in Cite.php.

This change adds PPFrame::setVolatile so parser hooks such as <ref> and
<references> can indicate that they have done something that should not
be cached, and PPFrame::isVolatile so that callers of PPFrame::expand
can know when to avoid caching.

Bug: 46815
Bug: 31834
Change-Id: I95b3cf8781cf047cdb63da221cef45f3e7d1632e
2014-05-30 14:07:06 -04:00
Jackmcbarn
2094e578b4 Restrict empty-frame cache entries to their parent
Remove the parser's global $mTplExpandCache, and replace it with an
alternative that is separated by parent frame. This allows the integrity
of the empty-frame expansion cache to be maintained while also allowing
parent frame access.

A page with 3 copies of 
http://ja.wikipedia.org/wiki/%E4%B8%AD%E5%A4%AE%E7%B7%9A_(%E9%9F%93%E5%9B%BD) 
has the following statistics: Without this change, there are 4625 cache hits
on this page, and a sample of 3 parses took 16.6, 16.9, and 16.8 seconds.
With this change, there are 2588 cache hits, and a sample of 3 parses took
16.7, 16.7, and 17.0 seconds.

Change-Id: I621e9075e0f136ac188a4d2f53418b7cc957408d
2014-05-30 01:38:15 +00:00
umherirrender
48cd71a339 Fix @since of Parser::stripOuterParagraph
Was merged after release branch.

Follow-Up: I6bb3597898324556df912a23a7ffc9ff250b8f58
Change-Id: Idab16dc1e322ede31f6688236fddae5365ac133c
2014-05-16 19:50:30 +02:00
Ori.livneh
df983f6642 Revert "Declare visibility on class properties of includes/parser/"
See https://bugzilla.wikimedia.org/65375#c4

This reverts commit f359cdf614.

Bug: 65375
Change-Id: I12a60b5cc52a07a6deabcbf47c7c99cd2faac3c3
2014-05-16 00:52:24 +00:00
Bartosz Dziewoński
c3aa5ef597 Create Parser::stripOuterParagraph to avoid code duplication
We've had the logic for stripping the outer <p/> element in three
separate places. The version in OutputPage was missing the '$' at the
end of the regex, that was most likely a mistake caused by the
duplication.

Also, extend the logic in order not to generate invalid HTML if the
input contains more than one <p/> tag. Added tests for this and the
previous behaviour.

https://www.mail-archive.com/mediawiki-api@lists.wikimedia.org/msg03188.html

Change-Id: I6bb3597898324556df912a23a7ffc9ff250b8f58
2014-05-15 12:20:19 -04:00
Siebrand Mazeland
90254361a2 Change visibility of some methods in Parser and update docs accordingly
Change-Id: Ibe9d817325b4abafe137cd3f2fc6ccc25740cf58
2014-05-11 16:28:07 +00:00
Siebrand Mazeland
dfc7416fbe Various documentation updates for includes/parser/
Change-Id: I16dd3a792cc83f8c80b3652d42c055730f6d177a
2014-05-11 18:18:26 +02:00
Siebrand Mazeland
2527cca6de Fix most CodeSniffer issues in includes/parser/
Remaining are the classes containing underscores and possibly a few other
issues that will be addressed soonish.

Change-Id: Icf56374c71afc134420ebbcfecf12dcb29dc9564
2014-05-11 08:44:52 +00:00
Siebrand Mazeland
f359cdf614 Declare visibility on class properties of includes/parser/
Change-Id: If03a9bd5eb83be4d15f54e73f49f42540fb7d5fc
2014-05-11 02:25:00 +02:00
jenkins-bot
ac971d9a06 Merge "Add $wgServerName" 2014-05-09 10:18:09 +00:00
Ori Livneh
72c0ce43a8 Add $wgServerName
This partially reverts r73950 which removed $wgServerName on the ground that it
was only used for {{SERVERNAME}}. When it was pointed out that $wgServerName was
also used by several extensions, the response was not to restore the variable, but
to proceed to remove it from extensions as well.

It is a useful variable to have, as the discussion on Id819246a9 makes clear
(see Tim's comment on PS12 and Timo's reply). So let's reintroduce it, and expose
it in mw.config and ApiQuerySiteInfo as well.

Change-Id: I40a6fd427d38c64c628f70a2f407b145443ea204
2014-05-09 11:53:56 +02:00
Brian Wolff
5a81ad0e8a Throw an error if calling parser recursively
People accidentally (or sometimes intentionally) calling the
parser recursively has been a major source of bugs over the
years. I think its much better to fail suddenly, instead
of having unclear signs like UNIQ's all over the place.

Change-Id: I0e42aa69835c15a5df7aecb0dc5c3dec946bdf6a
2014-05-09 09:53:21 +02:00
umherirrender
7f9fd63901 Fixed some @params documentation (includes/parser)
Swapped some "$var type" to "type $var" or added missing types
before the $var. Changed some other types to match the more common
spelling. Makes beginning of some text in captial.
Also added some missing @param.

Change-Id: I49f8f48b521878de7abd9cc40efdeff6cf9a37e0
2014-04-22 01:38:39 +02:00
umherirrender
b9cd789fce docs: closure -> Closure; callback -> callable
Changed closure to capital word Closure in doc and type hint,
also changed callback in docs to callable

Change-Id: I52c8e8f13d38a837052101c38b9986be780ca057
2014-04-19 08:43:31 +02:00
jenkins-bot
91372c2225 Merge "Remove deprecated function mw.util.toggleToc" 2014-04-17 15:32:28 +00:00
Fomafix
a54ef1a203 Remove deprecated function mw.util.toggleToc
* Remove dependency from mediawiki.util to mediawiki.toc.
* Load module mediawiki.toc only when toc is existent.

Gadgets that use the messages "showtoc" or "hidetoc" should explicitly
load the module mediawiki.toc or use their own messages.

Follows-up I3ca2acb70db98d00e3f1b (implements mediawiki.toc).

Change-Id: If0438b7b6f4649434e2b83133d6f583f2f8eff16
2014-04-17 17:23:43 +02:00
Chad Horohoe
61a854fadb Remove FakeTitle
This doesn't seem to be used anywhere anymore and it's an awful class

Change-Id: Ie9047a346e410099c3082725ced83818846e95c2
2014-04-17 14:51:32 +00:00
jenkins-bot
85d4e39ff0 Merge "Handle conflicting image format options in predictable way." 2014-04-15 16:50:38 +00:00
umherirrender
725d9d125d Removed unneeded spaces and colons in @param and friends
Also swapped some "$var type" to "type $var" or added missing types
before the $var. Changed some other types to match the more common
spelling. Makes beginning of some text in captial.

Change-Id: Ic36c8c7820a6c2d603f1138130670c6bf6a1ca59
2014-04-08 16:02:49 +00:00
Jackmcbarn
730c2c01a8 Allow passing parameters to preload
When pages are loaded in the edit box via preload, allow parameter
substitution. The interface-style $1 is used rather than the
template-style {{{1}}} to avoid conflicts with preloads that add template
parameters. Syntax is:
action=edit&preload=Foo&preloadparams[]=first&preloadparams[]=second

Bug: 12853
Change-Id: If02cf4b3dba9f9d22a956d8bfff224677cbce00d
2014-04-06 21:03:02 -04:00
umherirrender
23fab68274 Fix spacing after @param and friends in comments
Searched for:
\@(param|return|throws|since|deprecated|access|todo|var)[ \t]{2,}

Change-Id: Icce22ba9fe0635455691ca58d9872d618151f346
2014-04-05 20:02:29 +00:00
jenkins-bot
ed8668b925 Merge "Add missing line breaks to wfDebug() calls" 2014-03-31 11:50:41 +00:00
Brian Wolff
c00fd14c0e Follow-up I7d4bb90: Message tweak + add code comment
When I was saying it in my head, "No description available."
sounded better to me than "Description not available", although
perhaps that's just me.

Add a code comment to Parser::addTrackingCategory telling people
to register their tracking categories.

Change-Id: I17bb0cf4b3118dda8647e0607588685c2b5cdb86
2014-03-30 22:04:08 -03:00
Alexandre Emsenhuber
449ee32451 Add missing line breaks to wfDebug() calls
Also removed true as second parameter to it from CloneDatabase.php
since it is the default value of that parameter.

Change-Id: I727ebae2bd4df0e26019985ce8c7ce73381c5642
2014-03-29 11:52:07 +01:00
C. Scott Ananian
083ec382c4 Handle conflicting image format options in predictable way.
The PHP parser now uses the first image format option that appears,
and ignores subsequent format options.  This enforces the "zero or one"
language in
https://en.wikipedia.org/wiki/Wikipedia:Extended_image_syntax#Type
and makes parser behavior more predictable.  This also matches Parsoid
behavior.

Change-Id: Ifa32238b3d274123c7b98022cf688c33edfd7197
2014-03-18 14:21:33 -04:00
umherirrender
7c314de876 Rename some local vars to start with a lowercase letter
Change-Id: I6e5975ed7351c1439eda19afaba5120c6afa50f1
2014-03-15 21:03:05 +00:00
umherirrender
047c86f26e Fix spacing between two functions
Added and removed some new lines to have one new line between two
functions

Change-Id: I1ccfbd575dd26b160396ef3d3e2e079f5cdbe196
2014-03-15 20:57:23 +00:00
Ladsgroup
a90f1a2d79 Changing URLs of mediawiki.org in scripts to the SSL-based website
http://www.mediawiki.org --> https://www.mediawiki.org

Part 3

Change-Id: Ica633881b1744fa2854f4b012b79dbf5a7e5e7e2
2014-03-13 22:28:14 +00:00
Chad Horohoe
4e2b1eef2c Actually make Parser::pstPass2() private
Nothing else uses this anywhere in SVN or Git as far as I can tell

Change-Id: I0ea0ebe5d11ab50fef455dd0239912e206606cd8
2014-02-05 13:29:01 -08:00