Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
jenkins-bot	bd78869618	Merge "No yoda conditions"	2018-12-09 01:34:23 +00:00
Alangi Derick	19adaa6a4b	parser: Fix PHPDoc annotations in parser module Change-Id: I09680d72516f943051e86655b5fddf9ff2988e4e	2018-12-08 13:07:10 +00:00
jenkins-bot	4077b57759	Merge "Parse wikitext in gallery caption"	2018-11-27 15:47:50 +00:00
C. Scott Ananian	f87898b488	Protect legacy URL parameter syntax in link and alt options HTML doesn't allow certain semicolon-less HTML entities in attribute values to avoid breaking legacy markup like: <a href="http://example.com?foo&param=bar">...</a> (Note that the & in that URL is not properly entity-escaped as `&`.) Unlike wikitext, HTML generally allows semicolon-less legacy entities in text. Our alt and link option processing shove text through Sanitizer::stripAllTags, which does entity decoding including these legacy semicolon-less entities. Wikitext doesn't allow semicolon-less entities, so escape & characters where appropriate to protect alt/link options and avoid breaking URLs. This was a "regression" in how alt options were handled starting in `ddb4913f53` when we switched to using Remex for Sanitizer::stripAllTags -- semicolon-less entities (previously invalid in wikitext) were now being decoded when stripAllTags was called on alt text. This change became a problem when `ad80f0bca2` sent link option text through Sanitizer::stripAllTags (with the new semicolon-less entity decode) instead of PHP's strip_tags (which, in addition to its other faults, doesn't do entity decode at all). This suddenly started decoding "non-wikitext" entities like `&para` inside URLs, breaking links. Filed T210437 as a follow-up to consider changing the behavior of Sanitizer::stripAllTags() globally to prevent it from decoding semicolon-less entities for all callers. Bug: T209236 Change-Id: I5925e110e335d83eafa9de935c4e06806322f4a9	2018-11-27 10:12:05 -05:00
Fomafix	3ee1560232	No yoda conditions Replace if ( 42 === $foo ) by if ( $foo === 42 ) Change-Id: Ice320ef1ae64a59ed035c20134326b35d454f943	2018-11-21 17:54:39 +01:00
jenkins-bot	7a3eb1f3a6	Merge "Hard deprecate codepaths where tidy is disabled"	2018-11-13 23:54:24 +00:00
Brad Jorsch	d65e96b763	Use new externallinks.el_index_60 field This adds a method to LinkFilter to build the query conditions necessary to properly use it, and adjusts code to use it. This also takes the opportunity to clean up the calculation of el_index: IPs are handled more sensibly and IDNs are canonicalized. Also weird edge cases for invalid hosts like "http://.example.com" and corresponding searches like "http://*..example.com" are now handled more regularly instead of being treated as if the extra dot were omitted, while explicit specification of the DNS root like "http://example.com./" is canonicalized to the usual implicit specification. Note that this patch will break link searches for links where the host is an IP or IDN until refreshExternallinksIndex.php is run. Bug: T59176 Bug: T130482 Change-Id: I84d224ef23de22dfe179009ec3a11fd0e4b5f56d	2018-11-12 22:33:18 +00:00
C. Scott Ananian	54ac31f94d	Hard deprecate codepaths where tidy is disabled Future parsers will not support the output generated with tidy disabled. Parser tests using untidied output will also be deprecated (and rewritten) in a follow-up patch. No new release notes necessary since user-visible tidy configuration was deprecated previously (in 1.32), and individual methods which had disabled tidy during execution were individually release-noted as they were updated. Bug: T198214 Depends-On: I0f417f75a49dfea873e9a2f44d81796a48b9f428 Depends-On: If5c619cdd3e7f786687cfc2ca166074d9197ca11 Change-Id: I592e0e0dfef7d929f05c60ffe4d60e09725b39cc	2018-11-05 18:49:16 +00:00
jenkins-bot	9aedec343e	Merge "Handle <nowiki> and other markup consistently in image link/alt options"	2018-11-02 01:59:01 +00:00
Max Semenik	c16704c33a	Display SVGs in target language Previously, they were always displayed in defult language unless forced explicitly in wikitext, e.g. [[File:Foo.svg\|lang=ru]]. This change adds a feature flag that would enable always trying to display in page language. * If enabled, Parser will pass a new parameter - 'pagelang' - to the media handler. * SvgHandler uses page language when determining what language to render the image in. * 'pagelang' can always be overridden by 'lang'. * If no translation in page language is available, the default language (English) will be used for thumbnail URLs, to prevent cluttering media storage and HTTP caches with useless copies. Performance: this requires accessing image's metadata during parsing. My testing indicates there were no code path where this wasn't the case already, so no performance hit is expected, however we should still keep an eye on page save performance. Bug: T205040 Change-Id: I348840ef405e1370cc0c17d69051bce30153c9c0	2018-10-30 16:12:11 -07:00
Tim Starling	a6a017cea4	Fix use of non-existent variable Parser::$config Fix bug from Ib4394f370cb561ccf195338a1c2e9e465dcb3dc3 Add test. Bug: T208000 Change-Id: Ia81cca1b64afef2af3cb8dff19719a7f0de9d306	2018-10-25 16:27:55 -07:00
C. Scott Ananian	ad80f0bca2	Handle <nowiki> and other markup consistently in image link/alt options Use Parser::stripAltText() consistently to handle link and alt options in both Parser::makeImage() and Parser::renderImageGallery(). This ensures that link option text can use <nowiki> to escape problematic text so that (for example) the following works: ``` [[File:Foobar.jpg\|link=<nowiki>a''b''c</nowiki>\|alt=<nowiki>a''b''c</nowiki>]] <gallery> File:Foobar.jpg\|link=<nowiki>a''b''c</nowiki>\|alt=<nowiki>a''b''c</nowiki> </gallery> ``` Previously the handling of the link option in Parser::renderImageGallery() used a bespoke `strip_tags` invocation which didn't replace <nowiki>s, and the handling of the link option in Parser::makeImage() didn't strip tags at all, nor did it replace <nowiki>s. For example, in Parser::makeImage() double quotes in titles would be converted to embedded `<i>` tags before being passed to Parser::parseLinkParameter(), with predictably poor results. Tests added to confirm behavior of alt/link with HTML-escaped entities and <nowiki>s exposed a bug in Remex: T207088. Tests will fail on PHP 7.0 until that is fixed. Bug: T206940 Depends-On: Ide67bba20f771868c0e119cb2874464dcf1d758a Change-Id: Ife4c0edaa85e0cb294c5d4c1e31d5d7d828d9df4	2018-10-22 15:26:36 +00:00
jenkins-bot	b3f03fd75e	Merge "Inject Config into Parser instead of using globals"	2018-10-17 14:34:31 +00:00
James D. Forrester	976c50c21a	Drop ParserLimitReport, deprecated in 1.22 Change-Id: I4898d92569bd823f09c12f68fa186e2e139790a7	2018-10-10 16:20:18 -07:00
Aryeh Gregor	5173d5ee60	Inject Config into Parser instead of using globals Change-Id: Ib4394f370cb561ccf195338a1c2e9e465dcb3dc3	2018-10-02 21:26:01 +03:00
jenkins-bot	50f6b24ee6	Merge "Parser: Refactor parsing of [[File:...\|link=...]] syntax for reusability"	2018-09-26 19:18:34 +00:00
Bartosz Dziewoński	1c9664d18a	Parser: Refactor parsing of [[File:...\|link=...]] syntax for reusability Change-Id: I91467297de4b7c532448a4c20b9a0dc8216c7200	2018-09-26 13:36:32 +02:00
C. Scott Ananian	327f0f92fa	Use wfIsHHVM() instead of a HipHop-specific environment variable Change-Id: I5bbf3e4f65d9b6a0d7419f67e3931e77e92b7e6c	2018-09-20 09:23:54 -04:00
daniel	465954aa23	Provide new, unsaved revision to PST to fix magic words. This injects the new, unsaved RevisionRecord object into the Parser used for Pre-Save Transform, and sets the user and timestamp on that revision, to allow {{subst:REVISIONUSER}} and {{subst:REVISIONTIMESTAMP}} to function. Bug: T203583 Change-Id: I31a97d0168ac22346b2dad6b88bf7f6f8a0dd9d0	2018-09-06 18:33:44 +02:00
Brian Wolff	13e5700b23	Use annotations for taint in Parser & ParserOutput. This replaces the builtin taints that are removed in Ic1e1983a51c. Additionally, parse will no longer warn about double escaping - there's many situations where such warnings are wrong (e.g. Using Html::rawElement()). However this also means that Parser::parse( wfMessage( 'foo' )->parse() ); will no longer give a double escaping warning, which is unfortunate. Bug: T202380 Change-Id: Ia52d37411beb62b112c6ff102438063c3d750769	2018-08-31 15:55:44 +00:00
jenkins-bot	a3357744c3	Merge "[MCR] Introduce RevisionRenderer"	2018-08-31 11:25:15 +00:00
Kunal Mehta	eb7150b029	Set @param-taint for Parser::internalParse() This is not strictly accurate, because Parser::internalParse() actually returns half-parsed HTML, which is not safe for output. But it is safe for output from a parser tag. Maybe phan-taint-check plugin needs to learn about half-parsed HTML as an extra taint type, and make that an acceptable thing for parser tags to return, but not other things. But this fixes the failures for the Listings extension, so I think it's worthwhile in the meantime. Change-Id: Idf87f5c3dcf81dd210de73a4ff15e3b1aabd9f89	2018-08-30 21:46:10 -07:00
daniel	e9f71517f7	[MCR] Introduce RevisionRenderer RevisionRenderer is the MCR replacement for Content::getParserOutput, as outlined in <https://www.mediawiki.org/wiki/User:Daniel_Kinzler_(WMDE)/MCR-PageUpdater>. Note: This change also introduces quite a bit of code for merging ParserOutput objects. Bug: T194048 Change-Id: I871978bf79f67c9e7954fb3fc8528d6e365f2cc1	2018-08-30 19:15:12 +02:00
jenkins-bot	ba6c827485	Merge "Apply content wrapping in ParserOutput::getText()"	2018-08-29 16:25:22 +00:00
daniel	0dc7ba02b4	Apply content wrapping in ParserOutput::getText() Instead of applying wrapping the the parser and unwrapping in ParserOutput::getText(), turn this around and apply wrapping in getText(), and only if desired. This avoids search&replace logic for unwrapping, and it also makes it a lot easier to merge the output of multiple slots for MCR output. This changes behavior in two hopefully irrelevant ways: 1) the limit report comments will be inside the wrapper div, instead of following it. 2) if HTML with a wrapper div is explicitly injected into a ParserOutput object, it will not be possible to unwrap the text. Bug: T174035 Change-Id: I1641b7995af9bd297f1acd610d583fbf874f34e0	2018-08-29 16:46:25 +02:00
jenkins-bot	e548e0f35c	Merge "Make interwiki transclusion use the WAN cache"	2018-08-27 21:17:04 +00:00
Aaron Schulz	6504e23074	Make interwiki transclusion use the WAN cache This means that now: * Entries actually get deleted when expired * The transclusion cache is shared across wikis * Large blobs that do not fit in cache no longer cause DB errors * DB writes are not triggered on GET requests * Keys are hashed and no longer need to be so restrictive Also, add a "check key" based purge system and process cache the text/html values similar to how regular revision text is cached. Bug: T189702 Change-Id: I8ac12b53c02bb26857175dd5a4af29d49e03dc33	2018-08-27 19:32:04 +00:00
Kunal Mehta	2852255186	Inject SpecialPageFactory into Parser Change-Id: I6a6a94cbdafdc724ce02408cd9e744e7b3eda92b	2018-08-17 12:03:13 -07:00
Kunal Mehta	b7b8f214bb	Parser: Call firstCallInit() in getTags/getFunctionHooks So callers don't need to do this manually. Pointed out by Tim in T201799. Depends-On: Ia6c36d5a650095e35093bf47e275e081e89b3daf Change-Id: Ida62767f3ca53f99609cae01d3a20051bb92ccab	2018-08-14 14:16:42 -07:00
Kunal Mehta	e8370d6977	Parser: Add accessors needed by CodeMirror Change-Id: Ia2d98baf6caed2cd779cb00aceba5785cf13d633	2018-08-13 22:44:48 -07:00
Aryeh Gregor	90d4f56fe4	Mass conversion of $wgContLang to service Brought to you by vim macros. Bug: T200246 Change-Id: I79e919f4553e3bd3eb714073fed7a43051b4fb2a	2018-08-11 22:44:29 -06:00
Aryeh Gregor	bca6085920	Use ParserFactory in a bunch of places I wasn't sure how to convert the rest of the occurrences in core (there are a significant number). Bug: T200881 Change-Id: I114bba946cd3ea8a293121e275588c3c4d174f94	2018-08-11 00:16:17 -06:00
Aryeh Gregor	62515f7b15	Introduce ParserFactory service Bug: T200881 Change-Id: I257e78200983cb10afb76de1f07dd1b9d531c52a	2018-08-11 00:15:52 -06:00
jenkins-bot	8e7f352610	Merge "Update Parser to use ContentLanguage"	2018-08-03 04:52:25 +00:00
Aryeh Gregor	c2f29a2efc	Update Parser to use ContentLanguage Bug: T200246 Change-Id: Ie54677706ec175189c3ff52342a9d8ac2f5d90d8	2018-08-03 04:29:39 +00:00
Kunal Mehta	612f6e5690	Document Parser::$mFirstCall And don't bother checking its value in clearState(), since firstCallInit() will do that anyways. Change-Id: Ibc5e809daa614e99be91d65a363de4f697e6afa5	2018-08-02 01:57:53 -07:00
Aryeh Gregor	5a16d92e04	Update MagicWordArray to use MagicWordFactory Bug: T200247 Change-Id: Ie5a60b81382d7299beadc691fe4d27e931ebe0ed	2018-07-31 21:40:21 +03:00
Aryeh Gregor	72ab013be0	Update CoreParserFunctions to use MagicWordFactory Bug: T200247 Change-Id: I122d8acf601581b18756a5b8d65e50953b28c21d	2018-07-31 15:36:57 +03:00
Aryeh Gregor	c3ee073d62	Update Parser to use MagicWordFactory Bug: T200247 Depends-On: Ie061fe90f9b9eca0cbf7e8199d9ca325c464867a Change-Id: Iab6a4cb32c491caf2685cdd68f9465ef1dfa3c4c	2018-07-30 21:20:43 +03:00
Brad Jorsch	d469f53847	Parser: Remove style and script tags' content from TOC We don't want to display the stylesheet as part of the TOC entry if someone uses TemplateStyles in a heading. Bug: T198618 Change-Id: I2f7316daaba0cce662b6a4702ab87322e6783655	2018-07-16 22:52:51 -04:00
Umherirrender	130ec2523d	Fix PhanTypeMismatchDeclaredParam Auto fix MediaWiki.Commenting.FunctionComment.DefaultNullTypeParam sniff Change-Id: I865323fd0295aabd06f3e3c75e0e5043fb31069e	2018-07-07 00:34:30 +00:00
Fomafix	a60dcdc2e3	Armor against French spaces detection in HTML attributes This change also solves T13874 in a generic way. Bug: T5158 Change-Id: Id8cdb887182f346acab2d108836ce201626848af	2018-06-21 19:24:07 +02:00
jenkins-bot	84fa176c9c	Merge "Avoid deprecated LinkCache::singleton()"	2018-06-14 23:48:54 +00:00
C. Scott Ananian	7de2c566dd	Deprecate Language::markNoConversion, which confuses readers Language::markNoConversion is used only within Parser.php and differs from LanguageConverter::markNoConversion in that, contrary to its name and its namesake, it only protects things which look like URLs from language conversion. This wasted several days of my time before I realized what was going on. It's needless; just hoist the "looks like a URL" special casing inline to the single place where that functionality is used. (And I wonder if the "looks like a URL" case is actually needed at all any more, since most of those cases are probably free external links, which go through a different code path, not bracketed external links.) This is a clean-up to the clean-up that liangent performed in 2012 with `e01adbfc0b`. Change-Id: I80479600f34170651732b032e8881855aa1204d8	2018-06-13 13:26:58 -04:00
C. Scott Ananian	dbda7cdfb0	Remove unnecessary Parser::getConverterLanguage() indirection The getConverterLanguage() method was added in March 2012 in commit `561424c266` as a workaround for a regression in mediawiki 1.19. It was an indirection which checked the global variable $wgBug34832TransitionalRollback to return a different converter language for Chinese wikis. When this temporary bugfix was reverted in January 2013 in commit `a3fbdaaa2c`, the temporary global variable was removed, but not the getConverterLanguage() indirection. Since then, new code in the parser seems to have faithfully used getConverterLanguage() instead of getTargetLanguage(), even though they are identical and the need for getConverterLanguage() has long since passed. Strike a small blow for elegant minimalism by removing the completely unnecessary Parser::getConverterLanguage() indirection. Well, sort of: since this blight has been slowly growing inside Parser.php for so long, we need to deprecate getConverterLanguage() first just in case any external dependency has been infected. Next release we can finally excise the unnecessary method. Change-Id: I567c29c9c7699020955699b76cbe8578d02e2fe6	2018-06-12 23:33:03 +00:00
Fomafix	e1630b6a53	PHP: Use short ternary operator (?:) where possible Change-Id: Idcc7e4fcdd4d8302ceda44bf6d294fa8c2219381	2018-06-11 11:26:35 +02:00
Kunal Mehta	c4e5a9dd97	Avoid deprecated LinkCache::singleton() Change-Id: Ie0e5c4ef0fe6ec896378bb2433af0898655dd907	2018-06-10 23:55:11 -07:00
Max Semenik	6e956d55aa	Replace call_user_func_array(), part 2 Uses new PHP 5.6 syntax like ...parameter unpacking and calling anything looking like a callback to make the code more readable. There are much more occurrences but this commit is intentionally limited to an easily reviewable size. In one occurrence, a simple conditional instead of trickery was much more readable. This patch finishes all the easy stuf in the core, the remainder is either unobvious or would result in smaller readability gains. It will be carefully dealt with in further commits. Change-Id: I79a16c48bfb98b75e5b99f2f6f4fa07b3ae02c5b	2018-06-07 20:19:26 -07:00
Bartosz Dziewoński	485f66f174	Use PHP 7 '??' operator instead of '?:' with 'isset()' where convenient Find: /isset$\s([^()]+?)\s$\s\?\s\1\s:\s/ Replace with: '\1 ?? ' (Everywhere except includes/PHPVersionCheck.php) (Then, manually fix some line length and indentation issues) Then manually reviewed the replacements for cases where confusing operator precedence would result in incorrect results (fixing those in I478db046a1cc162c6767003ce45c9b56270f3372). Change-Id: I33b421c8cb11cdd4ce896488c9ff5313f03a38cf	2018-05-30 18:06:13 -07:00
Bartosz Dziewoński	b191e5e860	Use PHP 7 '<=>' operator in 'sort()' callbacks `$a <=> $b` returns `-1` if `$a` is lesser, `1` if `$b` is lesser, and `0` if they are equal, which are exactly the values 'sort()' callbacks are supposed to return. It also enables the neat idiom `$a[x] <=> $b[x] ?: $a[y] <=> $b[y]` to sort arrays of objects first by 'x', and by 'y' if they are equal. * Replace a common pattern like `return $a < $b ? -1 : 1` with the new operator (and similar patterns with the variables, the numbers or the comparison inverted). Some of the uses were previously not correctly handling the variables being equal; this is now automatically fixed. * Also replace `return $a - $b`, which is equivalent to `return $a <=> $b` if both variables are integers but less intuitive. * (Do not replace `return strcmp( $a, $b )`. It is also equivalent when both variables are strings, but if any of the variables is not, 'strcmp()' converts it to a string before comparison, which could give different results than '<=>', so changing this would require careful review and isn't worth it.) * Also replace `return $a > $b`, which presumably sort of works most of the time (returns `1` if `$b` is lesser, and `0` if they are equal or `$a` is lesser) but is erroneous. Change-Id: I19a3d2fc8fcdb208c10330bd7a42c4e05d7f5cf3	2018-05-30 18:05:20 -07:00

1 2 3 4 5 ...

985 commits