Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
C. Scott Ananian	b975a0bfe0	Don't break autolinks by stripping the final semicolon from an entity. Autolinking free external links is clever about making sure that trailing punctuation isn't included in the link. But if an HTML entity happens to terminate the URL, the semicolon from the entity is stripped from the url, breaking it. Fix this corner case. This also unifies autolink parsing with Parsoid. See: I5ae8435322c78dd1df170d7a3543fff3642759b1 Change-Id: I5482782c25e12283030b0fd2150ac55092f7979b	2014-12-18 17:27:55 -05:00
Brad Jorsch	5c1eeb2464	Normalize "\r" newlines in preSaveTransform The behavior of the different preprocessors differs when given \r or \r\n newlines. We already normalize the latter here, so may as well do the former here too. Bug: T78488 Change-Id: Id6390f64a73ea01088729f25d79103388c1fe7e8	2014-12-15 16:59:42 -05:00
C. Scott Ananian	25d35fc65c	Enforce spaces around magic links (RFC, PMID, and ISBN). Ensure that there is a \b boundary before and after RFC, PMID, and ISBN links. (Previously we enforced \b boundaries only before free external links and after ISBN links.) Consistency is a good thing! In addition: * \b is not a PHP escape sequence, so you don't need to write \\b inside a string. * \b before the numeric part of an ISBN is pointless: by the structure of the regexp there will always be a space on the left and a word character (a digit) on the right. Bug: 65278 Change-Id: Ic315b988091a5c7530a8285b9249804db72e55db	2014-12-11 03:41:23 +00:00
Aaron Schulz	e369f66d00	Replace wfRunHooks calls with direct Hooks::run calls * This avoids the overhead of an extra function call Change-Id: I8ee996f237fd111873ab51965bded3d91e61e4dd	2014-12-10 12:26:59 -08:00
umherirrender	489d793882	Fixed spacing - Added/removed spaces around parenthesis - Added newline in empty blocks - Added space after switch/foreach/function - Use tabs at begin of line - Add newline at end of file Change-Id: I244cdb2c333489e1020931bf4ac5266a87439f0d	2014-12-05 22:28:07 +01:00
jenkins-bot	0b8f48c535	Merge "Use Parser::SFH_NO_HASH/SFH_OBJECT_ARGS class const"	2014-11-27 05:55:07 +00:00
Aaron Schulz	88ad1bd9a7	Cleaned up template profile report tabbing Change-Id: I46abfc856d718d4db73d0510bde3e2b589341b10	2014-11-18 14:58:02 -08:00
umherirrender	91f26d50ee	Use Parser::SFH_NO_HASH/SFH_OBJECT_ARGS class const Instead of the global const Add hint to Defines, that they should not be used. Change-Id: I3e1dcf46fe18a97a05e3406c209815adb7e0e083	2014-11-18 21:19:22 +01:00
Chad Horohoe	b8d93fb4fd	Refactor profiling output from profiling * Added a standard getFunctionStats() method for Profilers to return per function data as maps. This is not toolbar specific like getRawData(). * Cleaned up the interface of SectionProfiler::getFunctionStats() a bit. * Removed unused cpu_sq, real_sq fields from profiler UDP output. * Moved getTime/getInitialTime to ProfilerStandard. Co-Authored-By: Aaron Schulz <aschulz@wikimedia.org> Change-Id: I266ed82031a434465f64896eb327f3872fdf1db1	2014-11-17 19:26:04 -07:00
Aaron Schulz	0bfa6b6264	Move request-only template profiling to an always-on parser report Change-Id: I0660c8d6cac0dadab648eac9736504b7939320f3	2014-11-12 18:06:00 -08:00
Reedy	aa5c2493cb	Remove documentation hinting LinkHolderArray::replace() should return value Return value not used in any code in our repo Removes FIXME too Change-Id: Ia2ec35099f0b54ea39c2f6b9371e94c3034bddb0	2014-11-10 18:32:57 +00:00
MZMcBride	627ccbcd7b	Minor code comment tweaks for spelling and consistency Change-Id: I51391f45d0f81e4245ccc0e435a71ccd5b0e3ca3	2014-11-08 14:07:19 -05:00
Bartosz Dziewoński	565e9fa077	Correctly parse <indicator/> contents, Parser rejiggering includes/parser/Parser.php * Pull out a chunk of code we need to reuse from parse() to internalParseHalfParsed(). This is a fully backwards-compatible change. Code changes: * Add a guard for running ParserBeforeTidy and ParserAfterTidy hooks, as extensions might not expect them to be called for snippets, only full page content. * Change $options to $this->mOptions. The bulk of parsing work is now done in internalParse() and internalParseHalfParsed(), parse() only handles four things: * Resetting parser state when a parse starts/finishes * Page title language conversion * Outputting limit report and limitation warnings * Running ParserAfterParse hook (dunno why, but it's documented) * Expand documentation for recursiveTagParse(), with some uppercase warnings so that no one does the stupid thing I did ever again. * Add new public method recursiveTagParseFully(), which is a recursive parser entry point that produces fully parsed HTML ready for inclusion in HTML output. Compared to Parser::parse(), it doesn't produce limit reports and doesn't run the ParserAfterParse hook. includes/parser/CoreTagHooks.php * Use the new recursiveTagParseFully() method. * Use Parser::stripOuterParagraph() to remove silly tags. Bug: 72887 Change-Id: I89ae9a50b82245f9a9e4a903563aeb1c51b6103e	2014-11-04 10:25:58 +00:00
Reedy	8e6fa108b8	or -> \|\| Change-Id: Ic591f06f70c68bb2912b7f028f7f988eb658375d	2014-10-24 11:26:14 +01:00
Chad Horohoe	6c30fff0ba	Swap and for && Change-Id: I7821a62586cc2d2f929fb3d7d5046958a70efbd0	2014-10-23 13:03:14 -07:00
Tim Starling	ce8e466e44	Revert "Use a fixed regex for StripState" Breaks extensions, doesn't entirely fix the problem it was meant to fix. This reverts commit `6da3f169ac`. Change-Id: Ic193abcff8c72b0c8b434fcac514f88603a45beb	2014-10-20 21:42:53 +00:00
jenkins-bot	cf93d76c03	Merge "Remove hitcounters and associated code"	2014-10-20 21:12:54 +00:00
Chad Horohoe	90d90dad6e	Remove hitcounters and associated code The hitcounter implementation in MediaWiki is flawed and needs removal. For proper metrics, it is suggested to use something like Piwik or Google Analytics. RFC: https://www.mediawiki.org/wiki/Requests_for_comment/Removing_hit_counters_from_MediaWiki_core Change-Id: I0e5006a7e8a09c800f8fa4effa9399e8afdd7a57	2014-10-20 13:01:55 -07:00
Tim Starling	6da3f169ac	Use a fixed regex for StripState The JIT compiler in newer versions of PCRE experiences lock contention when multithreaded applications perform a high rate of concurrent compilations. We are seeing some performance impact on HHVM under normal production traffic. The random part of the strip marker is just there to protect against deliberate insertion of strip markers into the source text, which is very rare. So use a generic regex to find strip markers, and check in the callback whether the random state ID is correct. StripState::killMarkers() will be slower when it has to remove many strip markers, but most calls to it will not match any strip markers, so overall performance should be improved due to reduced JIT compilation. Bug: 72205 Change-Id: I8d37ae929a8c669c9e39adc8096b89e5732b68d0	2014-10-19 14:38:09 -07:00
Gergő Tisza	382d4df858	Move addTrackingCategory from Parser to ParserOutput addTrackingCategory is more in line with ParserOutput's functionality (addLink, addCategory etc), and tracking categories are useful even for content types which do not use the parser at all. There is no reason to require the caller to obtain a Parser object just to be able to add tracking categories. Change-Id: I89d9ea1db3a4e6486e77eee940bd438f7753b776	2014-09-28 23:35:52 +00:00
jenkins-bot	0755177e64	Merge "Add parser callback to get a page's current revision"	2014-09-25 22:52:10 +00:00
Brad Jorsch	8eeb906f93	Break accidental references in Parser::__clone If you have a reference to an object field (anywhere in the call stack) when you clone the object, the field will be cloned as a reference rather than as a value. So we have to break those unexpected references in the cloned object manually, which is easy enough by making a non-reference copy and then rebinding the cloned object's reference to this copy. Bug: 56226 Change-Id: I9c600e9c0845b4fde0366126ce3809d74e2240b4	2014-09-22 13:44:49 -04:00
Jackmcbarn	edc9f2acd9	Add parser callback to get a page's current revision Add Parser::fetchCurrentRevisionOfTitle(). By default, this just calls Revision::newFromTitle, but a callback can be set in ParserOptions that will override it. Anything that runs as part of a parse should use this wherever possible. Bug: 70495 Change-Id: I521f1f68ad819cf0f37e63240806f10c1cceef9c	2014-09-19 11:59:58 -04:00
Brad Jorsch	e2c9d4dfa9	Improve/rename Parser::replaceUnusualEscapes The previous implementation would unescape '&', '=', '+', and '%'. The first three will break the URL when unescaped in the query string, and the last will break when unescaped anywhere. The code is now changed to treat the path, query, and fragment parts of the URL separately when unescaping. We also escape any unsafe characters and ensure all percent-encodings use uppercase hexits. And since the old name is no longer accurate, Parser::replaceUnusualEscapes is deprecated in favor of Parser::normalizeLinkUrl. Bug: 57909 Change-Id: I77dc308d0d016c395ad737c08cf10a7711e25bbd	2014-09-16 23:00:16 +00:00
umherirrender	896f835ea9	Refactor: Use local variables for editsections in Parser In Parser.php an array was built and then the elements of that array were used, replaced this by local vars. In ParserOutput.php also use local vars to make the code more readable. Also inlined a private callback by using an anonymous function. Change-Id: I1c31c9e4855f93a8fb65e1c21faba46fcdcb1f4b	2014-09-05 13:33:05 +00:00
This, that and the other	fb7e8b876a	Fix URL protocol detection regex for file link= parameter This regex looked something like /^(?i)bitcoin:\|ftp://\|ftps://\|.../, which meant the anchoring ^ only applied to the first name. This meant that any link= value that happened to contain a URL protocol anywhere within it (e.g. wikinews:Foo containing "news:") got incorrectly matched by this regex. Bug: 69317 Change-Id: Ide1c4f64137666db99f8e3b6816df01ef5099c8e	2014-08-16 22:09:42 +10:00
addshore	61c989cfc0	Fix phpcs issues in parser This fixes all issues except for: - class names - line length Change-Id: Ie91b010d5b3eec49d3b80b6e93b125a901ef43c6	2014-08-12 01:00:15 +00:00
jenkins-bot	bfc3710111	Merge "Don't include images/categories when behind a local interwiki prefix"	2014-08-09 11:51:07 +00:00
umherirrender	c332e33c2b	Doc: Parser::getTargetLanguage cannot return null Change-Id: I979d3d5010dc3d0ada3d82ca6d9546c5e800aaec	2014-08-08 21:03:46 +02:00
This, that and the other	9883b2471c	Don't include images/categories when behind a local interwiki prefix This solution is somewhat imperfect, as the logic being added here to MediaWikiTitleCodec really belongs in the parser. However, given the current state of this code, this is the cleanest possible solution at the moment. Modified the existing release note for this. Bug: 68802 Change-Id: I38309186bdcad23f49e23beb26daaf3ef5bceea1	2014-08-01 18:20:51 +10:00
umherirrender	dd8921c9d9	Cleanup some docs (includes/[m-r]) - Swap "$variable type" to "type $variable" - Added missing types - Fixed spacing inside docs - Makes beginning of @param/@return/@var/@throws in capital - Changed some types to match the more common spelling Change-Id: I8ebfbcea0e2ae2670553822acedde49c1aa7e98d	2014-07-24 19:43:25 +02:00
This, that and the other	e349358a5d	No interlanguage links after local interwiki prefixes This was noticed on enwiki after w: was marked as a local interwiki prefix there. Links like [[w🇩🇪Foo]] ought to act like [[🇩🇪Foo]], not [[de:Foo]]. Also adding a number of additional parser tests related to interwiki links. Bug: 68085 Change-Id: If39af06edb4af2da85c9bcf43df7088181809fcf	2014-07-22 15:01:07 +02:00
umherirrender	de39f3e019	Use some callable hints on @param docs Callbacks can be given as a string or array, so the hint 'callable' is used. Change-Id: I3842606f74c8c3705dffc70bf13e31f44a37fa65	2014-07-03 21:20:35 +02:00
Max Semenik	467f4affd1	New hook, AfterParserFetchFileAndTitle It is needed for PageImages to collect information about galleries, improving results for Commons mainspace. Bug: 66510 Change-Id: I3136d648ef2c1841767db0ab33855cd168e3de3e	2014-07-01 17:40:11 -07:00
Jackmcbarn	c313a75c80	Support {{!}} as a magic word Add {{!}} as a magic word that expands to a pipe. Parsoid already does this, so we know it isn't going to cause major breakage. Change-Id: I1f857417d224d6443504074a5add852df3975b89	2014-06-26 14:56:04 -07:00
jenkins-bot	ddeadfc49b	Merge "Prevent OutputPage::addWikiText and friends from causing UNIQ fails"	2014-06-26 09:25:19 +00:00
Brian Wolff	4e6b0e4f4d	Prevent OutputPage::addWikiText and friends from causing UNIQ fails If you transclude a special page, OutputPage::addWikiText can cause problems. This prevents that from happening, by using a new object if currently in a parsing operation. Bug: 14562 Bug: 65826 Change-Id: I7c38fa9e2fbd270e45f73f522612451e77ab8cbb	2014-06-25 15:16:14 -03:00
Brian Wolff	d7d8458bc0	Allow fragments in link= parameter in <gallery> tags. This brings the image syntax in gallery tags inline with normal syntax. Handle <gallery>File:foo.png\|link=bar#baz</gallery> properly. Bug: 62343 Change-Id: If6149ccc19f70605ad4481e4da2ca55676d6001d	2014-06-23 19:45:31 -03:00
jenkins-bot	2da03f8806	Merge "Allow interlanguage link prefixes that are not language codes"	2014-06-20 15:19:32 +00:00
This, that and the other	7665f7d767	Allow interlanguage link prefixes that are not language codes $wgExtraInterlanguageLinkPrefixes holds a list of interwiki prefixes to be treated as language codes if $wgInterwikiMagic is true. To set the display text for the interlanguage links generated by this code, you need to create MediaWiki:Interlanguage-link-foo, where "foo" is the interwiki prefix. To provide a friendly site name for the link title text, use MediaWiki:Interlanguage-link-sitename-foo. On the WMF cluster, these messages could be set using the WikimediaMessages extension. Information about extra language links (in the site language only) is provided via the API in meta=siteinfo&prop=interwikimap. Bug: 32189 Change-Id: I3d04760e2d9fb3320bb71e3d5ad115eed54a899c	2014-06-20 11:29:05 +10:00
Thiemo Mättig	f6cff5e392	Update documentation of what a "section" is There are so many slightly different understandings of what a "section" is or can be. I'm aware the documentation was improved just a few weeks ago. I still find it incomplete and confusing. 1. I renamed it to $sectionId to make it more clear what it really is. 2. Sections are usually numbers. 0, 1 and so on. There is no reason to disallow the use of ints or even floats (this works because the string representation of 0.0 is "0"). The code never disallowed numbers. 3. 'T1' never was supported, as far as I can tell. 'T-1' is supported. See Parser::extractSections(). 4. null and false and '' all mean "the whole page" in WikiPage::replaceSectionAtRev() but for some reason this meaning got lost in WikitextContent::replaceSection(). I made it the same again. Change-Id: Icc3997722d2ed742bf7703cd7c06d09199225720	2014-06-12 18:13:23 +02:00
jenkins-bot	93405c852c	Merge "Update list item newline handling to follow Parsoid's model"	2014-06-09 18:13:14 +00:00
Gabriel Wicke	b33b5d5840	Update list item newline handling to follow Parsoid's model This improves on commit `34bd573144` by matching Parsoid's newline handling in the PHP parser. It is the outcome of a discussion with Erwin, where we agreed that * foo * bar should produce <ul><li>foo</li> <li>bar</li></ul> See the discussion in https://gerrit.wikimedia.org/r/#/c/94443/ The original rendering issue this tried to address is no longer present after a change to the template. The pure CSS solution is now working. Bug: 39617 Bug: 56809 Change-Id: Ib7aa9449bbd994cb23b83b3f23cff944b1cddadf	2014-06-09 11:01:52 -07:00
Brad Jorsch	d18ba4e9df	Add PPFrame::isVolatile and PPFrame::setVolatile Most wikitext is safe to parse once and then cache for when that same wikitext is used again, such as for multiple transclusions of the same template within a page. There are occasions, though, where some piece of wikitext has side effects and so should not be cached; a prominent example of such wikitext is the <ref> and <references> tags in Cite.php. This change adds PPFrame::setVolatile so parser hooks such as <ref> and <references> can indicate that they have done something that should not be cached, and PPFrame::isVolatile so that callers of PPFrame::expand can know when to avoid caching. Bug: 46815 Bug: 31834 Change-Id: I95b3cf8781cf047cdb63da221cef45f3e7d1632e	2014-05-30 14:07:06 -04:00
Jackmcbarn	2094e578b4	Restrict empty-frame cache entries to their parent Remove the parser's global $mTplExpandCache, and replace it with an alternative that is separated by parent frame. This allows the integrity of the empty-frame expansion cache to be maintained while also allowing parent frame access. A page with 3 copies of http://ja.wikipedia.org/wiki/%E4%B8%AD%E5%A4%AE%E7%B7%9A_(%E9%9F%93%E5%9B%BD) has the following statistics: Without this change, there are 4625 cache hits on this page, and a sample of 3 parses took 16.6, 16.9, and 16.8 seconds. With this change, there are 2588 cache hits, and a sample of 3 parses took 16.7, 16.7, and 17.0 seconds. Change-Id: I621e9075e0f136ac188a4d2f53418b7cc957408d	2014-05-30 01:38:15 +00:00
umherirrender	48cd71a339	Fix @since of Parser::stripOuterParagraph Was merged after release branch. Follow-Up: I6bb3597898324556df912a23a7ffc9ff250b8f58 Change-Id: Idab16dc1e322ede31f6688236fddae5365ac133c	2014-05-16 19:50:30 +02:00
Ori.livneh	df983f6642	Revert "Declare visibility on class properties of includes/parser/" See https://bugzilla.wikimedia.org/65375#c4 This reverts commit `f359cdf614`. Bug: 65375 Change-Id: I12a60b5cc52a07a6deabcbf47c7c99cd2faac3c3	2014-05-16 00:52:24 +00:00
Bartosz Dziewoński	c3aa5ef597	Create Parser::stripOuterParagraph to avoid code duplication We've had the logic for stripping the outer <p/> element in three separate places. The version in OutputPage was missing the '$' at the end of the regex, that was most likely a mistake caused by the duplication. Also, extend the logic in order not to generate invalid HTML if the input contains more than one <p/> tag. Added tests for this and the previous behaviour. https://www.mail-archive.com/mediawiki-api@lists.wikimedia.org/msg03188.html Change-Id: I6bb3597898324556df912a23a7ffc9ff250b8f58	2014-05-15 12:20:19 -04:00
Siebrand Mazeland	90254361a2	Change visibility of some methods in Parser and update docs accordingly Change-Id: Ibe9d817325b4abafe137cd3f2fc6ccc25740cf58	2014-05-11 16:28:07 +00:00
Siebrand Mazeland	dfc7416fbe	Various documentation updates for includes/parser/ Change-Id: I16dd3a792cc83f8c80b3652d42c055730f6d177a	2014-05-11 18:18:26 +02:00

1 2 3 4 5 ...

739 commits