Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
Florian	2d50e28975	Allow to enable OOUI via a parser tag extension This change adds the possibility to enable OOUI out of the parser, which enabled parser tag functions to easily enable OOUI, if they need it, for every page view out of the function that handles the parser tag. Bug: T106949 Change-Id: If1e139d4f07be98e418e11470794ea42e8a9b2eb	2015-07-25 17:36:33 +02:00
Arlo Breault	0b4208e645	Allow whitespace between indent and table start tag * \s matches the trim on the line. * Since leading space is ok for table start tags, and you can use them in ":" context, you should be able to compose the two together. Bug: T105238 Change-Id: Id08e24e5dd2bb8ca09453adec87b21225df4a840	2015-07-18 20:41:33 +00:00
Chad Horohoe	b8ced862bb	Protect against non-text output from StripState going into Title::newFromText() Non-string input shouldn't be fed into newFromText(). We currently handle this indirectly with relying on Title to do it. Instead just return earlier and not try to construct a title from bad input. Bug: T102321 Change-Id: I9bc96111378d9d4ed5981bffc6f150cbd0c1e331	2015-07-10 20:05:06 +00:00
Arlo Breault	ba00a957fb	Cleanup in doTableStuff Change-Id: I75c0a943b24f96a30c6ee1efc3f0b11388f892b7	2015-07-09 04:57:52 +00:00
Brad Jorsch	359e77d7c9	Parser: Avoid producing <span></span> in the TOC If someone renames a section but wants old targeted links to still work, <span id="old-anchor"></span> is the usual solution. And sometimes people put it inside the section header markup, like == <span id="old-anchor"></span>New name == since putting it before makes it be considered part of the previous section while putting it after causes the browser to scroll the section header off the screen. But this has the unfortunate side effect that the TOC text for that section will be "<span></span>New name". We should strip that useless empty span. Bug: T96153 Change-Id: I47a33ceb79d48f6d0c38fa3b3814a378feb5e31e	2015-07-08 17:11:21 +00:00
Bartosz Dziewoński	e688bea6a5	Parser: Correct setHook() documentation Change-Id: Iaeaac9ea79b696dfa39adb6608ed68edd3754516	2015-06-30 19:02:42 +00:00
umherirrender	70f3afd548	Remove unneeded empty lines at begin of if/else/foreach body An if body must not begin with an empty line Change-Id: I62b058be337fcc85a120fcd3dadce564db59a271	2015-06-19 20:05:45 +02:00
Vivek Ghaisas	9f5b6f5aeb	Fix whitespace issues around parentheses Fix issues found by MediaWiki.WhiteSpace.SpaceyParenthesis sniff. Bug: T102617 Change-Id: Iec7f71e64081659fba373ec20d9d2006306a98f4	2015-06-16 22:14:02 +03:00
jenkins-bot	0e1c80e6e1	Merge "Check result of preg_match_all in Parser.php"	2015-06-02 22:08:42 +00:00
Ori Livneh	12571bde26	Use a fixed marker prefix string in the Parser and MWTidy Generating one-time, unique strip markers hurts us in multiple ways: * The strip marker regexes don't benefit from JIT compilation, so they are slower to execute than they could be. * Although the regexes don't benefit from JIT compilation, they are still compiled, because HHVM bets on regexes getting reused. This extra work is fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off. * The size of the PCRE JIT cache is finite, and the caching of one-off regexes displaces from the cache regexes which are in fact reused. Tim's preferred solution (per his review comment on https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers. So: * Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which complements the existing Parser::MARKER_SUFFIX. * Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix(). * Deprecate Parser::getRandomString(), since it is no longer useful. * In Preprocessor_:preprocessToObj() and Parser::fetchTemplateAndTitle, replace any occurences of \x7f with '?', to prevent strip marker forgery. \x7f is not valid input anyway. Deprecate the $prefix parameter for StripState::__construct, since a custom prefix may no longer be specified. Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f	2015-05-31 19:33:36 -07:00
umherirrender	c430850154	Check result of preg_match_all in Parser.php preg_match_all can return false on failure, which than results in undefined index access. Check the result and just keep it as nothing found by processing an empty array Change-Id: I1f11894240dc6869506d68d3513715abdc3abb5d	2015-05-29 05:16:08 +00:00
Jackmcbarn	62c3fe221f	Allow running code during unstrip When adding strip markers, allow closures to be passed in place of text. The closure is then called during unstrip. Also, add a hook that runs after unstripGeneral. This is needed for Extension:Cite's I0e136f952. Change-Id: If83b0623671fd67e5ccc9deaaaab456a6679af8f	2015-05-13 02:44:20 +00:00
Timo Tijhof	d06855ecbe	Parser: Say tildes instead of ~~~ in comment to fix Doxygen fatal Doxygen was unable to parse the file past validateSig(). > Parser.php:6397: warning: reached end of file while inside a ~~~ block! > The command that should end the block seems to be missing! Change-Id: I3d1b547968302611d2bd78a7c11dd0738b40d23a	2015-04-06 12:32:25 +00:00
Max Semenik	08762b02de	Minor cleanups * Declare undeclared variables * Kill unused variables * Fix comments including PHPDoc Change-Id: I60015f6b6740aa9088bda3745f4dc4e65e29fcb1	2015-04-02 16:22:42 +00:00
Aaron Schulz	ac0de3c430	Fixed {{REVISION(TIMESTAMP\|USER\|SIZE)}} on new revisions * This makes use of the injected new revision object used elsewhere in Parser to solve this problem. Bug: T94407 Change-Id: I7881583cf7cb2bc799c89ffaa2a344a2d4ca3a4e	2015-03-30 21:10:09 -07:00
Chad Horohoe	c33f4de066	Profile all external HTTP requests from MW Change-Id: Ie980b080da2ef21ec7d9fc32f1accc55710de140	2015-03-03 20:54:30 -08:00
Jackmcbarn	cab99af90e	Fix TOC anchor name collisions in edge cases Currently, the parser adds a "_2" to the second of two identical headlines to avoid collisions, but there's still a collision if another headline actually ends in "_2". This change causes the new headline to also be checked for a collision, and advances to "_3" or beyond if there is one. Bug: T26787 Change-Id: Id0a55aa4c1917bac2f8f0d4863fcb85bd3dff1ca	2015-02-17 20:59:33 +00:00
Timo Tijhof	d62a2b76b1	Replace dev.w3.org with more permanent or stable urls * Sanitizer: dev.w3.org/html5/spec-preview Follows-up `8e8b15afc6`. Use stable reference to www.w3.org/TR/html5 instead (currently from October 2014) instead of an old preview branch from 2012. * parserTests: dev.w3.org/html5 Follows-up `959aa336a1`. Url is now a dead end. Replaced with link to a draft from around that time. The relevant section no longer exists in the curent spec as it got split off into a separate spec. Maybe this one: https://url.spec.whatwg.org/#percent-encoded-bytes * Parser, HTMLIntField: dev.w3.org/html5 Use stable reference to www.w3.org/TR/html5 instead. * HTMLFloatField.php: dev.w3.org/html5 Url is now a dead end. Draft from around that time: http://www.w3.org/TR/2011/WD-html5-20110525/common-microsyntaxes.html#real-numbers The section "Real numbers" no longer exists in the current spec, but the Infrastructure chapter has a section on floating point numbers that describes the same sequence now. Change-Id: I7dcd49b6cd39785fb1b294e4eeaf39bda52337b2	2015-02-14 14:21:33 +00:00
Sam Reed	f41e2ddb6a	Don't split regex string unnecessarily Change-Id: Id5912e64916ce5c7be2991478c32531596917540	2015-01-28 16:17:41 +00:00
Aaron Schulz	6921770414	Updated some try-catch statements: MWException -> Exception Change-Id: I76601a86e30f4984e3b1a8c8ec5ef5a0f652433a	2015-01-09 17:20:22 -08:00
Ricordisamoa	2ae155da52	Fix phpcs errors in includes/ Mostly Squiz.WhiteSpace.SuperfluousWhitespace.EmptyLines Change-Id: I678b2f0902f11cd1dfa1611b9da24e7237df9122	2015-01-08 20:15:07 +01:00
Aaron Schulz	4ff8136807	Removed remaining profile calls Change-Id: I31c81c78715048004fc8fca0f27d09c1fa71c118	2015-01-08 02:49:33 -08:00
Chad Horohoe	aa21e125a3	Remove obvious function-level profiling Xhprof generates this data now. Custom profiling of various sub-function units are kept. Calls to profiler represented about 3% of page execution time on Special:BlankPage (1.5% in/out); after this change it's down to about 0.98% of page execution time. Change-Id: Id9a1dc9d8f80bbd52e42226b724a1e1213d07af7	2015-01-07 11:14:24 -08:00
Amir E. Aharoni	144d741196	Shorten lines to pass phpcs test Change-Id: I5588e1f16f1a23d77160cd180058bd2000a93ab6	2014-12-29 17:14:08 +02:00
Derk-Jan Hartman	e20e64eb6b	Parser: Add <bdi> to the whitelist for TOC links Bug: 72884 Change-Id: Id5aa9a4eb32fb185881141e55de700ae36f806c5	2014-12-27 21:24:42 +01:00
C. Scott Ananian	54a8199f87	Don't allow embedded newlines in magic links, but do allow   This continues the work started in T67278 to make magic link parsing more consistent with wiki text parsing in general, and closes two long-standing bugs. Bug: T30950 Bug: T31025 Change-Id: I71f8b337543163569c64bbfdec154eb9b69d7264	2014-12-22 04:14:55 +00:00
C. Scott Ananian	b975a0bfe0	Don't break autolinks by stripping the final semicolon from an entity. Autolinking free external links is clever about making sure that trailing punctuation isn't included in the link. But if an HTML entity happens to terminate the URL, the semicolon from the entity is stripped from the url, breaking it. Fix this corner case. This also unifies autolink parsing with Parsoid. See: I5ae8435322c78dd1df170d7a3543fff3642759b1 Change-Id: I5482782c25e12283030b0fd2150ac55092f7979b	2014-12-18 17:27:55 -05:00
Brad Jorsch	5c1eeb2464	Normalize "\r" newlines in preSaveTransform The behavior of the different preprocessors differs when given \r or \r\n newlines. We already normalize the latter here, so may as well do the former here too. Bug: T78488 Change-Id: Id6390f64a73ea01088729f25d79103388c1fe7e8	2014-12-15 16:59:42 -05:00
C. Scott Ananian	25d35fc65c	Enforce spaces around magic links (RFC, PMID, and ISBN). Ensure that there is a \b boundary before and after RFC, PMID, and ISBN links. (Previously we enforced \b boundaries only before free external links and after ISBN links.) Consistency is a good thing! In addition: * \b is not a PHP escape sequence, so you don't need to write \\b inside a string. * \b before the numeric part of an ISBN is pointless: by the structure of the regexp there will always be a space on the left and a word character (a digit) on the right. Bug: 65278 Change-Id: Ic315b988091a5c7530a8285b9249804db72e55db	2014-12-11 03:41:23 +00:00
Aaron Schulz	e369f66d00	Replace wfRunHooks calls with direct Hooks::run calls * This avoids the overhead of an extra function call Change-Id: I8ee996f237fd111873ab51965bded3d91e61e4dd	2014-12-10 12:26:59 -08:00
umherirrender	489d793882	Fixed spacing - Added/removed spaces around parenthesis - Added newline in empty blocks - Added space after switch/foreach/function - Use tabs at begin of line - Add newline at end of file Change-Id: I244cdb2c333489e1020931bf4ac5266a87439f0d	2014-12-05 22:28:07 +01:00
jenkins-bot	0b8f48c535	Merge "Use Parser::SFH_NO_HASH/SFH_OBJECT_ARGS class const"	2014-11-27 05:55:07 +00:00
Aaron Schulz	88ad1bd9a7	Cleaned up template profile report tabbing Change-Id: I46abfc856d718d4db73d0510bde3e2b589341b10	2014-11-18 14:58:02 -08:00
umherirrender	91f26d50ee	Use Parser::SFH_NO_HASH/SFH_OBJECT_ARGS class const Instead of the global const Add hint to Defines, that they should not be used. Change-Id: I3e1dcf46fe18a97a05e3406c209815adb7e0e083	2014-11-18 21:19:22 +01:00
Chad Horohoe	b8d93fb4fd	Refactor profiling output from profiling * Added a standard getFunctionStats() method for Profilers to return per function data as maps. This is not toolbar specific like getRawData(). * Cleaned up the interface of SectionProfiler::getFunctionStats() a bit. * Removed unused cpu_sq, real_sq fields from profiler UDP output. * Moved getTime/getInitialTime to ProfilerStandard. Co-Authored-By: Aaron Schulz <aschulz@wikimedia.org> Change-Id: I266ed82031a434465f64896eb327f3872fdf1db1	2014-11-17 19:26:04 -07:00
Aaron Schulz	0bfa6b6264	Move request-only template profiling to an always-on parser report Change-Id: I0660c8d6cac0dadab648eac9736504b7939320f3	2014-11-12 18:06:00 -08:00
Reedy	aa5c2493cb	Remove documentation hinting LinkHolderArray::replace() should return value Return value not used in any code in our repo Removes FIXME too Change-Id: Ia2ec35099f0b54ea39c2f6b9371e94c3034bddb0	2014-11-10 18:32:57 +00:00
MZMcBride	627ccbcd7b	Minor code comment tweaks for spelling and consistency Change-Id: I51391f45d0f81e4245ccc0e435a71ccd5b0e3ca3	2014-11-08 14:07:19 -05:00
Bartosz Dziewoński	565e9fa077	Correctly parse <indicator/> contents, Parser rejiggering includes/parser/Parser.php * Pull out a chunk of code we need to reuse from parse() to internalParseHalfParsed(). This is a fully backwards-compatible change. Code changes: * Add a guard for running ParserBeforeTidy and ParserAfterTidy hooks, as extensions might not expect them to be called for snippets, only full page content. * Change $options to $this->mOptions. The bulk of parsing work is now done in internalParse() and internalParseHalfParsed(), parse() only handles four things: * Resetting parser state when a parse starts/finishes * Page title language conversion * Outputting limit report and limitation warnings * Running ParserAfterParse hook (dunno why, but it's documented) * Expand documentation for recursiveTagParse(), with some uppercase warnings so that no one does the stupid thing I did ever again. * Add new public method recursiveTagParseFully(), which is a recursive parser entry point that produces fully parsed HTML ready for inclusion in HTML output. Compared to Parser::parse(), it doesn't produce limit reports and doesn't run the ParserAfterParse hook. includes/parser/CoreTagHooks.php * Use the new recursiveTagParseFully() method. * Use Parser::stripOuterParagraph() to remove silly tags. Bug: 72887 Change-Id: I89ae9a50b82245f9a9e4a903563aeb1c51b6103e	2014-11-04 10:25:58 +00:00
Reedy	8e6fa108b8	or -> \|\| Change-Id: Ic591f06f70c68bb2912b7f028f7f988eb658375d	2014-10-24 11:26:14 +01:00
Chad Horohoe	6c30fff0ba	Swap and for && Change-Id: I7821a62586cc2d2f929fb3d7d5046958a70efbd0	2014-10-23 13:03:14 -07:00
Tim Starling	ce8e466e44	Revert "Use a fixed regex for StripState" Breaks extensions, doesn't entirely fix the problem it was meant to fix. This reverts commit `6da3f169ac`. Change-Id: Ic193abcff8c72b0c8b434fcac514f88603a45beb	2014-10-20 21:42:53 +00:00
jenkins-bot	cf93d76c03	Merge "Remove hitcounters and associated code"	2014-10-20 21:12:54 +00:00
Chad Horohoe	90d90dad6e	Remove hitcounters and associated code The hitcounter implementation in MediaWiki is flawed and needs removal. For proper metrics, it is suggested to use something like Piwik or Google Analytics. RFC: https://www.mediawiki.org/wiki/Requests_for_comment/Removing_hit_counters_from_MediaWiki_core Change-Id: I0e5006a7e8a09c800f8fa4effa9399e8afdd7a57	2014-10-20 13:01:55 -07:00
Tim Starling	6da3f169ac	Use a fixed regex for StripState The JIT compiler in newer versions of PCRE experiences lock contention when multithreaded applications perform a high rate of concurrent compilations. We are seeing some performance impact on HHVM under normal production traffic. The random part of the strip marker is just there to protect against deliberate insertion of strip markers into the source text, which is very rare. So use a generic regex to find strip markers, and check in the callback whether the random state ID is correct. StripState::killMarkers() will be slower when it has to remove many strip markers, but most calls to it will not match any strip markers, so overall performance should be improved due to reduced JIT compilation. Bug: 72205 Change-Id: I8d37ae929a8c669c9e39adc8096b89e5732b68d0	2014-10-19 14:38:09 -07:00
Gergő Tisza	382d4df858	Move addTrackingCategory from Parser to ParserOutput addTrackingCategory is more in line with ParserOutput's functionality (addLink, addCategory etc), and tracking categories are useful even for content types which do not use the parser at all. There is no reason to require the caller to obtain a Parser object just to be able to add tracking categories. Change-Id: I89d9ea1db3a4e6486e77eee940bd438f7753b776	2014-09-28 23:35:52 +00:00
jenkins-bot	0755177e64	Merge "Add parser callback to get a page's current revision"	2014-09-25 22:52:10 +00:00
Brad Jorsch	8eeb906f93	Break accidental references in Parser::__clone If you have a reference to an object field (anywhere in the call stack) when you clone the object, the field will be cloned as a reference rather than as a value. So we have to break those unexpected references in the cloned object manually, which is easy enough by making a non-reference copy and then rebinding the cloned object's reference to this copy. Bug: 56226 Change-Id: I9c600e9c0845b4fde0366126ce3809d74e2240b4	2014-09-22 13:44:49 -04:00
Jackmcbarn	edc9f2acd9	Add parser callback to get a page's current revision Add Parser::fetchCurrentRevisionOfTitle(). By default, this just calls Revision::newFromTitle, but a callback can be set in ParserOptions that will override it. Anything that runs as part of a parse should use this wherever possible. Bug: 70495 Change-Id: I521f1f68ad819cf0f37e63240806f10c1cceef9c	2014-09-19 11:59:58 -04:00
Brad Jorsch	e2c9d4dfa9	Improve/rename Parser::replaceUnusualEscapes The previous implementation would unescape '&', '=', '+', and '%'. The first three will break the URL when unescaped in the query string, and the last will break when unescaped anywhere. The code is now changed to treat the path, query, and fragment parts of the URL separately when unescaping. We also escape any unsafe characters and ensure all percent-encodings use uppercase hexits. And since the old name is no longer accurate, Parser::replaceUnusualEscapes is deprecated in favor of Parser::normalizeLinkUrl. Bug: 57909 Change-Id: I77dc308d0d016c395ad737c08cf10a7711e25bbd	2014-09-16 23:00:16 +00:00

1 2 3 4 5 ...

765 commits