Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
Daimona Eaytoy	2a0de02aab	phpunit: Avoid TestUser in non-database tests TestUser creates the user and therefore needs the database. Avoid using it in non-database tests. Add ApiQueryBlockInfoTraitTest to the Database group because it needs the database. Add DeleteUserEmailTest to the Database group because since `3bedffa8` the default user is not created any more in non-database tests Change-Id: Iff438964dde47a47a2fa4a314d55010bd8c7fee5	2023-07-29 14:26:50 +00:00
C. Scott Ananian	e22d93a6bb	Hard-deprecate ParserOutput::{get,set}Flag() These were deprecated in 1.38; users are expected to use ParserOutput::{get,set}OutputFlag() instead, which helps eliminate a confusing aliasing of many MW methods named "flag". Original deprecation: `06ab90f163` Code search: https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos= Patches for non-production extensions: PageProperties: I592d43e2c912df635cd9162180ed20a6136535f1 CIForms: I238a6c557891bb6d271d2641261ef69542b7957e Bug: T292868 Bug: T305161 Change-Id: I4525443ab0932241b0cf64ab606f7ab7d6d70b6e	2023-07-28 13:51:02 -04:00
jenkins-bot	1929084e47	Merge "Rename newly-added ParserOutput::appendOutputString() method"	2023-07-28 17:29:12 +00:00
jenkins-bot	4a4da63cd2	Merge "Fix incomplete/broken ParserFactoryTest & ParserTest"	2023-07-28 17:08:30 +00:00
C. Scott Ananian	ea51801f79	Rename newly-added ParserOutput::appendOutputString() method Tweaked the pluralization of the newly-added ParserOutput::appendOutputString() method (now ::appendOutputStrings() and ::getOutputStrings()), and name of the ParserOutputStrings class (now ParserOutputStringSets), in an effort to continue repainting bikesheds until the color is juuuust right. Also extended the new method to cover ::addModules() and ::addModuleStyles() and added support for these string sets in ::collectMetadata(). (These methods and the enumeration class were originally added in b2cfa31eb6173e9f5e8607eadd126c33f8ce440b.) Depends-On: I8bdffa55498d90e990af5bfc3332e3028b0a3539 Change-Id: Ibd41485d5db7779f01642e2144c50ed49d409812	2023-07-28 12:10:56 -04:00
thiemowmde	8a2b869945	Fix incomplete/broken ParserFactoryTest & ParserTest Some details: * Just use a real MagicWord object. It doesn't do anything that needs mocking. * Add missing methods to mocks. * Remove not needed details from mocks. * Remove duplicate test that does the same. * Remove pointless assertions that are impossible to ever fail. Change-Id: I177242429a528d2c7109ca757840b538b772711c	2023-07-28 14:22:46 +00:00
Isabelle Hurbain-Palatin	b2cfa31eb6	Add append/getOutputString to ParserOutput This aims at providing an interface similar to setOutputFlag for string sets, such as the ones used in CSP properties. Change-Id: I6f103bd88802e66611e483403a2f8a540d54aae9	2023-07-27 11:37:11 +02:00
thiemowmde	8a9dd67139	Avoid calling overrideConfigValue() multiple times Same as I7a82951. overrideConfigValue() and overrideConfigValues() both call setMwGlobals(), which calls resetServices(). This is surprisingly expensive. It's much better to call it once with an array. Change-Id: I4ff2f6b902b1a1e0b554ce6fc76f3b612f703fae	2023-07-20 14:59:42 +02:00
Umherirrender	d2a09384a7	tests: Change some setMwGlobals to overrideConfigValue Change-Id: I21b9bf907e313947360b1607f11ae9917488f109	2023-07-17 23:02:32 +02:00
Daimona Eaytoy	5035ecd2f7	CoreParserFunctionsTest: Avoid username pattern reserved for temp users The leading "*" is currently used as the username pattern for temp users, meaning this test will fail if $wgAutoCreateTempUser['enabled'] = true; Put the star at the end instead, and use a variable for the username instead of repeating it multiple times. Change-Id: Ie0414de5f9d9054dfec540f14bd0dc9ec7b4cb72	2023-07-16 19:55:50 +02:00
daniel	c4033734db	HookContainer: deprecate old hook handler formats This reduces the acceptable forms for hook handlers to three things: * a callable (in the form of a string, an array, or a closure) * an object, which is expected to have a public "on" method that matches the hook name. * an array containing an object spec in the "handler" key, for use with ExtensionRegistry. All other forms will trigger a deprecation warning. Bug: T339167 Depends-On: I980f2d45e6bb8c6a04058e68c758f71bbcf709de Depends-On: Ieae405f70caa01d84602583cc214b0ee3fadc796 Depends-On: If15df4b598c02ed9bda5eea0ae89a16ebbf4f2e2 Depends-On: Id70276fa1e1821bd400dc0ae5cea722a21d524d5 Change-Id: I83bc81d1b3033c38b9313884a9c70a187fdde227	2023-06-21 11:40:10 +00:00
Daimona Eaytoy	518a5da533	Replace deprecated MWException Bug: T328220 Change-Id: I0408575ee71e58d1c9e9ebedabab35bd3813f515	2023-06-12 12:27:49 +00:00
Umherirrender	d36073cdcf	tests: Make some PHPUnit data providers static Initally used a new sniff with autofix (T333745), but some provide are defined non-static in TestBase class and need more work to make them static in a compatible way Bug: T332865 Change-Id: I889d33424f0c01fb26f2d86f8d4fc3de3e568843	2023-05-20 01:05:27 +02:00
Volker E	2c1729e4e9	HTML: Remove self-closing XHTML syntax from core Syntactical leftover with no significance in modern web. Bug: T309150 Depends-On: I3a029ca950db42b938962b2452ad136ae8ddea6f Depends-On: Id0557ac19583de36d7226b14a4c06933da47fe97 Depends-On: I17580a72e4a9384d7d774866e610197e950900cb Change-Id: I4bbfa47fbf6e30fb90d920d6d02cdf6e0b1cdb46	2023-05-03 10:44:41 +02:00
thiemowmde	bee13a2a6d	Avoid calling setMwGlobals multiple times Turns out this method is rather expensive because of the final resetServices() is does internally. It's much better to call it with an array. Change-Id: I7a82951e281512d535ffc5a86929f4441f3ddc4e	2023-05-02 15:48:12 +02:00
Umherirrender	997726c4ee	tests: Use array_fill_keys instead of array_combine/array_fill Change-Id: I3bee4452b182a982b99017beed4ff929e96a10c6	2023-04-29 15:51:03 +02:00
jenkins-bot	65812ee715	Merge "parser: Make all LinkHolderArray properties private"	2023-04-08 22:28:31 +00:00
Aaron Schulz	366a0afd63	parser: improve cache TTL accuracy for CURRENT/LOCAL magic words Consolidate cache TTL handling within CoreMagicVariables. Make the TTL account for how many seconds away the value is from changing. For example, CURRENTHOUR should change soon after the next hour is reached. There is a minimum adjustment TTL to avoid parser-after-save delays. This allows for longer caching in most cases, as well as more up-to-date rendering when the hour/day/week/year is about to change. Previously, there were blind TTLs, which are either way too pessimistic or way too generous. This commit does not change the CURRENTTIME, CURRENTTIMESTAMP, LOCALTIME, and LOCALTIMESTAMP words, since there is no reasonable way to cache output while keeping them up-to-date. Bug: T320668 Change-Id: I9acb42b0d9ff67798a1624cbf9c7cac99c8fbe2f	2023-03-28 22:35:17 +00:00
C. Scott Ananian	cfd9c516e1	Allow setting a ParserOption to generate Parsoid HTML This is an initial quick-and-dirty implementation. The ParsoidParser class will eventually inherit from \Parser, but this is an initial placeholder to unblock other Parsoid read views work. Currently Parsoid does not fully implement all the ParserOutput metadata set by the legacy parser, but we're working on it. This patch also addresses T300325 by ensuring the the Page HTML APIs use ParserOutput::getRawText(), which will return the entire Parsoid HTML document without post-processing. This is what the Parsoid team refers to as "edit mode" HTML. The ParserOutput::getText() method returns only the <body> contents of the HTML, and applies several transformations, including inserting Table of Contents and style deduplication; this is the "read views" flavor of the Parsoid HTML. We need to be careful of the interaction of the `useParsoid` flag with the ParserCacheMetadata. Effectively `useParsoid` should always be marked as "used" or else the ParserCache will assume its value doesn't matter and will serve legacy content for parsoid requests and vice-versa. T330677 is a follow up to address this more thoroughly by splitting the parser cache in ParserOutputAccess; the stop gap in this patch is fragile and, because it doesn't fork the ParserCacheMetadata cache, may corrupt the ParserCacheMetadata in the case when Parsoid and the legacy parser consult different sets of options to render a page. Bug: T300191 Bug: T330677 Bug: T300325 Change-Id: Ica09a4284c00d7917f8b6249e946232b2fb38011	2023-03-26 21:46:05 -04:00
Tim Starling	5e30a927bc	tests: Make some PHPUnit data providers static Just methods where adding "static" to the declaration was enough, I didn't do anything with providers that used $this. Initially by search and replace. There were many mistakes which I found mostly by running the PHPStorm inspection which searches for $this usage in a static method. Later I used the PHPStorm "make static" action which avoids the more obvious mistakes. Bug: T332865 Change-Id: I47ed6692945607dfa5c139d42edbd934fa4f3a36	2023-03-24 02:53:57 +00:00
thiemowmde	4ebc778eb7	parser: Make all LinkHolderArray properties private I could not find any use outside of core, or even outside of this class. The class is instantiated a single time in core: https://codesearch.wmcloud.org/search/?q=new%5CW%2BLinkHolderArray&files=%5C.php%24 This instance is not used anywhere else: https://codesearch.wmcloud.org/search/?q=mLinkHolders&files=%5C.php%24 I would argue this doesn't really qualify as a breaking change. This was always meant to be private. Change-Id: I4c614dae1fe1d61c9cf8b7a03c37eb93fae33873	2023-03-15 10:44:04 +01:00
jenkins-bot	6de76f1fad	Merge "Add ParserOutput::getLanguage()"	2023-03-13 14:18:47 +00:00
jenkins-bot	bd5cccf7c4	Merge "Deprecate ParserOutput::{get,set}TOCHTML()"	2023-03-12 21:41:20 +00:00
libraryupgrader	7375f3a5fe	build: Updating mediawiki/mediawiki-codesniffer to 41.0.0 The following sniffs are failing and were disabled: * MediaWiki.Usage.ForbiddenFunctions.eval Change-Id: I6fd0a9296c88a77c3abec6e5e8d568bb469c2d6e	2023-03-11 19:04:09 +00:00
C. Scott Ananian	29853113f7	Deprecate ParserOutput::{get,set}TOCHTML() No uses in deployed code outside mediawiki-core: https://codesearch.wmcloud.org/deployed/?q=%5Bgs%5DetTOCHTML%5C%28&i=nope&files=&excludeFiles=&repos= Bug: T293513 Change-Id: I3fd82150ac581afbeb94f401672702063586fff0	2023-03-10 20:34:33 -05:00
C. Scott Ananian	183a6da420	Add ParserOutput::getLanguage() Provide a way for backend code to determine the primary language of a ParserOutput, eg for setting the Content-Language header of an API response. This is read-only and backed by extension data at the moment for transition purposes; if this API sticks we'll graduate it to a "real" property in the future, with appropriate serialization to/from JSON (T303329). Similarly, this patch only includes the most basic code to handle the various ParserOutput merge cases in ParserOutput::merge{Internal,Html,Tracking}MetaDataFrom(), ParserOutput::collectMetadata(), and OutputPage::addParserOutput{Content,Metadata,Text,}(); mostly inherited from the fact that the storage is backed by extension data at the moment. Generally only the "top-level" parser output gets to set the primary language; we'll presumably need to ensure that the language is consistent during merge. Change-Id: I767daba22805a877d9b806fd77334e508902844b	2023-03-10 18:42:29 -05:00
James D. Forrester	ad06527fb4	Reorg: Namespace the Title class This is moderately messy. Process was principally: * xargs rg --files-with-matches '^use Title;' \| grep 'php$' \| \ xargs -P 1 -n 1 sed -i -z 's/use Title;/use MediaWiki\\Title\\Title;/1' * rg --files-without-match 'MediaWiki\\Title\\Title;' . \| grep 'php$' \| \ xargs rg --files-with-matches 'Title\b' \| \ xargs -P 1 -n 1 sed -i -z 's/\nuse /\nuse MediaWiki\\Title\\Title;\nuse /1' * composer fix Then manual fix-ups for a few files that don't have any use statements. Bug: T166010 Follows-Up: Ia5d8cb759dc3bc9e9bbe217d0fb109e2f8c4101a Change-Id: If8fc9d0d95fc1a114021e282a706fc3e7da3524b	2023-03-02 08:46:53 -05:00
Amir Sarabadani	0f13e81a15	Reorg: Move five page-related classes to page/ out of includes/ These classes: - MergeHistory - MovePage - ProtectionForm - BadFileLookup (to MediaWiki\Page\File) - FileDeleteForm (to MediaWiki\Page\File) Bug: T321882 Change-Id: Ibeb488ba322c62a34042a0307bbb5562773bcad1	2023-02-23 17:03:49 +01:00
C. Scott Ananian	d5b39490ca	Remove back-compatibility code for ToC marker Before 1.39 we used <mw:toc> and in 1.39 we switched to <mw:tocplace/> (commit `24949480eb`). This was changed to a <meta> tag in 1.40 (commit `0b10563895` and `fa8646ca7b`) and the old content has long since expired from the ParserCache. Clean up the old ParserCache transition code. Change-Id: I3254d0acba31e107b50767797a2b0ad28aba59ee	2023-02-10 00:03:54 -05:00
Amir Sarabadani	c8116223b4	Reorg: Move category-related classes from includes/ to Category/ Bug: T321882 Change-Id: I0b86acfdeaa3a2a0a14b7763fd088122820bafdc	2023-02-09 20:18:54 +01:00
C. Scott Ananian	439656e019	Generate TOC HTML on demand in ParserOutput::getText() * Rather than computing TOC HTML in Parser and setting it in ParserOutput, compute it on demand based on section metadata. This will let Parsoid set section metadata in ParserOutput and have the TOC generated automatically. * This required fixing some "bugs" in Linker's generateTOC which didn't properly close tags and relied on Tidy to fix up unclosed li and ul tags. * This patch relies on converting section metadata objects to array objects, but Linker::generateTOC could be converted to use TOC data instead. * Since TOC generation is now moved to getText(), this is done post-PC load and this eliminates the parser cache split on user language for TOC heading localization. Bug: T293513 Change-Id: Ief1bba326d3612b40930440c872a61abadffab10	2023-01-25 16:42:16 -05:00
jenkins-bot	8220c7dce3	Merge "Generate/set/get TOCData/SectionMetadata objects instead of arrays"	2023-01-19 21:36:56 +00:00
Subramanya Sastry	d8d6ecd39f	Generate/set/get TOCData/SectionMetadata objects instead of arrays * ParserOutput::setSections()/::getSections() are expected to be deprecated. Uses in extensions and skins will need to be migrated in follow up patches once the new interface has stabilized. * In the skins code, the metadata is converted back to an array. Downstream skin TOC consumers will need to be migrated as well before we can remove the toLegacy() conversion. * Fixed SerializationTestTrait's validation method - Not sure if this is overkill but should handle all future complex objects we might stuff into the ParserCache. * This patch emits a backward-compatible Sections property in order to avoid changing the parser cache serialization format. T327439 has been filed to eventually use the JsonCodec support for object serialization, but for this initial patch it makes sense to avoid the need for a concurrent ParserCache format migration by using a backward-compatible serialization. * TOCData is nullable because the intent is that ParserOutput::setTOCData() is MW_MERGE_STRATEGY_WRITE_ONCE; that is, only the top-level fragment composing a page will set the TOCData. This will be enforced in the future via wfDeprecated() (T327429), but again our first patch is as backward-compatible as possible. Bug: T296025 Depends-On: I1b267d23cf49d147c5379b914531303744481b68 Co-Authored-By: C. Scott Ananian <cananian@wikimedia.org> Co-Authored-By: Subramanya Sastry <ssastry@wikimedia.org> Change-Id: I8329864535f0b1dd5f9163868a08d6cb1ffcb78f	2023-01-19 16:18:13 -05:00
C. Scott Ananian	96e4f5d840	JsonCodec: fix en/decoding of nested objects and stdClass objects Add a type annotation when encoding `stdClass` objects so that we can be sure to decode them as objects instead of arrays. This avoids issues such as that seen in the Graph extension (T312589) where an extension data key is stored as a stdClass. If ParserOutput was computed fresh, a subsequent getExtensionData(..) call will return a stdClass object, but if the ParserOutput was cached, getExtensionData() would return an array. After this change the return type is always consistent. Properly handle nested objects: encode all object values returned by JsonSerializable::jsonSerialize() (so that client is not responsible for implementing this correctly), and decode all object values before calling JsonUnserializable::newFromJsonArray (again, so that the client is not responsible for decoding its property values). The new behavior matches how serialize/unserialize is handled in the 'naive' JsonUnserializable{Sub,Super}Class test cases; ParserOutput (the only users of JsonCodec in core) was doing an extra manual decode for the ExtensionData array in ParserOutput::initFromJson that is no longer necessary. The GrowthExperiments and SemanticMediaWiki extensions were working around the non-recursive nature of JsonCodec; this patch depends on patches to GrowthExperiments to make it agnostic about whether object unserialization occurs before or after ::newFromJsonArray() is called, which can then be further cleaned up once this is released. A pull request for SemanticMediaWiki has also been submitted. Bug: T312589 Depends-On: I3413609251f056893d3921df23698aeed40754ed Change-Id: Id7d0695af40b9801b42a9b82f41e46118da288dc	2023-01-12 14:12:32 -05:00
jenkins-bot	d3ecbc93a3	Merge "parser: Optimize regex patterns used in LinkHolderArray"	2023-01-07 16:21:50 +00:00
thiemowmde	69c5757243	parser: Optimize regex patterns used in LinkHolderArray Two micro-optimizations are done in this patch: 1. We know exactly how these placeholders are built in the makeHolder() method. In »<!--IWLINK'" 1-->« it's guaranteed to be a single number and in »<!--LINK'" 1:2-->« it's two numbers. The most extreme synthetic micro benchmark I did cuts the runtime of these regular expressions down to about 25%. It won't make much of a difference in real-world scenarios but is still worth it, I believe. It also makes the code more specific and less confusing (see below). 2. We don't need to use the full string »<!--LINK'" 1:2-->« as array key when the only thing that matters is the part »1:2«. Note the same is done just a few lines below in the replaceInterwiki() method. This code does have outstanding test coverage via all the parser tests, I believe. Any change here that doesn't make a test fail should be safe. Note the unit tests have been written many years later via I2c12cc7, using "dummy" strings and such instead of the expected numeric namespace and link ids. Most of this is already fixed via previous patches. The last mistake addressed in this patch is that getPrefixedDBkey() is supposed to be a title. It can't contain one of these placeholders. Follow-Up: I2c12cc76a9bf01eb527db3ea038e4adc59446cac Change-Id: Ie994059092df8861ddb97c098acd082698d45c53	2023-01-07 13:25:33 +00:00
Amir Sarabadani	523ab7cff8	Reorg: Move RawMessage to under language/ To follow Message. This is approved as part of RFC T166010. Also namespace it but doing it properly with PSR-4 would require namespacing every class under language/ and that will take some time. Bug: T321882 Change-Id: I195cf4c67bd51410556c2dd1e33cc9c1033d5d18	2022-12-16 11:30:19 +01:00
Umherirrender	fd516a98e1	Fix whitespaces after comma Change-Id: Ide6de0a53661e6f650099d7b1f274a02699441df	2022-12-15 01:24:14 +01:00
jenkins-bot	be2ff28b48	Merge "Reorg: Move MagicWord related files to under parser/"	2022-12-11 18:15:48 +00:00
Amir Sarabadani	a1b4699fea	Reorg: Move MagicWord related files to under parser/ This is approved as part of T166010 RFC. Bug: T321882 Change-Id: Ia4498c0a20e38a6a288dc14065ea8242c84fbc49	2022-12-09 13:48:35 +01:00
thiemowmde	800fd1d4c4	Fix bogus nextLinkID in LinkHolderArrayIntegrationTest Parser::nextLinkID cannot return a string. It returns a positive integer number. Note a very similar mistake was already fixed before via I7e71ffc. Change-Id: Ifce71d0f4db31787bf0eb84e621cfdeb07c674ef	2022-12-09 11:45:09 +01:00
Reedy	0cb2c3c106	Fix casing of class and function name usages Bug: T253628 Change-Id: I5c64f436d3cf757390b751ce3e34bfc7872bc176	2022-12-04 19:09:30 +00:00
Subramanya Sastry	bcb7009c41	Use real section metadata in tests * Most of the files were generated from the validate* script. * Post-processing of these generated files to fix problems: - Some of the files were binary-edited via "vi -b" to fix some issues with bad property names used in the prior step. 1.36, 1.38, 1.39 files were all fixed up this way. - In addition, the 1.36 file had bad data (not sure if the wrong php version was used) but I fixed this by splicing in data from the 1.38 file to revert incorrect changes to "Categories" and "IndexPolicy" properties. - The 1.35 data file was binary edited by splicing data from the now 1.36 version. Change-Id: I4e22b94ce30c2ad9b1f544c15e1c3cd0dd0bce6b	2022-11-23 12:45:27 -05:00
Subramanya Sastry	623625e8f2	Followup to `fb747bc0`: Fix bad property names Change-Id: I362b0cf8feca13a91fd91961d400579f2e4ea97e	2022-11-18 16:12:06 -06:00
Subramanya Sastry	fb747bc038	Add section metadata parsercache serialization tests for MW 1.40 * Generate data files for 1.40 only since the new formats only showed up in 1.40 and won't be present in the parser cache for older MW versions. Change-Id: I6f297e3091ec2faab7c2203c138800551b01e32a	2022-11-17 15:48:15 -06:00
daniel	118d4980b2	Track the reason for rendering. Allow the causeAction that triggers page rendering to be looped through to ParserCache, so we can count what causes writes to the cache. Change-Id: I6ad8e105a3ce457e3ab4f85cd154f47a32085e0d	2022-11-09 09:38:57 +00:00
daniel	8c1c1ae35a	Enable pig-latin variant for testing Having pig-latin enabled per default in dev environments is convenient for manual testing. More importantly, it will allow us to write end-to-end tests for variant conversion. Depends-On: I9dc2f743ac487b0f7cfb667150c0f6950d5e7fce Depends-On: I85b66c85be3959d48a048733af17197bc4cf70af Change-Id: Ia80ad33cbf5e311fa8b84bd765a8df8d156f4c38	2022-11-08 17:45:51 +05:30
Tim Starling	0077c5da15	Use short array destructuring instead of list() Introduced in PHP 7.1. Because it's shorter and looks nice. I used regex replacement. Change-Id: I0555e199d126cd44501f859cb4589f8bd49694da	2022-10-21 15:33:37 +11:00
C. Scott Ananian	d96207ab86	Auto-discover core parser test files Make parser test discover in core work the same way as it does in extensions: any file ending with *.txt under tests/parser is run as a parser test file. This search is recursive, which is motivation to also move some unrelated files under tests/parser/preprocess over to tests/phpunit/data/preprocess where they belong; they are used by tests/phpunit/includes/parser/PreprocessorTest.php and are unrelated to the parser test infrastructure. Change-Id: I8c84b4b853e1309929dceb700aab1e79a598d8ab	2022-10-13 10:41:15 -04:00
Jon Robson	d1662dca59	Parser: Use linkAnchor in section definition as well as anchor The anchor property comes from Sanitizer::escapeIdForAttribute() and should be used if you want to (eg) look up an element by ID using document.getElementById(). The linkAnchor property comes from Sanitizer::escapeIdForLink() and contains additional escaping appropriate for use in a URL fragment, and should be used (eg) if you are creating the href attribute of an <a> tag. Bug: T315222 Change-Id: Icecf9640a62117c2729dca04af343fb1ddaaf8f8	2022-09-14 12:54:36 -04:00
jenkins-bot	61cbd18ff3	Merge "parser: Use a <meta> tag for the internal TOC_PLACEHOLDER"	2022-09-09 21:12:34 +00:00
Subramanya Sastry	c8a944a94b	Add support to enable Scribunto & Parsoid to handle nowikis properly * Lua modules have been written to inspect nowiki strip state markers and extract nowiki content to further process them. Callers might have used nowikis in arguments for any number of reasons including needing to have the argument be treated as raw text intead of wikitext. While we might add first-class typing features to wikitext, templates, extensions, and the like in the future which would let Parsoid process template arguments based on type info (rather than as wikitext always), we need a solution now to enable modules to work properly with Parsoid. * The core issue is the decoupled model used by Parsoid where transclusions are preprocessed before further processing. Since nowikis cannot be processed and stripped during preprocessing, Lua modules don't have access to nowiki strip markers in this model. * In this patch, we change extension tag processsing for nowikis. When generating HTML, nowikis are replaced with a 'nowiki' strip marker with the nowiki's "innerXML" (only tag contents). In this patch, during preprocessing, instead of adding a 'general' strip marker with the "outerXML" (tag contents and the tag wrapper), we add a 'nowiki' strip marker with its "outerXML". * Since Parsoid (and any clients using the preprocessed output) will unstrip all strip markers, the shift from a general to nowiki strip marker won't make a difference. * To support Scribunto and Lua modules unstrip usage, this patch adds new functionality to StripState to replace the (preprocessing-)nowiki strip markers with whatever its users want. So, Scribunto could pass in a callback that replaces these with the "innerXML" by stripping out the tag wrapper. * Hat tip to Tim Starling for recommending this strategy. * Updated strip state tests. Bug: T272507 Bug: T299103 Depends-On: Id6ea611549e98893f53094116a3851e9c42b8dc8 Change-Id: Ied0295feab06027a8df885b3215435e596f0353b	2022-09-01 21:04:42 +00:00
Bartosz Dziewoński	f7158c396d	Add markup to page titles to distinguish the namespace and the main text Pages outside of the main namespace now have the following markup in their <h1> page titles, using 'Talk:Hello' as an example: <h1> <span class="mw-page-title-namespace">Talk</span> <span class="mw-page-title-separator">:</span> <span class="mw-page-title-main">Hello</span> </h1> (line breaks and spaces added for readability) Pages in the main namespace only have the last part, e.g. for 'Hello': <h1> <span class="mw-page-title-main">Hello</span> </h1> The change is motivated by a desire to style the titles differently on talk pages in the DiscussionTools extension (T313636), but it could also be used for other things: * Language-specific tweaks (e.g. adding typographically-correct spaces around the colon separator: T249149, or replacing it with a different character: T36295) * Site-specific tweaks (e.g. de-emphasize or emphasize specific namespaces like 'Draft': T62973 / T236215) The markup is also added to automatically language-converted titles. It is not added when the title is overridden using the wikitext `{{DISPLAYTITLE:…}}` or `-{T\|…}-` forms. I think this is a small limitation, as those forms mostly used in the main namespace, where the extra markup isn't very helpful anyway. This may be improved in the future. As a workaround, users could also just add the same HTML markup to their wikitext (as those forms accept it). It is not also added when the title is overridden by an extension like Translate. Maybe we'll have a better API before anyone wants to do that. If not, one could un-mark Parser::formatPageTitle() as @internal, and use that method to add the markup themselves. Bug: T306440 Change-Id: I62b17ef22de3606d736e6c261e542a34b58b5a05	2022-08-16 23:36:21 +00:00
C. Scott Ananian	0b10563895	parser: Use a <meta> tag for the internal TOC_PLACEHOLDER Split out from the I44045b3b9e78e change. This is consistent with what Parsoid will use for the TOC marker. Bug: T287767 Bug: T270199 Bug: T311502 Depends-On: I1f607cf1ef1b61fb4d2e1880de756fb94d5a6b22 Change-Id: Ie63eed07b9bca1bfa07d4c256aba3728cedd8f93	2022-08-16 06:05:17 +00:00
C. Scott Ananian	fa8646ca7b	parser: Prepare to use a <meta> tag for the internal TOC_PLACEHOLDER Split out from the I44045b3b9e78e and Ie63eed07b9bca changes. We first add code to handle the new tag as well as the old tag in ParserCache contents. This will allow us to safely rollback if needed when deploying the follow-on patch which actually changes the tag used. Bug: T287767 Bug: T270199 Bug: T311502 Change-Id: Ib3e5e010b9f5ca2c4ea7c4fe28080170b6a88812	2022-08-15 18:54:52 -04:00
Derick Alangi	5e8cd2c838	Migrate from `setMwGlobals()` to `overrideConfigValue(s)` Change-Id: I3f167d0e7d59a5aa091c3095a7d96c889d6e7e78	2022-08-02 10:14:10 +01:00
Brian Wolff	f79ea41072	parser: Mock WikiPage::getContentModel in ParserCacheTest to fix php8.1 PHP 8.1 doesn't like this returning null. Bug: T313663 Change-Id: I59eb21301aab946b6362fea956b398337af8d971	2022-07-25 20:51:51 +00:00
Thiemo Kreuz	61ae7504df	Replace trivial usa of mock builder with createMock() shortcut createMock() does the same, but is much easier to read. A small difference is that some of the replacements made in this patch didn't use disableOriginalConstructor() before. In case this was relevant we should see the respective test fail. If not we can save some CPU cycles and skip these constructors. Change-Id: Ib98fb06e0fe753b7a53cb087a47e1159515a8ad5	2022-07-15 16:43:48 +00:00
Umherirrender	246bc931f6	tests: Set wgLang with MediaWikiIntegrationTestCase::setUserLang Change-Id: Ic1247a6719032b3a0ea1f76514edc5ffd5a7854a	2022-07-13 00:59:46 +02:00
Umherirrender	047c184bfe	tests: Use Title::makeTitle instead of Title::newFromText Avoid parsing known titles in tests to improve performance Change-Id: Ibfccfe696f0b8bfda0b99abae324e60bbecef7d8	2022-07-06 00:44:00 +02:00
Derick Alangi	d01e3ed739	Replace deprecated calls `ParserOptions::newCanonical( 'canonical' )` This is a quick find & replace of calls to the deprecated method ParserOptions::newCanonical() when the context is the string literal 'canonical'. This can be safely replaced by called newFromAnon(). Change-Id: If7bb68459b11e0c5f5de188f10fdae85ad1a78bf	2022-06-16 14:22:24 +01:00
jenkins-bot	b494330aa7	Merge "ParserCache: always use JSON"	2022-06-07 14:12:29 +00:00
daniel	697f28df32	ParserCache: always use JSON When JSON support was introduced into ParserCache in 1.36, it was controlled by a feature flag, $wgParserCacheUseJson. The feature flag was "born deprecated" in 1.36. It can now be removed. This means that ParserCache will always store entries as JSON. Support for reading old non-JSON entries remains intact. This is needed when updating wikis from a version older than 1.36 to the current version. Change-Id: Id04e42bfb458d98414bac50e0d6c505e8878e5c0	2022-06-07 15:19:45 +02:00
Reedy	41c42d5435	Tests: Cleanup some unnecessary nested function calls Replace ->will( ->return with ->willReturn( Change-Id: Ia2dfafa03cac8169d86d6fa5a30b73bfad1fe9fa	2022-06-06 01:02:34 +01:00
Umherirrender	8557249ac6	tests: Update namespace for MediaWiki\SpecialPage\SpecialPageFactory MediaWiki\Special\SpecialPageFactory is deprecated since 1.35 Change-Id: I558a59e781edef4a78b4e902961809ba07f4f695	2022-05-28 01:31:53 +02:00
Nikki Nikkhoui	b5fe60a7e1	Introduce PageBundleJsonTrait for serialization New trait for PageBundle class to serialize & deserialize PageBundle object into json before stashing and after unstashing. Change-Id: I486fab5b3d01bcef2b535af579cd9672403b2102	2022-05-23 17:54:48 +01:00
Brian Wolff	bec8dada48	Clarify generate-html and make ParserOutput behave as expected Previously: * It was unclear that generate-html is an optional optimization * Most of MediaWiki core was doing $parserOutput->setText('') if html wasn't generated. However this is wrong and will cause $parserOutput->hasText() to return true and also potentially cause cache pollution if a content handler both does that and supports parser cache (Like MassMessage; see T299896) * The default value of mText in the constructor was '', and most of the time MW used that default. This doesn't seem right. If setText() is never called, the ParserOutput should not be considered to have text * It was impossible to set mText to null, as $parserOutput->setText(null) was a no-op. Docs implied you were supposed to do this, so it was very confusing. This patch clarifies docs, changes the default value for ParserOutput::$mText from '' to null, and makes $parserOutput->setText(null) do what you expect it to. The last two are arguably breaking changes, although the previous behaviours were unexpected, mostly undocumented and based on a code search do not appear to be relied on. It seems like the main reason this only broke MassMessage is most content handlers either don't support generateHtml, or they don't support parser cache. Bug: T306591 Change-Id: I49cdf21411c6b02ac9a221a13393bebe17c7871e Depends-On: I68ad491735b2df13951399312a4f9c37b63a08fa	2022-05-03 11:23:08 +02:00
Aryeh Gregor	b85391120b	Use UrlUtils in Parser Change-Id: I65f851ea29efe482ee225565a200d623fa85bc20	2022-04-28 17:14:51 +03:00
Tim Starling	d6a3b6cfa8	TempUser EditPage and permissions * Allow EditPage to create a user on page save. This has to be enabled in config and then activated by the UI/API caller. * Add an autocreate source for temporary users. * Allow editing by anonymous users via automatic account creation when $wgGroupPermisions['']['edit'] = false. On an edit GET request, use an unsaved placeholder user to stand in for post-create permissions. On preview or aborted save, the username to be created is stashed in a session and restored on subsequent requests. * On a (likely) successful page save, create the account. * Put regular non-temporary users in a "named" group so that they can be given additional permissions. * Use a different "~~~" signature for temporary users * Show account creation warnings on edit and preview. Change-Id: I67b23abf73cc371280bfb2b6c43b3ce0e077bfe5	2022-04-26 14:10:53 +10:00
Umherirrender	2909d06a08	Use new namespace for revision related classes All revision related classes are namespaced MediaWiki\Revision instead of MediaWiki\Storage since 1.32. The old namespaced class names are deprecated and only kept for backwards-compatibility. Bug: T305784 Change-Id: I34e492d84d9fc4bc78481667202716d93b3c43cb	2022-04-14 23:03:43 +02:00
Tim Starling	13c1839735	Fix SignatureValidatorFactory circular dependency Parser is using the service container to get a SignatureValidator because, as noted in Gerrit comments on the relevant commit, there is a circular dependency Parser -> SignatureValidatorFactory -> Parser. So, have SignatureValidatorFactory::__construct() take a closure which returns a Parser, instead of an actual Parser or ParserFactory. Change-Id: I7bf4660f84ec8c8fb1d5b3b8581fe5d82bc3156e	2022-04-13 12:38:00 +10:00
jenkins-bot	0827d5ffea	Merge "Fix notice from ParserCacheSerializationTestCases"	2022-04-10 15:22:58 +00:00
Alexander Vorwerk	62a70ec7c7	Use new namespace for revision related classes All revision related classes are namespaced MediaWiki\Revision instead of MediaWiki\Storage since 1.32. The old namespaced class names are deprecated and only kept for backwards-compatibility. Bug: T305784 Change-Id: Ia0030814ce2176d06e2898acffe533d31633fccb	2022-04-09 20:22:36 +02:00
Tim Starling	0d94c44743	Fix notice from ParserCacheSerializationTestCases Change-Id: I6e65952367dd6de30916bfc574d1e4a5db84b998	2022-04-08 10:57:46 +10:00
jenkins-bot	1a91fcb41e	Merge "Emit deprecation warnings for ParserOutput::addOutputHook()"	2022-04-07 21:27:33 +00:00
C. Scott Ananian	05eda60400	Emit deprecation warnings for ParserOutput::addOutputHook() Once no one is calling ::addOutputHook() we can stub out ::getOutputHook() to just return an empty array. Code search: https://codesearch.wmcloud.org/deployed/?q=-%3E%28addOutputHook%7CgetOutputHooks%29%5C%28&i=nope&files=&excludeFiles=&repos= Bug: T292321 Change-Id: I1081696c4cc2e67c3c38b8f6e53054e62ac71502	2022-04-07 02:48:57 +00:00
C. Scott Ananian	c1a326f44e	Emit warnings when accessing deprecated public properties of Parser Code search: https://codesearch.wmcloud.org/deployed/?q=-%3E%28mLinkID%7CmIncludeSizes%7CmDoubleUnderscores%7CmShowToc%7CmRevisionId%7CmRevisionTimestamp%7CmRevisionUser%7CmRevisionSize%7CmInputSize%7CmInParse%7CmFirstCall%7CmGeneratedPPNodeCount%29&i=nope&files=&excludeFiles=&repos= The following @deprecated properties are not included in this patch in order to keep it conservative: * Hard to code search because of generic name: $mTitle, $ot, $mOptions * Should be @internal, not @deprecated, because they are used internally: $mPPNodeCount, $mHighestExpansionDepth * Used by SyntaxHighlight_GeSHi and TemplateStyles extensions (even though they could/should use their own independent unique ID): $mMarkerIndex * Used by test cases for Wikibase: $mExpensiveFunctionCount Change-Id: I1dadff934ead767cbd25615c08768e8e935d6b2e	2022-03-31 19:25:33 -04:00
Alexander Vorwerk	82739980fd	parser: change 'level' in parse api back to string We changed to operate on an int internally in I92daeb0f7be8a0. Let's cast it back to a string for the api in order to prevent a breaking change, which is not really necessary. Bug: T304171 Change-Id: I5f5a9203b4dd085cb5defba72c6650532bc9e8d1	2022-03-18 19:52:24 +01:00
jenkins-bot	c268687d46	Merge "Hard deprecate Sanitizer::removeHTMLtags()"	2022-03-08 19:29:55 +00:00
jenkins-bot	d1cfc0317d	Merge "Add explicit casts between scalar types"	2022-03-08 17:32:26 +00:00
Umherirrender	6ea3d6ac2c	Add explicit casts between scalar types php internal functions like floor/round/ceil documented to return float, most cases the result is used as int, added casts Found by phan strict checks Change-Id: I92daeb0f7be8a0566fd9258f66ed3aced9a7b792	2022-03-08 16:59:01 +00:00
C. Scott Ananian	d6576e5dc6	Hard deprecate Sanitizer::removeHTMLtags() Rename Sanitizer::removeHTMLtags() into an @internal method named ::internalRemoveHtmlTags() so that we can deprecate external use. Code search: https://codesearch.wmcloud.org/deployed/?q=removeHTMLtags&i=nope&files=&excludeFiles=&repos= Followup-To: Ic864c01471c292f11799c4fbdac4d7d30b8bc50f Depends-On: Iaca83ed06e9c61d8366579cd2283cba653c82319 Depends-On: I1963bfe9a99198ea02ca482a5769467ce806cd58 Depends-On: I83923d8b38d33f3638cd53958dd10f257ec21f7c Depends-On: I018b34bb5f6e113056da9b04cc72d4318422adce Change-Id: I202826f8b27519f7be89643e24eda47a6e3fc9f6	2022-03-07 22:04:56 -05:00
C. Scott Ananian	9f14fbd002	Add Sanitizer::removeSomeTags() which uses Remex to tokenize The existing Sanitizer::removeHTMLtags() method, in addition to having dodgy capitalization, uses regular expressions to parse the HTML. That produces corner cases like T298401 and T67747 and is not guaranteed to yield balanced or well-formed HTML. Instead, introduce and use a new Sanitizer::removeSomeTags() method which is guaranteed to always return balanced and well-formed HTML. Note that Sanitizer::removeHTMLtags()/::removeSomeTags() take a callback argument which (as far as I can tell) is never used outside core. Mark that argument as @internal, and clean up the version used by ::removeSomeTags(). Use the new ::removeSomeTags() method in the two places where DISPLAYTITLE is handled (following up on T67747). The use by the legacy parser is more difficult to replace (and would have a performace cost), so leave the old ::removeHTMLtags() method in place for that call site for now: when the legacy parser is replaced by Parsoid the need for the old ::removeHTMLtags() will go away. In a follow-up patch we'll rename ::removeHTMLtags() and mark it @internal so that we can deprecate ::removeHTMLtags() for external use. Some benchmarking code added. On my machine, with PHP 7.4, the new method tidies short 30-character title strings at a rate of about 6764/s while the tidy-based method being replaced here managed 6384/s. Sanitizer::removeHTMLtags blazes through short strings 20x faster (120,915/s); some of this difference is due to the set up cost of creating the tag whitelist and the Remex pipeline, so further optimizations could doubtless be done if Sanitizer::removeSomeTags() is more widely used. Bug: T299722 Bug: T67747 Change-Id: Ic864c01471c292f11799c4fbdac4d7d30b8bc50f	2022-03-04 14:06:02 -05:00
jenkins-bot	24aa34d06c	Merge "phpcs: Disable `Generic.Files.LineLength` for test files"	2022-02-21 15:51:29 +00:00
jenkins-bot	99dee6855a	Merge "Change return value of ParserOutput::getPageProperty() when property is missing"	2022-02-19 00:49:48 +00:00
C. Scott Ananian	c39ef6c6c9	Change return value of ParserOutput::getPageProperty() when property is missing The old ParserOutput::getProperty() method returned `false` when a property was missing. This requires callers to use the `?:` syntax to supply default values, which then causes any falsey value to be treated as missing. So, for example, setting the defaultsort to '0' will cause the default sort to be ignored. Modern php convention is to use `null` for missing values, and the `??` syntax is a better/more restrictive alternative to `?:`. We renamed `ParserOutput::getProperty()` to `::getPageProperty()` in 1.38 (Ie963eea5aa0f0e984ced7c4dfa0fd65d57313cfa/T287216) but kept the return value convention. Before this actually makes it into a 1.38 release, take the opportunity to fix the return value for the new `ParserOutput::getPageProperty()` method to return `null` when the property is missing. We need to do some temporary workarounds to the places we'd already swapped over to use the new `::getPageProperty()` method to allow them to handle either `false` or `null` as a return value; we'll clean that up once this is merged. Code search: https://codesearch.wmcloud.org/deployed/?q=-%3EgetPageProperty%5C%28\|T301915&i=nope&files=&excludeFiles=&repos= Bug: T301915 Depends-On: I3f11ce604970e47b41fc1c123792df8c3045626f Depends-On: Ie7533f49fe4cad01ebfda29760d23c61e9867b10 Depends-On: Ic5c09f5caa4c897bc553c614fbae9cee159566a2 Depends-On: I0278b2eafd90e77e4fee41c45a1165fb79ddf47e Depends-On: I383abb6b7dc5e96c0061af13957609f6e31a1065 Depends-On: I79f9f4078e415284af29b15047bafd1c823d7f5b Depends-On: I02276c48c49f5d2d241a69eb0a6cdf439b572d8b Depends-On: I71628661b4539a4e35ae32846e719f92bcf782e0 Depends-On: I7e215cb43de0ce150a6bcc00f92481dcdcfed383 Change-Id: Iaa25c390118d2db2b6578cdd558f2defd5351d15	2022-02-18 21:15:58 +00:00
Timo Tijhof	8d406bbcd6	phpcs: Disable `Generic.Files.LineLength` for test files There is a common and reasonable need for longer lines in tests. The nudge for shorter lines doesn't seem valuable here. The natural breaks will likely still fall in 80-100 given the enforced practice for non-test code, e.g. whether through habit, or 80-100 column markers in text editors, or the finite width of diff and code review interfaces. Change-Id: I879479e13551789a67624ce66f0946d2f185e6ee	2022-02-18 18:32:05 +00:00
C. Scott Ananian	3c211fdb3c	Update ParserCache serialization test cases to use valid category keys Category keys are supposed to be non-null strings. The test cases use bogus integer values, which causes issues when refactoring more strictly enforces validity checks on category sort key values. Change-Id: If2937a694ba6bd4c522336f33aa58d40949b5a54	2022-02-17 12:12:53 -05:00
C. Scott Ananian	b46dfcc351	Update ParserCache serialization test cases to use a valid index policy Valid values for ParserCache::$mIndexPolicy are '', 'noindex', and 'index'. The test cases use the bogus value 'policy1', which causes issues when refactoring more strictly enforces validity checks on index policy values. Change-Id: I2d00ff4e3ded043d18942c8482a39fac14ec60bc	2022-02-09 12:47:27 -05:00
Func	7f74a2e50c	Clean up tests that misused the parameters of assertSame/Equals Expected value is the first parameter to assertSame() or assertEquals(). And turn to use assertCount() for some assertions aginst count of array. Based on code search `assert(?:Same\|Equals)\(.+,.+expected` and I look through files roughly, so some assertions that don't contains 'expected' are also fixed. In the meantime, some assertions that I am not clear about are not touched. Change-Id: I75798b60d29fd19b33f4fdf34ed3c788db420d01	2022-02-08 07:21:10 +00:00
jenkins-bot	44c1145dff	Merge "Add ParserOutput::appendExtensionData()"	2022-02-04 22:20:47 +00:00
jenkins-bot	fe5f09812d	Merge "Add ParserOutput::{set,append}JsConfigVar()"	2022-02-04 21:15:56 +00:00
C. Scott Ananian	baaee141e4	Add ParserOutput::appendExtensionData() Soft-deprecate the use of ::setExtensionData() to destructively update the value stored under a single key. Add the new ::appendExtensionData() method to use where multiple values are desired. This accomodates the asynchronous and incremental parsing goals on the Parsoid roadmap. Bug: T300981 Change-Id: I2dea4ba71ea506428854a9983c1abd906b2efd5f	2022-02-04 13:43:22 -05:00
C. Scott Ananian	0f5dc718ce	Add ParserOutput::{set,append}JsConfigVar() Deprecate ParserOutput::addJsConfigVars() and add setter methods which better ensure that the ParserOutput contents are independent of parse order. This accomodates the asynchronous and incremental parsing goals on the Parsoid roadmap. Bug: T300307 Change-Id: I4f08d1098da211f7bf5c43c08c620de224cbf37f	2022-02-04 13:42:59 -05:00
daniel	026133bb05	remove access to config globals from includes/parser Loops ServiceOptions through to CoreParserFunctions and CoreTagHooks to avoid access to the main config from static methods. Bug: T294739 Change-Id: Ia6c97f2d0952964c2ad6189f8053ad127589b37c	2022-02-01 07:48:57 -08:00
Alexander Vorwerk	decbaf4f38	phpunit: use ->getServiceContainer() in integration tests Change-Id: I38299cb65eeaadfdc0eb05db4e8c0b0119cfb37d	2022-01-27 22:04:16 +01:00
Alexander Vorwerk	1f78d6a249	ParserCacheSerializationTestCases: call ::addModule(Style)?s with an array Bug: T299865 Change-Id: Ifb0dd97c7023154ba1d834e574a913cfe9ff0f1f	2022-01-23 17:26:32 +01:00
Reedy	044259b8d5	ParserOutputTest: Call ParserOutput::addModule(Style)?s with an array Bug: T299747 Change-Id: I14bb2b12f515369a3890e70e8effeef4c501ecbd	2022-01-21 10:40:46 +00:00
Kosta Harlan	0c2cc804e1	phpunit: Use is_file/is_dir instead of file_exists Yes, it's a micro-optimization. See https://bugs.php.net/bug.php?id=78285 and https://thephp.cc/articles/caching-makes-everything-faster-right for more info. Change-Id: Ib8e8e9794e15066476f35cdb1236df8b983274d6	2022-01-03 21:47:56 +01:00
Thiemo Kreuz	b4c63c64ae	Remove some more comments that literally repeat the code Nothing to learn from these. You can find a longer explanation in the comments in I93751e6. Change-Id: I195aae70fc282b58be5b18160783f27d38605d15	2021-12-09 19:01:36 +01:00

1 2 3 4 5 ...

684 commits