Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
C. Scott Ananian	e22d93a6bb	Hard-deprecate ParserOutput::{get,set}Flag() These were deprecated in 1.38; users are expected to use ParserOutput::{get,set}OutputFlag() instead, which helps eliminate a confusing aliasing of many MW methods named "flag". Original deprecation: `06ab90f163` Code search: https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos= Patches for non-production extensions: PageProperties: I592d43e2c912df635cd9162180ed20a6136535f1 CIForms: I238a6c557891bb6d271d2641261ef69542b7957e Bug: T292868 Bug: T305161 Change-Id: I4525443ab0932241b0cf64ab606f7ab7d6d70b6e	2023-07-28 13:51:02 -04:00
Isabelle Hurbain-Palatin	b2cfa31eb6	Add append/getOutputString to ParserOutput This aims at providing an interface similar to setOutputFlag for string sets, such as the ones used in CSP properties. Change-Id: I6f103bd88802e66611e483403a2f8a540d54aae9	2023-07-27 11:37:11 +02:00
thiemowmde	3c631a59f2	More specific array type hints in ParserOutput/OutputPage Change-Id: I7dbecebb8b26e57afda13f46d3b895f085c4e95e	2023-07-03 15:52:18 +02:00
Subramanya Sastry	0e9656e6da	Add return type to getIndicators() in ParserOutput & OutputPage This is in preparation for changes on the Parsoid side to make sure its signature is compatible with the ContentMetadataCollector interface there. Change-Id: Ife4ae81dbc304097da7dcba40b143f7030b959f3	2023-06-02 16:13:01 +05:30
jenkins-bot	e1c1632d9c	Merge "ParserOutput: Ensure page title is updated after merging properties"	2023-05-11 18:23:20 +00:00
Umherirrender	e04d3a28f6	Replace internal Hooks::runner The Hooks class contains deprecated functions and the whole class is going to get removed, so remove the convenience function and inline the code. Bug: T335536 Change-Id: I8ef3468a64a0199996f26ef293543fcacdf2797f	2023-05-11 06:17:38 +00:00
Subramanya Sastry	632481c382	ParserOutput: Ensure page title is updated after merging properties Eventually we should merge the "title text" and "display title" in ParserOutput (T293514) but for now mirror the logic in ParserOutput::mergeHtmlMetadataFrom() and update the title text from the source if it hasn't already been set in the destination. This patch ensures that after page properties are merged during metadata collection, the title text is suitably updated if the 'displaytitle' property is set. This will let Parsoid pass displaytitle (metadata) tests in integrated mode since Parsoid relies on merging metadata from multiple ParserOutput objects (in the DataAccess object that is used to expand templates, etc.) Once this patch is merged, Parsoid patches may start failing CI till we submit a patch there to fix up the integrated test failures list since some previously failing tests may now pass. Bug: T293514 Bug: T294621 Change-Id: Ia673f1261ccd03caf455122b71cfb9769b02f22e	2023-05-10 08:53:41 +00:00
jenkins-bot	c5152db020	Merge "Remove back-compat for <editsection>"	2023-04-28 15:59:12 +00:00
Subramanya Sastry	3e297c43ad	Fix breakages generating TOC for API Help pages * TOCData in Parsoid expects to process non-string-key indexed arrays. * Don't use 'null' as the default for maxtoclevel to ensure that TOC is always displayed even when it isn't passed in as a param by callers. * Follows up on `05535be6` which only partially fixed the breakage caused by `153a4157` and `439656e0` Bug: T334551 Change-Id: I8883b58574ea8ed0566de2c44dba3408a47d2d0c	2023-04-12 15:37:03 -05:00
jenkins-bot	90997943f9	Merge "Parser: Remove back-compatibility NO_TOC_CONVERSION code"	2023-03-27 20:43:53 +00:00
C. Scott Ananian	cfd9c516e1	Allow setting a ParserOption to generate Parsoid HTML This is an initial quick-and-dirty implementation. The ParsoidParser class will eventually inherit from \Parser, but this is an initial placeholder to unblock other Parsoid read views work. Currently Parsoid does not fully implement all the ParserOutput metadata set by the legacy parser, but we're working on it. This patch also addresses T300325 by ensuring the the Page HTML APIs use ParserOutput::getRawText(), which will return the entire Parsoid HTML document without post-processing. This is what the Parsoid team refers to as "edit mode" HTML. The ParserOutput::getText() method returns only the <body> contents of the HTML, and applies several transformations, including inserting Table of Contents and style deduplication; this is the "read views" flavor of the Parsoid HTML. We need to be careful of the interaction of the `useParsoid` flag with the ParserCacheMetadata. Effectively `useParsoid` should always be marked as "used" or else the ParserCache will assume its value doesn't matter and will serve legacy content for parsoid requests and vice-versa. T330677 is a follow up to address this more thoroughly by splitting the parser cache in ParserOutputAccess; the stop gap in this patch is fragile and, because it doesn't fork the ParserCacheMetadata cache, may corrupt the ParserCacheMetadata in the case when Parsoid and the legacy parser consult different sets of options to render a page. Bug: T300191 Bug: T330677 Bug: T300325 Change-Id: Ica09a4284c00d7917f8b6249e946232b2fb38011	2023-03-26 21:46:05 -04:00
C. Scott Ananian	8aae904254	Parser: Remove back-compatibility NO_TOC_CONVERSION code The TOC used to be language-converted in ParserOutput::getText(), but it wasn't possible to apply custom rules defined in the wikitext article body at ::getText() time. Remove the various hacks that we'd added in an attempt to do so, which were made unnecessary by I321cd31dae64bbf845d53282e5d28a55bc4ec319. Bug: T306862 Change-Id: Ib12cd02e9ade91d5794462e8833f2aa3b45a51f2	2023-03-24 22:14:42 +00:00
C. Scott Ananian	99e9d4927f	Remove back-compat for <editsection> The tag has been <mw:editsection> since at least 2011 (`f0fd318a4e`), we no longer need to include the ancient <editsection> variant in our regexp and test cases. Change-Id: I5fd783556810ea13b07a69066ea6762d1a1863e1	2023-03-15 13:53:01 -04:00
jenkins-bot	6de76f1fad	Merge "Add ParserOutput::getLanguage()"	2023-03-13 14:18:47 +00:00
C. Scott Ananian	29853113f7	Deprecate ParserOutput::{get,set}TOCHTML() No uses in deployed code outside mediawiki-core: https://codesearch.wmcloud.org/deployed/?q=%5Bgs%5DetTOCHTML%5C%28&i=nope&files=&excludeFiles=&repos= Bug: T293513 Change-Id: I3fd82150ac581afbeb94f401672702063586fff0	2023-03-10 20:34:33 -05:00
C. Scott Ananian	183a6da420	Add ParserOutput::getLanguage() Provide a way for backend code to determine the primary language of a ParserOutput, eg for setting the Content-Language header of an API response. This is read-only and backed by extension data at the moment for transition purposes; if this API sticks we'll graduate it to a "real" property in the future, with appropriate serialization to/from JSON (T303329). Similarly, this patch only includes the most basic code to handle the various ParserOutput merge cases in ParserOutput::merge{Internal,Html,Tracking}MetaDataFrom(), ParserOutput::collectMetadata(), and OutputPage::addParserOutput{Content,Metadata,Text,}(); mostly inherited from the fact that the storage is backed by extension data at the moment. Generally only the "top-level" parser output gets to set the primary language; we'll presumably need to ensure that the language is consistent during merge. Change-Id: I767daba22805a877d9b806fd77334e508902844b	2023-03-10 18:42:29 -05:00
C. Scott Ananian	d2446a77dd	Deprecate ParserOutput::getCategories() This undocumented method returns a reference to ParserOutput's private storage array, yet very few callers actually require a reference or try to use this to mutate the internal storage. Further, the keys of the array can be converted to `int` when the category names are numeric, which can further confuse users. Most users found through codesearch can/should use ::getCategoryNames() instead. Add a new ::getCategorySortKey() method to provide access to the sort keys for those few callers who require them, in a manner which doesn't expose that the internal `mCategories` array stores numeric category names as 'int'. Bug: T331727 Change-Id: I8dc85e76bfbb9ed49a603d990c14b7ee798bd821	2023-03-10 10:02:42 -05:00
C. Scott Ananian	e34b25a09f	Ensure categories are returned as strings Numeric category strings like '1' are converted to ints when they are used as array keys. Convert back to strings as needed to ensure this doesn't surprise any clients. Bug: T331084 Change-Id: Ib39707216d213e414c09226a6378047ffaf43892	2023-03-10 10:02:23 -05:00
James D. Forrester	ad06527fb4	Reorg: Namespace the Title class This is moderately messy. Process was principally: * xargs rg --files-with-matches '^use Title;' \| grep 'php$' \| \ xargs -P 1 -n 1 sed -i -z 's/use Title;/use MediaWiki\\Title\\Title;/1' * rg --files-without-match 'MediaWiki\\Title\\Title;' . \| grep 'php$' \| \ xargs rg --files-with-matches 'Title\b' \| \ xargs -P 1 -n 1 sed -i -z 's/\nuse /\nuse MediaWiki\\Title\\Title;\nuse /1' * composer fix Then manual fix-ups for a few files that don't have any use statements. Bug: T166010 Follows-Up: Ia5d8cb759dc3bc9e9bbe217d0fb109e2f8c4101a Change-Id: If8fc9d0d95fc1a114021e282a706fc3e7da3524b	2023-03-02 08:46:53 -05:00
jenkins-bot	9a96857757	Merge "Reorg: Move HTML-related classes out of includes/ to Html/"	2023-02-21 15:37:53 +00:00
Kosta Harlan	b16d2b7fc9	ParserOutput: Don't assume that TOC extension data exists When running PHPUnit integration tests locally for Extension:GrowthExperiments, $toc['extensionData'] isn't defined, leading to failures for various tests. Follows-Up: I67397c49f2d0764e5c755101264631bea6603e16 Change-Id: I3ef45a86c236863dbeafbd121f1a5951947c5dc6	2023-02-17 09:44:23 +01:00
Amir Sarabadani	7d8768e931	Reorg: Move HTML-related classes out of includes/ to Html/ Bug: T321882 Change-Id: I5dc1f7e9c303cd3f5b9dd7010d6bb470d8400a18	2023-02-16 20:40:01 +01:00
jenkins-bot	855004747a	Merge "Ensure CacheTime properties are reflected by ParserOutput::collectMetadata"	2023-02-13 22:04:40 +00:00
C. Scott Ananian	fc62d1325d	Ensure CacheTime properties are reflected by ParserOutput::collectMetadata In order to break a cyclic dependency, Parsoid doesn't know about core's `ParserOutput` class; it defines its own `ContentMetadataCollector` interface which expose those portions of the ParserOutput metadata which the parser needs to supply. Other bits of the ParserOutput metadata are specific to MediaWiki internals and Parsoid doesn't have to explicitly know about them: extensions and core implementations of parser functions (eg) can take the ContentMetadataCollector supplied by Parsoid and downcast it back to a ParserOutput in order to propagate internal information (like ParserCache lifetimes) "behind Parsoid's back" - aka, without violating abstraction boundaries by exposing every implementation detail of MediaWiki to Parsoid. When Parsoid calls into core to expand magic words like `currenttimestamp` they update the cache TTL in the ParserOutput using this mechanism. Using ParserOutput::collectMetadata() ensure these values are propagated to the final ParserOuput, even though Parsoid doesn't (shouldn't have to) explicitly know about them. Bug: T329067 Change-Id: Ia92efff4293841330674df09e82897d0775ef4d6	2023-02-13 16:41:08 -05:00
jenkins-bot	91e9cccc04	Merge "Use a SectionMetadata object in Linker::generateTOC()"	2023-02-10 22:48:18 +00:00
jenkins-bot	eaa368f09d	Merge "Remove back-compatibility code for ToC marker"	2023-02-10 20:50:13 +00:00
C. Scott Ananian	d5b39490ca	Remove back-compatibility code for ToC marker Before 1.39 we used <mw:toc> and in 1.39 we switched to <mw:tocplace/> (commit `24949480eb`). This was changed to a <meta> tag in 1.40 (commit `0b10563895` and `fa8646ca7b`) and the old content has long since expired from the ParserCache. Clean up the old ParserCache transition code. Change-Id: I3254d0acba31e107b50767797a2b0ad28aba59ee	2023-02-10 00:03:54 -05:00
C. Scott Ananian	153a415742	Use a SectionMetadata object in Linker::generateTOC() This updates Linker::generateTOC() so it uses a TOCData object, not a "legacy" associative array. Change-Id: I8fa83afd17b769df69bdd61ebd1b2ef3fe8b540f	2023-02-09 23:20:52 -05:00
C. Scott Ananian	38767bcabf	Temporarily preserve TOC top-level extension data The TOCData should be serialized with the JsonCodec which will also allow preserving the TOC top-level extension data. But for now, use a hack to ensure it is not lost when we use the "legacy" associative array format to serialize/deserialize TOCData. Change-Id: I67397c49f2d0764e5c755101264631bea6603e16	2023-02-10 04:16:14 +00:00
C. Scott Ananian	439656e019	Generate TOC HTML on demand in ParserOutput::getText() * Rather than computing TOC HTML in Parser and setting it in ParserOutput, compute it on demand based on section metadata. This will let Parsoid set section metadata in ParserOutput and have the TOC generated automatically. * This required fixing some "bugs" in Linker's generateTOC which didn't properly close tags and relied on Tidy to fix up unclosed li and ul tags. * This patch relies on converting section metadata objects to array objects, but Linker::generateTOC could be converted to use TOC data instead. * Since TOC generation is now moved to getText(), this is done post-PC load and this eliminates the parser cache split on user language for TOC heading localization. Bug: T293513 Change-Id: Ief1bba326d3612b40930440c872a61abadffab10	2023-01-25 16:42:16 -05:00
jenkins-bot	8220c7dce3	Merge "Generate/set/get TOCData/SectionMetadata objects instead of arrays"	2023-01-19 21:36:56 +00:00
Subramanya Sastry	d8d6ecd39f	Generate/set/get TOCData/SectionMetadata objects instead of arrays * ParserOutput::setSections()/::getSections() are expected to be deprecated. Uses in extensions and skins will need to be migrated in follow up patches once the new interface has stabilized. * In the skins code, the metadata is converted back to an array. Downstream skin TOC consumers will need to be migrated as well before we can remove the toLegacy() conversion. * Fixed SerializationTestTrait's validation method - Not sure if this is overkill but should handle all future complex objects we might stuff into the ParserCache. * This patch emits a backward-compatible Sections property in order to avoid changing the parser cache serialization format. T327439 has been filed to eventually use the JsonCodec support for object serialization, but for this initial patch it makes sense to avoid the need for a concurrent ParserCache format migration by using a backward-compatible serialization. * TOCData is nullable because the intent is that ParserOutput::setTOCData() is MW_MERGE_STRATEGY_WRITE_ONCE; that is, only the top-level fragment composing a page will set the TOCData. This will be enforced in the future via wfDeprecated() (T327429), but again our first patch is as backward-compatible as possible. Bug: T296025 Depends-On: I1b267d23cf49d147c5379b914531303744481b68 Co-Authored-By: C. Scott Ananian <cananian@wikimedia.org> Co-Authored-By: Subramanya Sastry <ssastry@wikimedia.org> Change-Id: I8329864535f0b1dd5f9163868a08d6cb1ffcb78f	2023-01-19 16:18:13 -05:00
C. Scott Ananian	96e4f5d840	JsonCodec: fix en/decoding of nested objects and stdClass objects Add a type annotation when encoding `stdClass` objects so that we can be sure to decode them as objects instead of arrays. This avoids issues such as that seen in the Graph extension (T312589) where an extension data key is stored as a stdClass. If ParserOutput was computed fresh, a subsequent getExtensionData(..) call will return a stdClass object, but if the ParserOutput was cached, getExtensionData() would return an array. After this change the return type is always consistent. Properly handle nested objects: encode all object values returned by JsonSerializable::jsonSerialize() (so that client is not responsible for implementing this correctly), and decode all object values before calling JsonUnserializable::newFromJsonArray (again, so that the client is not responsible for decoding its property values). The new behavior matches how serialize/unserialize is handled in the 'naive' JsonUnserializable{Sub,Super}Class test cases; ParserOutput (the only users of JsonCodec in core) was doing an extra manual decode for the ExtensionData array in ParserOutput::initFromJson that is no longer necessary. The GrowthExperiments and SemanticMediaWiki extensions were working around the non-recursive nature of JsonCodec; this patch depends on patches to GrowthExperiments to make it agnostic about whether object unserialization occurs before or after ::newFromJsonArray() is called, which can then be further cleaned up once this is released. A pull request for SemanticMediaWiki has also been submitted. Bug: T312589 Depends-On: I3413609251f056893d3921df23698aeed40754ed Change-Id: Id7d0695af40b9801b42a9b82f41e46118da288dc	2023-01-12 14:12:32 -05:00
jenkins-bot	ece6ba5417	Merge "ParserOutput: point to documentation for serialization compatibility."	2023-01-03 18:27:59 +00:00
daniel	f2febebb30	ParserOutput: point to documentation for serialization compatibility. Any changes to the way ParserOutput is serialized must follow the instructions at <https://www.mediawiki.org/wiki/Manual:Parser_cache/Serialization_compatibility>. Change-Id: Ic16a6804ca0a65f8f9abbc3112359cc239febde3	2023-01-03 19:08:22 +01:00
Amir Sarabadani	523ab7cff8	Reorg: Move RawMessage to under language/ To follow Message. This is approved as part of RFC T166010. Also namespace it but doing it properly with PSR-4 would require namespacing every class under language/ and that will take some time. Bug: T321882 Change-Id: I195cf4c67bd51410556c2dd1e33cc9c1033d5d18	2022-12-16 11:30:19 +01:00
Matěj Suchánek	a592d47e91	Clean up redundant array manipulation PHP does this implicitly. Change-Id: I009a7c93d44fb5e8c430c971cfc637fa04a8e68d	2022-12-11 12:42:29 +01:00
Amir Sarabadani	2d60ba0c63	Reorg: Move DummyLinker and Linker to linker/ This feels like a no-brainer unless I'm missing something obvious Bug: T321882 Change-Id: Id49c3d0dd6ea4593211048850856b5b8e05a8fb3	2022-12-08 06:38:17 +01:00
Umherirrender	1b342a8893	Various doc fixes about false and null on method arguments/return types Doc-only changes Change-Id: Ice974b3ba41708859dfe646e94b31c5ebbf26410	2022-11-03 18:55:47 +01:00
Tim Starling	0077c5da15	Use short array destructuring instead of list() Introduced in PHP 7.1. Because it's shorter and looks nice. I used regex replacement. Change-Id: I0555e199d126cd44501f859cb4589f8bd49694da	2022-10-21 15:33:37 +11:00
thiemowmde	d81f01e417	Replace various `array` type hints with more specific `string[]` There are many, many more. I touch only a few where I'm sure it's never anything but an array of strings. Change-Id: I8b798f2e9d48f07a241b95ce0ace8fa9d981695d	2022-09-27 09:24:22 +02:00
Umherirrender	5c5498a202	Remove unused key variable from foreach loops Change-Id: Id2d91e30a6f7cc4eb93427b50efc1c5c77f14b75	2022-09-21 21:18:43 +02:00
C. Scott Ananian	6c242a8a11	OutputPage::addParserOutputText(): use default ParserOutput options from skin This addresses the common case patched by I530d71d0f9279b40a263cd62467d3ef8c76975c3, If6267f3389b166043fc94d7f952bc54122b1a378 and probably the code in Article.php from I44045b3b9e78e7ab793da3f37e3c0dbc91cd7d39 by ensuring that "injectTOC" in the options passed to ParserOutput::getText() defaults to the correct value based on the skin being used by OutputPage. Bug: T317333 Change-Id: Ica30569efbb5730eff5b807e8fc34beb2e13e74f	2022-09-08 15:46:23 -04:00
jenkins-bot	6d840fa896	Merge "ParserOutput::mergeMapStrategy - use a more robust comparison for objects"	2022-07-21 02:51:31 +00:00
Umherirrender	e00a52e6f5	Clean up line indent with mixed tabs and whitespaces Change-Id: Ifcd15ecc4212d4ebfc26b2e18d6f1da47abf2a86	2022-07-09 22:21:53 +02:00
C. Scott Ananian	541542e588	ParserOutput::mergeMapStrategy - use a more robust comparison for objects Map values can include JsonUnserializable objects, and strict (reference) equality comparison of these objects is not going to reflect value equality. Serialize the values and compare strings instead; this case should be hit very infrequently given that rewriting the same extension data key is discouraged. Bug: T312588 Change-Id: I942e7fa662b2f1a5e32fd55ef65eaa10a22afcfb	2022-07-08 16:07:18 +00:00
C. Scott Ananian	577879841c	ParserOutput::mergeMapStrategy: don't crash if merging non-array values The PHP `isset(...)` construct covers a multitude of possible "wrong types" for the left hand side of an array access, but it still crashes (with "Cannot use object of type stdClass as array") if the left hand side is an object. Bug: T312242 Change-Id: I35026c573fb941004764d46d5652ebcddc559c03	2022-07-07 15:02:57 +00:00
jenkins-bot	3ed9d3a6f9	Merge "Use the same tooltip for transcluded sections as normal ones"	2022-06-22 18:31:43 +00:00
daniel	697f28df32	ParserCache: always use JSON When JSON support was introduced into ParserCache in 1.36, it was controlled by a feature flag, $wgParserCacheUseJson. The feature flag was "born deprecated" in 1.36. It can now be removed. This means that ParserCache will always store entries as JSON. Support for reading old non-JSON entries remains intact. This is needed when updating wikis from a version older than 1.36 to the current version. Change-Id: Id04e42bfb458d98414bac50e0d6c505e8878e5c0	2022-06-07 15:19:45 +02:00
Isabelle Hurbain-Palatin	1277d9f154	Still collect metadata on multiple writes Follow-up to I9d1f0f6bab1305552a0350667d6142a24bc04049. That patch was not collecting data at all (not even overwriting them over and over again) - the assignment operation was, in practice, a NOP. This patch fixes this. Bug: T303014 Bug: T303015 Change-Id: I7d09b532f3270edf4327c16e032d665353d992f6	2022-05-17 11:14:51 -04:00

1 2 3 4 5 ...

379 commits