Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
C. Scott Ananian	b4cf4aa6bd	ParserOutput: Temporarily write (unused) TOCHTML to ParserCache Even though this JSON property is unused on master, the previous train release read it from the JSON (and threw the value away). In order to provide error-free roll-forward and roll-back of the train, temporarily write an empty string as the value of TOCHTML so that the read from `$jsonData['TOCHTML']` won't cause a PHP notice in the logs if we roll back. This patch is only needed for one train release, and can then be removed. Bug: T363107 Change-Id: I77add3bd7f00941cb81481f738bc59d6008c2406	2024-04-22 11:26:10 -04:00
Umherirrender	8d97313f81	Fix some line indent Change-Id: I8f82724197d20f9289d80e138d80310f1eab29f2	2024-04-20 00:25:15 +02:00
C. Scott Ananian	195ac55bfe	[ParserOutput] Remove deprecated ::getTOCHTML() and ::setTOCHTML() methods These were deprecated with warnings in 1.40. Change-Id: I8027bc26c71ae94d3d5c7e5112545cd1b35749aa	2024-04-16 13:00:58 -04:00
C. Scott Ananian	db2f1ad606	[ParserOutput] Remove deprecated ::getCategories() method This was deprecated with warnings in 1.40. Change-Id: I7b8a86f6efbdd86c1f493db6741c37bfb325e9bb	2024-04-16 12:57:17 -04:00
jenkins-bot	1caf41bb73	Merge "ParserOutput: Rename ::setIndexedPageProperty() to ::setNumericPageProperty()"	2024-04-16 10:57:58 +00:00
C. Scott Ananian	2429785470	ParserOutput: Rename ::setIndexedPageProperty() to ::setNumericPageProperty() Before this method name gets baked forever into the 1.42 release, rename the ParserOutput::setIndexedPageProperty() and ::setUnindexedPageProperty() methods to ::setNumericPageProperty() and ::setUnsortedPageProperty() to try to address some confusion about whether the presence of the page property is still indexed (it is!), in contrast to whether there's an additional "sort key" associated with the value assigned to the page property. This naming is compatible with the feature request in T357783 to have the sort key and property value specified independently. The new method signature in that case would be: ...setSortedPageProperty( string $name, string $value, int\|float $sortKey ) Although PHP 8.0 will throw a TypeError if a non-numeric type is coerced to numeric using `0 + ...`, use an explicit is_numeric check to obtain the same behavior in PHP 7.x. Change-Id: Ia94c192c429d0482c58467bed787fd2e0aca052f	2024-04-15 15:13:56 -04:00
C. Scott Ananian	f1a45cf2b9	Expand documentation of ParserOutput class Not all ParserOutputs represent parsed articles, and describe the merging operations on ParserOutputs in more depth. The interaction with Content and ContentHandlers is also described (thanks, Daniel!). Followup-To: Id2e3124652315a74869f504056fa8a99ad794350 Change-Id: I5c1016532eba1b71dc4d3d5d5d0c46775713efb5	2024-04-12 12:53:23 -04:00
Lucas Werkmeister	1adefb10e3	ParserOutput: clarify that “indexed” refers to value Bug: T305158 Change-Id: Ic6ea22b5188e575b288d57c8f692f492cb69452d	2024-04-12 12:09:02 +02:00
C. Scott Ananian	b4721e24aa	ParserOutput::setUnindexedPageProperty(): use empty string as default value If a placeholder value is needed, it is recommended to use the empty string to avoid wasting database space unnecessarily. Operationalize this recommendation by providing a default value for the method argument. Bug: T305158 Bug: T350224 Change-Id: I9ea8d93298d771c2d38fdfb451a2817220ca679a	2024-04-11 11:58:13 -04:00
jenkins-bot	2472cd9247	Merge "Substitute category default sort key when filling links table, not at parse time"	2024-04-11 14:59:33 +00:00
jenkins-bot	e4981c9702	Merge "Add ParserOutput::setIndexedPageProperty(); deprecate numeric properties"	2024-04-10 17:44:00 +00:00
C. Scott Ananian	de57c4e7c2	Add ParserOutput::setIndexedPageProperty(); deprecate numeric properties Deprecate non-string values to ::setPageProperty(), which introduce easy traps for programmers to fall into. Instead if page properties are intended to be indexed, use the new ::setIndexedPageProperty() instead. Also add ::setUnindexedPageProperty() for symmetry, with a tighter string type on the value. Bug: T305158 Bug: T350224 Change-Id: I8a39a7c90341dfee932aa819c9a0a637a8782f69	2024-04-05 19:12:29 -04:00
C. Scott Ananian	01590b89bf	ParserOutput: Emit deprecation warning if interwiki passed to addTemplate Bug: T361330 Depends-On: Ia8fd49a6f9af18e32d47d1dcd052c5f33123f44b Change-Id: Id4104dff4acaa60d94155d7915b9c1f2af4baaf0	2024-04-04 10:38:45 -04:00
C. Scott Ananian	c2df535b9c	Substitute category default sort key when filling links table, not at parse time This ensures uniform treatment of all places that call `addCategory` without duplicating the `defaultsort` code; it also ensures that the effect of the {{DEFAULTSORT}} parser function is independent of page position. Bug: T40435 Bug: T353530 Change-Id: I4480a6d59e766fa4eddc9ec9117c58b66771bb47	2024-03-29 18:30:02 -04:00
James D. Forrester	8e940c4f21	Standardise all our class alias deprecation comments for ease of grepping Change-Id: I7f85d931d3b79da23e87b4e5692b2e14be8fcaa0	2024-03-19 20:11:29 +00:00
jenkins-bot	9232985bd8	Merge "ParserOutput::setPageProperty(): Emit deprecation warning for non-scalar values"	2024-03-11 17:08:20 +00:00
Umherirrender	f3524224f0	build: Fix line indents Fixed SkinModuleTest::provideGetFeatureFilePathsOrder as nesting of arrays for parameters is wrong Change-Id: I9875008adf62d284c48662ebfbd245d72e5be064	2024-03-11 00:14:16 +01:00
jenkins-bot	a62f5c7911	Merge "[ParserOutput] Rename $mText to $mRawText and ::setText() to ::setRawText()"	2024-02-21 17:11:00 +00:00
C. Scott Ananian	72c4945a72	[ParserOutput] Rename $mText to $mRawText and ::setText() to ::setRawText() ParserOutput::getText() is not a simple getter, but does transformations on the "text" of the ParserOutput; the simple getter is named ::getRawText(). To maintain consistency, rename ParserOutput::setText() to ::setRawText() and the property name ParserOutput::$mText to ::$mRawText so future readers are not confused. The JSON property name as it appears in the serialized ParserCache is left as 'Text' so that we don't have any forward- or backward- rollback issues. Change-Id: I3ef34814ab9473cc70d0a6806e8c5a4a02b73491	2024-02-20 17:13:28 +00:00
C. Scott Ananian	6846f8aa10	ParserOutput::setPageProperty(): Emit deprecation warning for non-scalar values Non-scalar values passed to ParserOutput::setPageProperty() have never "worked"; they've been stringified (and null has been stored as an empty string). Emit a warning so we can fail harder in future releases. Bug: T305158 Depends-On: Ib36787d04c0ca713587dc8b814ca1c5a827f6f72 Change-Id: I38234084fdc7427ca577bb33a7fce1541581188d	2024-02-20 11:29:49 -05:00
C. Scott Ananian	b5d44bf339	ParserOutput::setPageProperty(): Update documentation String and non-string values behave very differently when passed to ::setPageProperty(), resulting in some unexpected gotchas for the unaware caller. Bug: T350224 Bug: T305158 Change-Id: I23b35b250f27a117d1353ea8a26d2b3f77c568e7	2024-02-20 11:26:38 -05:00
Subramanya Sastry	e55cc517da	Move Parser to Mediawiki\Parser namespace Bug: T166010 Co-Authored-By: Daimona Eaytoy <daimona.wiki@gmail.com> Co-Authored-By: James Forrester <jforrester@wikimedia.org> Co-Authored-By: Subramanya Sastry <ssastry@wikimedia.org> Change-Id: I79b4e732c45095eedbaa80afa5eb7479b387ed8a	2024-02-16 09:18:38 -05:00
jenkins-bot	2ca5bb9a96	Merge "ParserOutput: update task id in documentation"	2024-02-15 23:36:35 +00:00
C. Scott Ananian	13873a35b9	ParserOutput: update task id in documentation We closed T296023 and opened a new task for the work remaining, so update the comments in the code to match. The task relating to `addLanguageLink` is actually T296019. Change-Id: I28b942a57ed41751d44d8565a290d925f6d7f180	2024-02-15 15:23:57 -05:00
C. Scott Ananian	28a3371382	[OutputTransform] Remove broken and unused 'bodyContentOnly' option This was formerly used by the REST api, but instead that code just uses ParserOutput::getRawText() when it needs the full HTML document. This option has been broken, with various passes like RenderDebugInfo and AddWrapperDiv adding content in inappropriate places if bodyContentOnly was false. Change-Id: Ib45f95ded59c81c16d61803f977d1edbfe82b262	2024-02-15 13:05:53 -05:00
C. Scott Ananian	770d2bf040	[ParserOutput] Make 'enableSectionEditLinks' a ParserOption This will allow the Translate extension to set this parser option in the ArticleParserOptions hook, instead of mutating $options passed to ParserOutput::getText() in the ParserOutputPostCacheTransform hook. It ought to also help to handle the many places which call: ... = $parserOutput->getText( [ 'enableSectionEditLinks' => false, ] ); by allowing them to set the appropriate ParserOption instead of passing arguments to ::getText(). Bug: T350626 Change-Id: I719c115194059060f7f888608417a194ac80cc92	2024-02-09 23:42:03 +00:00
C. Scott Ananian	242c6d2cf9	Introduce ParserOutput:setFromParserOptions() and use for preview flag Bug: T341010 Co-Authored-by: cananian <cananian@wikimedia.org> Co-Authored-by: ihurbain <ihurbainpalatin@wikimedia.org> Change-Id: I03125fdaa7dd71ba57d593e85ecb98be6806f3f6	2024-02-07 21:22:06 -05:00
C. Scott Ananian	52320c0902	Move ParsoidRenderID to MediaWiki\Edit This class belongs with the rest of the Parsoid output stash code. This class has been marked @unstable since 1.39 and thus the move does not need release notes. Change-Id: I16061c0c28b1549fbe90ea082cc717fee4a09a6e	2024-02-07 21:22:06 -05:00
C. Scott Ananian	1858e1cdd7	Rename ParserOutput::{get,set}Timestamp() to ::{get,set}RevisionTimestamp() This avoids confusion with the "render timestamp" held by the cache, and is consistent with ::get*RevisionId() etc. The old ::getTimestamp() and ::setTimestamp() methods have been deprecated. Change-Id: Idb5e687709c98086c5d3075d31885c58a0723197	2024-02-07 21:22:06 -05:00
C. Scott Ananian	0de13d7662	Add ParserOutput::{get,set}RenderId() and set render id in ContentRenderer Set the render ID for each parse stored into cache so that we are able to identify a specific parse when there are dependencies (for example in an edit based on that parse). This is recorded as a property added to the ParserOutput, not the parent CacheTime interface. Even though the render ID is /related/ to the CacheTime interface, CacheTime is also used directly as a parser cache key, and the UUID should not be part of the lookup key. In general we are trying to move the location where these cache properties are set as early as possible, so we check at each location to ensure we don't overwrite a previously-set value. Eventually we can convert most of these checks into assertions that the cache properties have already been set (T350538). The primary location for setting cache properties is the ContentRenderer. Moved setting the revision timestamp into ContentRenderer as well, as it was set along the same code paths. An extra parameter was added to ContentRenderer::getParserOutput() to support this. Added merge code to ParserOutput::mergeInternalMetaDataFrom() which should ensure that cache time, revision, timestamp, and render id are all set properly when multiple slots are combined together in MCR. In order to ensure the render ID is set on all codepaths we needed to plumb the GlobalIdGenerator service into ContentRenderer, ParserCache, ParserCacheFactory, and RevisionOutputCache. Eventually (T350538) it should only be necessary in the ContentRenderer. Bug: T350538 Bug: T349868 Followup-To: Ic9b7cc0fcf365e772b7d080d76a065e3fd585f80 Change-Id: I72c5e6f86b7f081ab5ce7a56f5365d2f75067a78	2024-02-07 21:22:06 -05:00
Daimona Eaytoy	1d6776fdbc	Replace deprecated MWException Also remove some unchecked exception from doc comments. Bug: T328220 Bug: T240672 Change-Id: I88b1e948ce5da77d9c4862a2b98793d6ba00cf8b	2024-01-19 21:58:42 +00:00
Brian Wolff	f1af33be38	Add taint annotations for ParserOutput Change-Id: Id73b8f22f8877442f114bf7b41d0f9ea47fb4283	2024-01-12 14:17:21 +00:00
C. Scott Ananian	f2d910844f	ParserOutput: Convert category name back to a LinkTarget when merging CMC When we are merging a ParserOutput into a ContentMetadataCollector, convert categories to LinkTarget, which is the preferred parameter type of CMC::addCategory(). This also reverts the temporary fix in I0715f4fbc870e401e5759dd7c7a3c19077c40a6a. Note that the category names should be in dbkey form for proper deduplication, but both TitleValue:tryNew() and CategoryLinksTable::setParserOutput() will renormalize if needed (see I2b08edd90666e0fa4eafe91444a58806909b02d6 / T328477). Depends-On: Iea894aa2cee90f4ca5c7688493b0654e4605ce23 Change-Id: I5a903396edb4da0900ecef37cb3bf4bd03b5ba68	2023-12-18 21:01:51 +00:00
C. Scott Ananian	df1f18cc9d	ParserOutput: Temporarily move "merge categories" in ::collectMetadata Due to a botched signature change on the Parsoid side, in -a8 Parsoid only accepts `string\|int` for ContentMetadataCollector::addCategory() and in -a9 Parsoid only accept `LinkTarget`. The ParserOutput in core, of course, accepts both. So move the code which merges categories into the section of ContentMetadataCollector::collectMetadata() where we know that the CMC we're merging with is really a ParserOutput. Change-Id: I0715f4fbc870e401e5759dd7c7a3c19077c40a6a	2023-12-18 14:31:19 -05:00
jenkins-bot	0d45f127f5	Merge "ParserOutput: keep modules and module styles unique"	2023-12-16 04:25:53 +00:00
James D. Forrester	9bfb75ff90	Namespace ParserOutput Most used non-namespaced class! Bug: T353458 Change-Id: I4c2cbb0a808b3881a4d6ca489eee5d8c8ebf26cf	2023-12-14 14:57:34 -05:00
Isabelle Hurbain-Palatin	3935cd1b05	ParserOutput::getText(): do not clone ParserOutput when invoking pipeline OutputPage::getParserOutputText/addParserOutputContent expects ParserOutput to be mutated (e.g. by PostCacheTransformHookRunner). Hence, cloning it before running the pipeline is breaking DiscussionTools, probably among others. Suppress the clone for the case where the output pipeline is invoked from ParserOutput::getText() (which is a deprecated method anyway) and additionally suppress the side-effects to ParserOutput::$mText on that code path. Bug: T353257 Co-Authored-By: C. Scott Ananian <cananian@wikimedia.org> Co-Authored-By: Isabelle Hurbain-Palatin <ihurbainpalatin@wikimedia.org> Change-Id: I85c690fd37b781cb27c21970467639e852113b2a	2023-12-12 11:39:32 -05:00
jenkins-bot	c57120300a	Merge "ParserOutput: Allow passing LinkTarget to title-related methods"	2023-12-11 18:02:25 +00:00
Isabelle Hurbain-Palatin	a3f51c732d	Refactor DefaultOutputTransform into a pipeline of transforms Bug: T348253 Change-Id: I53551ec6d6471569709c71c1155729e550f64de8	2023-12-08 18:06:19 -05:00
C. Scott Ananian	4b83285954	ParserOutput: Allow passing LinkTarget to title-related methods Broadened the argument type to allow passing LinkTarget to: * ParserOutput::addCategory() * ParserOutput::addLanguageLink() * ParserOutput::addLink() * ParserOutput::addImage() * ParserOutput::addTemplate() This allows for a tighter interface with Parsoid's ContentMetadataCollector class and avoids errors caused by passing the wrong form of string title ("text" with spaces versus "dbkey" with underscores). There are a few performance problems remaining after this patch, which only apply to use by Parsoid (not the legacy parser): 1. ::addLink() does inefficient db requests to fetch the page id for each link if the optional $id parameter is not passed. These lookups should be deferred and a LinkBatch used. (The legacy parser always passes $id.) 2. ::addTemplate() similarly requires $page_id (and $rev_id) to be passed, so is not currently usable by Parsoid. 3. ::addLanguageLink() uses Title::getFullText() which is not present in LinkTarget and is currently implemented as a full Title lookup. This is not an issue for the legacy parser, because it already has a Title object so the lookup is a no-op, but could be improved for Parsoid's use. Bug: T296023 Change-Id: If21ec8563c8a619bdde7c0cb6534bb9009480a21	2023-12-08 17:50:29 -05:00
jenkins-bot	b7fc1b2f43	Merge "Only cache expensive renderings"	2023-11-30 21:24:34 +00:00
daniel	e3fb964439	Only cache expensive renderings Pages that are fast to render can be omitted from the parser cache to preserve disk space and cache write operations. The threshold is configurable per namespace, so the tradeoff can be evaluated based on different access patterns. For example, pages that are accessed rarely, like file description pages on commons, may have a high threshold configured, while pages that are read frequently, like wikipedia articles, may be configured to be always cached, using a 0 threshold. Filtering is based on a time profile recorded in the ParserOutput. A generic mechanism for capturing the timing profile is implemented in the ContentHandler base class. Subclasses may implement a more rigorous capture mechanism. Bug: T346765 Change-Id: I38a6f3ef064f98f3ad6a7c60856b0248a94fe9ac	2023-11-30 20:56:12 +00:00
C. Scott Ananian	f8178369b6	ParserOutput: remove getFlag()/setFlag(), deprecated since 1.38 These were hard deprecated in 1.41. Bug: T305161 Change-Id: I0cd7f416a7cb132747b3dbd35d83af1c60f6a8dc	2023-11-28 15:07:51 -05:00
C. Scott Ananian	cea029f718	ParserOutput: keep modules and module styles unique Instead of waiting until ParserOutput::mergeList() is called to uniquify the list of modules, use an array<string,true> to ensure that the modules and module styles are a set. Change-Id: I49673bc369dec373bce23fe7b831e6be5a256c46	2023-11-27 22:04:54 -05:00
C. Scott Ananian	78442cdb63	[typo] Fix typo in ParserOutput comment Change-Id: If73dc2d2e0bf668a370d1bda16d53a59b3dc15ee	2023-11-17 16:13:26 +00:00
Isabelle Hurbain-Palatin	c1a4034ec1	Extract EDITSECTION_REGEX and renderDebugInfo to DefaultOutputTransform Follow-up to I312f3748ebfb0373ee3542ba0abdeefe7db1d488 Bug: T348253 Change-Id: Ia5f7108b3d27ad27785cbc5fca6486634a8c8849	2023-11-07 15:45:34 +01:00
jenkins-bot	c53a75856d	Merge "parser: Move lang/dir and mw-content-ltr to ParserOutput::getText"	2023-11-06 17:38:10 +00:00
Timo Tijhof	d0a96db0f9	parser: Move lang/dir and mw-content-ltr to ParserOutput::getText == Skin::wrapHTML == Skin::wrapHTML no longer has to perform any guessing of the ParserOutput language. Nor does it have to special wiki pages vs special pages in this regard. Yay, code removal. == ImagePage == On URLs like /wiki/File:Example.jpg, the main output handler is ImagePage::view. This calls the parent Article::view to handle most of its output. Article::view obtains the ParserOptions, and then fetches ParserOutput, and then adds `<div class=mw-parser-output>` and its metadata to OutputPage. Before this change, ImagePage::view was creating a wrapper based on "predicting" what language the ParserOutput will contain. It couldn't call the new OutputPage::getContentLanguage or some equivalent as Article::view wouldn't have populated that yet. This leaky abstraction is fixed by this change as now the `<div>` from ParserOutput no longer comes with a "please wrap it properly" contract that Article subclasses couldn't possibly implement correctly (it coudln't wrap it after the fact because Article::view writes to OutputPage directly). RECENT (T310445): A special case was recently added for file pages about translated SVGs. For those, we decide which language to use for the "fullMedia" thumb atop the page. This was recently changed as part of T310445 from a hardcoded $wgLanguageCode (site content lang) to new problematic Title::getPageViewLanguage, which tries to guestimate the page language of the rendered ParserOutput and then gets the preferred variant for the current user. The motivation for this was to support language variants but used Title::getPageViewLanguage as a kitchen sink to achieve that minor side-effect. The only part of this now-deprecated method that we actually need is LanguageConverter::getPreferredVariant(). Test plan: Covered by ImagePageTest. == Skin mainpage-title == RECENT (T331095, T298715): A special case was added to Skin::getTemplateData that powers the mainpage-title interface message feature. This is empty by default, but when created via MediaWiki:mainpage-title allows interface admins to replace the H1 with a custom and localised page heading. A few months ago, in Ifc9f0a7174, Title::getPageViewLanguage was applied here to support language variants. Replace with the same fix as for ImagePage. Revert back to Message::inContentLanguage() but refactor to inLanguage() via MediaWikiServices::getContentLanguage so that LanguageConverter::getPreferredVariant can be applied. == EditPage == This was doing similar "predicting" of the ParserOutput language to create an empty preview placeholder for use by preview.js. Now that ApiParse (via ParserOutput::getText) returns a usable element without any secret "you magically know the right class, lang, and dir" contract, this placeholder is no longer needed. Test Plan: * EditPage: Default preview 1. index.php?title=Main_Page&action=edit 2. Show preview 3. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr> * EditPage: JS preview 1. Preferences > Editing > Show preview without reload 2. index.php?title=Main_Page&action=edit 3. Show preview 4. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr> 5. Type something and 'Show preview' again 6. Assert old element gone, new text is shown, and new element attributes are the same as the above. == McrUndoAction == Same as EditPage basically, but without the JS preview use case. == DifferenceEngine == Test: 1. Open /w/index.php?title=Main_Page&diff=0 (this shows the latest diff, can do manually by viewing /wiki/Main_Page, click "View history", click "Compare selected revisions") 2. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr> 3. Open /w/index.php?title=Main_Page&diff=0&action=render 4. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr> == Special:ExpandTemplates == Test: 1. /wiki/Special:ExpandTemplates 2. Write "Hello". 3. "OK" 4. Assert <div class="mw-content-ltr mw-parser-output" lang=en dir=ltr> Bug: T341244 Depends-On: Icd9c079f5896ee83d86b9c2699636dc81d25a14c Depends-On: I4e7484b3b94f1cb6062e7cef9f20626b650bb4b1 Depends-On: I90b88f3b3a3bbeba4f48d118f92f54864997e105 Change-Id: Ib130a055e46764544af0f1a46d2bc2b3a7ee85b7	2023-11-03 19:24:47 -04:00
C. Scott Ananian	221a33208c	Deprecate ParserOutput::setLanguageLinks() Core does not use ::setLanguageLinks() at all any more, and only a very small number of extensions use it: https://codesearch.wmcloud.org/search/?q=-%3EsetLanguageLinks&files=&excludeFiles=&repos= This can be deprecated and (eventually) removed as we prepare to update ::addLanguageLink(). Bug: T296019 Change-Id: I5bcbc65c8dccb6e3037f528bd9e7e9e27514ea5b	2023-11-03 22:28:38 +00:00
Isabelle Hurbain-Palatin	36b4ab44f6	Refactor ParserOutput::getText into DefaultOutputTransform service This also introduces the ephemeral field "$mTransformedText" to store the result of transformation in ParserOutput. This is a first step before the transformation uses HtmlHolder as input and output. Bug: T348253 Change-Id: I312f3748ebfb0373ee3542ba0abdeefe7db1d488	2023-10-16 13:11:38 +02:00

1 2 3 4 5 ...

459 commits