Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
Umherirrender	1951aea6b8	Fix various version mention for class_alias Versions are changed in `8e940c4f21`, but that makes the version wrong Follow-Up: I7f85d931d3b79da23e87b4e5692b2e14be8fcaa0 Change-Id: Iae43725b8e0fffc4d44bf57f6227334b41290bd9	2024-07-05 18:39:49 +02:00
Bartosz Dziewoński	c7f52f0ddb	Make MessageValue implement JsonDeserializable MessageValue and friends are pure value objects and newable, so it makes sense for them to be (de)serializable too. There are some places where we want to serialize messages, such as in ParserOutput. The structure of the resulting JSON is inspired by the way we represent Message objects as plain values elsewhere in MediaWiki, e.g. StatusValue::getStatusArray(). Co-Authored-By: C. Scott Ananian <cscott@cscott.net> Depends-On: Ia32f95a6bdf342262b4ef044140527f0676402b9 Depends-On: I7bafe80cd36c2558517f474871148286350a4e76 Change-Id: Id47d58b5e26707fa0e0dbdd37418c0d54c8dd503	2024-06-12 15:47:37 -04:00
James D. Forrester	19f4e6945a	Rename JsonUnserial… to JsonDeserial… This is to make it clearer that they're related to converting serialized content back into JSON, rather than stating that things are not representable in JSON. Change-Id: Ic440ac2d05b5ac238a1c0e4821d3f2d858bc3d76	2024-06-12 14:50:58 -04:00
C. Scott Ananian	47bdd8b1c8	[ParserOutput] Remove unused TOCHTML from ParserCache serialization This reverts commit `b4cf4aa6bd`, which is no longer needed for ParserCache compatibility across trains. REL1_42 contains `b4cf4aa6bd`, so MW 1.43 will not need this. This also adds new serialization test cases for 1.43 with this field removed; see https://www.mediawiki.org/wiki/Manual:Parser_cache/Serialization_compatibility Change-Id: I716e2efe7a491002e6e6b2300016165fffe3c0d6	2024-05-17 21:46:00 +00:00
C. Scott Ananian	19ee8c4f91	Serialization test cases: fix filename after ParserOutput namespacing The serialization test cases look for files based on the name of the class they are testing. After the namespacing of ParserOutput, they were looking for files named like: 1.42-MediaWiki\Parser\ParserOutput-binaryPageProperties.json The embedded backslashes in these filenames would raise havoc on Windows machines. What's more, none of the existing ParserOutput tests will actually be checked anymore because the filenames don't match up with what is expected after namespacing. Fix this by stripping the namespace from the classname when forming the test file names. When this is done, the tests cases for GhostFieldAccess begin running again, revealing that they were broken when GhostFieldTestClass was re-namespaced. Add a class alias for the GhostFieldTestClass to fix this. Finally, PHP <= 8.1 does not deserialize private properties correctly after a class is renamed and aliased, because the internal name of the private property contains the "old" class name in the serialization. Add a new ::restoreAliasedGhostField() method to the GhostFieldAccessTrait to workaround this issue and restore proper deserialization of ParserOutput. Bug: T365060 Followup-To: I9c64a631b0b4e8e4fef8a72ee0f749d35f918052 Followup-To: I4c2cbb0a808b3881a4d6ca489eee5d8c8ebf26cf Change-Id: I7bafe80cd36c2558517f474871148286350a4e76	2024-05-17 17:07:47 -04:00
Bartosz Dziewoński	73de566949	Use 'scalar' type alias to shorten PHPDoc annotations 'string\|int\|float\|bool' (in any order) can be replaced by 'scalar'. 'string\|int\|float\|bool\|null' (likewise) can be replaced by '?scalar'. This is convenient for functions that can accept any primitive value, which comes up sometimes when serializing things as SQL, JSON etc. Change-Id: I4a711ee59611d76d6745f3640e4aa6bebec02918	2024-05-11 23:21:22 +00:00
jenkins-bot	b671e574eb	Merge "Add ParserOptions::setCollapsibleSections()"	2024-04-29 21:17:15 +00:00
C. Scott Ananian	8d031bcf87	Add ParserOptions::setCollapsibleSections() This is a non-default option that will add a <div> wrapper around section contents to allow client-side collapsing. This is intended for use by MobileFrontEnd, but could eventually be enabled for desktop read views as well. Since this parser option is in the "cache-varying options" set, any caller who sets this option will fork the cache for that page, which is reasonable as the parser options sets a ParserOutput property. In the future our caching strategy will get smarter and we'll add code which avoids the cache split and just transfers the appropriate values from ParserOptions to ParserOutput flags after the cached output is retrieved. Bug: T359001 Change-Id: Ie93959a056ed15a728404eb293e4bb6eeaeb15c0	2024-04-29 12:11:09 -04:00
C. Scott Ananian	b4cf4aa6bd	ParserOutput: Temporarily write (unused) TOCHTML to ParserCache Even though this JSON property is unused on master, the previous train release read it from the JSON (and threw the value away). In order to provide error-free roll-forward and roll-back of the train, temporarily write an empty string as the value of TOCHTML so that the read from `$jsonData['TOCHTML']` won't cause a PHP notice in the logs if we roll back. This patch is only needed for one train release, and can then be removed. Bug: T363107 Change-Id: I77add3bd7f00941cb81481f738bc59d6008c2406	2024-04-22 11:26:10 -04:00
Umherirrender	8d97313f81	Fix some line indent Change-Id: I8f82724197d20f9289d80e138d80310f1eab29f2	2024-04-20 00:25:15 +02:00
C. Scott Ananian	195ac55bfe	[ParserOutput] Remove deprecated ::getTOCHTML() and ::setTOCHTML() methods These were deprecated with warnings in 1.40. Change-Id: I8027bc26c71ae94d3d5c7e5112545cd1b35749aa	2024-04-16 13:00:58 -04:00
C. Scott Ananian	db2f1ad606	[ParserOutput] Remove deprecated ::getCategories() method This was deprecated with warnings in 1.40. Change-Id: I7b8a86f6efbdd86c1f493db6741c37bfb325e9bb	2024-04-16 12:57:17 -04:00
jenkins-bot	1caf41bb73	Merge "ParserOutput: Rename ::setIndexedPageProperty() to ::setNumericPageProperty()"	2024-04-16 10:57:58 +00:00
C. Scott Ananian	2429785470	ParserOutput: Rename ::setIndexedPageProperty() to ::setNumericPageProperty() Before this method name gets baked forever into the 1.42 release, rename the ParserOutput::setIndexedPageProperty() and ::setUnindexedPageProperty() methods to ::setNumericPageProperty() and ::setUnsortedPageProperty() to try to address some confusion about whether the presence of the page property is still indexed (it is!), in contrast to whether there's an additional "sort key" associated with the value assigned to the page property. This naming is compatible with the feature request in T357783 to have the sort key and property value specified independently. The new method signature in that case would be: ...setSortedPageProperty( string $name, string $value, int\|float $sortKey ) Although PHP 8.0 will throw a TypeError if a non-numeric type is coerced to numeric using `0 + ...`, use an explicit is_numeric check to obtain the same behavior in PHP 7.x. Change-Id: Ia94c192c429d0482c58467bed787fd2e0aca052f	2024-04-15 15:13:56 -04:00
C. Scott Ananian	f1a45cf2b9	Expand documentation of ParserOutput class Not all ParserOutputs represent parsed articles, and describe the merging operations on ParserOutputs in more depth. The interaction with Content and ContentHandlers is also described (thanks, Daniel!). Followup-To: Id2e3124652315a74869f504056fa8a99ad794350 Change-Id: I5c1016532eba1b71dc4d3d5d5d0c46775713efb5	2024-04-12 12:53:23 -04:00
Lucas Werkmeister	1adefb10e3	ParserOutput: clarify that “indexed” refers to value Bug: T305158 Change-Id: Ic6ea22b5188e575b288d57c8f692f492cb69452d	2024-04-12 12:09:02 +02:00
C. Scott Ananian	b4721e24aa	ParserOutput::setUnindexedPageProperty(): use empty string as default value If a placeholder value is needed, it is recommended to use the empty string to avoid wasting database space unnecessarily. Operationalize this recommendation by providing a default value for the method argument. Bug: T305158 Bug: T350224 Change-Id: I9ea8d93298d771c2d38fdfb451a2817220ca679a	2024-04-11 11:58:13 -04:00
jenkins-bot	2472cd9247	Merge "Substitute category default sort key when filling links table, not at parse time"	2024-04-11 14:59:33 +00:00
jenkins-bot	e4981c9702	Merge "Add ParserOutput::setIndexedPageProperty(); deprecate numeric properties"	2024-04-10 17:44:00 +00:00
C. Scott Ananian	de57c4e7c2	Add ParserOutput::setIndexedPageProperty(); deprecate numeric properties Deprecate non-string values to ::setPageProperty(), which introduce easy traps for programmers to fall into. Instead if page properties are intended to be indexed, use the new ::setIndexedPageProperty() instead. Also add ::setUnindexedPageProperty() for symmetry, with a tighter string type on the value. Bug: T305158 Bug: T350224 Change-Id: I8a39a7c90341dfee932aa819c9a0a637a8782f69	2024-04-05 19:12:29 -04:00
C. Scott Ananian	01590b89bf	ParserOutput: Emit deprecation warning if interwiki passed to addTemplate Bug: T361330 Depends-On: Ia8fd49a6f9af18e32d47d1dcd052c5f33123f44b Change-Id: Id4104dff4acaa60d94155d7915b9c1f2af4baaf0	2024-04-04 10:38:45 -04:00
C. Scott Ananian	c2df535b9c	Substitute category default sort key when filling links table, not at parse time This ensures uniform treatment of all places that call `addCategory` without duplicating the `defaultsort` code; it also ensures that the effect of the {{DEFAULTSORT}} parser function is independent of page position. Bug: T40435 Bug: T353530 Change-Id: I4480a6d59e766fa4eddc9ec9117c58b66771bb47	2024-03-29 18:30:02 -04:00
James D. Forrester	8e940c4f21	Standardise all our class alias deprecation comments for ease of grepping Change-Id: I7f85d931d3b79da23e87b4e5692b2e14be8fcaa0	2024-03-19 20:11:29 +00:00
jenkins-bot	9232985bd8	Merge "ParserOutput::setPageProperty(): Emit deprecation warning for non-scalar values"	2024-03-11 17:08:20 +00:00
Umherirrender	f3524224f0	build: Fix line indents Fixed SkinModuleTest::provideGetFeatureFilePathsOrder as nesting of arrays for parameters is wrong Change-Id: I9875008adf62d284c48662ebfbd245d72e5be064	2024-03-11 00:14:16 +01:00
jenkins-bot	a62f5c7911	Merge "[ParserOutput] Rename $mText to $mRawText and ::setText() to ::setRawText()"	2024-02-21 17:11:00 +00:00
C. Scott Ananian	72c4945a72	[ParserOutput] Rename $mText to $mRawText and ::setText() to ::setRawText() ParserOutput::getText() is not a simple getter, but does transformations on the "text" of the ParserOutput; the simple getter is named ::getRawText(). To maintain consistency, rename ParserOutput::setText() to ::setRawText() and the property name ParserOutput::$mText to ::$mRawText so future readers are not confused. The JSON property name as it appears in the serialized ParserCache is left as 'Text' so that we don't have any forward- or backward- rollback issues. Change-Id: I3ef34814ab9473cc70d0a6806e8c5a4a02b73491	2024-02-20 17:13:28 +00:00
C. Scott Ananian	6846f8aa10	ParserOutput::setPageProperty(): Emit deprecation warning for non-scalar values Non-scalar values passed to ParserOutput::setPageProperty() have never "worked"; they've been stringified (and null has been stored as an empty string). Emit a warning so we can fail harder in future releases. Bug: T305158 Depends-On: Ib36787d04c0ca713587dc8b814ca1c5a827f6f72 Change-Id: I38234084fdc7427ca577bb33a7fce1541581188d	2024-02-20 11:29:49 -05:00
C. Scott Ananian	b5d44bf339	ParserOutput::setPageProperty(): Update documentation String and non-string values behave very differently when passed to ::setPageProperty(), resulting in some unexpected gotchas for the unaware caller. Bug: T350224 Bug: T305158 Change-Id: I23b35b250f27a117d1353ea8a26d2b3f77c568e7	2024-02-20 11:26:38 -05:00
Subramanya Sastry	e55cc517da	Move Parser to Mediawiki\Parser namespace Bug: T166010 Co-Authored-By: Daimona Eaytoy <daimona.wiki@gmail.com> Co-Authored-By: James Forrester <jforrester@wikimedia.org> Co-Authored-By: Subramanya Sastry <ssastry@wikimedia.org> Change-Id: I79b4e732c45095eedbaa80afa5eb7479b387ed8a	2024-02-16 09:18:38 -05:00
jenkins-bot	2ca5bb9a96	Merge "ParserOutput: update task id in documentation"	2024-02-15 23:36:35 +00:00
C. Scott Ananian	13873a35b9	ParserOutput: update task id in documentation We closed T296023 and opened a new task for the work remaining, so update the comments in the code to match. The task relating to `addLanguageLink` is actually T296019. Change-Id: I28b942a57ed41751d44d8565a290d925f6d7f180	2024-02-15 15:23:57 -05:00
C. Scott Ananian	28a3371382	[OutputTransform] Remove broken and unused 'bodyContentOnly' option This was formerly used by the REST api, but instead that code just uses ParserOutput::getRawText() when it needs the full HTML document. This option has been broken, with various passes like RenderDebugInfo and AddWrapperDiv adding content in inappropriate places if bodyContentOnly was false. Change-Id: Ib45f95ded59c81c16d61803f977d1edbfe82b262	2024-02-15 13:05:53 -05:00
C. Scott Ananian	770d2bf040	[ParserOutput] Make 'enableSectionEditLinks' a ParserOption This will allow the Translate extension to set this parser option in the ArticleParserOptions hook, instead of mutating $options passed to ParserOutput::getText() in the ParserOutputPostCacheTransform hook. It ought to also help to handle the many places which call: ... = $parserOutput->getText( [ 'enableSectionEditLinks' => false, ] ); by allowing them to set the appropriate ParserOption instead of passing arguments to ::getText(). Bug: T350626 Change-Id: I719c115194059060f7f888608417a194ac80cc92	2024-02-09 23:42:03 +00:00
C. Scott Ananian	242c6d2cf9	Introduce ParserOutput:setFromParserOptions() and use for preview flag Bug: T341010 Co-Authored-by: cananian <cananian@wikimedia.org> Co-Authored-by: ihurbain <ihurbainpalatin@wikimedia.org> Change-Id: I03125fdaa7dd71ba57d593e85ecb98be6806f3f6	2024-02-07 21:22:06 -05:00
C. Scott Ananian	52320c0902	Move ParsoidRenderID to MediaWiki\Edit This class belongs with the rest of the Parsoid output stash code. This class has been marked @unstable since 1.39 and thus the move does not need release notes. Change-Id: I16061c0c28b1549fbe90ea082cc717fee4a09a6e	2024-02-07 21:22:06 -05:00
C. Scott Ananian	1858e1cdd7	Rename ParserOutput::{get,set}Timestamp() to ::{get,set}RevisionTimestamp() This avoids confusion with the "render timestamp" held by the cache, and is consistent with ::get*RevisionId() etc. The old ::getTimestamp() and ::setTimestamp() methods have been deprecated. Change-Id: Idb5e687709c98086c5d3075d31885c58a0723197	2024-02-07 21:22:06 -05:00
C. Scott Ananian	0de13d7662	Add ParserOutput::{get,set}RenderId() and set render id in ContentRenderer Set the render ID for each parse stored into cache so that we are able to identify a specific parse when there are dependencies (for example in an edit based on that parse). This is recorded as a property added to the ParserOutput, not the parent CacheTime interface. Even though the render ID is /related/ to the CacheTime interface, CacheTime is also used directly as a parser cache key, and the UUID should not be part of the lookup key. In general we are trying to move the location where these cache properties are set as early as possible, so we check at each location to ensure we don't overwrite a previously-set value. Eventually we can convert most of these checks into assertions that the cache properties have already been set (T350538). The primary location for setting cache properties is the ContentRenderer. Moved setting the revision timestamp into ContentRenderer as well, as it was set along the same code paths. An extra parameter was added to ContentRenderer::getParserOutput() to support this. Added merge code to ParserOutput::mergeInternalMetaDataFrom() which should ensure that cache time, revision, timestamp, and render id are all set properly when multiple slots are combined together in MCR. In order to ensure the render ID is set on all codepaths we needed to plumb the GlobalIdGenerator service into ContentRenderer, ParserCache, ParserCacheFactory, and RevisionOutputCache. Eventually (T350538) it should only be necessary in the ContentRenderer. Bug: T350538 Bug: T349868 Followup-To: Ic9b7cc0fcf365e772b7d080d76a065e3fd585f80 Change-Id: I72c5e6f86b7f081ab5ce7a56f5365d2f75067a78	2024-02-07 21:22:06 -05:00
Daimona Eaytoy	1d6776fdbc	Replace deprecated MWException Also remove some unchecked exception from doc comments. Bug: T328220 Bug: T240672 Change-Id: I88b1e948ce5da77d9c4862a2b98793d6ba00cf8b	2024-01-19 21:58:42 +00:00
Brian Wolff	f1af33be38	Add taint annotations for ParserOutput Change-Id: Id73b8f22f8877442f114bf7b41d0f9ea47fb4283	2024-01-12 14:17:21 +00:00
C. Scott Ananian	f2d910844f	ParserOutput: Convert category name back to a LinkTarget when merging CMC When we are merging a ParserOutput into a ContentMetadataCollector, convert categories to LinkTarget, which is the preferred parameter type of CMC::addCategory(). This also reverts the temporary fix in I0715f4fbc870e401e5759dd7c7a3c19077c40a6a. Note that the category names should be in dbkey form for proper deduplication, but both TitleValue:tryNew() and CategoryLinksTable::setParserOutput() will renormalize if needed (see I2b08edd90666e0fa4eafe91444a58806909b02d6 / T328477). Depends-On: Iea894aa2cee90f4ca5c7688493b0654e4605ce23 Change-Id: I5a903396edb4da0900ecef37cb3bf4bd03b5ba68	2023-12-18 21:01:51 +00:00
C. Scott Ananian	df1f18cc9d	ParserOutput: Temporarily move "merge categories" in ::collectMetadata Due to a botched signature change on the Parsoid side, in -a8 Parsoid only accepts `string\|int` for ContentMetadataCollector::addCategory() and in -a9 Parsoid only accept `LinkTarget`. The ParserOutput in core, of course, accepts both. So move the code which merges categories into the section of ContentMetadataCollector::collectMetadata() where we know that the CMC we're merging with is really a ParserOutput. Change-Id: I0715f4fbc870e401e5759dd7c7a3c19077c40a6a	2023-12-18 14:31:19 -05:00
jenkins-bot	0d45f127f5	Merge "ParserOutput: keep modules and module styles unique"	2023-12-16 04:25:53 +00:00
James D. Forrester	9bfb75ff90	Namespace ParserOutput Most used non-namespaced class! Bug: T353458 Change-Id: I4c2cbb0a808b3881a4d6ca489eee5d8c8ebf26cf	2023-12-14 14:57:34 -05:00
Isabelle Hurbain-Palatin	3935cd1b05	ParserOutput::getText(): do not clone ParserOutput when invoking pipeline OutputPage::getParserOutputText/addParserOutputContent expects ParserOutput to be mutated (e.g. by PostCacheTransformHookRunner). Hence, cloning it before running the pipeline is breaking DiscussionTools, probably among others. Suppress the clone for the case where the output pipeline is invoked from ParserOutput::getText() (which is a deprecated method anyway) and additionally suppress the side-effects to ParserOutput::$mText on that code path. Bug: T353257 Co-Authored-By: C. Scott Ananian <cananian@wikimedia.org> Co-Authored-By: Isabelle Hurbain-Palatin <ihurbainpalatin@wikimedia.org> Change-Id: I85c690fd37b781cb27c21970467639e852113b2a	2023-12-12 11:39:32 -05:00
jenkins-bot	c57120300a	Merge "ParserOutput: Allow passing LinkTarget to title-related methods"	2023-12-11 18:02:25 +00:00
Isabelle Hurbain-Palatin	a3f51c732d	Refactor DefaultOutputTransform into a pipeline of transforms Bug: T348253 Change-Id: I53551ec6d6471569709c71c1155729e550f64de8	2023-12-08 18:06:19 -05:00
C. Scott Ananian	4b83285954	ParserOutput: Allow passing LinkTarget to title-related methods Broadened the argument type to allow passing LinkTarget to: * ParserOutput::addCategory() * ParserOutput::addLanguageLink() * ParserOutput::addLink() * ParserOutput::addImage() * ParserOutput::addTemplate() This allows for a tighter interface with Parsoid's ContentMetadataCollector class and avoids errors caused by passing the wrong form of string title ("text" with spaces versus "dbkey" with underscores). There are a few performance problems remaining after this patch, which only apply to use by Parsoid (not the legacy parser): 1. ::addLink() does inefficient db requests to fetch the page id for each link if the optional $id parameter is not passed. These lookups should be deferred and a LinkBatch used. (The legacy parser always passes $id.) 2. ::addTemplate() similarly requires $page_id (and $rev_id) to be passed, so is not currently usable by Parsoid. 3. ::addLanguageLink() uses Title::getFullText() which is not present in LinkTarget and is currently implemented as a full Title lookup. This is not an issue for the legacy parser, because it already has a Title object so the lookup is a no-op, but could be improved for Parsoid's use. Bug: T296023 Change-Id: If21ec8563c8a619bdde7c0cb6534bb9009480a21	2023-12-08 17:50:29 -05:00
jenkins-bot	b7fc1b2f43	Merge "Only cache expensive renderings"	2023-11-30 21:24:34 +00:00
daniel	e3fb964439	Only cache expensive renderings Pages that are fast to render can be omitted from the parser cache to preserve disk space and cache write operations. The threshold is configurable per namespace, so the tradeoff can be evaluated based on different access patterns. For example, pages that are accessed rarely, like file description pages on commons, may have a high threshold configured, while pages that are read frequently, like wikipedia articles, may be configured to be always cached, using a 0 threshold. Filtering is based on a time profile recorded in the ParserOutput. A generic mechanism for capturing the timing profile is implemented in the ContentHandler base class. Subclasses may implement a more rigorous capture mechanism. Bug: T346765 Change-Id: I38a6f3ef064f98f3ad6a7c60856b0248a94fe9ac	2023-11-30 20:56:12 +00:00

1 2 3 4 5 ...

467 commits