Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
Ebrahim Byagowi	efda4cae32	Use a better bidi aware markup in CommentParser As noted on the comments, this needed a markup that work better in bidi scenarios and as a part of replacing bidi control codes with HTML markup I was able to test different bidi scenarios using <bdi> HTML tags. Bug: T375975 Change-Id: If2af751fc9f78869acf7b7e93199fa927de2cc19	2024-10-04 10:50:02 +03:30
C. Scott Ananian	714a7146d6	Sync up core repo with Parsoid This now aligns with Parsoid commit b19f73d7beadedcb6991640aac7eb7d6e7aec8f5 Change-Id: Ief91b25769f777169af65c9720faa767850f6239	2024-10-02 10:43:47 -04:00
C. Scott Ananian	7495f9bc15	Deduplicate language links in ParserOutput and OutputPage Move deduplication of language links out of Parser.php and into the ParserOutput in order to be compatible with alternate Parsers (Parsoid). Clean up various inconsistencies: ensure deduplication also happens in OutputPage when multiple ParserOutputs are merged into the final output, and ensure that the deduplication in LinksUpdate is done in the same order (first link prevails) as in Parser/ParserOutput/OutputPage. Deprecate OutputPage::setLanguageLinks() (the matching ParserOutput::setLanguageLinks() was deprecated in 1.42). As a breaking change, return an array, not an array reference, from ParserOutput::getLanguageLinks(). This allows us to safely modify the internal representation of language links. As far as I can tell, no one used the returned reference to sneakily modify the list of language links, and there not a good way to have deprecated this before making the breaking change. While we're at it, we've added tests to ensure that language link fragments are preserved. Bug: T26502 Bug: T358950 Bug: T375005 Change-Id: I82a05a51d94782ebb9fa87ff889ca0f633b3e15c	2024-09-26 15:28:49 -04:00
C. Scott Ananian	25b27ce309	Sync up core repo with Parsoid This now aligns with Parsoid commit fc9ab0949952d5e784acb012096860f5c8663fc7 Change-Id: I5d72f551c75de80b0834ea98d8a1d3cb5852e866	2024-09-26 13:04:36 -04:00
C. Scott Ananian	ec4e4648dd	Sync up core repo with Parsoid This now aligns with Parsoid commit dea42dd799d9c40fb7fedb42122ec264d6ef6ded Change-Id: I4b2614ce3a83bfea0af53927464e7fbde6a92df9	2024-09-24 12:36:03 -04:00
C. Scott Ananian	25da911334	Parser tests: add additional options to test ParserOutput metadata New options added: `iwl`, `links`, `special`, `extlinks`, and `templates`, and handling of existing `ill` option tweaked to be consistent. Added some tests to exercise these options, focusing on the handling of title fragments. Attempted to make the output formatting consistent among options; a future unification (I32df68714ffdf2f0745b974f47bc3ccceef1f41c) should help DRY these out further. Bug: T310512 Change-Id: Ic9c766ae4362969de124ad9d66eb47cfa68395c6	2024-09-13 14:42:27 -04:00
Yiannis Giannelos	0509dbebad	Sync up core repo with Parsoid This now aligns with Parsoid commit 80bc41a395b19221e7f26b36dfbe0ab15a025819 Change-Id: Iec571f78e7a55991aea69ede2519803b84c05936	2024-09-12 18:58:43 +03:00
C. Scott Ananian	7249c4c982	parserTests.txt: Update documentation about cat/ill options Parsoid does support these options now. Change-Id: I9caedd10b8f7229602ad4f963275b62777aca104	2024-09-10 19:30:07 +00:00
dvorapa	10ab0e40a9	parser: Add a new {{USERLANGUAGE}} magic word for use in wikitext Depending on configuration, this returns either the interface language code of the current user or the current page language. Bug: T4085 Change-Id: Iab7fda272ec81af88c74612727ff6bed014d4a81	2024-09-07 19:16:32 +00:00
jenkins-bot	512c78b8ea	Merge "Make {{#language}} consistent with {{#dir}} and {{#bcp47}}"	2024-07-31 11:42:16 +00:00
jenkins-bot	52a10a36b1	Merge "Add {{#bcp47}} parser function"	2024-07-31 11:42:08 +00:00
jenkins-bot	f338ac3295	Merge "Add {{#dir}} parser function"	2024-07-30 20:34:27 +00:00
C. Scott Ananian	450fe7fcd8	Make {{#language}} consistent with {{#dir}} and {{#bcp47}} Add the same no-arg options for language code that {{#dir}} and {{#bcp47}} have, for consistency: * `{{#language}}` will return the name of the target language (for articles, the content language; for messages, the user language) The default value for the "in language" argument should be the autonym. This was working previously but only via a baroque code flow path for invalid language codes. Make this a bit clearer and add tests. Since non-autonym language code translations are added via the [[Extension:CLDR]] in production, hook LanguageGetTranslatedLanguageNames in the ParserTestRunner to ensure that we can test this. Followup-To: Ice1c671c5b3cc077d2bb80ea5dc25c5eabbfeb36 Followup-To: I19c3e91a924e080f37dc95a0d4e61493583b533e Change-Id: Ibf6e7f194cc056eadb48a5ad8e6d01a761d9351c	2024-07-30 20:27:17 +00:00
C. Scott Ananian	416c33bb6a	Add {{#bcp47}} parser function Template:Bcp47 is one of the most used templates in Wikimedia Commons. Providing its functionality as a parser function, tied to MediaWiki's language-handling code, reduces code duplication and will allow us to reduce template usage on commons. As with the {{#dir}} parser function, support one special case: * `{{#bcp47}}` will return the BCP-47 code of the target language (for articles, the content language; for messages, the user language) Note the following slight differences from [[Template:BCP47]] on Commons, documented in an added parser test: * 'simple' maps to 'en-simple' (not just 'en') * 'roa-tara' maps to 'nap-x-tara' (not 'it-x-tara') Bug: T366623 Change-Id: Ice1c671c5b3cc077d2bb80ea5dc25c5eabbfeb36	2024-07-30 20:27:03 +00:00
Ebrahim Byagowi	e1385d3bdf	Add {{#dir}} parser function Template:Dir is one of the most used templates in Wikimedia Commons, this tries to provide parts of its functionality in hope we can perhaps simplify or get rid of the template eventually for clarity and performance reasons. As a convenience, `{{#dir}}` and `{{#dir:}}` are synonyms for `{{#dir:{{PAGELANGUAGE}}}}`: they return the direction of the target language. For articles, the target language is the content language; for messages, the target language is the user language. In addition, to avoid confusion between BCP-47 language codes and MediaWiki-internal language codes, an optional second parameter can be supplied. If the second parameter is the (localizable) string 'bcp47', the language code given in the first parameter will be treated as a BCP-47 code. For example: `{{#dir:sr-Cyrl\|bcp47}}`. (See LanguageCode::bcp47ToInternal() for a description of the differences and overlaps between MediaWiki internal and BCP-47 codes. These overlaps so far don't result in any case where encouraging editors to be precise about which set of enumerated string values they are using for consistency with other language-related functions, and because MediaWiki internally differentiates between BCP-47 codes and internal codes.) Bug: T359761 Change-Id: I19c3e91a924e080f37dc95a0d4e61493583b533e	2024-07-19 16:57:48 -04:00
Tim Starling	ebf3c9be86	ParserTestRunner: add timezone and user language options * Add wgLocaltimezone to the list of global variables which may be set in parser test options. * Add userLanguage option, which is passed through to ParserOptions. Bug: T223772 Change-Id: I8498527c276288feae854868a8f4b1f3205a49e8	2024-07-12 11:35:33 +10:00
C. Scott Ananian	c8e77a3707	Sync up core repo with Parsoid This now aligns with Parsoid commit 2508e24a2aeb54b55eb54f7f65bedc4d477fc9cf Change-Id: Ibb9f1c6287c6ec3e982f0fa3ddf908b01484973a	2024-06-10 23:29:02 -04:00
Bartosz Dziewoński	f0c7fa9234	Move section edit links outside headings (new heading HTML) Legacy parser can now output headings using a more accessible markup, which is also identical to the markup used by the Parsoid parser. Changes to client-side JS and CSS necessary to support the new markup have already been merged in earlier commits. includes/skins/Skin.php includes/ServiceWiring.php * Define a new skin option, 'supportsMwHeading', which can be used to toggle the new markup per-skin. * Update the built-in fallback skin to enable it. This affects the output in parser tests. docs/config-schema.yaml includes/config-schema.php includes/config-vars.php includes/MainConfigNames.php includes/MainConfigSchema.php * Add a new configuration setting, 'ParserEnableLegacyHeadingDOM', which can be used to toggle the new markup per-site. includes/OutputTransform/Stages/HandleSectionLinks.php * Output new heading HTML for skins that enabled the option. tests/* * Duplicate parser tests that cover heading generation to cover both new and old markup. Update other parser tests to use new markup. * Add some unit and integration tests for the behavior of the skin option and some parser tests for edge cases of the new markup. Bug: T13555 Change-Id: I1180169a8e83af834c2984ba16089e6277f2a8dd	2024-05-06 12:25:33 -04:00
Subramanya Sastry	33f2164096	Sync up core repo with Parsoid This now aligns with Parsoid commit 902eb345ed701b635b98f03557276aa48b564cc2 Change-Id: I91c663a4f2ca00157fbd9337d1d0c72a98452591	2024-04-26 14:57:58 +05:30
Arlo Breault	de01ef7d20	Sync up core repo with Parsoid This now aligns with Parsoid commit c296dca4af9a1d47200a3699e12d9884acc43150 Change-Id: I5a0e246171e9b58d77b2be945b802f381c1f40b2	2024-04-11 12:59:32 -04:00
jenkins-bot	2472cd9247	Merge "Substitute category default sort key when filling links table, not at parse time"	2024-04-11 14:59:33 +00:00
jenkins-bot	71b809f9c2	Merge "Don't strip non-newline whitespace from left side of language links"	2024-04-04 16:56:28 +00:00
jenkins-bot	31a686f9da	Merge "Sync up core repo with Parsoid"	2024-04-01 04:00:01 +00:00
Subramanya Sastry	0cd8ecf2a5	Sync up core repo with Parsoid This now aligns with Parsoid commit 16e27722c6c50618c78230952c1ad27948fc3a0b Change-Id: I21067c1b22a494422184abf7c4bb50424b4fad56	2024-04-01 08:16:27 +05:30
C. Scott Ananian	63293370e5	Don't strip non-newline whitespace from left side of language links This follows up on I5e87b33a956e296cdaf671fa99c9555944b73479 and makes (invisible) language links consistent with how we handle (invisible) category links. Bug: T359886 Followup-To: I5e87b33a956e296cdaf671fa99c9555944b73479 Change-Id: I3e5567a91b47e0b04da928450644f3f475aaf51b	2024-03-29 18:46:16 -04:00
C. Scott Ananian	bf7120f80e	Don't strip non-newline whitespace from left side of [[Category]] links This follows up on a long series of tweaks to whitespace handling around [[Category]] links (T2087, T87753, T174639) which aimed to simplify and make intelligible the whitespace handling around category links without allowing categories to break lists or paragraphs in which they are found. Removing newlines but not other whitespace on the left-hand side of category links should preserve the valuable features of T2087 et al while still ensuring that the following all render equivalently: ABC [[Category:Foo]]DEF ABC[[Category:Foo]] DEF ABC [[Category:Foo]] DEF Added parser test to document the new behavior; it's worth noting that although there were plenty of tests documenting the expected interaction of category links and newlines, there were previously no tests covering the interaction of non-newline whitespace and category links; the one test which needed to be altered added non-semantic whitespace (ie, extra whitespace to the test output which did not affect the way the HTML would display). This patch brings the legacy parser into parity which Parsoid parsing of category links. Bug: T359886 Change-Id: I5e87b33a956e296cdaf671fa99c9555944b73479	2024-03-29 22:30:59 +00:00
C. Scott Ananian	c2df535b9c	Substitute category default sort key when filling links table, not at parse time This ensures uniform treatment of all places that call `addCategory` without duplicating the `defaultsort` code; it also ensures that the effect of the {{DEFAULTSORT}} parser function is independent of page position. Bug: T40435 Bug: T353530 Change-Id: I4480a6d59e766fa4eddc9ec9117c58b66771bb47	2024-03-29 18:30:02 -04:00
thiemowmde	a15b6d516f	parser: Fix formatdate parser function for ISO year 0 = 1 BC I'm not sure how this ever happened, but I'm sure it's a mistake. The following test scenario should make it very obvious: * {{#formatdate:-0002-12-31\|mdy}} * {{#formatdate:-0001-12-31\|mdy}} * {{#formatdate:0000-12-31\|mdy}} * {{#formatdate:0001-12-31\|mdy}} * {{#formatdate:0002-12-31\|mdy}} Expected output: 3 BC, 2 BC, 1 BC, 1, 2, … Current output: 3 BC, 2 BC, 0 (?), 1, 2, … Note how "1 BC" is skipped and shown as "0" instead. Everything else is correct, e.g. the ISO year -1 is already displayed as "2 BC". It's really only this single outlier. In case you don't know: There is no year 0 when the BC specifier is used. There is either year 1 after or year 1 before Christ. This is different in ISO, mostly to make calculations easier. That's why the DateFormater already does an extra `- 1` and `+ 1` in the two makeIsoYear and makeNormalYear methods. The problematic line of code was originally written in 2003, see https://phabricator.wikimedia.org/rMW98fc03e6 The core parser function exists since 2009, see https://phabricator.wikimedia.org/rMWb9ffb5a7 Change-Id: Iaeb7a954579a409fefd87dab4e2a15778ab39fb4	2024-02-27 17:17:36 +01:00
C. Scott Ananian	3cebc721bb	Sync up core repo with Parsoid This now aligns with Parsoid commit 51baccc8741108a9e3f763f2c19c6ce6eda55ac4 Three tests needed to be disabled because they had dependencies on features not included in core's CI: * {{#if}} used in tests added by I71c38b42ac9bfb7137f2e34df70bdfa139abced7 but only provided by the ParserFunctions extension * <poem> used in tests added by I5a6356a82251881a5f841b36a7f26879fc611138 but only provided by the Poem extension In addition, the "multiline" part of the "Expansion of multi-line..." parser tests seems to have been lost at some point. My best guess is that the definition of `Template:1x` initially included an extra newline which was lost, maybe during an unrelated stripping of leading/trailing whitespace in `!! article` clauses. In any case, these tests are no longer testing the thing they say they are. These will be fixed in a follow up. Change-Id: Ia9144634625f176fbea11f3d2ef4b21a5492e99b	2024-02-21 15:04:08 -05:00
Reedy	2295da3004	Fix more incorrect casing of MediaWiki Change-Id: I331e5636823a0beae8d804148f648cfaffd6a1f8	2024-02-19 14:35:34 +00:00
Isabelle Hurbain-Palatin	7f63d5250e	Revert "Use Remex for DeduplicateStyles transform" This reverts commit `82da9cf14b`. Passing through Remex seems to have unexpected consequences to be investigated but, for the sake of unbreaking the UBN, let's revert this first. Bug: T353920 Change-Id: Iaac7942aa77aee5ab525852ac5b41dd516ff13c9	2023-12-22 11:26:09 +01:00
jenkins-bot	132a7955ae	Merge "Make two messages not raw HTML"	2023-12-18 18:59:57 +00:00
C. Scott Ananian	82da9cf14b	Use Remex for DeduplicateStyles transform The previous implementation was using an ad-hoc regular expression which was matching inside the data-mw attribute of Parsoid output, eg: <sup about="#mwt42" [...] typeof="mw:Extension/ref mw:Error" data-mw="{"name":"ref","attrs":{"name":"infobox_stats_ref_rail"},"body":{"html":"<style data-mw-deduplicate=\"TemplateStyles:r1133582631\" typeof=\"..."> After substitution, the <link> element inserted contained " instead of " and so broke out of the attribute. Instead use a proper HTML tokenizer (via wikimedia/remex-html) so that we don't allow bogus matches inside attribute values. To fix up tests: * Don't deduplicate styles when parsing UX messages (also helps performance) * Don't deduplicate styles in ContentHandler integration tests * Don't deduplicate styles by default in parser tests (unless explicit option is set) Depends-On: Id9801a9ff540bd818a32bc6fa35c48a9cff12d3a Depends-On: I5111f1fdb7140948b82113adbc774af286174ab3 Followup-To: Ic0b17e361bf6eb0e71c498abc17f5f67f82318f8 Change-Id: I32d3d1772243c3819e1e1486351d16871b6e21c4	2023-12-15 17:49:21 +01:00
Jon Harald Søby	0e8a92d9ff	Make two messages not raw HTML Two messages were added to wgRawHtmlMessages instead of just fixing the way they were parsed so they can't contain raw HTML. This fixes that. In order to avoid breakage on-wiki for old customized messages that took advantage of them being parsed as raw HTML, rename the messages too. Also rename a few other messages from the same set to stay consistent. Note: These messages are suppressed in favour of Echo's messages when Echo is enabled, and Echo is enabled on all Wikimedia wikis, so the existing customized messages on Wikimedia wikis are basically no-ops. Bug: T353316 Change-Id: Ib0d1c79247fe091f2806b7c23ffb2fe22cc4df4a	2023-12-15 11:10:37 +01:00
Subramanya Sastry	f1772fc150	Sync up core repo with Parsoid This now aligns with Parsoid commit f73c9f0f665a57f5c0247ad1973a4f33f165f96b Change-Id: Ibd531ddb1d545c1286e3cd3c3c6c08536f954768	2023-11-07 13:11:04 -06:00
Bartosz Dziewoński	e9a281ef4c	Add parser test for escaped wikitext in section heading Change-Id: I4f0c2107541b668f6ddd093dadcb6f391724d57f	2023-10-09 14:46:51 +00:00
Isabelle Hurbain-Palatin	8706602346	Sync up core repo with Parsoid This now aligns with Parsoid commit 273c783374efdb148f26d7a0f3d590eb6ae66551 Change-Id: I742825115730b5697a1da47ce5d135adcdef1f8c	2023-07-13 18:15:47 +02:00
Arlo Breault	498c00ab25	Sync up core repo with Parsoid This now aligns with Parsoid commit 0dc439dd46b5db02bd515d642caa15f9e081270d Change-Id: I513703b4c1f002c75afd7d4792d47aa3cca0e726	2023-05-26 17:05:07 -04:00
jenkins-bot	a4cb5e6519	Merge "Sync up core repo with Parsoid"	2023-05-25 18:18:29 +00:00
Arlo Breault	a9ea70bf6c	Sync up core repo with Parsoid This now aligns with Parsoid commit db0772cd77d89ea166bf6ea162f9d223264a6f50 Change-Id: I988d8e3bd4953fdf8e71ca0ed72f2f0755e4948c	2023-05-25 13:45:34 -04:00
Matt Fitzpatrick	a7e4d70d45	Sanitizer: Permit the `aria-level` HTML attribute in wikitext Allows editors to identify a pseudo-heading as a heading of a given hierarchical level to assistive technologies. Also allows levels 7 and deeper. <div role="heading" aria-level="2">Example</div> See also https://www.w3.org/WAI/GL/wiki/Using_role%3Dheading Change-Id: Ia465a076db334d08cd1f548f2363a0f7cafe7690	2023-05-21 12:57:53 +03:00
C. Scott Ananian	85a3cc74c4	Sync up core repo with Parsoid This now aligns with Parsoid commit ede7e1c0afab3dea5c02033b9ad4e9a064e27717 Change-Id: Ib8ec513f3cef75c071b6d08913a18515a15ec82a	2023-05-11 14:49:23 -04:00
Arlo Breault	cd0d6aeba0	Sync up core repo with Parsoid This now aligns with Parsoid commit eb7a6ce7afac292b7e8a43c622fea6ac65791fc1 Change-Id: Ie704588c71bff4525632e6aa918ae6d0bd3364fb	2023-04-26 14:11:39 -04:00
Arlo Breault	30b8fe564b	Add classes on elements inside the media structure The purpose of which is to improve the performance of the css selectors targeting the output, as analyzed in T270150#8524965, as well as eliminate some of the brittleness in depending on direct descendents and first-child, which can be seen in T320285 and T304010. mw-file-element is targeted to apply margins, borders, and vertical-alignment to that element. The current css rules have wildcard selectors in the rightmost position, which, since css is parsed right-to-left, can be quite slow on a wiki page. The legacy parser has an equivalent class, thumbimage, when rendering thumbs but here we apply the class more broadly. A follow-up patch in I70c61493fe492445702f036e5b24ef87fc3bdf43 will remove the redundant wildcard selectors once parsercache has turned over. Bug: T270150 Bug: T314097 Depends-On: Ie85ee7048273023a2c51f42a333a9c1493360404 Depends-On: Ie0ec018ac6c2c42c05610b342d7ef87493dfdc42 Depends-On: Ifc17fdf530af515b066de706ca5e69e118fd1c5b Depends-On: Ib60edacdae2ff41a0de2b2b584718fd9ce925f97 Change-Id: Ifd4001e312a5fa4b7beaad63ba8c4e79e3201b9b	2023-04-26 12:39:25 -04:00
C. Scott Ananian	fe40b55f7d	ParserTestRunner: use TOCData::prettyPrint() for 'showtocdata' This provides a bit of isolation from the actual layout and names of properties in the object, as well as being a touch more readable when debugging test failures. Change-Id: I5ddca850f577b2ac24e237a2518f03983e79a51d	2023-03-10 16:41:49 -05:00
C. Scott Ananian	4e4008c976	Don't clear LanguageConverter display title when converting ToC The LanguageConverter::convert()/::convertTo() methods clear the converted title and reset other (less important) bits of LanguageConverter state. Add an optional parameter in order to skip this reset. (The LanguageConverter::translate() methods are available which don't reset LanguageConverter state, but they also don't process embedded language converter markup. Since headings can contain embedded markup, the ::translate() methods aren't appropriate.) Bug: T306862 Bug: T331316 Change-Id: Ifb2745e45974755ba5a6068c13e84be6c4e3f329	2023-03-09 13:08:01 -05:00
C. Scott Ananian	93073d4632	ParserTestRunner: handle metadata output as separate section If a ParserTest mixes HTML output and metadata properties, it can complicate HTML normalization and other test processes, especially for Parsoid-mode bidirectional tests. Support splitting metadata output into a separate section, named `!! metadata`, with the standard options for legacy and parsoid variants, like `!! metadata/php` and `!! metadata/parsoid` and `!! metadata/parsoid+integrated` etc. For compatibility, if the metadata flags are present on the test and the new section is not present, we'll continue to handle the metadata output as we have before, aka append or prepend the metadata to the HTML. Code search for uses of these options (uses in parsoid and core can be ignored; uses of 'pst' are harmless when they are not combined with another option): https://codesearch.wmcloud.org/search/?q=%28%5E%7C%20%29%28%28showtitle%7Cshowindicators%7Cill%7Ccat%7Cpst%7Cshowflags%29%28%20%7C%24%29%7C%28extension%3D%7Cproperty%3D%29%29&i=nope&files=%5Etests%2Fparser%2F.*%5C.txt&excludeFiles=&repos= Change-Id: I845694d4f2109a8b9125410e8533ca69bbea50fa	2023-02-28 17:26:08 -05:00
C. Scott Ananian	e7a762fd59	Language-convert Table of Contents at parse time In `24949480eb` (Oct 2021) injection of the Table of Contents was moved from Parser to ParserOutput::getText(); that is, from parse time to "postprocess text possibly fetched from the cache" time. Unfortunately, this meant that language conversion wasn't done on the table of contents (!), for either traditional skins or the vector-2022 skin. This was fixed for traditional skins by `059e62cde6` (Nov 2021), later amended by `0955046ca5` (Mar 2022), which added explicit language conversion to the TOC injection process in ParserOptions::getText(). This fix was still not complete, however, since editor-defined custom language-conversion rules defined in the article body were no longer available to the language converter when conversion was done in ParserOutput::getText(); the ToC title was also being double-converted. Further, neither of these short-term fixes addressed the output of ParserOutput::getSections() (now ParserOutput::getTOCData()) which was used by vector-2022 to generate the ToC in the sidebar and which remained entirely unconverted. With `439656e019` (Jan 2023), we started using the ::getSections()/::getTOCData() output for main article text as well, but we kept the previous hack which post-converted the generated HTML. This kept old skins at parity with the post-Oct-2021 status, but also didn't address the conversion issue for vector-2022. The solution here is to perform language conversion on the ToC lines at parse time along with the rest of the language conversion, and store converted headings in TOCData. This has a number of side effects: 1. The ToC information array available via the action API is now language converted. This is probably what you wanted in the first place, but could potentially be disruptive. 2. The ToC is consistently converted with the full set of editor-defined custom conversion rules. Before Oct 2021, the ToC was converted using the set of custom conversion rules active at the point at which the ToC was inserted (which was usually near the beginning of the article). When all conversion rules appear at the very top of the article (best practice!): -{en:Foo; en-x-piglatin:Bar;} Lead section text == Introduction == == Foo == There should be no difference before pre-Oct 2021 behavior and the behavior after this patch: in both cases the rule defined in the article body will be applied both to the heading and to the TOC, and they will be consistent. (After Oct 2021 and before this patch, Foo would be converted in the heading but not in the table of contents.) But in cases where conversion rules are defined after the TOC insertion point, the section heading as it appears in the body text could appear different from the section heading as it appears in the ToC. For example, if you defined a conversion rule just before using a term in a heading: == Introduction == -{en:Foo; en-x-piglatin:Bar;}- == Foo == Before Oct 2021, this rule would be applied to the heading, but not to the TOC (because the TOC insertion point was before the rule definition). This would also be the behavior before this patch (since rules defined in the article body are currently not applied at all). After this patch, the rule will be applied to both the heading and the TOC (because the rule application location is effectively "at the very end of the article"). In the rare cases when rules are not defined in glossaries at the top of the article, this type of usage (definition immediately preceding first use) is expected to be the most common and the behavior after this patch is more correct. But alternatively, if you defined a conversion rule after using the term in a heading: == Introduction == == Foo == -{en:Foo; en-x-piglatin:Bar;}- Before Oct 2021, this rule wouldn't be applied to the heading or the TOC. Before this patch, this would also be the case (because rules defined in the article body are not applied at all). After this patch, the rule will be applied to the ToC but not the heading, since the application point for the TOC is effectively at the end of the article. This inconsistency is probably not desirable, but this case is expected to be rare, and (assuming the editor intended 'Foo' to be unconverted) the editor can work around the inconsistency by explicitly protecting 'Foo' from conversion: == -{Foo}- == -{en:Foo; en-x-piglatin:Bar;}- And if the editor /intended/ Foo to be converted, the rule definition should be moved earlier in the article. Again, putting all rules at the top of the article is the preferred style, and works better with the glossary style used by the zhwiki community (see also https://www.mediawiki.org/wiki/Requests_for_comment/Scoped_language_converter ). Bug: T306862 Depends-On: I0c9c9fec920f7cb028d935e552a8f11475a23ba7 Change-Id: I321cd31dae64bbf845d53282e5d28a55bc4ec319	2023-02-24 10:09:53 -05:00
C. Scott Ananian	e73aef2d97	Sync up core repo with Parsoid This now aligns with Parsoid commit 90bc541138035d4ff6b62efa0050bd03161bc43b Change-Id: I9f16f71996da5e5baf1e0506129342a25c2ece75	2023-02-23 10:30:18 -05:00
jenkins-bot	9200e04403	Merge "ParserTestRunner: Move 'showflags' handling inside ::addParserOutputInfo()"	2023-02-23 03:57:48 +00:00

1 2 3 4 5 ...

855 commits