Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
C. Scott Ananian	0955046ca5	Ensure that ToC is converted into the proper target language This patch exports the necessary information from the Parser into the ParserOutput to ensure that the Table of Contents can be properly language-converted: both ensuring that the target language is correct (in cases where it differs from the content language) and that various conversion-suppression mechanisms are functional. When the ParserCache does not (yet) have the new properties from Parser, the behavior is unchanged from before (the content language is used, and its "preferred variant"). This is a follow up to the "quick fix" deployed in Ic14b3a49a8ee7ed600485d4f8a363a206035a847 to fix an UBN regression. Parser tests have also been added to verify that ToC conversion is correctly done (T299973). Task T303329 has been opened to (eventually) rename the 'core:target-lang' and 'core:target-lang-variant' properties added to the ParserOutput in this patch. Bug: T303235 Bug: T295187 Bug: T299973 Followup-To: Ic14b3a49a8ee7ed600485d4f8a363a206035a847 Followup-To: Ib273f88531c340b561072ee9f616aa60725091e6 Change-Id: Ie0f1d7b6daffc8ff47228f6f086a257518f72717	2022-03-09 00:08:57 -05:00
Arlo Breault	4f22d39828	Fix parserTest name Follow up to I206f1ccbfee1a601f3e5a4b52cb6acb5a6fbf113 Change-Id: I684e140b9a8d907f72f8e45ba2d7402dcb9d102d	2022-03-01 17:56:08 -05:00
Arlo Breault	1f381ce2cf	Sync up with Parsoid parserTests files This now aligns with Parsoid commit 3326011bc6ac9539eb197a22a22497347a4b2e35 Change-Id: Ifda86614572455d51e1855e5336c9ee613f842fd	2022-03-01 17:28:21 -05:00
Arlo Breault	350721cc2c	Add mw-file-description class on links to the file description page Matches Parsoid output. Bug: T292657 Depends-On: Iccee2dcbc7b06d80bcb4e026eedc11042585550b Change-Id: I206f1ccbfee1a601f3e5a4b52cb6acb5a6fbf113	2022-03-01 15:53:05 -05:00
Arlo Breault	30ae2b3a3b	Sync up with Parsoid parserTests files This now aligns with Parsoid commit 0a99d4ef38f1eba7637e6c0ddb6be434ccd5e72e Change-Id: I0251d7c56dfb5fa917ab3666a586217e90043bec	2022-02-19 09:58:09 -05:00
Arlo Breault	4ae0758db2	Revert "Add "resource" attribute to img tags" This reverts commit `5809ef7caa`. Bug: T292657 Bug: T297984 Depends-On: I261887a3b2d15130894b947d18a2e85537d50a1f Change-Id: Id4e8d16344ce0f420bfd3e0d5833c67d1cf85fd8	2022-02-18 20:14:20 -05:00
Tim Starling	80a22645f6	Allow parser tests to test the value of extension data and properties * Add "property" and "extension" options to parser tests * Slightly refactor the relevant code since it's getting big. * Slightly refactor the documentation too. Change-Id: Idc4ac4eb4e20d8e3e2fdbd093ff75f26d3af0d57	2022-01-24 12:46:34 +11:00
Arlo Breault	5254ddea2d	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit d9cbfe3ae95ead0111e935f214408ccb49aa12a6 Change-Id: Id5dd98d2bd54187a060ae7523de634747e5d7594	2022-01-07 15:58:30 -05:00
jenkins-bot	7d62f16f77	Merge "Add "resource" attribute to img tags"	2022-01-07 19:23:48 +00:00
Arlo Breault	20be2afb0a	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit 1b39be0995a201b4cc949bee32c5964053bdf77e Change-Id: I243885af834baab178cf4c4fa71de5f59e8609aa	2022-01-05 18:59:28 -05:00
Arlo Breault	5809ef7caa	Add "resource" attribute to img tags This is spec'd out at https://www.mediawiki.org/wiki/Specs/HTML#Media It's also useful in the bug to determine when the link is pointing at the resource, and hence MediaViewer should open. Previously that was distinguished with .image class on the link but that's now omitted in getDescLinkAttribs. FIXME: Should the "resource" contain querystrings? Maybe this needs to be done on the Parsoid side as well. Bug: T292657 Depends-On: Idb60e418f79dcb6a121de2a11e6e0ed0b31fd3ff Change-Id: Ia94138383ebdbfc2feef75fdf651b969085a72b1	2021-12-15 19:16:44 -05:00
Arlo Breault	7406194be4	Disable the legacy media dom on a few more tests Just newer and overlooked tests. All the media in those galleries are invalid and the gallery changes went in later in Iff2bdc3aa02f84f0bf4ca55d177706823934cc08. Change-Id: I6d03037af1b5c90e6d57fd048506da2b4e4bc704	2021-12-15 16:39:35 -05:00
C. Scott Ananian	4f60541f49	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit 819630e57c646038215a144fc03e6e9c29c12328 Change-Id: Ia814636d0d8550ebb6c76be6b4b7964b2b5ce105	2021-12-10 14:33:34 -05:00
Bartosz Dziewoński	dd4d1db814	TestRunner: Set local interwiki URLs to match wgServer, like in production Matching Parsoid change I6e7bdcdea6bc2fd955f0a04f25f09314ec1230c8. Change-Id: I6e7bdcdea6bc2fd955f0a04f25f09314ec1230c8	2021-12-07 16:20:26 -05:00
Reedy	a349d6b677	parserTexts.txt: Remove usages of "sanity" Bug: T254646 Change-Id: Iaf8a1df2a88a59e787c0a039c8c7becbd51dfcb5	2021-11-21 23:07:26 +00:00
Winston Sung	6eda8891a0	Update 台灣 to 臺灣 according to Wikipedia-zh village pump discussions https://zh.wikipedia.org/wiki/Wikipedia:互助客栈/其他/存档/2019年2月?oldid=61018059#「台灣」「正體」？ Follow-up of https://gerrit.wikimedia.org/r/c/mediawiki/core/+/700626 Change-Id: I6d2a128f682e71312400b97333ffbfffe9968ee7	2021-10-26 11:02:07 +00:00
Fomafix	e86f180bd4	Merge "Encode & to & in displaytitle fallback"	2021-10-14 17:58:06 +00:00
Bartosz Dziewoński	3223981217	Move parser test with stray carriage return to extraParserTests.txt All of my favorite text editors corrupt this test case whenever I edit parserTests.txt. extraParserTests.txt contains other tests with weird characters that may get corrupted by normal text editors. (I had to use `vi` to make this patch, and I wouldn't wish this on anyone.) Change-Id: Id474469180fc284e3e28b55f65808be727507875	2021-10-14 00:49:58 +02:00
Fomafix	eed3121a8f	Encode & to & in displaytitle fallback The value in the attribute displaytitle must contain valid HTML. The sanitizer of the {{DISPLAYTITLE}} parser ensures that only valid HTML is accepted. If there is no {{DISPLAYTITLE}} in the wikitext then displaytitle falls back to $title->getPrefixedText(). Here an HTML encoding of special characters is necessary. This affects only the replacement of & by & because other special characters like < and > are not allowed in the title. This change affects the displaytitle fallback on the following places: * ApiParse * ApiQueryInfo * InfoAction * Parser The displaytitle fallback in OutputPage is also updated to this behavior although Sanitizer::normalizeCharReferences( Sanitizer::removeHTMLtags( $html ) also replaces & by &. Also add test cases with & in the displaytitle to: * ApiParseTest * ApiQueryInfoTest * parserTests Bug: T291985 Change-Id: I8ee1e2731d9bfa49725d663b34986e7e3073e4ca	2021-10-05 18:09:15 +00:00
Subramanya Sastry	c417f6eb5f	Sync up with Parsoid (legacyMediaP\|mediaP\|p)arserTests.txt This now aligns with Parsoid commit 29f8e7051529ecbb62fc52bff6726a4df8bf20c2 Change-Id: I4f1be053aad137c974a18291ce018f9ce8fa8f82	2021-09-30 14:57:54 -05:00
Isabelle Hurbain-Palatin	1fd9493285	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit 9cf6f53f8adf52e92ae6fd0dc6fc6505ab6fce1f Change-Id: Ib35dc9decc2acda0d244bd0ec7ea983867903b4e	2021-09-29 15:29:00 +02:00
Arlo Breault	9c854614a3	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit 356629d62ad930d67798c54aa8c11f45f328d030 Change-Id: I5179a5f6fc4fdb221f1b4fd92fe0bfb3fa4442e5	2021-06-25 10:55:22 -04:00
Arlo Breault	fdd8f864b8	Emit media structure as piloted in Parsoid Gated behind the flag $wgParserEnableLegacyMediaDOM. The scattershot usage of it is a little unfortunate but isn't expected to live very long so maybe that's acceptable. Further details can be found at, https://www.mediawiki.org/wiki/Parsing/Media_structure Bug: T51097 Bug: T266148 Bug: T271129 Change-Id: I978187f9f6e9e0a105521ab3e26821e36a96b911	2021-06-24 23:32:40 +00:00
Arlo Breault	c32e539bcd	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit 760eb7ea841efff29a9e740662985c330501601b Change-Id: I06928d461e2948db2b23806e64adb2de4ef2c724	2021-04-26 15:09:39 -04:00
jenkins-bot	ee3e2a572d	Merge "Don't p-wrap <aside> tags in extension HTML"	2021-04-26 18:50:46 +00:00
Amir Aharoni	c8caf26ffd	Remove RLM/LRM from Names.php This character is no longer required here. It was added to ensure correct display of parentheses in mixed LTR/RTL environment, for example an interlanguage link from an RTL wiki to an LTR language with parentheses in its name. However, the Unicode bidirectional algorithm was updated to handle parentheses more cleverly and automatically, making manual adjustment with RTL/LRM unnecessary. This update was implemented years ago in all browsers and operating systems. I've tested this in Firefox, Chrome, Edge, and Internet Explorer 11, and it works correctly without the RLM/LRM characters. Parser tests are updated accordingly. Bug: T280435 Change-Id: I63107f623ade3b8367eae579a8e96d7e2c18b747	2021-04-22 08:27:41 +00:00
Máté Szabó	377c53ae51	Don't p-wrap <aside> tags in extension HTML Our PortableInfobox extension uses the HTML5 <aside> tag in its generated HTML. This tag isn't recognized as a block element (in the way e.g. <div> is) by the legacy parser, resulting in some spurious empty paragraphs in the output. As a fix, make the legacy parser aware of <aside> tags to avoid unnecessary p-wrapping. Also add <aside> to the Sanitizer's internal attribute check. I3e57f55ac69d2c1ee8a1d41c21b692e56fc7e628 takes care of updating Parsoid-PHP accordingly. Bug: T278565 Change-Id: I89dbdf7770e13e1b62320228a366c64e64217b0b	2021-04-06 16:26:12 +02:00
Umherirrender	dc7cfa0434	Add parser test for {{safesubst:self}} This fails without the follow-up patch with the same exception as on the task Bug: T276476 Follow-Up: I014da3a333f8ee6ca623b98c415b8d9f9d1be084 Change-Id: Ib61e9ea44a6fdc31e10b89c3504cecec5b9fd208	2021-04-04 22:22:29 +02:00
James D. Forrester	7c74fc35e2	parserTests: Avoid problematic language in comments Bug: T277986 Change-Id: I1e079d670ecfb5338223a26df507427b45e28121	2021-03-28 21:23:37 -07:00
jenkins-bot	800e1f8cea	Merge "Don't worry about something before when armoring french spaces"	2021-03-02 01:28:03 +00:00
Arlo Breault	6222a1aee8	Don't worry about something before when armoring french spaces We lost some insight in `c44a395` because we're no longer analysing the entire dom as a serialized string, but instead running our regexp on individual text nodes. This patch as written here just allows for the space to be at the start of the text node. However, some git spelunking shows that in `9dc65ef`, the condition for there being a non-whitespace character previous to the space was only because armoring French spacing happened before doBlockLevels and wanted to protect indent pre's. That's certainly not the case anymore, so we can probably get away with dropping the condition altogether now. Bug: T275918 Change-Id: I654a09b0f98937379b9fad3f325134ead7f2d8a6	2021-03-01 11:52:27 -05:00
Arlo Breault	ed543be03b	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit 241a08fb80cd5b4b16146eb99054b25c0261998c Change-Id: I8d176b76891e10de096247f6ad3ed52ec6f5735e	2021-02-18 11:23:19 -05:00
Arlo Breault	c44a3958a3	Don't apply French spacing in raw text elements This also means we don't need to take special care for French spacing in attributes, since it's no longer applied there. Adds a test that captures this change. Note that the test "Nowiki and french spacing" wonders whether this escaping should be applied to nowiki content. Bug: T255007 Change-Id: Ic8965e81882d7cf024bdced437f684064a30ac86	2021-02-16 19:26:29 -05:00
jenkins-bot	7b2a853019	Merge "Parser test for Balinese language conversion"	2021-01-30 15:22:25 +00:00
Arlo Breault	21dfb00fa3	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit 4dd80737783737621bf1fc0e0b7e954f3d1bbf3c Change-Id: Ib780af2f1e71aa6df8369d17cebf66d3bc85686b	2021-01-29 17:28:43 -05:00
jenkins-bot	03e2d471c4	Merge "Rewrite <langconvert> to support BCP 47 tags"	2021-01-28 16:30:31 +00:00
Tim Starling	0384793a2e	Parser test for Balinese language conversion Bug: T263082 Change-Id: I0a51656c54fbd547a6283dd23a7ee571dfb43d08	2021-01-28 03:43:07 +00:00
jenkins-bot	e845067ab6	Merge "Adopt pipe trick with Arabic comma"	2021-01-16 03:29:36 +00:00
Arlo Breault	96d9eaa8c7	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit ebf0a41507ec09a17f247acd2fdbb72555cbf2af Change-Id: Ic3f59b93ae7b3132e1f410d0dfd35b1a4f6852be	2021-01-14 10:21:57 -05:00
David Kamholz	cdbd2e791d	Rewrite <langconvert> to support BCP 47 tags This validates langconvert's "from" and "to" arguments as valid BCP 47 tags. For example, it will accept "sr-Cyrl" and "sr-cyrl" and reject the non-standard internal MediaWiki code "sr-ec". I made the BCP 47 matching case insensitive as that seems to conform with how MediaWiki handles it elsewhere and case sensitive matching would probably be a headache for users. Bug: T271758 Change-Id: I9f765fe650279820d61c3a7e499ca99468df3d14	2021-01-13 19:00:47 -08:00
Arlo Breault	78e85ab9e5	Split out media parser tests Bug: T111604 Bug: T271129 Change-Id: I9893d11d50b8e5884239da2bb41262e093afc47f	2021-01-13 15:53:33 -05:00
Ebrahim Byagowi	9fe1d1f734	Adopt pipe trick with Arabic comma Currently MediaWiki turns `[[test, abc]]` to `[[test, abc\|test]]` while saving the page but that comma isn't in use in Persian so this patch makes MediaWiki to treat Arabic comma the same way as regular comma. Change-Id: Ib8051023abc25b7c4f97a3f50246f35650057ec9	2021-01-11 21:43:33 +00:00
C. Scott Ananian	a41f284324	CoreTagHooks: First argument passed to parser tags can be null Document and enforce the correct type for the first argument to a Parser tag hook, which will be `null` if the tag is self-closed. Mark the methods in CoreTagHooks @internal. They are apparently unused outside MediaWiki core: https://codesearch.wmcloud.org/search/?q=CoreTagHooks&i=nope&files=&repos= Add coverage test cases to ensure that all tag hooks properly handle the `null` value of the first argument; prior to this patch the `<html>` tag emitted a broken strip tag in this case. The other hooks passed the null to other callees in violation of their type signatures, but eventually every other hook managed to safely cast the null to the empty string without throwing an exception or emitting a warning. For those, this patch does not change existing behavior---it just makes the cast to the empty string much more obvious to the reader. Change-Id: I69fde6c06eabb2db27bb1cc23d2cb19b99273391	2021-01-05 14:19:44 -05:00
Subramanya Sastry	94705b1e6a	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit b8c7ac91f5d4ec5860e23455e17a09d6c579b338 Change-Id: I80e93b2e22e10129e48a3a0312c46090e8d02551	2020-12-21 18:06:37 -06:00
C. Scott Ananian	727a77a19e	ParserTestRunner: add interwiki prefixes used by Parsoid tests Bug: T254181 Change-Id: Ia79992e8e44435746f8512b2f05408c560c80533	2020-12-21 16:53:43 -05:00
Arlo Breault	b12f5d8e20	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit 67180924cc1d78eed9b300b6f867498da51c35bc Change-Id: Icb1c8c3cc4e19db9fa5c93b62a6afadb9f6676dc	2020-12-18 12:30:25 -05:00
Arlo Breault	c2cef6cb58	Consistent label escaping in makeBrokenImageLinkObj Html::element is more lenient about which characters it escapes. But really this is just factored out of the next patch for ease of review. Change-Id: I9abb4d866a624df7bf4628ab9cc581967e715160	2020-12-18 11:41:09 -05:00
Arlo Breault	c203c574bd	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit c2952b434c1dc52d7c73154ca47bda19f2c2602f Change-Id: Ic878dd183592c0ace77e3e078c40df40e54b7eab	2020-12-17 13:18:44 -05:00
David Kamholz	a7ad0547bc	Implement <langconvert> tag The <langconvert> tag takes two attributes: from (language variant from) and to (language variant to). It returns the content of the tag converted using LanguageConverter. It returns an error if the attributes are not present, if the variants do not exist, or if the variants belong to different languages. Currently it does not work for IuConverter, because the variants use the code ike rather than iu, and ike isn't in the list of languages with converters available. This patchset reimplements from a parser function to a tag, and renames from transliterate to langconvert. Bug: T263082 Change-Id: Idc3a32c66d5a0466c63e7ce8753d2619354c30b0	2020-12-14 19:40:31 -08:00
Arlo Breault	acb40ea0d5	Sync up with Parsoid parserTests.txt This now aligns with Parsoid commit 010856ed7d4aeb9617ac264782809cc58d94fc47 Change-Id: Iae87302abe2c11deb36088100e50a638d58cffe6	2020-10-05 16:09:18 -04:00

1 2 3 4 5 ...

757 commits