Commit graph

1590 commits

Author SHA1 Message Date
WMDE-Fisch
ad10a5c5f0 PaserTestRunner: Avoid json_decode deprecation warning
json_decode now emits a deprecation warning when called with null.
Before it would just return null anyway in these cases so I just
introduced a way around that avoids calling the mehod in the frist
place.

Bug: T382590
Change-Id: I47b7aca331a405bb3d2865cc280ef3ced537f84b
(cherry picked from commit b2fad75337256aaabd6e892bdd4bea8f86b47d5c)
2024-12-26 23:49:11 +00:00
Reedy
7a1f0dff6d tests: Use namespaced ParserOptions
Change-Id: Id7b04b61d22ab6ef8980897f1f2e2eb3eee4e619
2024-10-16 01:35:06 +01:00
jenkins-bot
23ec5ff94e Merge "Add namespace to maintenance/includes classes" 2024-10-15 22:50:33 +00:00
jenkins-bot
26a696ecdd Merge "ParserTestRunner: consistent normalization of "known failure" output" 2024-10-11 22:12:54 +00:00
James D. Forrester
9f02d18eac Add namespace to maintenance/includes classes
Also a few other fixes of PHP class aliases spotted by phan.

Bug: T353458
Change-Id: Ie79d65722c47c24f8f20f1293355cfd3c2e8c2ad
2024-10-09 11:02:09 -04:00
thiemowmde
b1c9ec74fa Remove meaningless @var documentation from constants
A constant is not a variable. The type is hard-coded via the value
and can never change. While the extra @var probably doesn't hurt much,
it's redundant and error-prone and can't provide any additional
information.

Change-Id: Iee1f36a1905d9b9c6b26d0684b7848571f0c1733
2024-10-09 09:33:12 +02:00
C. Scott Ananian
1c3078b96c ParserTestRunner: consistent normalization of "known failure" output
Improve the output when a parser test fails because the "known failure"
output differs from the actual output, and use the same normalization
function which Parsoid's test runner does for this comparison.

Previously it would display the actual/expected output from the test,
which had no relationship to the "real" reason the test was failing.

Depends-On: I7ff5b27415d98e45d1364161ed6cdaac2d156a81
Change-Id: I0d56b60abb78e37d539267f744afb52c092cb997
2024-10-08 17:59:21 -04:00
jenkins-bot
8f8677e2c0 Merge "Use a better bidi aware markup in CommentParser" 2024-10-04 12:21:17 +00:00
Ebrahim Byagowi
efda4cae32 Use a better bidi aware markup in CommentParser
As noted on the comments, this needed a markup that work better
in bidi scenarios and as a part of replacing bidi control codes
with HTML markup I was able to test different bidi scenarios
using <bdi> HTML tags.

Bug: T375975
Change-Id: If2af751fc9f78869acf7b7e93199fa927de2cc19
2024-10-04 10:50:02 +03:30
James D. Forrester
9203493606 Add namespace to remaining parts of Wikimedia\FileBackend
Bug: T353458
Change-Id: I49c843c9d8f6459c0fbf774afeea7a82fa564b59
2024-10-03 16:21:22 +00:00
jenkins-bot
f4dc788b5c Merge "Allow localized gallery widths; avoid spurious "double px" tracking category" 2024-10-02 21:39:40 +00:00
C. Scott Ananian
714a7146d6 Sync up core repo with Parsoid
This now aligns with Parsoid commit b19f73d7beadedcb6991640aac7eb7d6e7aec8f5

Change-Id: Ief91b25769f777169af65c9720faa767850f6239
2024-10-02 10:43:47 -04:00
jenkins-bot
315de0e434 Merge "Deduplicate language links in ParserOutput and OutputPage" 2024-09-27 22:43:43 +00:00
James D. Forrester
9e5c1e8ac7 Add namespace to IDBAccessObject and DBAccessObjectUtils
Bug: T353458
Change-Id: I23cf7991f8792d4d000d1780463d8ce76dc0aee0
2024-09-27 16:19:10 -04:00
C. Scott Ananian
7495f9bc15 Deduplicate language links in ParserOutput and OutputPage
Move deduplication of language links out of Parser.php and into the
ParserOutput in order to be compatible with alternate Parsers (Parsoid).
Clean up various inconsistencies: ensure deduplication also happens in
OutputPage when multiple ParserOutputs are merged into the final output,
and ensure that the deduplication in LinksUpdate is done in the same
order (first link prevails) as in Parser/ParserOutput/OutputPage.

Deprecate OutputPage::setLanguageLinks() (the matching
ParserOutput::setLanguageLinks() was deprecated in 1.42).

As a breaking change, return an array, not an array *reference*, from
ParserOutput::getLanguageLinks().  This allows us to safely modify the
internal representation of language links. As far as I can tell, no one
used the returned reference to sneakily modify the list of language
links, and there not a good way to have deprecated this before making
the breaking change.

While we're at it, we've added tests to ensure that language link
fragments are preserved.

Bug: T26502
Bug: T358950
Bug: T375005
Change-Id: I82a05a51d94782ebb9fa87ff889ca0f633b3e15c
2024-09-26 15:28:49 -04:00
C. Scott Ananian
25b27ce309 Sync up core repo with Parsoid
This now aligns with Parsoid commit fc9ab0949952d5e784acb012096860f5c8663fc7

Change-Id: I5d72f551c75de80b0834ea98d8a1d3cb5852e866
2024-09-26 13:04:36 -04:00
C. Scott Ananian
ec4e4648dd Sync up core repo with Parsoid
This now aligns with Parsoid commit dea42dd799d9c40fb7fedb42122ec264d6ef6ded

Change-Id: I4b2614ce3a83bfea0af53927464e7fbde6a92df9
2024-09-24 12:36:03 -04:00
Umherirrender
fdac97eaf3 Pass function name to IDatabase functions
Change-Id: Ie2a1e5052e5b61bbb5b89905de942f47d3f1413d
2024-09-19 21:02:52 +02:00
C. Scott Ananian
25da911334 Parser tests: add additional options to test ParserOutput metadata
New options added: `iwl`, `links`, `special`, `extlinks`, and `templates`,
and handling of existing `ill` option tweaked to be consistent.

Added some tests to exercise these options, focusing on the handling
of title fragments.  Attempted to make the output formatting consistent
among options; a future unification (I32df68714ffdf2f0745b974f47bc3ccceef1f41c)
should help DRY these out further.

Bug: T310512
Change-Id: Ic9c766ae4362969de124ad9d66eb47cfa68395c6
2024-09-13 14:42:27 -04:00
Yiannis Giannelos
0509dbebad Sync up core repo with Parsoid
This now aligns with Parsoid commit 80bc41a395b19221e7f26b36dfbe0ab15a025819

Change-Id: Iec571f78e7a55991aea69ede2519803b84c05936
2024-09-12 18:58:43 +03:00
C. Scott Ananian
95cfe68e3b Allow localized gallery widths; avoid spurious "double px" tracking category
The `widths` and `heights` attributes to the <gallery> tag weren't being
properly localized with the `img_width` magic word, which meant that
trailing 'px' wasn't stripped causing it to trigger the "double-px"
tracking category when it shouldn't.

Bug: T374311
Change-Id: I538bc0975f858f62cdd20619fc6f337abb9698eb
2024-09-11 10:33:55 -04:00
James D. Forrester
2b11d61577 Migrate all uses of deprecated URL global functions to use wfGetUrlUtils()
wfGetUrlUtils() is also deprecated, but less so, so we can do this first
and then properly replace the individual uses with dependency injection
in local pieces of work.

Also:
* Switching Parser::getExternalLinkRel to UrlUtils::matchesDomainList
  exposed a type error in media.txt where $wgNoFollowDomainExceptions
  was set to a string (which is invalid) instead of an array.

Bug: T319340
Change-Id: Icb512d7241954ee155b64c57f3782b86acfd9a4c
2024-09-10 16:50:02 -07:00
C. Scott Ananian
7249c4c982 parserTests.txt: Update documentation about cat/ill options
Parsoid does support these options now.

Change-Id: I9caedd10b8f7229602ad4f963275b62777aca104
2024-09-10 19:30:07 +00:00
dvorapa
10ab0e40a9 parser: Add a new {{USERLANGUAGE}} magic word for use in wikitext
Depending on configuration, this returns either the interface language
code of the current user or the current page language.

Bug: T4085
Change-Id: Iab7fda272ec81af88c74612727ff6bed014d4a81
2024-09-07 19:16:32 +00:00
C. Scott Ananian
0450b5e4d5 Add double-px-category tracking category for deprecated image size syntax
For decades MediaWiki has allowed "extra" px modifiers in image size
specifications, for example `100pxpx`.  It has been suggested since at least
2008 (T15500#174968) that this behavior should be deprecated.  This is
not localized, so (for example) on eowiki we allow `100rapx` as well (!).

As one small step toward eventually removing this weird corner case behavior
add a tracking category whenever it is used on wiki.

In the process, emit deprecation warnings for
ImageGalleryBase::setWidths() or ::setHeights() if called without
ImageGalleryBase::setParser() having been set.  The ::setParser() method
already includes in its documentation that "If you do not set this and
the output of this gallery ends up in parser cache, the javascript will
break!", so please set the parser appropriately.

Bug: T15436
Bug: T15500
Bug: T372935
Change-Id: If86d949189a7d105595404d21447477499873b03
2024-08-29 17:54:38 -04:00
Isabelle Hurbain-Palatin
a3cf629d2f Remove ParserOutput::getText() calls from core (direct pipeline)
This is the second patch of a series of patches to remove
ParserOutput::getText() calls from core. This series of patches should
be functionally equivalent to I2b4bcddb234f10fd8592570cb0496adf3271328e.

This patch replaces the calls to getText where the legacy parser is
called directly by creating a pipeline and invoking it on the generated.
These should probably eventually use the Content framework to generate
output instead of using Parser directly (T371008), which will also allow
them to transparently support Parsoid.

Bug: T293512
Change-Id: I45951a49e57a8031887ee6e4546335141d231c18
2024-08-23 18:15:00 +02:00
Ebrahim Byagowi
697e19e461 Add MediaWiki\Registration namespace to registration classes
Bug: T353458
Change-Id: Ifa3b6a6e0353bb4ce21a3f4456f1fc696c8d377c
2024-08-10 10:08:22 +00:00
Ebrahim Byagowi
4c270a72ac Add namespace to WikitextContent
It adds MediaWiki\Content namespace to WikitextContent
and two classes related.

Change-Id: Ib74e4c5b3edac6aa0e35d3b2093ce1d0b794cb6d
2024-08-06 17:42:51 +03:30
jenkins-bot
512c78b8ea Merge "Make {{#language}} consistent with {{#dir}} and {{#bcp47}}" 2024-07-31 11:42:16 +00:00
jenkins-bot
52a10a36b1 Merge "Add {{#bcp47}} parser function" 2024-07-31 11:42:08 +00:00
jenkins-bot
f338ac3295 Merge "Add {{#dir}} parser function" 2024-07-30 20:34:27 +00:00
C. Scott Ananian
450fe7fcd8 Make {{#language}} consistent with {{#dir}} and {{#bcp47}}
Add the same no-arg options for language code that
{{#dir}} and {{#bcp47}} have, for consistency:
* `{{#language}}` will return the name of the *target language*
  (for articles, the content language; for messages, the user language)

The default value for the "in language" argument should be the autonym.
This was working previously but only via a baroque code flow path for
invalid language codes.  Make this a bit clearer and add tests.

Since non-autonym language code translations are added via the
[[Extension:CLDR]] in production, hook LanguageGetTranslatedLanguageNames
in the ParserTestRunner to ensure that we can test this.

Followup-To: Ice1c671c5b3cc077d2bb80ea5dc25c5eabbfeb36
Followup-To: I19c3e91a924e080f37dc95a0d4e61493583b533e
Change-Id: Ibf6e7f194cc056eadb48a5ad8e6d01a761d9351c
2024-07-30 20:27:17 +00:00
C. Scott Ananian
416c33bb6a Add {{#bcp47}} parser function
Template:Bcp47 is one of the most used templates in Wikimedia Commons.
Providing its functionality as a parser function, tied to MediaWiki's
language-handling code, reduces code duplication and will allow us to
reduce template usage on commons.

As with the {{#dir}} parser function, support one special case:

* `{{#bcp47}}` will return the BCP-47 code of the *target language*
  (for articles, the content language; for messages, the user language)

Note the following slight differences from [[Template:BCP47]] on Commons,
documented in an added parser test:

* 'simple' maps to 'en-simple' (not just 'en')
* 'roa-tara' maps to 'nap-x-tara' (not 'it-x-tara')

Bug: T366623
Change-Id: Ice1c671c5b3cc077d2bb80ea5dc25c5eabbfeb36
2024-07-30 20:27:03 +00:00
jenkins-bot
cf36ccf3c4 Merge "ParserTestRunner: add timezone and user language options" 2024-07-23 13:34:02 +00:00
Ebrahim Byagowi
e1385d3bdf Add {{#dir}} parser function
Template:Dir is one of the most used templates in Wikimedia Commons,
this tries to provide parts of its functionality in hope we can
perhaps simplify or get rid of the template eventually for clarity and
performance reasons.

As a convenience, `{{#dir}}` and `{{#dir:}}` are synonyms for
`{{#dir:{{PAGELANGUAGE}}}}`: they return the direction of the target
language.  For articles, the target language is the content language;
for messages, the target language is the user language.

In addition, to avoid confusion between BCP-47 language codes and
MediaWiki-internal language codes, an optional second parameter can be
supplied.  If the second parameter is the (localizable) string
'bcp47', the language code given in the first parameter will be
treated as a BCP-47 code.  For example: `{{#dir:sr-Cyrl|bcp47}}`.

(See LanguageCode::bcp47ToInternal() for a description of the
differences and overlaps between MediaWiki internal and BCP-47
codes.  These overlaps *so far* don't result in any case where
encouraging editors to be precise about which set of enumerated
string values they are using for consistency with other
language-related functions, and because MediaWiki internally
differentiates between BCP-47 codes and internal codes.)

Bug: T359761
Change-Id: I19c3e91a924e080f37dc95a0d4e61493583b533e
2024-07-19 16:57:48 -04:00
Umherirrender
cfe48fc3ef Use expression builder to avoid IDatabase::addQuotes
Bug: T361023
Change-Id: Ic28b548947894921134cbec0a7347e7e27e2aaf2
2024-07-18 18:44:34 +00:00
Tim Starling
ebf3c9be86 ParserTestRunner: add timezone and user language options
* Add wgLocaltimezone to the list of global variables which may be set
  in parser test options.
* Add userLanguage option, which is passed through to ParserOptions.

Bug: T223772
Change-Id: I8498527c276288feae854868a8f4b1f3205a49e8
2024-07-12 11:35:33 +10:00
Arlo Breault
f3e9477465 Sync up core repo with Parsoid
This now aligns with Parsoid commit 3e10bb7a56619fe3881ea7c759ada21b96dc592e

Change-Id: Iff282e927bdcdb1feef00ba630e33c54c63bec5f
2024-07-10 14:41:17 -04:00
Lucas Werkmeister
d19f2543c1 Update expected test output after i18n change
The Arabic translation of red-link-title was modified again in change
Id00b720194 (commit 30c622c091), requiring another fix mirroring change
I8f2930802a (commit 6cbd9e5263).

Test with:

    composer phpunit -- --testsuite parsertests --filter=T236183

Bug: T369694
Change-Id: I19090fe523e0a5c17bc1c30ee31edce24f541e6b
Follows-Up: Id00b720194d3a715050cbce55e40ca11b34212ce
2024-07-10 11:38:39 +02:00
Arlo Breault
b8258c5c15 Sync up core repo with Parsoid
This now aligns with Parsoid commit fefcac4e949707536530828366d74f06aac88861

Change-Id: I6bf0be053982454d7ccba7a03268440644813cf2
2024-06-27 11:09:10 -04:00
jenkins-bot
1dba954b3f Merge "Remove image and gallery image caption trimming" 2024-06-27 13:54:58 +00:00
jenkins-bot
f412e1b3b8 Merge "Fix mw-selflink-fragment on variant fragment links" 2024-06-27 10:42:44 +00:00
Arlo Breault
6b05fa3a21 Remove image and gallery image caption trimming
Post I5039c7ef9e07199c256fd568b4f94714e5831d17, gallery image captions
are no longer placed on new lines, so the presence of leading whitespace
shouldn't be significant.

This fixes an inconsistency in gallery image caption trimming, where
only the first and last option had start and end trimming, respectively.

It also matches Parsoid output, where no trimming takes places, as seen
in the updated tests.

Change-Id: I2a80198c43598dc8c7fa61cb4b0340a97d2ee895
2024-06-26 21:51:40 -04:00
Lucas Werkmeister
6cbd9e5263 Update expected test output after i18n change
The Arabic translation of red-link-title was modified in change
I47769df5dc (commit dfd748033e).

Bug: T368383
Change-Id: I8f2930802a6b161ba44205d6b6d114d223be3cde
Follows-Up: I47769df5dc91ad12d817e117fe931b1452e8b2ef
2024-06-25 14:59:14 +00:00
Arlo Breault
c356dfed72 Fix mw-selflink-fragment on variant fragment links
Should have been part of 1fca3b5b.

The fix to doVariants can be seen in old output linking [[Dуна#Foo]] to
Дуна despite [[Dуна]] being a self-link in the test above.

Bug: T198652
Change-Id: Id38cfc47041492c5cc68b4f8f9566f421c9168bd
2024-06-19 08:50:46 +00:00
Arlo Breault
867b158c51 Fix test about applying mw-selflink-fragment to variant links
A byte sequence (d0 b0) was presumably erroneously introduced in the
link [[Duna#Foo]] which prevented it from being recognized as the same
target as the title associated with the current test.  An article (Дуна)
that's a variant of Duna exists, and variant conversion determines that
the link points at that page instead.  Removing those bytes updates the
output to add the mw-selflink-fragment class.

The test itself is updated to reflect that, as of 1fca3b5b, fragment
links are selflinks and a FIXME is added for a follow-up to correct the
variant fragment links to add the class.

Change-Id: Iee0326f85c919a672397d0378f3549f583c17e28
2024-06-18 15:31:35 -04:00
Arlo Breault
adf0beeaab Sync up core repo with Parsoid
This now aligns with Parsoid commit af12018f9905bceeb7fbf201f8685bcc39d8cdf4

A lot of langParserTests were split out of parserTests in
Idcce741402233fb4768ba06868f09bff0397172a but langParserTests.txt wasn't
being sync'd to the core repo.

Change-Id: Icff4a70fba4bf154f7438d354d17972173c4469c
2024-06-17 16:03:28 -04:00
jenkins-bot
45c105ec46 Merge "Parser: Avoid extra escaping in replaceTableOfContentsMarker" 2024-06-12 19:51:18 +00:00
vahurzpu
fbba3bb2cf Parser: Avoid extra escaping in replaceTableOfContentsMarker
I60fdfc2c52 changed replaceTableOfContentsMarker from using
preg_replace, which supports backreferences in the replacement, and
thus expects literal backslashes and dollar signs to to be escaped,
to using preg_replace_callback, which does not expect any escaping.
This caused unwanted backslashes in headings. This patch removes the
escaping.

Bug: T365413
Change-Id: Idbdc3074c7ad007627c4c259a1aaf090a5d0c7f9
2024-06-12 06:10:58 -04:00
C. Scott Ananian
c8e77a3707 Sync up core repo with Parsoid
This now aligns with Parsoid commit 2508e24a2aeb54b55eb54f7f65bedc4d477fc9cf

Change-Id: Ibb9f1c6287c6ec3e982f0fa3ddf908b01484973a
2024-06-10 23:29:02 -04:00