Commit graph

66 commits

Author SHA1 Message Date
thiemowmde
52963bbcc0 tests: Make use of ?? and ??= operators in test code
I believe the more recent syntax is quite a bit more readable. The
most obvious benefit is that it allows for much less duplication.

Note this patch is intentionally only touching tests, so it can't
have any effect on production code.

Change-Id: Ibdde9e37edf80f0d3bb3cb9056bee5f7df8010ee
2024-08-08 15:51:20 +02:00
Yiannis Giannelos
90bac43f11 Extract StatsFactory methods in parsoid SiteConfig
* Its not very clean to import Wikimedia\Stats in parsoid
  * Mediawiki depends on parsoid
* As a workaround we can extract the 2 methods we need in SiteConfig

Bug: T354908
Change-Id: I696131cfba6ccc26ae1f705f216e221a7c3db175
2024-07-10 18:01:56 +02:00
Isabelle Hurbain-Palatin
f65d1c44d0 Make $headers['content-language'] a string instead of Bcp47Code
Page bundle headers should not contain objects, as they are supposed
to represent plaintext HTTP headers.

Change-Id: I2a87a8233b9e42cbafdba63bdf513abe00d826ce
2024-06-11 11:08:34 +02:00
C. Scott Ananian
a565e388f9 Move ParsoidOutputAccess::supportsContentModel() into Parsoid SiteConfig
The `supportsContentModel` method is really querying Parsoid for the
set of content models it supports, so it makes sense to put it in the
Parsoid-specific SiteConfig service.

This is part of the work to deprecate and remove ParsoidOutputAccess.

Change-Id: I81eb2df8cef93ede95361a4e03185b3d58e5b84b
2024-05-22 10:57:37 -04:00
thiemowmde
52ddf3e8ce Remove all @package comments
I don't think these do anything with the documentation generators
we currently use. Especially not in tests. How are tests part of a
"package" when the code is not?

Note how most of these are simply identical to the namespace. They
are most probably auto-generated by some IDEs but don't actually
mean anything.

Change-Id: I771b5f2041a8e3b077865c79cbebddbe028543d1
2024-05-10 13:53:15 +02:00
jenkins-bot
b6df1fd6f7 Merge "Re-enable test after bumping Parsoid" 2024-04-22 04:34:20 +00:00
jenkins-bot
76a7f4aefd Merge "Skip test to bump Parsoid version" 2024-04-22 03:30:13 +00:00
Arlo Breault
9731a015f5 Re-enable test after bumping Parsoid
Follows-Up: I10b77b800dd23f00707011f545817182d3cb58b7
Change-Id: Id1b684876b6fbcafc96e4ae35cd9712720bad1c9
2024-04-19 20:11:26 -04:00
Arlo Breault
0ab4f85ed2 Skip test to bump Parsoid version
The method was moved / renamed.

Needed-By: I441699e7fe9827a5e06e4638ce88c685deb9b856
Change-Id: I10b77b800dd23f00707011f545817182d3cb58b7
2024-04-19 20:10:42 -04:00
Umherirrender
8d97313f81 Fix some line indent
Change-Id: I8f82724197d20f9289d80e138d80310f1eab29f2
2024-04-20 00:25:15 +02:00
jenkins-bot
cf35b37992 Merge "HtmlOutputRendererHelper: fall back to page language" 2024-03-13 15:57:24 +00:00
jenkins-bot
c3cc71b430 Merge "test: Add PHPUnit tests for ParsoidParserFactory" 2024-03-13 08:49:12 +00:00
Doğu Abaris
8a1eae0684 test: Add PHPUnit tests for PageContent
Covered:
- Constructor initialization with correct dependencies.
- Retrieve roles assigned to page content.
- Check if the specified role exists in the page content slots.
- Retrieve model name for specified role in page content
- Handle exception for non-existent role when retrieving model
- Retrieve content format for specified role in page content
- Retrieve serialized content for specified role in page content
- Handle exception for non-existent role when retrieving content

Change-Id: Ia2129e37b15bb8c09c0b26e487a9e311e66b932f
2024-03-08 14:56:00 +00:00
daniel
e7f21f6e64 HtmlOutputRendererHelper: fall back to page language
HtmlOutputRendererHelper should not crash hard if the ParserOutput has
no language set. ParserOutput may come from a variety of places, we
should be lenient about it not having a language.

However, we should try harder to actually set a language on ParserOutput
if we have one available. So this also updates
PageBundleParserOutputConverter to keep the ParserOutput's language in
sync wit the language header in the PageBundle.

Bug: T349868
Bug: T353689
Bug: T359426
Change-Id: I2edf20dc3b199e22cda2f32bc858c21ca7d8f4bd
2024-03-06 17:18:16 +00:00
James D. Forrester
fe1fbb3a5c build: Upgrade mediawiki/mediawiki-codesniffer to v43.0.0
Depends-On: I5349d3378b5acd04f0d7c60072a9b1e3dd8f2052
Change-Id: I3b7fd4c460418e72ed0c36febef75f41bad0afb1
2024-03-01 15:58:13 -05:00
jenkins-bot
a62f5c7911 Merge "[ParserOutput] Rename $mText to $mRawText and ::setText() to ::setRawText()" 2024-02-21 17:11:00 +00:00
C. Scott Ananian
72c4945a72 [ParserOutput] Rename $mText to $mRawText and ::setText() to ::setRawText()
ParserOutput::getText() is not a simple getter, but does
transformations on the "text" of the ParserOutput; the simple getter
is named ::getRawText().

To maintain consistency, rename ParserOutput::setText() to
::setRawText() and the property name ParserOutput::$mText to
::$mRawText so future readers are not confused.

The JSON property name as it appears in the serialized ParserCache
is left as 'Text' so that we don't have any forward- or backward-
rollback issues.

Change-Id: I3ef34814ab9473cc70d0a6806e8c5a4a02b73491
2024-02-20 17:13:28 +00:00
Doğu Abaris
e8a13d0266 test: Add PHPUnit tests for ParsoidParserFactory
Covered:
- `testCreate`: Test the create method to create a new Parsoid parser.

Change-Id: I8aba66397e3beae5ddb765398a4ff83a606f4076
2024-02-18 21:18:08 +00:00
C. Scott Ananian
19ae795ac2 [Parsoid\Config\SiteConfig] enable Parsoid support for disabling magic links
Bug: T145590
Change-Id: Ic35c964e1ae224ca6985ddc01ad9eda5671fb7b6
2024-02-17 01:57:42 +00:00
Reedy
85396a9c99 tests: Fix @covers and @coversDefaultClass to have leading \
Change-Id: I5629f91387f2ac453ee4341bfe4bba310bd52f03
2024-02-16 22:43:56 +00:00
Reedy
e94e265a93 tests: Add Tests to PHP namespacing
Change-Id: I849268172751d50292e93aa75abe8094873f56bc
2024-02-16 19:10:11 +00:00
Subramanya Sastry
e55cc517da Move Parser to Mediawiki\Parser namespace
Bug: T166010
Co-Authored-By: Daimona Eaytoy <daimona.wiki@gmail.com>
Co-Authored-By: James Forrester <jforrester@wikimedia.org>
Co-Authored-By: Subramanya Sastry <ssastry@wikimedia.org>
Change-Id: I79b4e732c45095eedbaa80afa5eb7479b387ed8a
2024-02-16 09:18:38 -05:00
C. Scott Ananian
52320c0902 Move ParsoidRenderID to MediaWiki\Edit
This class belongs with the rest of the Parsoid output stash code.

This class has been marked @unstable since 1.39 and thus the move
does not need release notes.

Change-Id: I16061c0c28b1549fbe90ea082cc717fee4a09a6e
2024-02-07 21:22:06 -05:00
Daimona Eaytoy
175c0c4abf Replace more instances of deprecated MWException
Bug: T328220
Change-Id: Iba90f7f9b5766bccc05380d040138d74d5e9558a
2024-01-19 23:11:59 +00:00
James D. Forrester
9bfb75ff90 Namespace ParserOutput
Most used non-namespaced class!

Bug: T353458
Change-Id: I4c2cbb0a808b3881a4d6ca489eee5d8c8ebf26cf
2023-12-14 14:57:34 -05:00
Martin Urbanec
29af4dd074 Move user options related classes into its own namespace
There are a couple of user options related classes already,
and the T321527 work on dynamic defaults is going to add
even more. Let's move them into a separate namespace
to make core a bit more organized.

Old name is kept as an alias for compatibility purposes.

Bug: T321527
Bug: T352284
Change-Id: I9822eb1553870b876d0b8a927e4e86c27d83bd52
2023-11-29 13:27:13 +01:00
jenkins-bot
a1f4fb418a Merge "Allow Bcp47Code as parameter to LanguageCode::bcp47ToInternal()" 2023-09-29 21:27:27 +00:00
C. Scott Ananian
f47de6ec61 Allow Bcp47Code as parameter to LanguageCode::bcp47ToInternal()
This nominally takes a string-valued language code conforming to the
BCP-47 standard, but this is often generated from a Bcp47Code object.
Since the MediaWiki Language code implements Bcp47Code, we may have
the case where we have a Language object in hand (but typed as a
Bcp47Code not Language) and call Language::toBcp47Code() only to pass
it to LanguageCode::bcp47ToInternal to convert it back to a
mediawiki-internal code.

We can save steps and be more efficient if allow the parameter to be a
Bcp47Code object, and write a fast path for the special case where
that Bcp47Code happens to be a Language object and we can simply call
Language::getCode() to obtain the internal code.

Change-Id: I24932449b8c40e3a5072748d87667184f4befa67
2023-09-29 15:10:29 -04:00
James D. Forrester
c1599c91b3 Namespace Config-related classes under \MediaWiki\Config
Bug: T166010
Change-Id: I4066885a7ea071d22497abcdb3f95e73e154d08c
2023-09-21 05:41:58 +00:00
Umherirrender
04a039e135 parser: Delay Parser creation in ParsoidSiteConfig/ParsoidDataAccess
Remove parser creation from service creation

In ParsoidSiteConfig inject the ParserFactory and call getMainInstance
later, ParsoidSiteConfig is created often without calls to the parser.
For ParsoidDataAccess store the factory and call it when needed.

Bug: T343070
Change-Id: Ib3acadaf190383e4a8b3d266a9fd75c9b20c6649
2023-09-19 22:19:30 +02:00
James D. Forrester
a8a6cfd966 Namespace NamespaceInfo under \MediaWiki\Title
One of the big ones, so doing this alone.

Bug: T166010
Change-Id: Ibe103cd362535d3cb94cb8931e95fc74099d1497
2023-09-19 05:17:04 +00:00
Subramanya Sastry
062fd08e51 Remove all Parsoid debugApi references and uses
* Was used during the Parsoid JS -> PHP port and is no longer used.
* This also eliminated the need to inject ParsoidSettings into some
  classes.
* Once this merges and lands in core, I'll remove this from the Parsoid
  repo as well.

Change-Id: I008d30ea81f5a3db26e512c87762b90e3ca3c4ff
2023-09-14 14:48:48 -05:00
Daimona Eaytoy
b65482ac1b phpunit: Prevent access to ExtensionRegistry in unit tests
Unit tests should not access the ExtensionRegistry singleton. This is
similar to how MediaWikiServices is disabled, but needs to be done
separately because ExtensionRegistry is not a service.

Make ExtensionRegistryTest use a mocked SettingsBuilder to avoid
triggering the exception when SettingsBuilder tries to access the global
instance of ExtensionRegistry.

Inject data from ExtensionRegistry into Parsoid's SiteConfig to keep
SiteConfigTest a working unit test.

Change-Id: I0a04c82250582fed7a66c1e10868d9b4f3823a28
2023-09-12 00:04:31 +00:00
James D. Forrester
35b934ffcb General whitespace clean-up of tabs followed by multiple spaces
Change-Id: I22090062274dceec96d43e23eb227a7e3b1e36fa
2023-09-06 14:28:43 +01:00
Daimona Eaytoy
31fcbb83c1 Replace usages of wfParseUrl
wfParseUrl falls back to the global service locator as of I706ef8a5.
This will soon be disallowed in unit tests (see I5117eab9), and all the
classes updated in this patch are covered by a unit test that would then
fail.

SiteConfig already has a UrlUtils object available, so just use that.

In the other classes, there is no need to inject a UrlUtils service and
we can instead adopt parse_url, because these didn't depend on our
site-configurable or custom parsing logic. For precedent see also
change I6492f5142861513e4a7, I1e76d2f5aef, and lots of other examples
in Codesearch for parse_url().

The warnings about parse_url() in UrlUtils.php have been obsolete
since about PHP 5.4, when it started to support protocol-relative
URLs, non-slash protocols like "mailto", and deal with spaces/newlines
correctly (https://3v4l.org/YWUkl).

This patch was partly copied from PS 20 of I5117eab9.

Co-Authored-by: Timo Tijhof <krinkle@fastmail.com>
Change-Id: I98ea4670e842d11598664f058d8c90a900477be4
2023-08-11 00:00:25 +00:00
C. Scott Ananian
cb371f2d91 Bcp47Code fixes to ParsoidParser and LanguageVariantConverterUnitTest
LanguageVariantConverterUnitTest: don't mock a method in the Parsoid
class that no longer exists.

ParsoidParser: pass a Bcp47Code (in the form of a Language object),
not a string, when selecting the preferred variant for the output

Followup-To: Ib8554f98b1c653df3864110e0e66796b8da67b5f
Change-Id: I32fd64a9495b8aed729b0b5b00535180006e0223
2023-08-07 17:31:04 -04:00
C. Scott Ananian
bc213907c4 Replace test code calls to SiteConfig methods which are deprecated in Parsoid
* SiteConfig::variants() was replaced by ::variantsFor()
* SiteConfig::langConverterEnabled() was replaced by ::langConverterEnabledBcp47()

Change-Id: I2dc510fcf0f03304f01c14cff92d5dd50736f062
2023-07-31 22:17:15 +00:00
Subramanya Sastry
68805e2f50 ParsoidParser: Record ParserOptions watcher on ParserOutput object
* ParsoidParser hadn't registered a watcher on ParserOptions so far.
  Because of this, you can see that the current parser cache key
  (in deployed production code) doesn't have 'useParsoid=1' in it.

  Ex: View source on enwiki:Hospet shows that the parser cache key
  there is "enwiki:parsoid-pcache:idhash:2360619-0!canonical".

  The only reason this doesn't conflict with legacy parser output
  is because we use "parsoid-pcache", a diferent cache instance than
  "pcache" used for legacy parser output. But if/when we decide to use
  the same parser cache instance, this could cause cache corruptions.

  With FlaggedRevisions, where a single "stable-pcache" parser cache
  instance is used, in local testing, this was causing Parsoid HTML to be
  saved without "useParsoid=1", and so Parsoid HTML was being returned
  for legacy parser cache requests.

* In addition, fix the code in PageBundleParserOutputConverter to copy
  over internal metadata (which includes used options). This ensures
  that any tracked parser options aren't lost and the right parser cache
  key is constructed later on.

* Added / updated a number of new tests that verifies that usedOptions
  is tracked correctly in the useParsoid code paths. The tests fail
  without the code changes in this patch.

Bug: T340703
Bug: T335157
Needed-By: I0e954949768044eea6ec275a36d0d6d7ed457e8e
Change-Id: I076d5d362bdfd9d4b2ca8886bf6b30c1a746aee7
2023-07-11 10:53:11 -05:00
daniel
4f0da43cee PageBundleParserOutputConverter: don't mutate original ParserOutput
This an issue introduced by I8711a51fc1bcac48, which
caused duplicate variant conversion to be applied in some cases.
The reason is that the $parserOutput and $processedParserOutput fields
in HtmlOutputRendererHelper ended up being the same object.

Change-Id: Ic1fbc8815ef74beba6dae927563a9945b6dab1a1
2023-06-18 18:26:43 +02:00
jenkins-bot
ba80b34b9d Merge "LanguageVariantConverter: Use LanguageConverter::hasVariant() to check source" 2023-04-28 15:58:22 +00:00
Bartosz Dziewoński
6ba47296d9 Fix Phan suppressions related to Title::castFrom*() and friends
There is no way to express that Title::castFromPageIdentity(),
Title::castFromPageReference() and Title::castFromLinkTarget()
can only return null when the parameter is null. We need to add
Phan suppressions or explicit types almost everywhere that these
methods are used with parameters that are known to not be null.

Instead, introduce new methods Title::newFromPageIdentity() and
Title::newFromPageReference() (Title::newFromLinkTarget() already
exists), without the null-coalescing behavior, and use them when
the parameter is not null. This lets static analysis tools, and
humans, easily understand where nulls can't appear.

Do the same with the corresponding TitleFactory methods.

Change the obvious uses of castFrom*() to newFrom*() (if there is
a Phan suppression, a type check, or a method call on the result).

Change-Id: Ida4da75953cf3bca372a40dc88022443109ca0cb
2023-04-22 16:45:09 +02:00
Arlo Breault
bc1601f874 Remove the nativeGalleryEnabled parsoidSetting
This is now enabled in production (Ic5a4a9950d51f63b17f4c5e70516bec87b981aa5)
and not something we want to remain configurable.

It is removed from Parsoid in I52ddfd21ff2e72a34cb5eb68742e3dfb85c6ccf6

Change-Id: I6a4d7d33fb42270fc5da3a922aa0a959180fb33f
2023-03-30 17:52:56 -04:00
Tim Starling
5e30a927bc tests: Make some PHPUnit data providers static
Just methods where adding "static" to the declaration was enough, I
didn't do anything with providers that used $this.

Initially by search and replace. There were many mistakes which I
found mostly by running the PHPStorm inspection which searches for
$this usage in a static method. Later I used the PHPStorm "make static"
action which avoids the more obvious mistakes.

Bug: T332865
Change-Id: I47ed6692945607dfa5c139d42edbd934fa4f3a36
2023-03-24 02:53:57 +00:00
Subramanya Sastry
580d3a3d76 SiteConfig: Get rid of of Cite-specific method
Bug: T268777
Depends-On: Ie6bc2c1cef2aca3166a8af6921cad29ebb8ef3a2
Change-Id: I0c01f62a4f290862d91436eca1baa0f5ee1af5fc
2023-03-14 10:21:38 -05:00
C. Scott Ananian
424bf408df LanguageVariantConverter: Use LanguageConverter::hasVariant() to check source
This is a slightly stricter test than we'd previously used to check
the validity of the provided source language parameter.

Change-Id: I22e9c5cf6c30ce737884162970a1eb349549c86d
2023-03-13 16:51:12 -04:00
jenkins-bot
5434c71393 Merge "Use Bcp47Code when interfacing with Parsoid" 2023-03-13 19:11:03 +00:00
jenkins-bot
d1300d649e Merge "Revert "Revert "TransformHandler: Load stashed page bundle based on ETag.""" 2023-03-13 18:38:43 +00:00
C. Scott Ananian
5ad8dea80a Use Bcp47Code when interfacing with Parsoid
It is very easy for developers and maintainers to mix up "internal
MediaWiki language codes" and "BCP-47 language codes"; the latter are
standards-compliant and used in web protocols like HTTP, HTML, and
SVG; but much of WMF production is very dependent on historical codes
used by MediaWiki which in some cases predate the IANA standardized
name for the language in question.

Phan and other static checking tools aren't much help distinguishing
BCP-47 from internal codes when both are represented with the PHP
string type, so the wikimedia/bcp-47-code package introduced a very
lightweight wrapper type in order to uniquely identify BCP-47 codes.
Language implements Bcp47Code, and LanguageFactory::getLanguage() is
an easy way to convert (or downcast) between Bcp47Code and Language
objects.

This patch updates the Parsoid integration code and the associated
REST handlers to use Bcp47Code in APIs so that the standalone Parsoid
library does not need to know anything about MediaWiki-internal codes.
The principle has been, first, to try to convert a string to a
Bcp47Code as soon as possible and as close to the original input as
possible, so it is easy to see *why* a given string is a BCP-47 code
(usually, because it is coming from HTTP/HTML/etc) and we're not stuck
deep inside some method trying to figure out where a string we're
given is coming from and therefore what sort of string code it might
be.  Second, we've added explicit compatibility code to accept
MediaWiki internal codes and convert them to Bcp47Code for backward
compatibility with existing clients, using the @internal
LanguageCode::normalizeNonstandardCodeAndWarn() method.  The intention
is to gradually remove these backward compatibility thunks and replace
them with HTTP 400 errors or wfDeprecated messages in order to
identify and repair callers who are incorrectly using
non-standard-compliant language codes in web standards
(HTTP/HTML/SVG/etc).

Finally, maintaining a code as a Bcp47Code and not immediately
converting to Language helps us delay or even avoid full loading of a
Language object in some cases, which is another reason to occasionally
push Bcp47Code (instead of Language) down the call stack.

Bug: T327379
Depends-On: I830867d58f8962d6a57be16ce3735e8384f9ac1c
Change-Id: I982e0df706a633b05dcc02b5220b737c19adc401
2023-03-13 13:25:09 -04:00
jenkins-bot
4e2d0f1fe5 Merge "Preserve non-PageBundle metadata set by Parsoid" 2023-03-13 17:14:50 +00:00
C. Scott Ananian
bce63d1912 Preserve non-PageBundle metadata set by Parsoid
The Parsoid entrypoints should always have a "real" ParserOutput
passed as the ContentMetadataCollector object, so that recursive
invocations of extensions, etc, can set appropriate metadata
properties in the ParserOutput.

This is part of a belt-and-suspenders fix for T331084, where a
StubMetadataCollector is being used in production -- production should
never use a stub, it should always use a real ParserOutput object.
The other fix for T331084 is
I30ea2bb24e6c9b0950a8f46dc8e5b9bf5ee3378b, which ensures that if you
*were* to use a StubMetadataCollector in production, it wouldn't throw
an error when a numeric category string was encountered.

Bug: T331084
Change-Id: I8711a51fc1bcac48eae92ab1ba15a33fe05937ed
2023-03-13 11:24:57 -04:00