HtmlOutputRendererHelper was logging an error when the user doesn't have
persmission to access a revision. It should just send a 403 (or 404,
depending on the case).
BUG: T356157
Change-Id: I5ed2f85edbd1350ef5872238be4cecccdec9bd4b
This allows HtmlOutputRendererHelper to function for all kinds of
content.
Bug: T311728
Bug: T311648
Bug: T359426
Change-Id: Ib32af7cf2a7ad989eb0b13ecca37c857fc9199ec
HtmlOutputRendererHelper should not crash hard if the ParserOutput has
no language set. ParserOutput may come from a variety of places, we
should be lenient about it not having a language.
However, we should try harder to actually set a language on ParserOutput
if we have one available. So this also updates
PageBundleParserOutputConverter to keep the ParserOutput's language in
sync wit the language header in the PageBundle.
Bug: T349868
Bug: T353689
Bug: T359426
Change-Id: I2edf20dc3b199e22cda2f32bc858c21ca7d8f4bd
PoolCounterWorkArticleView was not designed for use by all callers of
getParserOutput. It provides stampede protection but does not generally
prevent duplicate concurrent parsing, and it may result in stale cache
entries being returned to the caller. This is acceptable for page views,
but not other use cases like editing or updating secondary derived data.
Bug: T352837
Change-Id: Ie532c17e5b86e8e1adbb57ecd5c5c6405b83bf8f
When using the accept-language header to determine the desired target
variant, use only a single language code from the header, instead of the
entire string. Using the entire string was causing failures in ETag
processing, since the accept-language header may contain spaces, while
the ETag may not.
Note that for now, we just use the first language code. In the future,
it would be nice if we could find the supported target variant that is
most preferred in the header.
Bug: T350852
Change-Id: I32af29627f68eafff6f097e2fc17723d2ebb39fa
This class belongs with the rest of the Parsoid output stash code.
This class has been marked @unstable since 1.39 and thus the move
does not need release notes.
Change-Id: I16061c0c28b1549fbe90ea082cc717fee4a09a6e
This avoids confusion with the "render timestamp" held by the cache,
and is consistent with ::get*RevisionId() etc.
The old ::getTimestamp() and ::setTimestamp() methods have been
deprecated.
Change-Id: Idb5e687709c98086c5d3075d31885c58a0723197
Set the render ID for each parse stored into cache so that we are able
to identify a specific parse when there are dependencies (for example
in an edit based on that parse). This is recorded as a property added
to the ParserOutput, not the parent CacheTime interface. Even though
the render ID is /related/ to the CacheTime interface, CacheTime is
also used directly as a parser cache key, and the UUID should not be
part of the lookup key.
In general we are trying to move the location where these cache
properties are set as early as possible, so we check at each location
to ensure we don't overwrite a previously-set value. Eventually we
can convert most of these checks into assertions that the cache
properties have already been set (T350538). The primary location for
setting cache properties is the ContentRenderer.
Moved setting the revision timestamp into ContentRenderer as well, as
it was set along the same code paths. An extra parameter was added to
ContentRenderer::getParserOutput() to support this.
Added merge code to ParserOutput::mergeInternalMetaDataFrom() which
should ensure that cache time, revision, timestamp, and render id are
all set properly when multiple slots are combined together in MCR.
In order to ensure the render ID is set on all codepaths we needed to
plumb the GlobalIdGenerator service into ContentRenderer, ParserCache,
ParserCacheFactory, and RevisionOutputCache. Eventually (T350538) it
should only be necessary in the ContentRenderer.
Bug: T350538
Bug: T349868
Followup-To: Ic9b7cc0fcf365e772b7d080d76a065e3fd585f80
Change-Id: I72c5e6f86b7f081ab5ce7a56f5365d2f75067a78
This introduces RestAuthorizeTrait to ensure proper error reporting
after calls to Authorizer methods to avoid misleading error reports,
see T350117 and T350202.
This reverts commit e047668d9f.
This restores change 701ff30193.
Change-Id: I617cb7ba24a1614c39e2b1072888f0ee7b3127e3
Per T310476, it looks like the Authority interface now supports
rate limiting, so we can just use that instead of a heavy full user
object.
Possible followups
==================
More can be done to make all consumers of the HTML helper's `init()`
method to inject Authority instead.
This patch is needed for the work happening in I08ebea5e8a601f161f.
NOTE: This is technically not a breaking change as the Authority
interface is implemented by both UserAuthority and User classes,
so passing either is fine so consumers passing a full user object
should still work even though we changed the signature of a public
method in HtmlOutputRendererHelper.
Change-Id: I025cd83cc81f73ded861fcab943ba3b942d7c390
* This reverts commit c1b82097.
* This reverts commit 56025174.
* This updates a test change from commit c8d0470f.
* Now that ParsoidOutputAccess has become a thin wrapper over
ParserOutputAccess and the code has landed in production without
needing to be reverted, we can revert the above hacks as soon as the
hits from the 'parsoid' instance start to go down to a small number.
As of the time of creating of this patch, of the combined hits to the
'parsoid' and 'parsoid_pcache' instance, over 90% are now from the
'parsoid_pcache' instance. We can wait for a couple more days to
watch how this number changes.
* Note that once we deploy this patch, the accesses which would have
hit in the 'parsoid' instance (with this hack) will instead result
in a cache miss thus adding the full parse latency to REST API
requests (whether by VisualEditor or by other clients). So, we need
to figure out what the cutoff point is. While 3 weeks is a guaranteed
switchover timeframe (because all entries in 'parsoid' cache will
expire at that time and we'll get no more hits from there after that),
note that we are at < 10% hits in this cache just 4 days after the
train rollout. So, there is a good chance we could get beyond 95%
by the end of this week.
Bug: T347632
Change-Id: Ibd741b92b860b4d4b03ca220863debaf53fab44a
* Parsoid REST API which considers both title and revid will soon
be made internal. The core REST API only has endpoints where either
the title OR the revid is provided and is not subject to this issue.
* This patch ignores page id mismatches and simply uses the revision
page id where the mismatch is detected. This is only supported for
ParsoidHandler which sets the lenient revision handling while fetching
the HtmlOutputRendererHelper.
For these API requests, the output is not cached.
* Local testing shows that this fixes the issue.
* Added new phpunit tests to ParsoidHandlerTest to verify expectations.
Also verified that disabling this fix fails that test.
Bug: T349235
Change-Id: I2f4a4a644710ee1e3894e6dc6a066eb37846bdfd
* Updated ParserOutput to set Parsoid render ids that REST API
functionality expects in ParserOutput objects.
* CacheThresholdTime functionality no longer exists since it was
implemented in ParsoidOutputAccess and ParserOutputAccess doesn't
support it. This is tracked in T346765.
* Enforce the constraint that uncacheable parses are only for fake or
mutable revisions. Updated tests that violated this constraint to
use 'getParseOutput' instead of calling the parse method directly.
* Had to make some changes in ParsoidParser around use of preferredVariant
passed to Parsoid. I also left some TODO comments for future fixes.
T267067 is also relevant here.
PARSOID-SPECIFIC OPTIONS:
* logLinterData: linter data is always logged by default -- removed
support to disable it. Linter extension handles stale lints properly
and it is better to let it handle it rather than add special cases
to the API.
* offsetType: Moved this support to ParsoidHandler as a post-processing
of byte-offset output. This eliminates the need to support this
Parsoid-specific options in the ContentHandler hierarchies.
* body_only / wrapSections: Handled this in HtmlOutputRendererHelper
as a post-processing of regular output by removing sections and
returning the body content only. This does result in some useless
section-wrapping work with Parsoid, but the simplification is probably
worth it. If in the future, we support Parsoid-specific options in
the ContentHandler hierarchy, we could re-introduce this. But, in any
case, this "fragment" flavor options is likely to get moved out of
core into the VisualEditor extension code.
DEPLOYMENT:
* This patch changes the cache key by setting the useParsoid option
in ParserOptions. The parent patch handles this to ensure we don't
encounter a cold cache on deploy.
TESTS:
* Updated tests and mocks to reflect new reality.
* Do we need any new tests?
Bug: T332931
Change-Id: Ic9b7cc0fcf365e772b7d080d76a065e3fd585f80
This reverts commit cbde6b69de to re-apply
the initial patch. This should only be merged once
I2acfd0b7a1e48aec107ded3bbe4963e2df24f4d3 is deployed.
Change-Id: If12ab65b1d773946fca6c8601ff51290136549c8
This reverts commit ec22840c4a.
This patch currently creates issues on beta, which still runs with the
latest vendor version of Parsoid. If, for some reason, Parsoid doesn't
get deployed with this patch, I2acfd0b7a1e48aec107ded3bbe4963e2df24f4d3
doesn't get included, and the REST page handler breaks.
Staggered deploy seems safer in this context, hence the proposal for a
temporary revert, and a merge after the next Parsoid vendor patch is
deployed.
Change-Id: I3f859fa807a04892a67323cd4e98be0d3fbb1676
* Parsoid's rt-testing script is still a node.js script and hence needs
ucs2 offests for its syntactic / semantic diff classification.
* So, we cannot let 1aa71cf5 ride the train since it will break
Parsoid's rt-testing. We'll figure out an alternative way of handling
it, but for now, I am reverting that part of the patch.
* Document in the ParsoidHandlerTest test that ucs2 offsets are used and
cannot be changed to 'byte'
Bug: T347426
Change-Id: Ifa833e01ef117d7bcd6da1c7eb542535192662eb
The Helper classes are deprecated since 1afd52e3e4.
Depends-On: I2acfd0b7a1e48aec107ded3bbe4963e2df24f4d3
Change-Id: Ie9973c6d6474bb7b4720c0641ca7492dc946d923
* This is in service of a followup patch that merges ParsoidOutputAccess
and ParserOutputAccess. We want to eliminate all Parsoid-specific options
that aren't part of ParserOptions and aren't easily supportable via
html2html transforms.
* offsetType conversion relies on Parsoid code that is a bit entangled
with env, siteconfig (and extension configs), page source, etc. It
could all be refactored but once the html2html output transformation
framework lands, we could potentially use that to call Parsoid to do
these transforms by exposing such transforms to the framework.
* In this patch, outputContentVersion that isn't the default major HTML
version is no longer support. It could potentially be supported via the
downgrade functionality in Parsoid in the future, or we might decide
to re-enable multiple outputContentVersion selection in the future
if such a use case arises. But, there are no plans to bump the major
HTML version in the near future while we work on read views.
* Rather than delete associated tests, I've marked them skipped so that
they can re-enabled when this support is added back.
Bug: T347426
Change-Id: Ibede4acd68e944512f6d00763d29c6b1605d67eb
This class is used heavily basically everywhere, moving it to Utils
wouldn't make much sense. Also with this change, we can move
StatusValue to MediaWiki\Status as well.
Bug: T321882
Depends-On: I5f89ecf27ce1471a74f31c6018806461781213c3
Change-Id: I04c1dcf5129df437589149f0f3e284974d7c98fa
* Fix tests depending on $wgUsePigLatinVariant=true, which is in
DevelopmentSettings.php but not TestSetup::applyInitialConfig().
* Fix test depending on DNS resolution details.
Change-Id: I877dc3323bf4024caab7666a8820103de0b48d23
Several tests were marked skipped when the Parsoid extension isn't
loaded. But the extension is no longer needed to use parsoid. So these
tests should not be skipped.
Change-Id: I9febdbd143237bf247c82bfa386bc2560ef411aa
Avoids any issue with not respecting the explicit version in accept
headers. However, it will effectively mean a cache miss on every
request after a Parsoid version bump.
Bug: T333606
Change-Id: Ia70f819df79fbb12a5b1dd6a98bfe0b968808d18
ParserOptions::setTargetLanguage will split the parser cache.
Only call it if the page language is different from the default.
Bug: T335183
Change-Id: I4cc21d6d83cb28abbd8e94b5448aa81802e0d88c
Just methods where adding "static" to the declaration was enough, I
didn't do anything with providers that used $this.
Initially by search and replace. There were many mistakes which I
found mostly by running the PHPStorm inspection which searches for
$this usage in a static method. Later I used the PHPStorm "make static"
action which avoids the more obvious mistakes.
Bug: T332865
Change-Id: I47ed6692945607dfa5c139d42edbd934fa4f3a36
It is very easy for developers and maintainers to mix up "internal
MediaWiki language codes" and "BCP-47 language codes"; the latter are
standards-compliant and used in web protocols like HTTP, HTML, and
SVG; but much of WMF production is very dependent on historical codes
used by MediaWiki which in some cases predate the IANA standardized
name for the language in question.
Phan and other static checking tools aren't much help distinguishing
BCP-47 from internal codes when both are represented with the PHP
string type, so the wikimedia/bcp-47-code package introduced a very
lightweight wrapper type in order to uniquely identify BCP-47 codes.
Language implements Bcp47Code, and LanguageFactory::getLanguage() is
an easy way to convert (or downcast) between Bcp47Code and Language
objects.
This patch updates the Parsoid integration code and the associated
REST handlers to use Bcp47Code in APIs so that the standalone Parsoid
library does not need to know anything about MediaWiki-internal codes.
The principle has been, first, to try to convert a string to a
Bcp47Code as soon as possible and as close to the original input as
possible, so it is easy to see *why* a given string is a BCP-47 code
(usually, because it is coming from HTTP/HTML/etc) and we're not stuck
deep inside some method trying to figure out where a string we're
given is coming from and therefore what sort of string code it might
be. Second, we've added explicit compatibility code to accept
MediaWiki internal codes and convert them to Bcp47Code for backward
compatibility with existing clients, using the @internal
LanguageCode::normalizeNonstandardCodeAndWarn() method. The intention
is to gradually remove these backward compatibility thunks and replace
them with HTTP 400 errors or wfDeprecated messages in order to
identify and repair callers who are incorrectly using
non-standard-compliant language codes in web standards
(HTTP/HTML/SVG/etc).
Finally, maintaining a code as a Bcp47Code and not immediately
converting to Language helps us delay or even avoid full loading of a
Language object in some cases, which is another reason to occasionally
push Bcp47Code (instead of Language) down the call stack.
Bug: T327379
Depends-On: I830867d58f8962d6a57be16ce3735e8384f9ac1c
Change-Id: I982e0df706a633b05dcc02b5220b737c19adc401
Mixing Handlers with Helpers doesn't look nice for consistency
reasons. Helpers should be in their own place (grouped) in the
Handlers directory as they're really "helpers for the handlers".
Change-Id: Ieeb7a0a706a4cb38778f312bfbfe781a1f366d14
2023-01-16 21:16:09 +01:00
Renamed from tests/phpunit/integration/includes/Rest/Handler/HtmlOutputRendererHelperTest.php (Browse further)