In order to replace the /api/rest_v1/page/title/{title} endpoints, we
need to have something in MediaWiki that generates a compatible
responser. The v1/page/{title}/bare and v1/revision/{id}/bare endpoints
are functionally equivalent, so the easiest approach seemed to be to
add a compatibility mode to them. The compatibility mode is triggered
using the x-restbase-compat header, which can be set via the gateway
when routing the request from /api/rest_v1/page/title/.
Bug: T374136
Change-Id: I4af7ff5325660ae30faebb24753b9dc1c3acb2b3
Since I72c5e6f86b7f081ab5ce7a56f5365d2f75067a78 it is part of the
contract of ContentRenderer::getParserOutput() that the render ID (and
other cache parameters) will be set when it returns.
(ContentHandler::getParserOutput() can set them even earlier if it has
custom content-based overrides.) We had a lot of temporary
backward-compatibility code "later" in the parse process to try to close
the barn door if some code path "forgot" to set them, but these are
unnecessary now.
This patch removes that backward-compatibility code in ParsoidParser;
there is similar remaining code in ParserCache etc. which can be
addressed in follow ups.
(For compatibility we do have to temporarily copy the render ID code
inside ParsoidOutputAccess::parseUncachable, but that class is
deprecated and will be removed.)
The HtmlOutputRendererHelper path which used to call
ParsoidParser::parseFakeRevision() is now replaced with a codepath that
goes through RevisionRenderer. In order to maintain the same behavior
of the ParsoidHandler, we have also added 'useParsoid' handling to the
JsonContentHandler. This support can perhaps be deprecated eventually.
Bug: T350538
Change-Id: I0853624cf785f72fd956c6c2336f979f4402a68f
Instead of creating a half-initialized helper and later calling ::init,
provide all the information necessary for the helper in the constructor.
This is facilitated by the fact that there already exists a factory
class, PageRestHelperFactory, which holds all the services required.
This affects:
* HtmlOutputRendererHelper::init()
* HtmlMessageOutputHelper::init()
* HtmlInputTransformHelper::init()
Change-Id: I1e1213597c6be012f2bc024c2b370c968ff3b472
This removes the last use of ParsoidOutputAccess in core, allowing it
to be deprecated and eventually removed.
Bug: T367074
Bug: T317018
Change-Id: Ica2c880e2e7c2b126aaea66a3e4be460b3f2234f
I believe this makes the code less brittle, and also makes it a bit
more obvious what these strings are meant to represent.
Change-Id: Ia39b5c80af4b495931d0a68fd091b783645dd709
And deprecated aliases for the the no namespaced classes.
ReplicatedBagOStuff that already is deprecated isn't moved.
Bug: T353458
Change-Id: Ie01962517e5b53e59b9721e9996d4f1ea95abb51
Mostly used find-and-replace:
Find:
/\*[\*\s]+@var (I?[A-Z](\w+)(?:Interface)?)[\s\*]+/\s*(private|protected|public) (\$[a-z]\w+;\n)((?=\s*/\*[\*\s]+@var (I?[A-Z](\w+)(?:Interface)?))\n|)
Replace with:
\3 \1 \4
More could be done, but to keep this patch reasonably sized, I only
changed the most obvious and unambiguously correct cases.
In some cases, I also removed redundant doc comments on the
constructor, and re-ordered the properties to match the constructor.
Change-Id: Ifa710fdf4d8d44a2d7244798b787a1b2a58c35a7
One more step in gradually replacing uses of ParsoidOutputAccess with
ParserOutputAccess. This is mostly just inlining the code from
ParsoidOutputAccess (which already uses ParserOutputAccess internally)
and then removing dead code.
Change-Id: I87d6568d9dc71bc11f406ea3c31ce86e01667e05
This allows HtmlOutputRendererHelper to function for all kinds of
content.
Bug: T311728
Bug: T311648
Bug: T359426
Change-Id: Ib32af7cf2a7ad989eb0b13ecca37c857fc9199ec
This makes it much easier for IDEs and tools like Phan to understand
what's going on. Note this syntax is perfectly valid even if a class
is undefined. Language features like `use`, `instanceof`, and
`class_exists` work perfectly fine. We do this a lot in existing
code.
Change-Id: I4d397621ebcc6a7e842150f7641c1b23d082b730
Pages that are fast to render can be omitted from the parser cache
to preserve disk space and cache write operations.
The threshold is configurable per namespace, so the tradeoff can
be evaluated based on different access patterns. For example, pages
that are accessed rarely, like file description pages on commons,
may have a high threshold configured, while pages that are read
frequently, like wikipedia articles, may be configured to be always
cached, using a 0 threshold.
Filtering is based on a time profile recorded in the ParserOutput.
A generic mechanism for capturing the timing profile is implemented
in the ContentHandler base class. Subclasses may implement a more
rigorous capture mechanism.
Bug: T346765
Change-Id: I38a6f3ef064f98f3ad6a7c60856b0248a94fe9ac
Per T310476, it looks like the Authority interface now supports
rate limiting, so we can just use that instead of a heavy full user
object.
Possible followups
==================
More can be done to make all consumers of the HTML helper's `init()`
method to inject Authority instead.
This patch is needed for the work happening in I08ebea5e8a601f161f.
NOTE: This is technically not a breaking change as the Authority
interface is implemented by both UserAuthority and User classes,
so passing either is fine so consumers passing a full user object
should still work even though we changed the signature of a public
method in HtmlOutputRendererHelper.
Change-Id: I025cd83cc81f73ded861fcab943ba3b942d7c390
* Updated ParserOutput to set Parsoid render ids that REST API
functionality expects in ParserOutput objects.
* CacheThresholdTime functionality no longer exists since it was
implemented in ParsoidOutputAccess and ParserOutputAccess doesn't
support it. This is tracked in T346765.
* Enforce the constraint that uncacheable parses are only for fake or
mutable revisions. Updated tests that violated this constraint to
use 'getParseOutput' instead of calling the parse method directly.
* Had to make some changes in ParsoidParser around use of preferredVariant
passed to Parsoid. I also left some TODO comments for future fixes.
T267067 is also relevant here.
PARSOID-SPECIFIC OPTIONS:
* logLinterData: linter data is always logged by default -- removed
support to disable it. Linter extension handles stale lints properly
and it is better to let it handle it rather than add special cases
to the API.
* offsetType: Moved this support to ParsoidHandler as a post-processing
of byte-offset output. This eliminates the need to support this
Parsoid-specific options in the ContentHandler hierarchies.
* body_only / wrapSections: Handled this in HtmlOutputRendererHelper
as a post-processing of regular output by removing sections and
returning the body content only. This does result in some useless
section-wrapping work with Parsoid, but the simplification is probably
worth it. If in the future, we support Parsoid-specific options in
the ContentHandler hierarchy, we could re-introduce this. But, in any
case, this "fragment" flavor options is likely to get moved out of
core into the VisualEditor extension code.
DEPLOYMENT:
* This patch changes the cache key by setting the useParsoid option
in ParserOptions. The parent patch handles this to ensure we don't
encounter a cold cache on deploy.
TESTS:
* Updated tests and mocks to reflect new reality.
* Do we need any new tests?
Bug: T332931
Change-Id: Ic9b7cc0fcf365e772b7d080d76a065e3fd585f80
* Fix tests depending on $wgUsePigLatinVariant=true, which is in
DevelopmentSettings.php but not TestSetup::applyInitialConfig().
* Fix test depending on DNS resolution details.
Change-Id: I877dc3323bf4024caab7666a8820103de0b48d23
Several tests were marked skipped when the Parsoid extension isn't
loaded. But the extension is no longer needed to use parsoid. So these
tests should not be skipped.
Change-Id: I9febdbd143237bf247c82bfa386bc2560ef411aa
Just methods where adding "static" to the declaration was enough, I
didn't do anything with providers that used $this.
Initially by search and replace. There were many mistakes which I
found mostly by running the PHPStorm inspection which searches for
$this usage in a static method. Later I used the PHPStorm "make static"
action which avoids the more obvious mistakes.
Bug: T332865
Change-Id: I47ed6692945607dfa5c139d42edbd934fa4f3a36
Mixing Handlers with Helpers doesn't look nice for consistency
reasons. Helpers should be in their own place (grouped) in the
Handlers directory as they're really "helpers for the handlers".
Change-Id: Ieeb7a0a706a4cb38778f312bfbfe781a1f366d14
The motivation is to restore parsoid support for the content models
defined in the Proofread extension.
Bug: T246403
Change-Id: I33d269e42fede28139f7c923504326a77d11ee13
REST helper objects should be geared towards accepting input directly
from an HTTP request. As such, they should offer setters that take
string values. And native representation of things like page titles,
languages, or content objects should be done implicitly by the helper.
Change-Id: I9b81cad4d5cc575e7c5283035e385ac0457e8059
This allows extensions like VisualEditor to safely instantiate REST
helper objects. It also reduces the number of services that need to be
injected into REST handlers from route definitions.
Change-Id: I10af85b2da96568cfffd03867d1cb299645fb371
* We will have several kinds of HTML transformations.
Rename HTMLTransform to indicate that its for converting HTML to Content
objects.
* Using Naming Convention 'Html' instead of 'HTML'
Change-Id: I506f3303ae8f9e4db17299211366bef1558f142c
Variant conversion is based on the Accept-Language header. Updated
the HtmlOutputRendererHelper to set the HTTP headers related to
variant conversion.
Bug: T317019
Change-Id: I5e11452f1c531a757e8d860f9c727b5810406bce
This is needed by VE when performing Wikitext -> HTML transformation
during editing.
Also, this patch introduces the new flavor: fragment, that is passed in
via $envOptions to activate VisualEditor's body only mode functionality.
NOTE: This patch also fixes a PHPUnit test that broke by correctly
injecting the appropriate parsoid instance for checking error handling.
Bug: T308743
Change-Id: I838a3b05d7d8523a469236cf112158349063283c
If a render ID is given via the use-cache parameter, but the key is not
found in the parsoid stash, look at the most recent known rendering of
the revision, and use it if it matches the render ID.
This patch moves the responsibility for looking up RevisionRecords and
PageRecords into ParsoidOutputAccess. This way, callers only need to
have a PageIdentity, and optionally a revision ID.
Bug: T318395
Change-Id: I1aa5b0fd9fb1acaa2544d5a58125fa3810a0eb39
This continues the work in the child patch to replace callers
of setMwGlobals() with the appropriate method. Directory this
patch covers is `tests/phpunit/integration/`.
Change-Id: I0a9abf0d2a43587f2ffa029b68024a1ba5165fc7
The main object cache is disabled during testing. Some integration tests
need it though. This provides a clean way to enable it, to replace the hacks
that were used so far.
Note that we may want to enable the main cache during testing soon. When
that happens, this method is still useful to disable the cache in certain
tests, and to set a specific cache instance.
Change-Id: I04ae1bf1b6b2c8f6310acd2edf89459d01a9c870
Parsoid currently only supports wikitext (and JSON), so don't give it anything else.
NOTE: ParsoidOutputAccess will fail on content that is unsupported by parsoid.
This will however not affect the /transform and /page endpoints in the
parsoid extension, since they use the ParsoidHandler base class, which doesn't
rely on ParsoidOutputAccess.
Bug: T301371
Change-Id: I6bc9b978947b31455a4bce6385b7bdf64ed4043c
This patch introduces a ParsoidOutputAccess service for
getting parsoid outputs and warms the cache with pregenerated
outputs.
It also introduces a config variable in ParsoidCacheConfig that
is turned off by default for controlling the cache warming.
Bug: T301371
Change-Id: I6152c42ea765d94093d8d62598b1b4278314adec
Cache the parsoid outputs only if a certain time is exceeded on
parse and consider the parse operation within this time limit as
not expensive per that wiki and not cache the parsoid output at all.
Bug: T308588
Change-Id: I7793b77feab13400ccd04343e7878ad701f5e6a7
When JSON support was introduced into ParserCache in 1.36, it was
controlled by a feature flag, $wgParserCacheUseJson. The feature flag
was "born deprecated" in 1.36. It can now be removed.
This means that ParserCache will always store entries as JSON.
Support for reading old non-JSON entries remains intact.
This is needed when updating wikis from a version older than 1.36
to the current version.
Change-Id: Id04e42bfb458d98414bac50e0d6c505e8878e5c0
As a means of understanding the usage of the stash FEAT for
/page/html & /revision/html endpoints used by VE extension,
this patch introduces the collection of stats using the
StatsDataFactory.
Bug: T309017
Change-Id: I4e17d50e79da263637bdd55ab62e993df441fe38
This patch enables the response from PageHTMLHandler and
RevisionHTMLHandler to have different eTags for different
output modes and varying flavors.
Before, the only difference we got was when the stashing
option is set or not, but we need more flavors.
Bug: T308744
Change-Id: I2e9679e46a31955a2106a52af4eb612b32799c8c
Add stash option to /page/html & /revision/html endpoints.
When this option is set, the PageBundle returned by Parsoid is
stashed and an etag is returned that can later be used to
make use of the stashed PageBundle.
The stash is for now backed by the BagOStuff returned by
ObjectCache::getLocalClusterInstance().
This patch adds additional data to the ParserOutput stored in ParserCache.
Old entries lacking that data will be ignored.
Bug: T267990
Co-Authored-by: Nikki <nnikkhoui@wikimedia.org>
Change-Id: Id35f1423a69e3ff63e4f9883b3f7e3f9521d81d5