Commit graph

36 commits

Author SHA1 Message Date
daniel
2ec1791d40 Introduce PageRestHelperFactory
This allows extensions like VisualEditor to safely instantiate REST
helper objects. It also reduces the number of services that need to be
injected into REST handlers from route definitions.

Change-Id: I10af85b2da96568cfffd03867d1cb299645fb371
2022-11-21 07:23:26 +00:00
msantos
46071e9c3d Follow redirects for page/{title} formats html/with_html
* Apply Legacy Temporary redirects (302) if page is a redirect in order
to have feature parity with RESTBase

* Check for normalization redirects and execute permanent redirects (301)

* Add/Update mocha tests for the redirects functionality

* Add query parameter 'redirect=no' check to bypass redirect logic

* Unit tests to check status code and location headers

Bug: T301372
Change-Id: I841c21d54a58e118617aaf5e2c604ea22914adaa
2022-11-16 17:50:03 +00:00
daniel
f545d5efeb Rename HTMLTransform to HtmlToContentTransform
* We will have several kinds of HTML transformations.
Rename HTMLTransform to indicate that its for converting HTML to Content
objects.

* Using Naming Convention 'Html' instead of 'HTML'

Change-Id: I506f3303ae8f9e4db17299211366bef1558f142c
2022-11-03 16:47:36 +01:00
Abijeet
1b53f15e7f page/html endpoint: Support variant conversion
Variant conversion is based on the Accept-Language header. Updated
the HtmlOutputRendererHelper to set the HTTP headers related to
variant conversion.

Bug: T317019
Change-Id: I5e11452f1c531a757e8d860f9c727b5810406bce
2022-11-01 19:21:42 +05:30
jenkins-bot
46c8bb05bf Merge "Use markTestSkippedIfExtensionNotLoaded() shortcut in tests" 2022-10-14 15:43:13 +00:00
daniel
994e50d24f Fix passing the wikiId into ParsoidOutputAccess.
It's not clear if Parsoid still need this, but let's err on the side of
caution.

Change-Id: I7cef2827da23af3c3466cb855de5f42e05375515
2022-10-07 17:50:38 +02:00
Derick Alangi
ab7849ed47 ParsoidOutputAccess: Add support for fragment flavor
This is needed by VE when performing Wikitext -> HTML transformation
during editing.

Also, this patch introduces the new flavor: fragment, that is passed in
via $envOptions to activate VisualEditor's body only mode functionality.

NOTE: This patch also fixes a PHPUnit test that broke by correctly
injecting the appropriate parsoid instance for checking error handling.

Bug: T308743
Change-Id: I838a3b05d7d8523a469236cf112158349063283c
2022-10-06 20:41:48 +01:00
thiemowmde
178d6810f5 Use markTestSkippedIfExtensionNotLoaded() shortcut in tests
Change-Id: Ie7cc7b8c3aad0225f2f2d2d2241046756c03c0d5
2022-10-04 09:39:55 +00:00
daniel
a02be0b3f8 HtmlInputTransformHelper: Fall back to ParserCache
If a render ID is given via the use-cache parameter, but the key is not
found in the parsoid stash, look at the most recent known rendering of
the revision, and use it if it matches the render ID.

This patch moves the responsibility for looking up RevisionRecords and
PageRecords into ParsoidOutputAccess. This way, callers only need to
have a PageIdentity, and optionally a revision ID.

Bug: T318395
Change-Id: I1aa5b0fd9fb1acaa2544d5a58125fa3810a0eb39
2022-09-30 15:56:23 +00:00
Tim Starling
8de63ae485 Make RateLimiter use WRStats
Bug: T261744
Change-Id: Ib947340cbac19fb26176257e1707e51426c7f76e
2022-07-16 11:00:22 +00:00
Derick Alangi
f88eab53a6 tests: Use overrideConfig(Value|Values) where needed
This continues the work in the child patch to replace callers
of setMwGlobals() with the appropriate method. Directory this
patch covers is `tests/phpunit/integration/`.

Change-Id: I0a9abf0d2a43587f2ffa029b68024a1ba5165fc7
2022-07-12 14:40:46 +01:00
daniel
bf092744c9 PHPUnit: introduce setMainCache
The main object cache is disabled during testing. Some integration tests
need it though. This provides a clean way to enable it, to replace the hacks
that were used so far.

Note that we may want to enable the main cache during testing soon. When
that happens, this method is still useful to disable the cache in certain
tests, and to set a specific cache instance.

Change-Id: I04ae1bf1b6b2c8f6310acd2edf89459d01a9c870
2022-07-07 16:25:59 +10:00
daniel
2ba27ab06e Protect against passing unsupported content models to Parsoid.
Parsoid currently only supports wikitext (and JSON), so don't give it anything else.

NOTE: ParsoidOutputAccess will fail on content that is unsupported by parsoid.
This will however not affect the /transform and /page endpoints in the
parsoid extension, since they use the ParsoidHandler base class, which doesn't
rely on ParsoidOutputAccess.

Bug: T301371
Change-Id: I6bc9b978947b31455a4bce6385b7bdf64ed4043c
2022-06-30 14:54:42 +00:00
Derick Alangi
1854fb02d9 Storage: Warm parsoid parser cache with parsoid outputs
This patch introduces a ParsoidOutputAccess service for
getting parsoid outputs and warms the cache with pregenerated
outputs.

It also introduces a config variable in ParsoidCacheConfig that
is turned off by default for controlling the cache warming.

Bug: T301371
Change-Id: I6152c42ea765d94093d8d62598b1b4278314adec
2022-06-28 09:05:41 +00:00
Derick Alangi
270699ec34 Configure caching parsoid output per wiki based on threshold
Cache the parsoid outputs only if a certain time is exceeded on
parse and consider the parse operation within this time limit as
not expensive per that wiki and not cache the parsoid output at all.

Bug: T308588
Change-Id: I7793b77feab13400ccd04343e7878ad701f5e6a7
2022-06-16 11:42:06 +01:00
daniel
6955380fbe Add rate limiting to ParsoidHTMLHelper
Bug: T267991
Change-Id: I52a83e7d3bdb0bcde59160e2d193f06908fda3d4
2022-06-15 13:40:56 +02:00
daniel
697f28df32 ParserCache: always use JSON
When JSON support was introduced into ParserCache in 1.36, it was
controlled by a feature flag, $wgParserCacheUseJson. The feature flag
was "born deprecated" in 1.36. It can now be removed.

This means that ParserCache will always store entries as JSON.
Support for reading old non-JSON entries remains intact.
This is needed when updating wikis from a version older than 1.36
to the current version.

Change-Id: Id04e42bfb458d98414bac50e0d6c505e8878e5c0
2022-06-07 15:19:45 +02:00
Derick Alangi
141b42c7ca Rest: Collect stats on Cache & Stash usage
As a means of understanding the usage of the stash FEAT for
/page/html & /revision/html endpoints used by VE extension,
this patch introduces the collection of stats using the
StatsDataFactory.

Bug: T309017
Change-Id: I4e17d50e79da263637bdd55ab62e993df441fe38
2022-05-30 09:51:55 +01:00
Derick Alangi
d62f97d5e0 Rest: Return different eTags for different output modes
This patch enables the response from PageHTMLHandler and
RevisionHTMLHandler to have different eTags for different
output modes and varying flavors.

Before, the only difference we got was when the stashing
option is set or not, but we need more flavors.

Bug: T308744
Change-Id: I2e9679e46a31955a2106a52af4eb612b32799c8c
2022-05-25 11:15:47 +00:00
Derick Alangi
13f6ec9e1b Rest: Migrate parsoid stashing logic from RESTbase
Add stash option to /page/html & /revision/html endpoints.
When this option is set, the PageBundle returned by Parsoid is
stashed and an etag is returned that can later be used to
make use of the stashed PageBundle.

The stash is for now backed by the BagOStuff returned by
ObjectCache::getLocalClusterInstance().

This patch adds additional data to the ParserOutput stored in ParserCache.
Old entries lacking that data will be ignored.

Bug: T267990
Co-Authored-by: Nikki <nnikkhoui@wikimedia.org>
Change-Id: Id35f1423a69e3ff63e4f9883b3f7e3f9521d81d5
2022-05-23 17:28:29 +01:00
Reedy
6e29611642 Remove or replace usages of "sanity"
Still some more to go...

Bug: T254646
Change-Id: Ia117f01e443c35b4765f3275cab4f2707e1be96f
2021-11-21 16:42:31 +00:00
Petr Pchelko
4ca16e8d08 Eliminate use of Title object in REST infrastructure
Change-Id: I585f0f23cac5f6dc2a4879f69f7b83828fda3dd3
2021-05-05 18:54:58 -07:00
Petr Pchelko
f642215aed Convert ParserCache to PageRecord
ParserOptions not updated cause they depend on Title::getLanguage
implementation.

Tests converted to not require a DB anymore. Can't be proper unit
tests yet due to globals in ParserOptions and fake time hacks,
but exec time does go down from 70 seconds to 9 seconds.

Page content model is still emitted in the metrics since
it was considered useful. Should be removed when we get
something like a page type concept.

Change-Id: Ib16fd0b5b87ffc3cb4d21f4aa43d1203cb7206d2
2021-04-02 21:14:54 -06:00
Petr Pchelko
3a2e8883b4 Rest: use Authority in all core handlers
Bug: T239753
Change-Id: Idf2229255f49514dd8b68bf63573c5b619b4f2f1
2021-01-21 18:22:33 -06:00
Thiemo Kreuz
2f66b3754f tests: Remove @param docs from test code that just repeat the signature
These are not only 100% identical to the actual code, but also:
* It's error-prone. Some are already wrong.
* These test…() functions are not meant to be called from
  anywhere. What is the target audience for this documentation?
* There is a @dataProvider. What such @param tags actually do is
  document the provider, but in an odd place. Just looking at
  the provider should give the same information.
* The MediaWiki CodeSniffer allows to skip @param when there is
  a @dataProvider, for the reasone listed.

Change-Id: I0f6f42f9a15776df944a0da48a50f9d5a2fb6349
2021-01-21 03:41:23 +00:00
daniel
00a3439dce Introduce RevisionOutputCache
Bug: T267981
Change-Id: Ib1dc641ed10d786918362b25bd655780d5844ba1
2020-12-14 16:50:28 +00:00
Petr Pchelko
1162411d7f Make /page/{title}/html emit etags in RESTBase format
RESTBase used to emit ETag in the `"<rev_id>/<render_id>" format.
For the benefit of the clients, preserve the formar.

Render ID is a UUIDv1 uniquely identifying the ParserOutput.
In future it would be used as a stashing key for stash deduplication.
At this time I decided to just attach the render ID as extension data
to our fake ParserOutput. Once we integrate Parsoid more into core,
we will likely move it into a ParserOutput property, or even
replace CacheTime::mCacheTime with a UUIDv1, but it's too early for that.

Bug: T268234
Change-Id: Ie604e9c98021d59eb1a17ca65f227e8f234a45be
2020-12-09 16:36:07 -06:00
Cindy Cicalese
808d841447 Moved page/{title}/bare to PageSourceHandler
Bug: T267981
Change-Id: Ie1a5ee9da5d8231bbf7ea2cbb419ab4bcec33c43
2020-12-09 22:02:11 +01:00
Daniel Kinzler
3bc61324b9 Re-Apply "Extract helper classes from PageHTMLHandler"
This reverts commit d51a697e13.

Reason for revert: Let's try this again...

Change-Id: Ie0218adff95576c972ff4c1d51cadd02f41eba3e
2020-12-07 16:59:29 +00:00
Subramanya Sastry
d51a697e13 Revert "Extract helper classes from PageHTMLHandler"
This reverts commit b98f7a6fc1.

Reason for revert: Breaks Parsoid CI but doesn't seem to run on core patches?

Change-Id: I1eaf1495dce6f6ba78093aacb9475a023a2aabfa
2020-12-02 23:32:27 +00:00
daniel
b98f7a6fc1 Extract helper classes from PageHTMLHandler
This extracts two helper classes from PageHTMLHandler:
* PageContentHelper for accessing page content. This replaces the
  LatestRevisionContentHandler mase class.
* ParsoidHtmlHelper for generating HTML from wikitext using parsoid.

The idea is to decouple the functionality from the REST handlers, so we
can easily mix and match functionality to create a handler for the
new per-revision HTML endpoint.

Bug: T267981
Bug: T267982
Change-Id: I3226833d12e51c959712d642b0195de1fe1ef979
2020-12-02 18:08:12 +00:00
Ppchelko
d2565533c4 Re-Re-apply "Use parsoid directly in /page/html handler
This reverts commit d4789dc29a.

Reason for revert: it's still good, resolving dependencies.

Change-Id: Ib5b75cf71b3d9ba2be21b1a369bf20db368c6968
2020-11-19 14:16:50 -07:00
Ppchelko
d4789dc29a Revert "Re-apply "Use parsoid directly in /page/html handler""
This reverts commit 38ca1b261e.

Reason for revert: Even though API appserver is ready, the REST API traffic is not routed to the correct MW cluster.

Change-Id: I00582e32c87e803c305930dd8de60c38b771b219
2020-11-17 17:05:19 +00:00
Ppchelko
38ca1b261e Re-apply "Use parsoid directly in /page/html handler"
This reverts commit 1157007658.

Reason for revert: can be reapplied after dependencies are resolved.

Change-Id: I1270853766fd5bf59ed191065b9e52b76e3d9fc9
2020-11-16 14:23:18 +00:00
Ppchelko
1157007658 Revert "Use parsoid directly in /page/html handler"
This reverts commit 4191c9fe31.

Reason for revert: This can not be released yet. It has slipped my mind that Parsoid extension is not enabled on the API MW cluster, thus releasing this will break the html endpoint. This code is good and can be re-reverted once https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/635096 is resolved.

Change-Id: I808be187ae582995e6c1899044b2a7019bf02d32
2020-10-19 22:39:01 +00:00
Petr Pchelko
4191c9fe31 Use parsoid directly in /page/html handler
Bug: T265295
Change-Id: I6d9999b315def616e973daca0b7d544e502c7212
2020-10-16 15:21:39 -07:00