Commit graph

9 commits

Author SHA1 Message Date
daniel
f31cd9f1d3 REST: HtmlInputTransformHelper: Load original data from stash
Parsoid needs the original rendering in order to apply
selective serialization (selser). The page/{title}/html endpoint
can stash the rendering, and now the transform endpoint can make use
of the stashed rendering.

Bug: T310464
Change-Id: Ia58043ed3aa1eb12731d82aa87606c82ec63f663
2022-09-29 19:52:27 +02:00
jenkins-bot
e0e430c049 Merge "Add PageBundleParserOutputConverter" 2022-09-26 13:16:23 +00:00
msantos
d3a86cfc6f Fix parse() and getParserOutput() interfaces
In Ie87f823e721ed5ae9d49cf7ead8e77cbef254cd7, we changed the signature
of `parse()` to accept a PageIdentity instead of PageRecord and it broke
some tests in other places, specifically: HtmlOutputRendererHelperTest,
so this patch fixes the interfaces.

Change-Id: I35685412c52f7d4ae9e63960695e686fb2bb9b21
2022-09-26 11:40:19 +01:00
Abijeet
7400456b1a Add PageBundleParserOutputConverter
Move code to create ParserOutput from PageBundle and vice versa to a
separate final class. An final class was used instead of a trait
because traits do not support constants for PHP version < 8.2.

The plan is to use this final class in various interfaces in order
to avoid exposing them to Parsoid concepts.

Bug: T317019
Change-Id: I33076c359ee45719c1c4ef63f77c1f1285951d0c
2022-09-26 15:11:47 +05:30
msantos
f29803e2d9 Support access to outputs of non-existent pages
* Introduce a method in ParsoidOutputAccess that parses and returns
  a parse output directly without caring about cache.

* Parse a non-existent page with the new method when the page object
  is not a PageRecord, but a PageIdentity

Change-Id: Ie87f823e721ed5ae9d49cf7ead8e77cbef254cd7
2022-08-31 20:52:41 +01:00
daniel
2ba27ab06e Protect against passing unsupported content models to Parsoid.
Parsoid currently only supports wikitext (and JSON), so don't give it anything else.

NOTE: ParsoidOutputAccess will fail on content that is unsupported by parsoid.
This will however not affect the /transform and /page endpoints in the
parsoid extension, since they use the ParsoidHandler base class, which doesn't
rely on ParsoidOutputAccess.

Bug: T301371
Change-Id: I6bc9b978947b31455a4bce6385b7bdf64ed4043c
2022-06-30 14:54:42 +00:00
daniel
8ce08c0cbc Move knowledge about HTTP status out of ParsoidOutputAccess
This removes a cyclic dependency:
ParsoidHTML helper in the REST component uses ParsoidOutputAccess in the
parser component. So ParsoidOutputAccess cannot use LocalizedHttpException
from the REST component.

This also improves separation of concerns: the parsing component should
not be concerned with HTTP status codes.

Bug: T301371
Change-Id: I2e661fe3ce0824dbfd7579650972f9019c92ed59
2022-06-28 12:30:44 +02:00
daniel
1271faa381 Move access to the page bundle into ParsoidOutputAccess
This isolates ParsoidHTMLHelper from the internal of
ParsoidOutputAccess. The corresponding test cases were changed to use a
mock ParsoidOutputAccess, and to not test the behavior of
ParsoidOutputAccess.

Bug: T301371
Change-Id: Id693fae2264f15e5d35f28acc5adc4239b2ae24f
2022-06-28 11:49:36 +02:00
Derick Alangi
1854fb02d9 Storage: Warm parsoid parser cache with parsoid outputs
This patch introduces a ParsoidOutputAccess service for
getting parsoid outputs and warms the cache with pregenerated
outputs.

It also introduces a config variable in ParsoidCacheConfig that
is turned off by default for controlling the cache warming.

Bug: T301371
Change-Id: I6152c42ea765d94093d8d62598b1b4278314adec
2022-06-28 09:05:41 +00:00