* This lets post-cache transforms have access to the title.
* Specifically, DiscussionTools uses this to post-process the HTML.
Bug: T341010
Change-Id: I328f533e6cdb11c0c3a873d23bab1a113dfa39be
* Updated documentation around this point
* Adjust tests to reflect this change.
* While it initially appeared that this can cause ParserCache impacts,
'disableContentConversion' isn't part of the cache key and thus
has no deployment impacts.
Change-Id: I535cb21cc104a358aa70829b030ae3751b76ae00
Mock the needed services, or set fixed values to avoid DB lookups, when
possible. Add the test to the Database group otherwise, e.g. for things
like Skin and Parser that use global state all over the place.
Change-Id: I8d87013d89accaf04d0ac19cb4b7216290383eb5
* ParsoidParser hadn't registered a watcher on ParserOptions so far.
Because of this, you can see that the current parser cache key
(in deployed production code) doesn't have 'useParsoid=1' in it.
Ex: View source on enwiki:Hospet shows that the parser cache key
there is "enwiki:parsoid-pcache:idhash:2360619-0!canonical".
The only reason this doesn't conflict with legacy parser output
is because we use "parsoid-pcache", a diferent cache instance than
"pcache" used for legacy parser output. But if/when we decide to use
the same parser cache instance, this could cause cache corruptions.
With FlaggedRevisions, where a single "stable-pcache" parser cache
instance is used, in local testing, this was causing Parsoid HTML to be
saved without "useParsoid=1", and so Parsoid HTML was being returned
for legacy parser cache requests.
* In addition, fix the code in PageBundleParserOutputConverter to copy
over internal metadata (which includes used options). This ensures
that any tracked parser options aren't lost and the right parser cache
key is constructed later on.
* Added / updated a number of new tests that verifies that usedOptions
is tracked correctly in the useParsoid code paths. The tests fail
without the code changes in this patch.
Bug: T340703
Bug: T335157
Needed-By: I0e954949768044eea6ec275a36d0d6d7ed457e8e
Change-Id: I076d5d362bdfd9d4b2ca8886bf6b30c1a746aee7
Initally used a new sniff with autofix (T333745),
but some provide are defined non-static in TestBase class
and need more work to make them static in a compatible way
Bug: T332865
Change-Id: I889d33424f0c01fb26f2d86f8d4fc3de3e568843
This is an initial quick-and-dirty implementation. The
ParsoidParser class will eventually inherit from \Parser,
but this is an initial placeholder to unblock other Parsoid
read views work.
Currently Parsoid does not fully implement all the ParserOutput
metadata set by the legacy parser, but we're working on it.
This patch also addresses T300325 by ensuring the the Page HTML
APIs use ParserOutput::getRawText(), which will return the entire
Parsoid HTML document without post-processing. This is what
the Parsoid team refers to as "edit mode" HTML. The
ParserOutput::getText() method returns only the <body> contents
of the HTML, and applies several transformations, including
inserting Table of Contents and style deduplication; this is
the "read views" flavor of the Parsoid HTML.
We need to be careful of the interaction of the `useParsoid` flag with
the ParserCacheMetadata. Effectively `useParsoid` should *always* be
marked as "used" or else the ParserCache will assume its value doesn't
matter and will serve legacy content for parsoid requests and
vice-versa. T330677 is a follow up to address this more thoroughly by
splitting the parser cache in ParserOutputAccess; the stop gap in this
patch is fragile and, because it doesn't fork the ParserCacheMetadata
cache, may corrupt the ParserCacheMetadata in the case when Parsoid
and the legacy parser consult different sets of options to render a
page.
Bug: T300191
Bug: T330677
Bug: T300325
Change-Id: Ica09a4284c00d7917f8b6249e946232b2fb38011