Commit graph

2 commits

Author SHA1 Message Date
Bartosz Dziewoński
6ba47296d9 Fix Phan suppressions related to Title::castFrom*() and friends
There is no way to express that Title::castFromPageIdentity(),
Title::castFromPageReference() and Title::castFromLinkTarget()
can only return null when the parameter is null. We need to add
Phan suppressions or explicit types almost everywhere that these
methods are used with parameters that are known to not be null.

Instead, introduce new methods Title::newFromPageIdentity() and
Title::newFromPageReference() (Title::newFromLinkTarget() already
exists), without the null-coalescing behavior, and use them when
the parameter is not null. This lets static analysis tools, and
humans, easily understand where nulls can't appear.

Do the same with the corresponding TitleFactory methods.

Change the obvious uses of castFrom*() to newFrom*() (if there is
a Phan suppression, a type check, or a method call on the result).

Change-Id: Ida4da75953cf3bca372a40dc88022443109ca0cb
2023-04-22 16:45:09 +02:00
C. Scott Ananian
cfd9c516e1 Allow setting a ParserOption to generate Parsoid HTML
This is an initial quick-and-dirty implementation.  The
ParsoidParser class will eventually inherit from \Parser,
but this is an initial placeholder to unblock other Parsoid
read views work.

Currently Parsoid does not fully implement all the ParserOutput
metadata set by the legacy parser, but we're working on it.

This patch also addresses T300325 by ensuring the the Page HTML
APIs use ParserOutput::getRawText(), which will return the entire
Parsoid HTML document without post-processing.  This is what
the Parsoid team refers to as "edit mode" HTML. The
ParserOutput::getText() method returns only the <body> contents
of the HTML, and applies several transformations, including
inserting Table of Contents and style deduplication; this is
the "read views" flavor of the Parsoid HTML.

We need to be careful of the interaction of the `useParsoid` flag with
the ParserCacheMetadata.  Effectively `useParsoid` should *always* be
marked as "used" or else the ParserCache will assume its value doesn't
matter and will serve legacy content for parsoid requests and
vice-versa.  T330677 is a follow up to address this more thoroughly by
splitting the parser cache in ParserOutputAccess; the stop gap in this
patch is fragile and, because it doesn't fork the ParserCacheMetadata
cache, may corrupt the ParserCacheMetadata in the case when Parsoid
and the legacy parser consult different sets of options to render a
page.

Bug: T300191
Bug: T330677
Bug: T300325
Change-Id: Ica09a4284c00d7917f8b6249e946232b2fb38011
2023-03-26 21:46:05 -04:00