Commit graph

3 commits

Author SHA1 Message Date
C. Scott Ananian
0955046ca5 Ensure that ToC is converted into the proper target language
This patch exports the necessary information from the Parser into the
ParserOutput to ensure that the Table of Contents can be properly
language-converted: both ensuring that the target language is correct
(in cases where it differs from the content language) and that various
conversion-suppression mechanisms are functional.  When the
ParserCache does not (yet) have the new properties from Parser, the
behavior is unchanged from before (the content language is used, and
its "preferred variant").

This is a follow up to the "quick fix" deployed in
Ic14b3a49a8ee7ed600485d4f8a363a206035a847 to fix an UBN regression.

Parser tests have also been added to verify that ToC conversion
is correctly done (T299973).

Task T303329 has been opened to (eventually) rename the
'core:target-lang' and 'core:target-lang-variant' properties added to
the ParserOutput in this patch.

Bug: T303235
Bug: T295187
Bug: T299973
Followup-To: Ic14b3a49a8ee7ed600485d4f8a363a206035a847
Followup-To: Ib273f88531c340b561072ee9f616aa60725091e6
Change-Id: Ie0f1d7b6daffc8ff47228f6f086a257518f72717
2022-03-09 00:08:57 -05:00
C. Scott Ananian
7f849e965b Provide method to merge a ParserOutput into a ContentMetadataCollector
ContentMetadataCollector is a write-only interface defined by Parsoid
that performs the metadata collection functions of ParserOutput.  In
order to support asynchronous and out-of-order parses,
ContentMetadataCollector is write-only and merges of fragments are
defined to be independent of merge order.

This provides an initial implementation of ParserOutput::collectMetadata()
which transfers metadata from a ParserOutput to a ContentMetadataCollector.
It is intended that the flags and accumulators in ParserOutput will be
(incrementally) made more regular so that ::collectMetadata() grows
simpler over time.

An optional $strategy argument is added to ::appendExtensionData() and
::appendJsConfigVars() to allow future expansion of merge strategies,
although only `union` is supported for the moment.

The MW_MERGE_STRATEGY_UNION constant will be upstreamed into Parsoid's
ContentMetadataCollector class as MERGE_STRATEGY_UNION; we've added a
prefix to ParserOutput's copy for now to avoid a conflict with the
constant which Parsoid will define.

Bug: T300979
Change-Id: I4e20b84eb590296fb3c011bb4d658d7a65082a11
2022-02-17 12:29:19 -05:00
C. Scott Ananian
06ab90f163 Add new ParserOutput::{get,set}OutputFlag() interface
This is a uniform mechanism to access a number of bespoke boolean
flags in ParserOutput.  It allows extensibility in core (by adding new
field names to ParserOutputFlags) without exposing new getter/setter
methods to Parsoid.  It replaces the ParserOutput::{get,set}Flag()
interface which (a) doesn't allow access to certain flags, and (b) is
typically called with a string rather than a constant, and (c) has a
very generic name.  (Note that Parser::setOutputFlag() already called
these "output flags".)

In the future we might unify the representation so that we store
everything in $mFlags and don't have explicit properties in
ParserOutput, but those representation details should be invisible to
the clients of this API.  (We might also use a proper enumeration
for ParserOutputFlags, when PHP supports this.)

There is some overlap with ParserOutput::{get,set}ExtensionData(), but
I've left those methods as-is because (a) they allow for non-boolean
data, unlike the *Flag() methods, and (b) it seems worthwhile to
distingush properties set by extensions from properties used by core.

Code search:
https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos=

Bug: T292868
Change-Id: I39bc58d207836df6f328c54be9e3330719cebbeb
2021-10-15 14:25:54 -04:00