This is the first patch of a series of patches to remove
ParserOutput::getText() calls from core. This series of patches should
be functionally equivalent to I2b4bcddb234f10fd8592570cb0496adf3271328e.
This first patch replaces the calls to getText to calls to
ParserOutput::getRawText when the pipeline has no reason to be executed.
Bug: T293512
Change-Id: I0ad53cd074ca9cf13e96e6b08179e13aeea180f4
And deprecated aliases for the the no namespaced classes.
ReplicatedBagOStuff that already is deprecated isn't moved.
Bug: T353458
Change-Id: Ie01962517e5b53e59b9721e9996d4f1ea95abb51
This is to make it clearer that they're related to converting serialized
content back into JSON, rather than stating that things are not
representable in JSON.
Change-Id: Ic440ac2d05b5ac238a1c0e4821d3f2d858bc3d76
I don't think these do anything with the documentation generators
we currently use. Especially not in tests. How are tests part of a
"package" when the code is not?
Note how most of these are simply identical to the namespace. They
are most probably auto-generated by some IDEs but don't actually
mean anything.
Change-Id: I771b5f2041a8e3b077865c79cbebddbe028543d1
This is a follow-up to: I0b683461212a357c7eb09ddec59c87539e323c65
and I40a8372a76f33c5f62ea73bb1180dd7c47412c89 which explicitly for
backward compatibility reasons supports IBufferingStatsdDataFactory.
Now that we've fully switched to StatsFactory together with the
`copyToStatsdAt()` method, we're fine to fully remove this `instanceof`
logic.
Bug: T356815
Change-Id: I164d82904b6d3fb575cb973c14f9454569bf09ac
ParserOutput::getText() is not a simple getter, but does
transformations on the "text" of the ParserOutput; the simple getter
is named ::getRawText().
To maintain consistency, rename ParserOutput::setText() to
::setRawText() and the property name ParserOutput::$mText to
::$mRawText so future readers are not confused.
The JSON property name as it appears in the serialized ParserCache
is left as 'Text' so that we don't have any forward- or backward-
rollback issues.
Change-Id: I3ef34814ab9473cc70d0a6806e8c5a4a02b73491
This avoids confusion with the "render timestamp" held by the cache,
and is consistent with ::get*RevisionId() etc.
The old ::getTimestamp() and ::setTimestamp() methods have been
deprecated.
Change-Id: Idb5e687709c98086c5d3075d31885c58a0723197
Set the render ID for each parse stored into cache so that we are able
to identify a specific parse when there are dependencies (for example
in an edit based on that parse). This is recorded as a property added
to the ParserOutput, not the parent CacheTime interface. Even though
the render ID is /related/ to the CacheTime interface, CacheTime is
also used directly as a parser cache key, and the UUID should not be
part of the lookup key.
In general we are trying to move the location where these cache
properties are set as early as possible, so we check at each location
to ensure we don't overwrite a previously-set value. Eventually we
can convert most of these checks into assertions that the cache
properties have already been set (T350538). The primary location for
setting cache properties is the ContentRenderer.
Moved setting the revision timestamp into ContentRenderer as well, as
it was set along the same code paths. An extra parameter was added to
ContentRenderer::getParserOutput() to support this.
Added merge code to ParserOutput::mergeInternalMetaDataFrom() which
should ensure that cache time, revision, timestamp, and render id are
all set properly when multiple slots are combined together in MCR.
In order to ensure the render ID is set on all codepaths we needed to
plumb the GlobalIdGenerator service into ContentRenderer, ParserCache,
ParserCacheFactory, and RevisionOutputCache. Eventually (T350538) it
should only be necessary in the ContentRenderer.
Bug: T350538
Bug: T349868
Followup-To: Ic9b7cc0fcf365e772b7d080d76a065e3fd585f80
Change-Id: I72c5e6f86b7f081ab5ce7a56f5365d2f75067a78
Pages that are fast to render can be omitted from the parser cache
to preserve disk space and cache write operations.
The threshold is configurable per namespace, so the tradeoff can
be evaluated based on different access patterns. For example, pages
that are accessed rarely, like file description pages on commons,
may have a high threshold configured, while pages that are read
frequently, like wikipedia articles, may be configured to be always
cached, using a 0 threshold.
Filtering is based on a time profile recorded in the ParserOutput.
A generic mechanism for capturing the timing profile is implemented
in the ContentHandler base class. Subclasses may implement a more
rigorous capture mechanism.
Bug: T346765
Change-Id: I38a6f3ef064f98f3ad6a7c60856b0248a94fe9ac
Mock the needed services, or set fixed values to avoid DB lookups, when
possible. Add the test to the Database group otherwise, e.g. for things
like Skin and Parser that use global state all over the place.
Change-Id: I8d87013d89accaf04d0ac19cb4b7216290383eb5
Just methods where adding "static" to the declaration was enough, I
didn't do anything with providers that used $this.
Initially by search and replace. There were many mistakes which I
found mostly by running the PHPStorm inspection which searches for
$this usage in a static method. Later I used the PHPStorm "make static"
action which avoids the more obvious mistakes.
Bug: T332865
Change-Id: I47ed6692945607dfa5c139d42edbd934fa4f3a36
Add a type annotation when encoding `stdClass` objects so that we can
be sure to decode them as objects instead of arrays.
This avoids issues such as that seen in the Graph extension (T312589)
where an extension data key is stored as a stdClass. If ParserOutput
was computed fresh, a subsequent getExtensionData(..) call will return
a stdClass object, but if the ParserOutput was cached, getExtensionData()
would return an array. After this change the return type is always
consistent.
Properly handle nested objects: encode all object values returned by
JsonSerializable::jsonSerialize() (so that client is not responsible
for implementing this correctly), and decode all object values *before*
calling JsonUnserializable::newFromJsonArray (again, so that the
client is not responsible for decoding its property values). The new
behavior matches how serialize/unserialize is handled in the 'naive'
JsonUnserializable{Sub,Super}Class test cases; ParserOutput (the only
users of JsonCodec in core) was doing an extra manual decode for
the ExtensionData array in ParserOutput::initFromJson that is no longer
necessary.
The GrowthExperiments and SemanticMediaWiki extensions were working
around the non-recursive nature of JsonCodec; this patch depends on
patches to GrowthExperiments to make it agnostic about whether object
unserialization occurs before or after ::newFromJsonArray() is called,
which can then be further cleaned up once this is released.
A pull request for SemanticMediaWiki has also been submitted.
Bug: T312589
Depends-On: I3413609251f056893d3921df23698aeed40754ed
Change-Id: Id7d0695af40b9801b42a9b82f41e46118da288dc
This is a quick find & replace of calls to the deprecated method
ParserOptions::newCanonical() when the context is the string literal
'canonical'. This can be safely replaced by called newFromAnon().
Change-Id: If7bb68459b11e0c5f5de188f10fdae85ad1a78bf
When JSON support was introduced into ParserCache in 1.36, it was
controlled by a feature flag, $wgParserCacheUseJson. The feature flag
was "born deprecated" in 1.36. It can now be removed.
This means that ParserCache will always store entries as JSON.
Support for reading old non-JSON entries remains intact.
This is needed when updating wikis from a version older than 1.36
to the current version.
Change-Id: Id04e42bfb458d98414bac50e0d6c505e8878e5c0
No need to manually reset MWTimestamp fake time,
MediaWikiTestCaseTrait::fakeTimestampTearDown() does
that after all tests anyway.
Change-Id: Ie45e837c693d7f21b6c7e25471cdaad6f60a3de9
CachedBagOStuff caches negatives, so it breaks PoolCounter.
We only need to cache metadata in-process, since it's commonly
used twice within the request.
Bug: T277829
Change-Id: I11a147c24b6cdb275b521b48802d6f3d0e1a4387
ParserOptions not updated cause they depend on Title::getLanguage
implementation.
Tests converted to not require a DB anymore. Can't be proper unit
tests yet due to globals in ParserOptions and fake time hacks,
but exec time does go down from 70 seconds to 9 seconds.
Page content model is still emitted in the metrics since
it was considered useful. Should be removed when we get
something like a page type concept.
Change-Id: Ib16fd0b5b87ffc3cb4d21f4aa43d1203cb7206d2
One major difference with what we've had before is that now we
actually write class names into the serialization - given that
this new mechanism is extencible, we can't establish any kind
of mapping of allowed classes. I do not think it's a problem
though.
Bug: T264394
Change-Id: Ia152f3b76b967aabde2d8a182e3aec7d3002e5ea
Without passing ALL_OK constant, json-encoding will \u-escape
all the unicode, which will blow the size of serialized data,
especially on Russian wiki out of proportion.
Bug: T263579
Change-Id: Ifaaf1cdfaeeb17c3a99ed742b64ae5cc3157500c
This adds JSON serialization and deserialization capabilities
to CacheTime and ParserOutput.
NOTE: JSON serialization is disabled for now. Merging this patch
should not change behavior in production.
Bug: T263579
Change-Id: I18187e8bce573d21f6f1bd29106e07c63a6d2f4d
This makes the parser cache resilient to encountering string values
where it is currently expecting to get a ParserOutput objerct from the
underlying cache.
This provides forward compatibility with a switch to JSON based caching:
If we have to switch back after writing JSON to the cache for a while,
ParserCache would simply ignore the respective entries, rather than
causing fatal errors.
Bug: T263579
Change-Id: Iaed582097ab2d05edb4b99a738ac39c530fd63c1