ParserCache::checkOutdated relies on ParserOutput::getCacheRevisionId() to determine
whether a revision is still current after loading it from the cache. If
the revision ID is 0 or null, this will result in false negatives, and
the revision will always be considered outdated.
It is better to detect and report this before writing the ParserOutput to the cache.
This also adds an assertion in DerivedPageDataUpdater that will trigger
an exception if we try to write to the parser cache before the revision
has been saved and the ID is known.
Change-Id: I242b769afbc7e1ae1e3f218d451f04945dfa8be4
When JSON support was introduced into ParserCache in 1.36, it was
controlled by a feature flag, $wgParserCacheUseJson. The feature flag
was "born deprecated" in 1.36. It can now be removed.
This means that ParserCache will always store entries as JSON.
Support for reading old non-JSON entries remains intact.
This is needed when updating wikis from a version older than 1.36
to the current version.
Change-Id: Id04e42bfb458d98414bac50e0d6c505e8878e5c0
Make phan stricter about null types by setting null_casts_as_any_type to
false (the default in mediawiki-phan-config)
Remaining false positive issues are suppressed.
The suppression and the setting change can only be done together
Bug: T242536
Bug: T301991
Change-Id: I0f295382b96fb3be8037a01c10487d9d591e7e01
New option 'absoluteURLs' was added to getText method
of the ParserOutput object that replaces all links
in the page HTML with absolute URLs.
Removing the action=render special case from Title
seems safe cause we will end up replacing the result
with absolute URL if we're in a render action no matter
where Title::getLocalUrl was called from.
This change is safely revertable from the perspective
of ParserCache.
Bug: T263581
Change-Id: Id660e1026192f40181587199d3418568f0fdb6d3
Per docs added in I18767cd809f67b, these don't need normalization
as they are only compared against predefined strings, and besides
are generally entered manually in a form, and even then would not
require the kinds of Unicode chars that have multiple/non-normalized
forms.
In nearby areas to also fix some trivial cases:
* getVal('title') obviously needs normalization.
Use getText() to make this more obvious.
* getVal() compared against simple string literals within the code
obviously don't need normalization (e.g. printable === 'no').
* Change hot code in MediaWiki checking for whether 'diff' or 'oldid'
are set to getCheck (which uses getRawVal) instead of getVal.
As a bonus this means it now handles values like "0" correctly,
which could theoretically have caused bad behaviour before.
Change-Id: Ied721cfdf59c7ba11d1afa6f4cc59ede1381238e
Cache misses in metadata were miscounted as miss.unserialize.
Count them as miss.absent.metadata instead.
Change-Id: Idff062325a34445478a4543709a9f2b3cc365f60
CachedBagOStuff caches negatives, so it breaks PoolCounter.
We only need to cache metadata in-process, since it's commonly
used twice within the request.
Bug: T277829
Change-Id: I11a147c24b6cdb275b521b48802d6f3d0e1a4387
ParserOptions not updated cause they depend on Title::getLanguage
implementation.
Tests converted to not require a DB anymore. Can't be proper unit
tests yet due to globals in ParserOptions and fake time hacks,
but exec time does go down from 70 seconds to 9 seconds.
Page content model is still emitted in the metrics since
it was considered useful. Should be removed when we get
something like a page type concept.
Change-Id: Ib16fd0b5b87ffc3cb4d21f4aa43d1203cb7206d2
ParserOutput object wraps revision ID and revision timestamp
of the parsed revision. Currently ParserCache sets these properties,
but it's not at all it's job - whatever generates the ParserOutput
knows much better what revision it parsed. This also allows us to
simplify ParserCache and easier switch it to PageRecord.
I've only removed setting the timestamp inside ParserCache
cause it's a blocker for page record, I will do followupus
to remove the $revId parameter from ParserCache as well.
cacheRevisionId should also be renamed, but later.
Bug: T278284
Change-Id: I9a82e9fd154b29a81d1f7a3c4abb073c9a27314e
One major difference with what we've had before is that now we
actually write class names into the serialization - given that
this new mechanism is extencible, we can't establish any kind
of mapping of allowed classes. I do not think it's a problem
though.
Bug: T264394
Change-Id: Ia152f3b76b967aabde2d8a182e3aec7d3002e5ea
Without passing ALL_OK constant, json-encoding will \u-escape
all the unicode, which will blow the size of serialized data,
especially on Russian wiki out of proportion.
Bug: T263579
Change-Id: Ifaaf1cdfaeeb17c3a99ed742b64ae5cc3157500c
This introduces $wgParserCacheUseJson for selectively enabling
JSON encoding in the parser cache. This is intended for testing only.
It should be removed before the release of 1.36.
Bug: T263579
Change-Id: I0d9cab3fafb984a3159e24f9e80f792429ff3c71
This adds JSON serialization and deserialization capabilities
to CacheTime and ParserOutput.
NOTE: JSON serialization is disabled for now. Merging this patch
should not change behavior in production.
Bug: T263579
Change-Id: I18187e8bce573d21f6f1bd29106e07c63a6d2f4d
This makes the parser cache resilient to encountering string values
where it is currently expecting to get a ParserOutput objerct from the
underlying cache.
This provides forward compatibility with a switch to JSON based caching:
If we have to switch back after writing JSON to the cache for a while,
ParserCache would simply ignore the respective entries, rather than
causing fatal errors.
Bug: T263579
Change-Id: Iaed582097ab2d05edb4b99a738ac39c530fd63c1
These are causing thousands of errors from wmf.11-cached pages
since we rolled back to wmf.10.
Bug: T264257
Change-Id: Ia3357b2f593ca16fc12241d7ea22bbfd222f2536
(cherry picked from commit 71ee44aabba5c10187ad6d5cb26b5ef072cbf9b2)
Deprecated in 1.35. However, if you look closely,
the deprecation warning emitting code was passing
numeric 1.35 instead of a string '1.35' which caused
the deprecation function to throw an exception.
Thus, this code has not been deprecated in 1.35, but
was accidentally broken. Instead of fixing the deprecation,
just remove the fallback.
Change-Id: I369f03d6b01053fc0396beb635c7b7d49bd249da
* Makes ParserCache take the root of the key
as a constructor argument
* Introduces a ParserCacheFactory
Next steps:
- convert FlaggedRevs to using this.
- cleanup
This assumes that we wouldn't want to differentiate
the parser cache settings per use-case, as it is now
for default vs flaggedrevs caches. There are only two settings:
$wgParserCacheType - name of the BagOStuff to use
$wgParserCacheExpireTime - the expiration time.
I think if we wanted to have different settings for different
caches, we could add that as a next step.
Bug: T263583
Change-Id: I188772da541a95c95a5ecece7c7dd748395506c2
This reverts commit a4dc6d82af.
I've reverted the merged patch since I didn't do enough testing
on serialized/reserialized ParserOutput and CacheTime. Now I'm
confident serialization/deserialization works.
Changes since original reverted version:
- Use __get/__set instead of DeprecationHelper in order to
avoid $deprecateProperties array to be serialized.
- Add test for old format serialization new format deserialization.
Change-Id: Ic911c2724ad709931d3316e609781fb89b5b7b28
This reverts commit 799c10b7eb.
Reason for revert: Didn't test how this would work with deserializing stored ParserOutput.
Change-Id: I4221bc26282f3b4bd044f0ab50d00e77eb57ede0
* In preparation for ParserCache/Parsoid integration, it's nice to
do some cleanups. Will untie our hands a bit more.
* Verified no usages in extensions deployed at wikimedia, other then
Flow, fixed in the dependent patch.
Change-Id: Idd78413a36887e2ff5c902d410e55691cafb736b
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.
So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.
Also:
* Fix the stripping of leading line breaks from the log header emitted
by Setup.php. This feature, accidentally broken in 2014, allows
requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
WebRequest in the hopes that it would not always be needed, however
$wgRequest->getIP() is now called unconditionally a few lines up in
Setup.php. This means that it is put in its proper place after the
"start request" message.
* Wrap the log header code in a closure so that variables like $name do
not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
parameter to wfDebug().
Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e