While investigating T376433 I realized we were also generating metrics
for wikidatawiki for the various content models implemented by
Wikibase. Add the wiki ID and main content model to the metrics
labels so we can distinguish or filter these out.
While we're at it, fix the case of the `parsercache_` metrics prefix to match
the other `ParserCache_` metrics already being recorded.
Bug: T371713
Depends-On: I11386e307caaa9fce34870b08bd4dce4c5e6eb25
Change-Id: Iaf9d8cac1fe008f1441c46e5bc70e7d060358b27
Collect parse time statistics as a counter in order to determine both
the number of opportunities for selective update as well as the
proportion of cpu time spent on parses where selective update is
feasible.
Bug: T371713
Change-Id: I5b8c7ab48d5a1d6c1e311149fcac6abdc523aa13
Controlled by $wgParsoidSelectiveUpdateSampleRate (which defaults to off)
randomly sample 1 in N parses to collect statistics to inform the design
of Parsoid selective update:
* For both legacy parses and Parsoid, count how many times a previous
parse is in the cache when a new parse is requested. This needs to
sample the legacy parser as well as Parsoid because Parsoid is not
yet invoked from the RefreshLinksJob. We also count the relative
number of parses from the different
RevisionRenderer::getRenderedRevision() call sites to determine
which pathways might account for the most opportunities for
optimized selective update.
* For sampled parses using the Parsoid parser where a previous parse
result is available, also fetch the previous wikitext source from the
database.
Bug: T371713
Change-Id: I208aeac1b315a96bdb9669427cd03de461b914b4
This patch lays the groundwork for incremental/selective parsing in
Parsoid by ensuring that we can pass previous cached parses through
the parse pipeline to Parsoid. We do this by adding a new render
hint type, `previous-output`, and ensuring it is passed along.
Because revisions can contain a ParserOutput which is the combination
of separate ParserOutput objects for each of their slots, RenderedRevision
also contains a method to unsplit the combined ParserOutput to reconstruct
an original ParserOutput for use in incremental parsing. Currently this
is mostly a stub, but illustrates how slot combination and splitting can
work, assuming those transformations are reversible.
Extra calls to ParserCache::getDirty() are added to some code paths
in order to ensure that any previously-cached ParserOutput is available
for selective update. In order to mitigate any performance concerns,
these are only done for the Parsoid parser at the moment. Future
patches will add additional metrics to quantify the cost/benefit ratio
of the additional cache lookups on these paths.
Bug: T363421
Bug: T371713
Change-Id: I440884f1d7e09c1ff9806f848b7b53a636367690
Versions are changed in 8e940c4f21,
but that makes the version wrong
Follow-Up: I7f85d931d3b79da23e87b4e5692b2e14be8fcaa0
Change-Id: Iae43725b8e0fffc4d44bf57f6227334b41290bd9
PoolCounterWorkArticleView was not designed for use by all callers of
getParserOutput. It provides stampede protection but does not generally
prevent duplicate concurrent parsing, and it may result in stale cache
entries being returned to the caller. This is acceptable for page views,
but not other use cases like editing or updating secondary derived data.
Bug: T352837
Change-Id: Ie532c17e5b86e8e1adbb57ecd5c5c6405b83bf8f
Two of the classes in this directory have already namespaced to
MediaWiki\PoolCounter.
Bug: T353458
Change-Id: Ie41f8d935f7623bb40040a5eb78f99c6d7b7b75e
This class is used heavily basically everywhere, moving it to Utils
wouldn't make much sense. Also with this change, we can move
StatusValue to MediaWiki\Status as well.
Bug: T321882
Depends-On: I5f89ecf27ce1471a74f31c6018806461781213c3
Change-Id: I04c1dcf5129df437589149f0f3e284974d7c98fa
Follows similar commits to the objectcache, rdbms, profiler,
filerepo components and other areas [1].
* Remove duplicate descriptions from file blocks in favour of class
doc blocks. This reduces needless duplication and was often
incorrect or outdated, and helps (ironically) to make the file header
more consistently visually ignorable.
* Remove `ingroup` from file blocks in class files as otherwise
the file is indexed twice (e.g. in Doxygen) which makes navigation
more messy.
* Remove `throws` tag for an undescribed MWException that isn't
meant to be caught by callers.
[1] https://gerrit.wikimedia.org/r/q/message:ingroup+owner:Krinkle
Change-Id: I6cd0d2a4d3179668779813a97bb55142eadf8851
Small, non-functional changes to make the code more readable.
* No need to expose the subclassing in newPoolWorkArticleView(). All
the user needs to know is that PoolCounterWork::execute() can be
called.
* The doWork() method exists to be called from PoolCounterWork. Each
subclasses should do this independently from the others.
Another benefit is that we can have more strict type declarations.
Change-Id: I9418169e8937029f61d15ad54a1afeec0b343bb9
The base class should not need to know anything about caching.
The motivation for this patch is to loosen the strong dependencies
between these subclasses, to possibly turn them into proxies or
something with much looser coupling.
This patch doesn't change any behavior. The code is just moved to a
slightly different place, but executed in the same order.
Bug: T304813
Change-Id: Icd68538c85c193c3d17443154bfdf6d5bce7661c
All these methods have been written to return true, but that value was
never used for anything other than realizing that the method succeeded.
The ParserOutput object we are interested in was stuck in a property.
Why not return the ParserOutput object?
I wrapped it in a Status object to be able to pass warning messages
along with the actual result. There was even more specialized code to
do that via dedicated setters and getters. All this can be removed now.
Bug: T304813
Change-Id: I6fd3745835dfcaec94695469498a2662f8317c35
There is only a single subclass that ever does anything with these
two boolean properties. Only 3 states are possible. Pretty much all
of this belongs to the subclass. No other code should have to know
anything about this.
This patch doesn't fully solve the issue but moves code in the
described direction.
Bug: T304813
Change-Id: I70754546f065b03ff04a73307c10f22fbb040810
By analogy with slow-parse.log. Also, I fixed the log message so that it
has the full title in it.
Change-Id: Icaeb6f002c5c2a676467d4c760f99cb2676ad73b
Extracts a specialized subclass for rendering the current revision
from PoolWorlArticleView, which then no longer knowes about caching.
In the next step, we will add a subclass that implements caching for old
revisions.
Bug: T267832
Change-Id: I56fb365962951e6c723a01cf9243dbc0094b5581
PoolWorkArticleView needs some cleanup before we can make it
cache output for old revisiosn (T244058).
This patch does doe following:
* apply dependency injection
* remove backwards compatibility code for legacy constructor calls
* mark PoolWorkArticleView as @internal (unused in extensions)
* remove audience check (to be done by caller)
* no longer set $wgUseFileCache to false.
For $wgUseFileCache, it seems like this has had no effect for a long
time. It would be set to false only on a cache miss during a page view.
But the file cache is only updated via HtmlCacheUpdater on edit and
purge.
Bug: T244058
Change-Id: Ief467562af0aa2f88ff7b42469d0273d2a1dcf7a
Encapsulate logic for getting rendered page content, for any revision,
with caching and pooling hidden away.
Introducing such a service object will also give us a leverage point for
supporting output transformations. Output transformations are currently
implemented partially in ParserOutput, partially in Parser, and partially
duplicated in Parsoid.
Bug: T267234
Change-Id: I566d7a7936633823ba68b5aecbc8c2d88949b4f8
Currently, it just shows the time and the title, and
unless someone looks for which channel the message
came in there is no context for the text
Change-Id: Ib1edee250d044a28b1ed4c950da3a1ae85c44d06
If PoolCounter acquisition would block and a stale ParserCache entry is
available, deliver it immediately rather than waiting for the lock. This
should avoid PoolCounter contention on heavily edited pages.
* Add a fastStale pool option to toggle the feature. False by default
but I'll set the default to true in a followup commit.
* Add a $timeout parameter to PoolCounter::acquireForMe() and
acquireForAnyone(). This requires a simultaneous update to the
PoolCounter extension.
* In the Redis implementation, use the requested timeout for blPop()
but use the configured timeout for data structure cleanup and item
expiry.
* Add a boolean $fast parameter to fallback() which tells the subclass
whether it is being called in the fast or slow mode. No extensions
in CodeSearch extend PoolCounterWork directly so this should not
cause a fatal.
* Pass through the $fast parameter in PoolCounterWorkViaCallback
* In PoolWorkArticleView, use the $fast flag to decide whether to check
the ChronologyProtector touched timestamp.
* Add $wgCdnMaxageStale by analogy with $wgCdnMaxageLagged, which
controls the CC:s-maxage when sending a stale ParserOutput.
* Fix the documented type of the timeout. It really should be a float,
but locks.c will treat non-integers as zero.
A simultaneous update to the PoolCounter extension is required.
Bug: T250248
Change-Id: I1f410cd5d83588e584b6d27d2e106465f0fad23e
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.
So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.
Also:
* Fix the stripping of leading line breaks from the log header emitted
by Setup.php. This feature, accidentally broken in 2014, allows
requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
WebRequest in the hopes that it would not always be needed, however
$wgRequest->getIP() is now called unconditionally a few lines up in
Setup.php. This means that it is put in its proper place after the
"start request" message.
* Wrap the log header code in a closure so that variables like $name do
not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
parameter to wfDebug().
Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
This global function was deprecated in 1.30 and is replaced with
the use of `ObjectCache::getLocalClusterInstance()->makeKey()`.
Change-Id: Ic08b53111be4374a973e08c2ed68224bfa922fa8
During development a lot of classes were placed in MediaWiki\Storage\.
The precedent set would mean that every class relating to something
stored in a database table, plus all related value classes and such,
would go into that namespace.
Let's put them into MediaWiki\Revision\ instead. Then future classes
related to the 'page' table can go into MediaWiki\Page\, future classes
related to the 'user' table can go into MediaWiki\User\, and so on.
Note I didn't move DerivedPageDataUpdater, PageUpdateException,
PageUpdater, or RevisionSlotsUpdate in this patch. If these are kept
long-term, they probably belong in MediaWiki\Page\ or MediaWiki\Edit\
instead.
Bug: T204158
Change-Id: I16bea8927566a3c73c07e4f4afb3537e05aa04a5
The old behavior was that the audience was RAW if the revision object
parameter got passed in, otherwise PUBLIC. This was undocumented and
not used outside core; this patch gets rid of it in favor of an
explicit argument.
Bug: T205578
Change-Id: Ic7cdb38f658f6d85c48ff13c7f84c64a45c9b1ee
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.
Change-Id: I6f59febaf8fc96e80f8cfc11f4356283f461142a
To better indicate that these are only triggered by page views.
We don't currently have any slow-parse logging for the parser
invocation that happens during save (which means we're potentially
missing lots of them).
Once we add that, this will help distinguish them.
Bug: T110760
Change-Id: I22be5684ef93efd410d683637e223f770d6c768c
This way the time and title values are transmitted via Monolog
as separate JSON properties.
Keep the message the same for backwards-compatibility (except
for the space padding).
Change-Id: I0b79944bb9944dc6d09d16fe2ecc845e0e0e2afb
* Use special prioritized refreshLinksJobs instead, which triggers when
transcluded pages are changed
* Also added a triggerOpportunisticLinksUpdate() method to handle
dynamic transcludes
bug: T89389
Change-Id: Iea952d4d2e660b7957eafb5f73fc87fab347dbe7