Commit graph

51 commits

Author SHA1 Message Date
C. Scott Ananian
b41f95e3a3 Use MetricsInterface::setLabels() for parsercache_selective_* stats
Depends-On: Ifb51bca3b8762e97e349d2868e42789494f262cb
Change-Id: Ie915a2f5debf74c66c91ff256f3b1632bd078435
2024-10-16 17:01:20 -04:00
James D. Forrester
a5387c7c20 Namespace all remaining classes in includes/parser
Bug: T353458
Change-Id: If02cc9b1ff78e26c1cf8c91ee4695845eb133829
2024-10-15 23:54:32 +01:00
C. Scott Ananian
06d6d20b6e Parsoid selective update metrics: add labels for wiki id and content model
While investigating T376433 I realized we were also generating metrics
for wikidatawiki for the various content models implemented by
Wikibase.  Add the wiki ID and main content model to the metrics
labels so we can distinguish or filter these out.

While we're at it, fix the case of the `parsercache_` metrics prefix to match
the other `ParserCache_` metrics already being recorded.

Bug: T371713
Depends-On: I11386e307caaa9fce34870b08bd4dce4c5e6eb25
Change-Id: Iaf9d8cac1fe008f1441c46e5bc70e7d060358b27
2024-10-08 18:01:40 -04:00
C. Scott Ananian
9b260caeb5 stats: collect timing information for parsercache_selective_* sample
Collect parse time statistics as a counter in order to determine both
the number of opportunities for selective update as well as the
proportion of cpu time spent on parses where selective update is
feasible.

Bug: T371713
Change-Id: I5b8c7ab48d5a1d6c1e311149fcac6abdc523aa13
2024-09-19 15:01:09 -04:00
C. Scott Ananian
65ecdc0eea Fix names of parsercache_selective_* stats
Rename to use a unit type as a suffix to match the guidance in
 https://www.mediawiki.org/wiki/Manual:Stats#Metrics

Change-Id: Ied4c1c3a1ab7fa6148d10a7fc89094c46f568453
2024-09-19 14:41:19 -04:00
C. Scott Ananian
92ca7f68a4 Randomly sample statistics for Parsoid Selective Update
Controlled by $wgParsoidSelectiveUpdateSampleRate (which defaults to off)
randomly sample 1 in N parses to collect statistics to inform the design
of Parsoid selective update:

* For both legacy parses and Parsoid, count how many times a previous
  parse is in the cache when a new parse is requested.  This needs to
  sample the legacy parser as well as Parsoid because Parsoid is not
  yet invoked from the RefreshLinksJob.  We also count the relative
  number of parses from the different
  RevisionRenderer::getRenderedRevision() call sites to determine
  which pathways might account for the most opportunities for
  optimized selective update.

* For sampled parses using the Parsoid parser where a previous parse
  result is available, also fetch the previous wikitext source from the
  database.

Bug: T371713
Change-Id: I208aeac1b315a96bdb9669427cd03de461b914b4
2024-09-13 19:29:18 -04:00
C. Scott Ananian
a0503debb0 Provide previous parse results to parser when rendering
This patch lays the groundwork for incremental/selective parsing in
Parsoid by ensuring that we can pass previous cached parses through
the parse pipeline to Parsoid.  We do this by adding a new render
hint type, `previous-output`, and ensuring it is passed along.

Because revisions can contain a ParserOutput which is the combination
of separate ParserOutput objects for each of their slots, RenderedRevision
also contains a method to unsplit the combined ParserOutput to reconstruct
an original ParserOutput for use in incremental parsing.  Currently this
is mostly a stub, but illustrates how slot combination and splitting can
work, assuming those transformations are reversible.

Extra calls to ParserCache::getDirty() are added to some code paths
in order to ensure that any previously-cached ParserOutput is available
for selective update.  In order to mitigate any performance concerns,
these are only done for the Parsoid parser at the moment.  Future
patches will add additional metrics to quantify the cost/benefit ratio
of the additional cache lookups on these paths.

Bug: T363421
Bug: T371713
Change-Id: I440884f1d7e09c1ff9806f848b7b53a636367690
2024-08-23 17:41:55 -04:00
Umherirrender
1951aea6b8 Fix various version mention for class_alias
Versions are changed in 8e940c4f21,
but that makes the version wrong

Follow-Up: I7f85d931d3b79da23e87b4e5692b2e14be8fcaa0
Change-Id: Iae43725b8e0fffc4d44bf57f6227334b41290bd9
2024-07-05 18:39:49 +02:00
James D. Forrester
8e940c4f21 Standardise all our class alias deprecation comments for ease of grepping
Change-Id: I7f85d931d3b79da23e87b4e5692b2e14be8fcaa0
2024-03-19 20:11:29 +00:00
daniel
d3a9abe50b ParserOutputAccess: only use PoolCounter if the caller asks for it.
PoolCounterWorkArticleView was not designed for use by all callers of
getParserOutput. It provides stampede protection but does not generally
prevent duplicate concurrent parsing, and it may result in stale cache
entries being returned to the caller. This is acceptable for page views,
but not other use cases like editing or updating secondary derived data.

Bug: T352837
Change-Id: Ie532c17e5b86e8e1adbb57ecd5c5c6405b83bf8f
2024-02-29 10:49:41 +01:00
Amir Sarabadani
b7c5d05813 PoolCounter: Namespace classes
Two of the classes in this directory have already namespaced to
MediaWiki\PoolCounter.

Bug: T353458
Change-Id: Ie41f8d935f7623bb40040a5eb78f99c6d7b7b75e
2023-12-20 09:32:19 +00:00
Subramanya Sastry
edb403f62f PoolWorkArticleView: Remove unsatisfiable check
* RevisionRecord::RAW suppresses audience checks
* So, we cannot get a null return value here.

Change-Id: I8714caf8fb9a0f665ba5fdedb50e898658d18b96
2023-09-29 15:24:47 -05:00
Subramanya Sastry
82f4b92ea0 PoolWorkArticleView: Separate slow-parsoid and slow-parse logs
* Avoids polluting slow-parse logs with Parsoid slow parses.

Change-Id: I4fd47a787e09c3bda1956bf7360131a8d7e372eb
2023-09-12 14:51:26 -05:00
Amir Sarabadani
f4e68e055f Reorg: Move Status to MediaWiki\Status\
This class is used heavily basically everywhere, moving it to Utils
wouldn't make much sense. Also with this change, we can move
StatusValue to MediaWiki\Status as well.

Bug: T321882
Depends-On: I5f89ecf27ce1471a74f31c6018806461781213c3
Change-Id: I04c1dcf5129df437589149f0f3e284974d7c98fa
2023-08-25 15:44:17 +02:00
Timo Tijhof
fa7bb033df poolcounter: Clean up file headers and @ingroup
Follows similar commits to the objectcache, rdbms, profiler,
filerepo components and other areas [1].

* Remove duplicate descriptions from file blocks in favour of class
  doc blocks. This reduces needless duplication and was often
  incorrect or outdated, and helps (ironically) to make the file header
  more consistently visually ignorable.

* Remove `ingroup` from file blocks in class files as otherwise
  the file is indexed twice (e.g. in Doxygen) which makes navigation
  more messy.

* Remove `throws` tag for an undescribed MWException that isn't
  meant to be caught by callers.

[1] https://gerrit.wikimedia.org/r/q/message:ingroup+owner:Krinkle

Change-Id: I6cd0d2a4d3179668779813a97bb55142eadf8851
2022-09-29 19:45:09 +00:00
Thiemo Kreuz
76646313cb poolcounter: Avoid calling parent::doWork in PoolWorkArticleView classes
Small, non-functional changes to make the code more readable.
* No need to expose the subclassing in newPoolWorkArticleView(). All
  the user needs to know is that PoolCounterWork::execute() can be
  called.
* The doWork() method exists to be called from PoolCounterWork. Each
  subclasses should do this independently from the others.

Another benefit is that we can have more strict type declarations.

Change-Id: I9418169e8937029f61d15ad54a1afeec0b343bb9
2022-05-13 20:36:25 +00:00
Thiemo Kreuz
d90a86e02b Inline trivial getter in PoolWorkArticleView
The goal is to reduce the complexity of the classes.

Change-Id: I12cca74dd86774f94061ef1eb36df78ba9c554eb
2022-04-26 16:18:30 +00:00
Thiemo Kreuz
236efcdc7b Untangle dependencies between PoolWorkArticleView subclasses
The base class should not need to know anything about caching.

The motivation for this patch is to loosen the strong dependencies
between these subclasses, to possibly turn them into proxies or
something with much looser coupling.

This patch doesn't change any behavior. The code is just moved to a
slightly different place, but executed in the same order.

Bug: T304813
Change-Id: Icd68538c85c193c3d17443154bfdf6d5bce7661c
2022-04-11 15:50:57 +02:00
Thiemo Kreuz
c5fdb1c8ba Change ParserOutputAccess workers to work with Status objects
All these methods have been written to return true, but that value was
never used for anything other than realizing that the method succeeded.
The ParserOutput object we are interested in was stuck in a property.
Why not return the ParserOutput object?

I wrapped it in a Status object to be able to pass warning messages
along with the actual result. There was even more specialized code to
do that via dedicated setters and getters. All this can be removed now.

Bug: T304813
Change-Id: I6fd3745835dfcaec94695469498a2662f8317c35
2022-04-08 15:47:59 +02:00
Thiemo Kreuz
dfbf5830b2 Move "dirty" logic to PoolWorkArticleView subclass that uses it
There is only a single subclass that ever does anything with these
two boolean properties. Only 3 states are possible. Pretty much all
of this belongs to the subclass. No other code should have to know
anything about this.

This patch doesn't fully solve the issue but moves code in the
described direction.

Bug: T304813
Change-Id: I70754546f065b03ff04a73307c10f22fbb040810
2022-04-08 13:47:36 +02:00
Tim Starling
4f41e2addd Add slow-parsoid log channel
By analogy with slow-parse.log. Also, I fixed the log message so that it
has the full title in it.

Change-Id: Icaeb6f002c5c2a676467d4c760f99cb2676ad73b
2021-09-15 15:48:11 +10:00
daniel
195bc9715d PoolWorkArticleView: inject logger
Bug: T267832
Change-Id: I7f4763d0e812d076188bb1a4ca2c333f50dffbee
2020-11-18 17:08:37 +01:00
daniel
ed41864370 Extract PoolWorlArticleViewCurrent
Extracts a specialized subclass for rendering the current revision
from PoolWorlArticleView, which then no longer knowes about caching.

In the next step, we will add a subclass that implements caching for old
revisions.

Bug: T267832
Change-Id: I56fb365962951e6c723a01cf9243dbc0094b5581
2020-11-17 20:17:02 +01:00
daniel
175d548e61 Clean up PoolWorkArticleView
PoolWorkArticleView needs some cleanup before we can make it
cache output for old revisiosn (T244058).

This patch does doe following:
* apply dependency injection
* remove backwards compatibility code for legacy constructor calls
* mark PoolWorkArticleView as @internal (unused in extensions)
* remove audience check (to be done by caller)
* no longer set $wgUseFileCache to false.

For $wgUseFileCache, it seems like this has had no effect for a long
time. It would be set to false only on a cache miss during a page view.
But the file cache is only updated via HtmlCacheUpdater on edit and
purge.

Bug: T244058
Change-Id: Ief467562af0aa2f88ff7b42469d0273d2a1dcf7a
2020-11-16 12:06:20 +01:00
daniel
67d0986211 Introduce ParserOutputAccess
Encapsulate logic for getting rendered page content, for any revision,
with caching and pooling hidden away.

Introducing such a service object will also give us a leverage point for
supporting output transformations. Output transformations are currently
implemented partially in ParserOutput, partially in Parser, and partially
duplicated in Parsoid.

Bug: T267234
Change-Id: I566d7a7936633823ba68b5aecbc8c2d88949b4f8
2020-11-10 15:12:12 +01:00
Umherirrender
d621adbcb6 build: Updating mediawiki/mediawiki-codesniffer to 32.0.0
Exclude failing sniff to fix in follow ups
Includes some simply fix, most are autofix

Change-Id: I5bb4743f08618bb6226bc2a4cc7f4d73a7ad142d
2020-10-28 20:06:22 +00:00
Petr Pchelko
13574e8404 Deprecate ParserCache::getKey and replace it with getMetadata
Bug: T263689
Change-Id: I4a71e5a7eb1c25cd53b857c115883cd00160736b
2020-10-13 08:31:22 -07:00
DannyS712
551e2d50f1 Add message text to slow-parse log entries
Currently, it just shows the time and the title, and
unless someone looks for which channel the message
came in there is no context for the text

Change-Id: Ib1edee250d044a28b1ed4c950da3a1ae85c44d06
2020-10-13 06:44:37 +00:00
Tim Starling
7f710a514a Fast stale ParserCache responses
If PoolCounter acquisition would block and a stale ParserCache entry is
available, deliver it immediately rather than waiting for the lock. This
should avoid PoolCounter contention on heavily edited pages.

* Add a fastStale pool option to toggle the feature. False by default
  but I'll set the default to true in a followup commit.
* Add a $timeout parameter to PoolCounter::acquireForMe() and
  acquireForAnyone(). This requires a simultaneous update to the
  PoolCounter extension.
* In the Redis implementation, use the requested timeout for blPop()
  but use the configured timeout for data structure cleanup and item
  expiry.
* Add a boolean $fast parameter to fallback() which tells the subclass
  whether it is being called in the fast or slow mode. No extensions
  in CodeSearch extend PoolCounterWork directly so this should not
  cause a fatal.
* Pass through the $fast parameter in PoolCounterWorkViaCallback
* In PoolWorkArticleView, use the $fast flag to decide whether to check
  the ChronologyProtector touched timestamp.
* Add $wgCdnMaxageStale by analogy with $wgCdnMaxageLagged, which
  controls the CC:s-maxage when sending a stale ParserOutput.
* Fix the documented type of the timeout. It really should be a float,
  but locks.c will treat non-integers as zero.

A simultaneous update to the PoolCounter extension is required.

Bug: T250248
Change-Id: I1f410cd5d83588e584b6d27d2e106465f0fad23e
2020-06-05 16:24:22 +10:00
Tim Starling
47a1619027 Remove terminating line breaks from debug messages
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.

So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.

Also:
* Fix the stripping of leading line breaks from the log header emitted
  by Setup.php. This feature, accidentally broken in 2014, allows
  requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
  WebRequest in the hopes that it would not always be needed, however
  $wgRequest->getIP() is now called unconditionally a few lines up in
  Setup.php. This means that it is put in its proper place after the
  "start request" message.
* Wrap the log header code in a closure so that variables like $name do
  not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
  parameter to wfDebug().

Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
2020-06-03 12:01:16 +10:00
DannyS712
4721717527 Replace uses and hard deprecate Article:: and WikiPage::getRevision
Bug: T250532
Bug: T239975
Change-Id: Ic8f2baa0ac805d5196a7107bdc7a1abb36eba139
2020-04-20 23:06:48 +00:00
Derick Alangi
1b9ea4d1bf Avoid/Replace usage of deprecated wfMemcKey() function
This global function was deprecated in 1.30 and is replaced with
the use of `ObjectCache::getLocalClusterInstance()->makeKey()`.

Change-Id: Ic08b53111be4374a973e08c2ed68224bfa922fa8
2019-05-02 14:27:31 +00:00
Brad Jorsch
dff469a408 Re-namespace RevisionStore and RevisionRecord classes
During development a lot of classes were placed in MediaWiki\Storage\.
The precedent set would mean that every class relating to something
stored in a database table, plus all related value classes and such,
would go into that namespace.

Let's put them into MediaWiki\Revision\ instead. Then future classes
related to the 'page' table can go into MediaWiki\Page\, future classes
related to the 'user' table can go into MediaWiki\User\, and so on.

Note I didn't move DerivedPageDataUpdater, PageUpdateException,
PageUpdater, or RevisionSlotsUpdate in this patch. If these are kept
long-term, they probably belong in MediaWiki\Page\ or MediaWiki\Edit\
instead.

Bug: T204158
Change-Id: I16bea8927566a3c73c07e4f4afb3537e05aa04a5
2018-10-09 10:22:48 -04:00
Gergő Tisza
5174fa8364
Add audience parameter to PoolWorkArticleView
The old behavior was that the audience was RAW if the revision object
parameter got passed in, otherwise PUBLIC. This was undocumented and
not used outside core; this patch gets rid of it in favor of an
explicit argument.

Bug: T205578
Change-Id: Ic7cdb38f658f6d85c48ff13c7f84c64a45c9b1ee
2018-10-02 03:53:28 -07:00
Gergő Tisza
6e8d39c6e7
Add constant for the name of the 'main' slot for MCR
Bug: T202142
Change-Id: I97a74e5a029b014f3c2195188936d5c8233c1b7f
2018-09-24 16:52:12 -07:00
daniel
4835a75ec5 Use RevisionRenderer for rendering ParserOutput
Bug: T174035
Bug: T174036
Change-Id: I1085b05d635dd954c143c8a398fae909632ba0a9
2018-09-11 15:25:39 +00:00
Umherirrender
130ec2523d Fix PhanTypeMismatchDeclaredParam
Auto fix MediaWiki.Commenting.FunctionComment.DefaultNullTypeParam sniff

Change-Id: I865323fd0295aabd06f3e3c75e0e5043fb31069e
2018-07-07 00:34:30 +00:00
Thiemo Mättig
7d0d9815d3 poolcounter: Fix type hint for PoolWorkArticleView::getParserOutput
Change-Id: Ib6a71e198481cf2a0230b3f8721c019ef3c7288c
2018-01-24 18:41:52 +00:00
Kunal Mehta
1fd095ec1c Avoid using the deprecated ParserCache::singleton()
Change-Id: I0da6d9cbfad26c89bf5dab564071ef97acaf44f9
2017-09-09 14:20:10 -07:00
James D. Forrester
9635dda73a includes: Replace implicit Bugzilla bug numbers with Phab ones
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.

Change-Id: I6f59febaf8fc96e80f8cfc11f4356283f461142a
2017-02-21 18:13:24 +00:00
Kunal Mehta
77b2cf831b Include numerical namespace in slow-parse.log
This makes it easier to sort by namespace rather than trying to parse it
out of the title.

Change-Id: I946cb00548bcb69bd2be98c15a9f1e02e546fa24
2016-12-09 23:18:45 +00:00
Aaron Schulz
58bae669bc Clean up PoolWorkArticleView type hints and fix IDEA errors
All callers pass a WikiPage here already.

Change-Id: I6a17bf52fb2547729c6a1fa40704f1c9efe28b12
2016-05-03 01:47:36 -07:00
Kunal Mehta
6e9b4f0e9c Convert all array() syntax to []
Per wikitech-l consensus:
 https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html

Notes:
* Disabled CallTimePassByReference due to false positives (T127163)

Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
2016-02-17 01:33:00 -08:00
Timo Tijhof
c45305bcd3 poolcounter: Add 'trigger' field to the slow-parse log
To better indicate that these are only triggered by page views.

We don't currently have any slow-parse logging for the parser
invocation that happens during save (which means we're potentially
missing lots of them).

Once we add that, this will help distinguish them.

Bug: T110760
Change-Id: I22be5684ef93efd410d683637e223f770d6c768c
2015-10-23 20:53:37 +00:00
Timo Tijhof
8c74b8a3e8 poolcounter: Convert slow-parse to LoggerFactory with data context
This way the time and title values are transmitted via Monolog
as separate JSON properties.

Keep the message the same for backwards-compatibility (except
for the space padding).

Change-Id: I0b79944bb9944dc6d09d16fe2ecc845e0e0e2afb
2015-08-26 20:41:13 +02:00
jenkins-bot
4e004124cd Merge "Removed obsolete "containsOldMagic" code" 2015-03-04 06:02:04 +00:00
Aaron Schulz
df5ef8b5d7 Removed doCascadeProtectionUpdates method to avoid DB writes on page views
* Use special prioritized refreshLinksJobs instead, which triggers when
  transcluded pages are changed
* Also added a triggerOpportunisticLinksUpdate() method to handle
  dynamic transcludes

bug: T89389
Change-Id: Iea952d4d2e660b7957eafb5f73fc87fab347dbe7
2015-02-22 13:36:13 -08:00
Aaron Schulz
4111ff0dc3 Removed obsolete "containsOldMagic" code
Change-Id: Id225347e0599a6f79b30b0793cce7d97daed46f2
2015-02-15 14:41:49 -08:00
Chad Horohoe
e4ff67e0db Handle missing parser cache keys better in pool counter
Change-Id: I493fd1ee5e9ab6c3a49a7f478460cbfe54393ca0
2014-11-19 11:22:48 -08:00
umherirrender
21e0c1c533 Correct variable names in @param to match method declarations
Some @param have a typo in the variable name,
some @param's were in wrong order.

Change-Id: Ie25806831027112b398f6f4a909c59147ac3a5fa
2014-08-13 21:48:28 +02:00