Commit graph

55 commits

Author SHA1 Message Date
Bartosz Dziewoński
1321082c6e Use real type hints for services etc. in includes/page/
Mostly used find-and-replace:

Find:
/\*[\*\s]+@var (I?[A-Z](\w+)(?:Interface)?)[\s\*]+/\s*(private|protected|public) (\$[a-z]\w+;\n)((?=\s*/\*[\*\s]+@var (I?[A-Z](\w+)(?:Interface)?))\n|)
Replace with:
\3 \1 \4

More could be done, but to keep this patch reasonably sized, I only
changed the most obvious and unambiguously correct cases.

In some cases, I also removed redundant doc comments on the
constructor, and re-ordered the properties to match the constructor.

Change-Id: I7eb97640c0543ae10bf2431623a5f7efdc3349b7
2024-06-11 19:37:28 +02:00
Arlo Breault
adbcb24d40 Move parsoid version cache invalidation to lower level
Bug: T333606
Change-Id: I0af5d769507ec47c3bd34b697830a247f83cf818
2024-04-03 21:50:16 -04:00
daniel
36b0c8a048 REST: HTML endpoints should support all content models
This allows HtmlOutputRendererHelper to function for all kinds of
content.

Bug: T311728
Bug: T311648
Bug: T359426
Change-Id: Ib32af7cf2a7ad989eb0b13ecca37c857fc9199ec
2024-03-13 04:32:59 -05:00
daniel
d3a9abe50b ParserOutputAccess: only use PoolCounter if the caller asks for it.
PoolCounterWorkArticleView was not designed for use by all callers of
getParserOutput. It provides stampede protection but does not generally
prevent duplicate concurrent parsing, and it may result in stale cache
entries being returned to the caller. This is acceptable for page views,
but not other use cases like editing or updating secondary derived data.

Bug: T352837
Change-Id: Ie532c17e5b86e8e1adbb57ecd5c5c6405b83bf8f
2024-02-29 10:49:41 +01:00
Amir Sarabadani
b7c5d05813 PoolCounter: Namespace classes
Two of the classes in this directory have already namespaced to
MediaWiki\PoolCounter.

Bug: T353458
Change-Id: Ie41f8d935f7623bb40040a5eb78f99c6d7b7b75e
2023-12-20 09:32:19 +00:00
James D. Forrester
9bfb75ff90 Namespace ParserOutput
Most used non-namespaced class!

Bug: T353458
Change-Id: I4c2cbb0a808b3881a4d6ca489eee5d8c8ebf26cf
2023-12-14 14:57:34 -05:00
Bartosz Dziewoński
f74f51f658 ParserOutputAccess: Change local cache from array to MapCacheLRU
Daniel made me do it.

Change-Id: I710b0341e240650d42110bc73a11acd0aa712397
2023-11-08 17:27:21 +01:00
Bartosz Dziewoński
8e0ab86cf8 ParserOutputAccess: Limit local cache size
Entries were never removed from this cache, causing memory leaks in
long-running scripts.

Bug: T315510
Change-Id: I5131f2ca19db9981218273bde0dd0f975157ccdb
2023-11-08 01:09:23 +01:00
Subramanya Sastry
7514bf3921 Revert "Hacks to avoid cold cache misses after ParsoidOutputAccess changes"
* This reverts commit c1b82097.
* This reverts commit 56025174.
* This updates a test change from commit c8d0470f.

* Now that ParsoidOutputAccess has become a thin wrapper over
  ParserOutputAccess and the code has landed in production without
  needing to be reverted, we can revert the above hacks as soon as the
  hits from the 'parsoid' instance start to go down to a small number.
  As of the time of creating of this patch, of the combined hits to the
  'parsoid' and 'parsoid_pcache' instance, over 90% are now from the
  'parsoid_pcache' instance. We can wait for a couple more days to
  watch how this number changes.

* Note that once we deploy this patch, the accesses which would have
  hit in the 'parsoid' instance (with this hack) will instead result
  in a cache miss thus adding the full parse latency to REST API
  requests (whether by VisualEditor or by other clients). So, we need
  to figure out what the cutoff point is. While 3 weeks is a guaranteed
  switchover timeframe (because all entries in 'parsoid' cache will
  expire at that time and we'll get no more hits from there after that),
  note that we are at < 10% hits in this cache just 4 days after the
  train rollout. So, there is a good chance we could get beyond 95%
  by the end of this week.

Bug: T347632
Change-Id: Ibd741b92b860b4d4b03ca220863debaf53fab44a
2023-10-24 20:08:23 +00:00
Bartosz Dziewoński
6727d3cedf ParserOutputAccess: Fix local cache when page is edited within the process
When the latest revision of a page was requested from
ParserOutputAccess, then the page was edited within the same
PHP process, and then the new latest revision of a page was
requested, it would return the result for the old latest
revision. The $isOld checks did not prevent this, because at
the time of checking, the two revisions were both the latest.

Compare the latest revision ID (not just whether it's the
latest revision) to avoid this.

Bug: T349033
Change-Id: I386b4a9b791065fb39dcfb2cb6df9f321d540ae1
2023-10-17 14:31:38 +02:00
C. Scott Ananian
c1b82097e4 ParserOutputAccess: fix the wrapper div fetched from the fallback cache
Followup-To: I7f933fd61bf358c6ea0e0c1202231cac618f9e8d
Change-Id: I502adc85cd2a75160f090b8979213b5d7563aefd
2023-10-06 16:49:52 -04:00
Subramanya Sastry
56025174a2 Hacks to avoid cold cache misses after ParsoidOutputAccess changes
* ParsoidOutputAccess used a 'parsoid' ParserCache instance and did not
  set the 'useParsoid' parser option for tier 2 ParserOutput cache key
  computations.

* ParserOutputAccess uses 'pcache' for legacy parser output and
  'parsoid-pcache' for Parsoid parser output objects based on whether
  'useParsoid' parser option is true or false.

* 'parsoid-pcache' is right now very sparsely populated since useParsoid
  is only used for testing.

* In Ic9b7cc0fcf36, where we make ParsoidOutputAccess a thin wrapper
  over ParserOutputAccess, all Parsoid parser output requests will go
  to ParserOutputAccess's 'parsoid-pcache' instance which is sparsely
  populated and hence will result in a lot of cold cache misses.

* To eliminate this scenario, this patch adds hardcoded hacks to both
  ParserOutputAccess and ParserCache to query the 'parsoid' PC instance
  on cache misses to the 'parsoid-pcache' instance. Over a 3-week
  period, as 'parsoid-pcache' fills up, there will be fewer and fewer
  access to the 'parsoid' PC instance which will also expire. At the
  end of that period, we can remove this hack.

  T347632 tracks removal of these hacks.

* Added new PHP unit test verifying that the hack work as intended.

Bug: T332931
Change-Id: I7f933fd61bf358c6ea0e0c1202231cac618f9e8d
2023-09-30 07:20:52 +00:00
James D. Forrester
b16be7a36c Namespace TitleFormatter under \MediaWiki\Title
One of the big ones, so doing this alone.

Bug: T166010
Change-Id: Ic2d59eb6764b1a273ed7162ecabf641f638b8f66
2023-09-19 05:17:18 +00:00
Subramanya Sastry
8dbbbc790f s/NO_CACHE/OPT_NO_CACHE/g in ParserOutputAccess and tests
Change-Id: I9b853646b34bf212bab8602796f4258727827c50
2023-09-10 20:42:54 -05:00
Amir Sarabadani
70dcaba317 rdbms: Inject CP instead of relying on LBF
This patch makes two major changes:
 - In the PoolCounter chain, we simply inject CP and call it directly
   and as result, there is no need for ILBF::getChronologyProtectorTouched
 - Instead of injecting CP callback to LB, just pass the object down the
   chain which leads to simpler and more stable code.

Bug: T275713
Change-Id: If78f4498d98e256015e54cc46561cb11b2947058
2023-09-04 12:29:05 +02:00
Amir Sarabadani
f4e68e055f Reorg: Move Status to MediaWiki\Status\
This class is used heavily basically everywhere, moving it to Utils
wouldn't make much sense. Also with this change, we can move
StatusValue to MediaWiki\Status as well.

Bug: T321882
Depends-On: I5f89ecf27ce1471a74f31c6018806461781213c3
Change-Id: I04c1dcf5129df437589149f0f3e284974d7c98fa
2023-08-25 15:44:17 +02:00
Umherirrender
3838f7b9eb page: Reduce creation of primary cache in ParserOutputAccess
Not always needed in newPoolWorkArticleView() when render oldids

Change-Id: If6953d1cadf84f96cd95eac2cec4e866d853d2f9
2023-08-07 18:16:01 +02:00
David Causse
ce511406e0 ParserCache: add an option to explicitly trigger links update
Triggering an opportunistic LinksUpdate on every cache miss of the
current revision might not be appropriate in some cases.
Some functions like ContentHandler::getParserOutputForIndexing might
be called after all LinksUpdates but if these functions do explicitely
disallow populating the parser cache via OPT_NO_UPDATE_CACHE we might
enter a case where involved jobs would trigger themselves forever.

It is happening in the case of the CirrusSearch extension that listens
to LinksUpdate and is relying on
ContentHandler::getParserOutputForIndexing to fetch the parser output.

Introduce a new option ParserOutputAccess::OPT_LINKS_UPDATE to be
more intentional on whether such cascading LinksUpdate might occur
or not on cache misses.

Change the default to not trigger a LinksUpdate on every cache miss
and enable it only when rendering the article view (Article::view).
It does not seem ideal that this behavior is owned by the ParserCache
and further refactoring might be needed to separate these concerns.

Bug: T329842
Change-Id: Ib3c3ca935f316ea880ff6c6b393fa80166e42bd3
2023-05-16 11:32:55 +02:00
C. Scott Ananian
2caf69797c ParserOutputAccess: Fork primary and secondary caches for parsoid
Uses flag to detect which cache instance to use based on ParserOptions
and sets the primary and secondary caches accordingly.  This ensures
that the ParserCacheMetadata cache used by the ParserCache is also
appropriately forked for Parsoid, as Parsoid may consult different
options in the ParserCache than core does.

A follow up patch will attempt to refactor this to be less
parsoid-specific.

Bug: T327769
Bug: T330677
Co-authored-by: Alangi Derick <alangiderick@gmail.com>
Change-Id: Id580b97ad9a0b90bbe56d4de3c2f999274fe329b
2023-03-26 21:46:07 -04:00
Brian Wolff
024915bbcd Separate RevisionOutputCache::makeParserOutputKey from no revid case
If the revision id is null, the item should not be cachable.
However, you still need a poolcounter work key for such cases,
and this method was used for both. This is confusing and seems
dangerous, so split into two methods, one where a revision id
is required, and another one where it is optional that should
only be used when it is not for a cache key.

This also fixes a warning about null revId on php 8.1 in tests.

Bug: T313663
Change-Id: Id685caeecf21d058bfd8446d9b5e21f0f11e0177
2022-09-09 18:24:27 -07:00
Thiemo Kreuz
76646313cb poolcounter: Avoid calling parent::doWork in PoolWorkArticleView classes
Small, non-functional changes to make the code more readable.
* No need to expose the subclassing in newPoolWorkArticleView(). All
  the user needs to know is that PoolCounterWork::execute() can be
  called.
* The doWork() method exists to be called from PoolCounterWork. Each
  subclasses should do this independently from the others.

Another benefit is that we can have more strict type declarations.

Change-Id: I9418169e8937029f61d15ad54a1afeec0b343bb9
2022-05-13 20:36:25 +00:00
Timo Tijhof
e3659dfef9 page: Improve class documentation briefs
* Indicate whether a class is a service (to be found via MediaWikiServices)
  or a lower-level class for certain backend logic.

* Indicate how to create / where to get instances of non-service classes,
  e.g. point to the relevant service.

* Remove copy-pasta text in file docblock that is unrelated,
  and incorporate any relevant text into the class docblock instead.

Change-Id: Ia3b9b8c22da4d7160c5e14ae6a6a7c9dca30e9db
2022-04-12 00:49:41 +00:00
Thiemo Kreuz
c5fdb1c8ba Change ParserOutputAccess workers to work with Status objects
All these methods have been written to return true, but that value was
never used for anything other than realizing that the method succeeded.
The ParserOutput object we are interested in was stuck in a property.
Why not return the ParserOutput object?

I wrapped it in a Status object to be able to pass warning messages
along with the actual result. There was even more specialized code to
do that via dedicated setters and getters. All this can be removed now.

Bug: T304813
Change-Id: I6fd3745835dfcaec94695469498a2662f8317c35
2022-04-08 15:47:59 +02:00
Thiemo Kreuz
dfbf5830b2 Move "dirty" logic to PoolWorkArticleView subclass that uses it
There is only a single subclass that ever does anything with these
two boolean properties. Only 3 states are possible. Pretty much all
of this belongs to the subclass. No other code should have to know
anything about this.

This patch doesn't fully solve the issue but moves code in the
described direction.

Bug: T304813
Change-Id: I70754546f065b03ff04a73307c10f22fbb040810
2022-04-08 13:47:36 +02:00
Adam Wight
f21b52bec3 Remove test-only method
Usually we opt to break access control in a test, rather than expose
internals in production classes.

Change-Id: I7e393d2569e8784e2c8eb7ed29d60aab58b9bd83
2022-04-06 10:01:52 +02:00
jenkins-bot
b0a823afbd Merge "Clarify the return type of ContentHandler::getParserOutputForIndexing()" 2022-04-05 20:12:56 +00:00
David Causse
2eca69ca63 Clarify the return type of ContentHandler::getParserOutputForIndexing()
it may be null in some cases.

Bug: T305169
Change-Id: I00bf78e6d46392244cbf95344f782ffe3c55dbb6
2022-04-05 10:38:48 +02:00
Thiemo Kreuz
a166140e59 Re-arrange status/result building logic in ParserOutputAccess
The idea is to group code that belongs together:
1. The status is created and fix-ups applied.
   * The result is stored.
   * Fix an invalid status that claims to be ok but doesn't have a
     result.
   * Note we might have a non-null result at this point that's
     marked as erroneous.
2. Some additional things that are only done when it was a success.
3. Logging/stats stuff is something entirely separate.

Change-Id: I541eff1a64d113c48223c4ce76d79ae5fe08a018
2022-04-04 20:38:18 +02:00
Thiemo Kreuz
6604df6919 Don't cache old revisions in ParserOutputAccess' local cache
There is a chance a revision is given, but it's the latest one. This
will be cached, but never used because of the check in line #217.

Change-Id: Ic74a16f6647672d7619ce0d736ef931721362f67
2022-04-04 20:32:26 +02:00
jenkins-bot
b69dbbb125 Merge "Make old vs. latest revision more obvious in ParserOutputAccess" 2022-04-04 12:13:46 +00:00
jenkins-bot
45beffe07a Merge "De-obfuscate stats related code in ParserOutputAccess" 2022-04-04 12:03:12 +00:00
Thiemo Kreuz
2661070e18 Make old vs. latest revision more obvious in ParserOutputAccess
The same check is done in multiple places, just slightly different
every time, but always boiling down to the same idea: The latest
revision is requested either when it's specifically requested or
missing. Otherwise it's an old one.

The only purpose of this patch is to make the code easier to read.
I made sure there is zero functional change.

Bug: T304813
Change-Id: I0a5c84eb137dfbbefb1ef57eaf8711971b991911
2022-04-04 11:05:55 +00:00
Thiemo Kreuz
09f6995230 De-obfuscate stats related code in ParserOutputAccess
Remove some single-use variable names. This reduces mental load. No
need to check if a variable might be used for something else later.

The code duplication is minimal, only single lines. This makes it
very easy to parse mentally.

The reason why I touch this code in the first place will become more
obvious in the following patches.ParserOutputAccess

Change-Id: I73fc5d45034ae6ce93b198b956f783f95900d155
2022-04-04 12:21:32 +02:00
Adam Wight
0f5bfd9e00 cleanup: drop unused parameter
This function hasn't used $parserOptions since I0fe275b4991f1bf8.

Change-Id: Idd57a431b6211a9e169cce8212d5e5989be69a09
2022-04-04 12:03:11 +02:00
Amir Sarabadani
a087f79319 ParserOutputAccess: Allow calling getPO with option of not saving in PC
This is needed to make sure CirrusSearch doesn't overwhelm parsercache.

Follows-up I23c053df4c (T302620).

Bug: T285993
Change-Id: Ia5fc3b063c45cb43fdee16f44da2270847773945
2022-04-01 14:02:07 +00:00
Matěj Suchánek
7f6323e355 Fix switch-case syntax in ParserOutputAccess
The equality check is unnecessary, the switch-case construct
does it. The code should work the same:
When $useCache is equal to CACHE_SECONDARY, the expression
evaluates to true and so does the loose comparison done by
the switch.
Otherwise it evaluates to false and the case is false, too.

Follow-Up: I0fe275b4991f1bf89c7bb587132bc4fb0ea862e2
Change-Id: I2ded074d4e31d5770c02b4d5ac4acc58b8542ad0
2022-03-14 20:08:44 +00:00
Amir Sarabadani
b6e2a124fb ParserOutputAccess: Check for latest revision when checking for cache
This is similar to the rest of the class, caching if revision is not
specified or it's the latest.

This helps in cases when revision is being passed to getParserOutput()
like FlaggedRevs

Bug: T283029
Change-Id: Ia5c5c112a033944689259c2d2839faf4a8bd90e0
2022-02-17 18:08:03 +01:00
Amir Sarabadani
206f6cbd50 ParserOutputAccess: Add process cache within the service class
This guards against duplicate parses.

These happen when a page is parsed but an extension needs the
ParserOutput again in the same request when it hasn't made into the
ParserCache yet, or if it is considered uncachable. In that case we
still want to allow re-use within the same process.

Bug: T301310
Change-Id: I1ddd967a40b760b1e53f1fd227cb0d0732f78ec1
2022-02-16 12:18:21 +00:00
daniel
c839155dc0 Remove @unstable tag from ParserOutputAccess
ParserOutputAccess was introduced in 1.36, time to call it stable in
1.38.

Change-Id: Idf8928b87e841bc8837df2e56b09841fc104e25f
2021-11-24 09:42:38 +00:00
Petr Pchelko
cd66d7c335 Convert ParserOutputAccess to PageRecord.
Still needs to downcast to WikiPage in 2 places:

1. To check get a ContentHandler and check if content model
is cacheable. We probably should just make all content models
cacheable.
2. To call WikiPage::triggerOpportunisticLinksUpdate. I have
an elaborate plan for this one, but it will be done separately.

Change-Id: Ifd9ab0155dc1fad0c1608dafea05d16292afd057
2021-04-05 07:46:34 -06:00
jenkins-bot
b3a8ea7076 Merge "Remove 'stubthreshold' from ParserCache key." 2020-12-15 20:21:14 +00:00
daniel
00a3439dce Introduce RevisionOutputCache
Bug: T267981
Change-Id: Ib1dc641ed10d786918362b25bd655780d5844ba1
2020-12-14 16:50:28 +00:00
Petr Pchelko
9872b627ff Article:view - always try using ParserCache for old revisions.
Bug: T268075
Change-Id: Ie318a6275c1fb5aedff830b72ee838def815e190
2020-12-07 15:18:08 -06:00
Petr Pchelko
235c56d649 Remove 'stubthreshold' from ParserCache key.
Stubthreshold option used to be a cache-varying option,
but in all places where we interact with the ParserCache
we are checking that it's 0 before using the cache.

Instead, we can just remove all the special cases for
stubthreshold option, remove it from cache key and rely
on ParserOptions::isSafeToCache to avoid caching non-default
stubthreshold outputs.

Bug: T264351
Change-Id: Ifaf69a3e651eef21c88da3aa3044b490059958ca
2020-12-07 14:47:05 -06:00
Petr Pchelko
3d6e6a5f70 ParserOutputAccess: don't compate RevisionRecord to int
Change-Id: I2c261328770b9a990fb7221d43dbfe974a7ecfe1
2020-12-03 11:03:45 -06:00
Petr Pchelko
4417b13d58 Make ParserCache respect ParserOptions::isSafeToCache
Bug: T269154
Change-Id: I8e9ecd2787aa8d172e708ba64ea936e63fbc6b36
2020-12-02 14:02:36 -06:00
daniel
2c7ba6f62b PoolWorkArticleViewOld: use WANObjectCache
Use WANObjectCache instead of the local cluster object cache.

Bug: T268278
Change-Id: Ic16feffecaf4b75c284c6ef34de42ac113e625f8
2020-11-30 16:38:56 +00:00
Petr Pchelko
dbdc2a3cd3 Introduce JsonCodec to help with serialization/deserialization
Change-Id: I5433090ae8e2b3f2a4590cc404baf838025546ce
2020-11-19 08:32:21 -07:00
jenkins-bot
6e5c7e97b4 Merge "PoolWorkArticleView: inject logger" 2020-11-18 20:47:06 +00:00
daniel
195bc9715d PoolWorkArticleView: inject logger
Bug: T267832
Change-Id: I7f4763d0e812d076188bb1a4ca2c333f50dffbee
2020-11-18 17:08:37 +01:00