Commit graph

148 commits

Author SHA1 Message Date
daniel
489e2826e0 ParserCache: fix stats for metadata cache missed
Cache misses in metadata were miscounted as miss.unserialize.
Count them as miss.absent.metadata instead.

Change-Id: Idff062325a34445478a4543709a9f2b3cc365f60
2021-04-08 17:54:01 +02:00
Petr Pchelko
d1f481f242 ParserCache: only use in-process caching for metadata
CachedBagOStuff caches negatives, so it breaks PoolCounter.
We only need to cache metadata in-process, since it's commonly
used twice within the request.

Bug: T277829
Change-Id: I11a147c24b6cdb275b521b48802d6f3d0e1a4387
2021-04-06 17:53:38 -06:00
Petr Pchelko
f642215aed Convert ParserCache to PageRecord
ParserOptions not updated cause they depend on Title::getLanguage
implementation.

Tests converted to not require a DB anymore. Can't be proper unit
tests yet due to globals in ParserOptions and fake time hacks,
but exec time does go down from 70 seconds to 9 seconds.

Page content model is still emitted in the metrics since
it was considered useful. Should be removed when we get
something like a page type concept.

Change-Id: Ib16fd0b5b87ffc3cb4d21f4aa43d1203cb7206d2
2021-04-02 21:14:54 -06:00
Petr Pchelko
37030c04f0 RevisionRenderer should set revision ID/Timestamp in ParserOutput
ParserOutput object wraps revision ID and revision timestamp
of the parsed revision. Currently ParserCache sets these properties,
but it's not at all it's job - whatever generates the ParserOutput
knows much better what revision it parsed. This also allows us to
simplify ParserCache and easier switch it to PageRecord.

I've only removed setting the timestamp inside ParserCache
cause it's a blocker for page record, I will do followupus
to remove the $revId parameter from ParserCache as well.

cacheRevisionId should also be renamed, but later.

Bug: T278284
Change-Id: I9a82e9fd154b29a81d1f7a3c4abb073c9a27314e
2021-03-24 10:25:56 -06:00
Timo Tijhof
eb7b9c8e7d ParserCache: Instrument CachedBagOStuff to understand dupe fetches
Follows-up 66cc685b45.

Bug: T269593
Change-Id: Iff5267689a17281330307575d618cfd531051e57
2021-03-13 01:43:10 +00:00
jenkins-bot
d491f23b90 Merge "Respect used options for ParserOptions::isSafeToCache" 2021-01-25 19:13:53 +00:00
Petr Pchelko
7e8d1a11c8 Return back accidentally removed ParserCache 'hit' metric
Change-Id: Ibd69e532a2f373f9d0129ac2a2c6ac70039c9bec
2021-01-05 14:44:19 -06:00
Petr Pchelko
46b66f093a Respect used options for ParserOptions::isSafeToCache
Bug: T269293
Change-Id: Ic3cf908265ad470815f0ac81442d33bde04a5665
2021-01-04 10:32:34 -06:00
Petr Pchelko
71bb51ed55 ParserCache: general code cleanup, abstracted expiration checks.
Change-Id: I7374f30d582064236b8f782e6a2528eb692e3010
2020-12-16 12:09:55 +00:00
Petr Pchelko
66cc685b45 Make ParserCache use CachedBagOStuff
Bug: T269593
Change-Id: I21e6e39eccad22b781252b142c1e5b079c1ee0b4
2020-12-07 10:28:30 -06:00
Petr Pchelko
4417b13d58 Make ParserCache respect ParserOptions::isSafeToCache
Bug: T269154
Change-Id: I8e9ecd2787aa8d172e708ba64ea936e63fbc6b36
2020-12-02 14:02:36 -06:00
Petr Pchelko
b956c77d27 Merge CacheTime and ParserOutput accessedOptions properties
Change-Id: I5785596d68e8923f8bcbd182ace0b1991bd75c9a
2020-11-19 10:12:39 -07:00
Petr Pchelko
dbdc2a3cd3 Introduce JsonCodec to help with serialization/deserialization
Change-Id: I5433090ae8e2b3f2a4590cc404baf838025546ce
2020-11-19 08:32:21 -07:00
Petr Pchelko
7c68ae9296 Safe ParserOutput extension data and JsonUnserializable helper.
One major difference with what we've had before is that now we
actually write class names into the serialization - given that
this new mechanism is extencible, we can't establish any kind
of mapping of allowed classes. I do not think it's a problem
though.

Bug: T264394
Change-Id: Ia152f3b76b967aabde2d8a182e3aec7d3002e5ea
2020-11-10 11:21:09 -07:00
Petr Pchelko
8cc6b7f99a ParserCache JSON - do not \u encode unicode and special characters.
Without passing ALL_OK constant, json-encoding will \u-escape
all the unicode, which will blow the size of serialized data,
especially on Russian wiki out of proportion.

Bug: T263579
Change-Id: Ifaaf1cdfaeeb17c3a99ed742b64ae5cc3157500c
2020-10-22 18:26:59 -07:00
DannyS712
e2731a76ad Normalize error messages for non-serializable properties
Change-Id: If599082bd4acdc9df5b32aaabf2ba8d24e830914
2020-10-21 22:49:57 +00:00
Petr Pchelko
2bbf1dc97e ParserCache: add serialization format to HTML debug message.
Bug: T263579
Change-Id: I80f316ce78285cb245e05d01c7e1a8e314a2e732
2020-10-20 12:48:44 -07:00
Petr Pchelko
e269dd028b Hard-deprecate ParserCache::getETag.
This is not ParserCache business to build etags for output.

See https://github.com/SemanticMediaWiki/SemanticMediaWiki/pull/4862
for removal of the only use.
Change-Id: Iceb6bd761acc7511ea7d9d14b9df2e9e1fa51648
2020-10-16 20:17:26 +00:00
jenkins-bot
ed57d5295f Merge "Move serializability validation from ParserOutput to ParserCache" 2020-10-16 13:19:59 +00:00
Petr Pchelko
0f16608e6d Add basic docs for ParserCache
Change-Id: I6290c2f064d6ddc4693a27f1d8bf933bcdb4293f
2020-10-15 13:51:25 -07:00
Petr Pchelko
09c14b9dd0 Move serializability validation from ParserOutput to ParserCache
Bug: T263579
Change-Id: Iac2dbc817c2e7af4a6d112f01bd380a04354db22
2020-10-15 13:15:30 -07:00
daniel
0c059b7381 ParserCache: introduce feature flag for enabling JSON encoding.
This introduces $wgParserCacheUseJson for selectively enabling
JSON encoding in the parser cache. This is intended for testing only.

It should be removed before the release of 1.36.

Bug: T263579
Change-Id: I0d9cab3fafb984a3159e24f9e80f792429ff3c71
2020-10-13 23:46:57 +00:00
daniel
600f64029f Use JSON for parser cache
This adds JSON serialization and deserialization capabilities
to CacheTime and ParserOutput.

NOTE: JSON serialization is disabled for now. Merging this patch
should not change behavior in production.

Bug: T263579
Change-Id: I18187e8bce573d21f6f1bd29106e07c63a6d2f4d
2020-10-13 16:28:52 -07:00
Petr Pchelko
bb39896603 Hard-deprecate ParserCache::getKey.
Bug: T263689
Depends-On: I20b5a3eece79afaac6a4fef733d7a60ea23c6ffe
Depends-On: I3ed1188e267f4eaab0ae46f2bc6f9a379dea58ce
Change-Id: I30d05ee5b217fce0521d14867309979e76f34760
2020-10-13 08:31:23 -07:00
Petr Pchelko
13574e8404 Deprecate ParserCache::getKey and replace it with getMetadata
Bug: T263689
Change-Id: I4a71e5a7eb1c25cd53b857c115883cd00160736b
2020-10-13 08:31:22 -07:00
jenkins-bot
f43007d3f1 Merge "HACK/ParserCache: Force cache-miss if mUsedOptions is undefined" 2020-10-05 13:58:14 +00:00
daniel
ff07253be5 ParserCache: be resilient to string values
This makes the parser cache resilient to encountering string values
where it is currently expecting to get a ParserOutput objerct from the
underlying cache.

This provides forward compatibility with a switch to JSON based caching:
If we have to switch back after writing JSON to the cache for a while,
ParserCache would simply ignore the respective entries, rather than
causing fatal errors.

Bug: T263579
Change-Id: Iaed582097ab2d05edb4b99a738ac39c530fd63c1
2020-10-01 14:53:00 -06:00
Petr Pchelko
e7ff3cbb6b Cover ParserCache with integration tests
Bug: T250500
Change-Id: I8c45e7c6706b532f1569d06330cc45e841f208b7
2020-10-01 13:56:22 -06:00
Timo Tijhof
b52660a1f1 HACK/ParserCache: Force cache-miss if mUsedOptions is undefined
These are causing thousands of errors from wmf.11-cached pages
since we rolled back to wmf.10.

Bug: T264257
Change-Id: Ia3357b2f593ca16fc12241d7ea22bbfd222f2536
(cherry picked from commit 71ee44aabba5c10187ad6d5cb26b5ef072cbf9b2)
2020-10-01 18:25:47 +00:00
Ppchelko
3254e41a4c Revert "Revert "Revert "Hard deprecate all public properties in CacheTime and ParserOutput"""
This reverts commit deacee9088.

Bug: T264257
Change-Id: Ie68d8081a42e7d8103e287b6d6857a30dc522f75
2020-10-01 12:03:41 -06:00
Petr Pchelko
f24125684c Clean up ParserCache construction and inject logger
Bug: T263583
Depends-On: Iceaa0e872c53aa79b7012711813895221fa62fa6
Change-Id: I6f131a078e9d6eb5da3533b0ac3730e24bd3f56f
2020-09-28 13:17:30 -07:00
jenkins-bot
17291773c1 Merge "Create ParserCacheFactory." 2020-09-28 16:13:37 +00:00
Petr Pchelko
6417f2c49f ParserCache::get - drop support for passing Article.
Deprecated in 1.35. However, if you look closely,
the deprecation warning emitting code was passing
numeric 1.35 instead of a string '1.35' which caused
the deprecation function to throw an exception.

Thus, this code has not been deprecated in 1.35, but
was accidentally broken. Instead of fixing the deprecation,
just remove the fallback.

Change-Id: I369f03d6b01053fc0396beb635c7b7d49bd249da
2020-09-27 15:46:34 -07:00
Petr Pchelko
fec48eb5a4 Create ParserCacheFactory.
* Makes ParserCache take the root of the key
  as a constructor argument
* Introduces a ParserCacheFactory

Next steps:
- convert FlaggedRevs to using this.
- cleanup

This assumes that we wouldn't want to differentiate
the parser cache settings per use-case, as it is now
for default vs flaggedrevs caches. There are only two settings:
$wgParserCacheType - name of the BagOStuff to use
$wgParserCacheExpireTime - the expiration time.
I think if we wanted to have different settings for different
caches, we could add that as a next step.

Bug: T263583
Change-Id: I188772da541a95c95a5ecece7c7dd748395506c2
2020-09-25 18:17:58 -07:00
Ppchelko
deacee9088 Revert "Revert "Hard deprecate all public properties in CacheTime and ParserOutput""
This reverts commit a4dc6d82af.

I've reverted the merged patch since I didn't do enough testing
on serialized/reserialized ParserOutput and CacheTime. Now I'm
confident serialization/deserialization works.

Changes since original reverted version:
 - Use __get/__set instead of DeprecationHelper in order to
   avoid $deprecateProperties array to be serialized.
 - Add test for old format serialization new format deserialization.

Change-Id: Ic911c2724ad709931d3316e609781fb89b5b7b28
2020-09-24 07:55:18 -07:00
Ppchelko
a4dc6d82af Revert "Hard deprecate all public properties in CacheTime and ParserOutput"
This reverts commit 799c10b7eb.

Reason for revert: Didn't test how this would work with deserializing stored ParserOutput.

Change-Id: I4221bc26282f3b4bd044f0ab50d00e77eb57ede0
2020-09-23 22:46:33 +00:00
Petr Pchelko
799c10b7eb Hard deprecate all public properties in CacheTime and ParserOutput
* In preparation for ParserCache/Parsoid integration, it's nice to
  do some cleanups. Will untie our hands a bit more.
* Verified no usages in extensions deployed at wikimedia, other then
  Flow, fixed in the dependent patch.

Change-Id: Idd78413a36887e2ff5c902d410e55691cafb736b
2020-09-23 07:17:13 -07:00
Tim Starling
6b05a27987 Require three parameters to ParserCache::__construct()
Change-Id: I8a74fdf016bafa2efd32ef81f3c51909bc1d8ec7
Depends-On: I8bc1b94c01d2e6e0b352a44bcb8e1d24a9fbe4ee
2020-09-18 08:14:15 +10:00
Umherirrender
381c934075 Use StatsdDataFactory service in ParserCache
New argument is optional, because extension extends this class

Change-Id: I710016c0ca9f8bb595d9f3ccd9452c76fdda3ef3
2020-06-21 21:15:17 +02:00
DannyS712
cbbd029cac Remove terminating line breaks from wfDebugLog calls
Change-Id: Iac61ba7924597d654df7bf0a9136eeb3adbe0eef
2020-06-03 02:48:36 +00:00
Tim Starling
47a1619027 Remove terminating line breaks from debug messages
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.

So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.

Also:
* Fix the stripping of leading line breaks from the log header emitted
  by Setup.php. This feature, accidentally broken in 2014, allows
  requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
  WebRequest in the hopes that it would not always be needed, however
  $wgRequest->getIP() is now called unconditionally a few lines up in
  Setup.php. This means that it is put in its proper place after the
  "start request" message.
* Wrap the log header code in a closure so that variables like $name do
  not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
  parameter to wfDebug().

Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
2020-06-03 12:01:16 +10:00
Tim Starling
68c433bd23 Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.

General principles:
* Use DI if it is already used. We're not changing the way state is
  managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
  is a service, it's a more generic interface, it is the only
  thing that provides isRegistered() which is needed in some cases,
  and a HookRunner can be efficiently constructed from it
  (confirmed by benchmark). Because HookContainer is needed
  for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
  SpecialPage and ApiBase have getHookContainer() and getHookRunner()
  methods in the base class, and classes that extend that base class
  are not expected to know or care where the base class gets its
  HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
  getHookRunner() methods, getting them from the global service
  container. The point of this is to ease migration to DI by ensuring
  that call sites ask their local friendly base class rather than
  getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
  methods did not seem warranted, there is a private HookRunner property
  which is accessed directly. Very rarely (two cases), there is a
  protected property, for consistency with code that conventionally
  assumes protected=private, but in cases where the class might actually
  be overridden, a protected accessor is preferred over a protected
  property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
  global code. In a few cases it was used for objects with broken
  construction schemes, out of horror or laziness.

Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore

Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router

setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine

Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-05-30 14:23:28 +00:00
Reedy
b038d6333a Fix even more PSR12.Properties.ConstantVisibility.NotFound
Change-Id: I6d98efcfac1f1c0ab6a442e0af6d5daa6ef7801a
2020-05-16 00:28:41 +00:00
DannyS712
4721717527 Replace uses and hard deprecate Article:: and WikiPage::getRevision
Bug: T250532
Bug: T239975
Change-Id: Ic8f2baa0ac805d5196a7107bdc7a1abb36eba139
2020-04-20 23:06:48 +00:00
ArtBaltai
13ae7b807f ParserCache::get use WikiPage only as argument
ParserCache work only with WikiPage,remove Article and Page interfaces
Rename WikiPage property names and type hintings

Bug: T248719
Change-Id: I08afded432b059f94538be574a4789e18e89bf03
2020-04-12 03:49:48 +03:00
C. Scott Ananian
8a1c656150 Hard deprecate ParserCache::singleton(), deprecated in 1.30
Code search:
https://codesearch.wmflabs.org/search/?q=ParserCache%5Cs*%3A%3A%5Cs*singleton&i=fosho&files=&repos=

Bug: T249032
Change-Id: I22308bb2530a4aaa6a29e42d50fd679b932a6e9f
2020-04-01 10:31:38 -04:00
addshore
13548c6c12 Remove old pcache metric compat from ParserCache.php
All usages have been removed from grafana.

Bug: T235724
Change-Id: If7f72706ee80ff41beebdc16b3df014ec3e9caca
2019-10-31 14:41:52 +01:00
Aaron Schulz
6c31ca3f25 parsercache: use WRITE_ALLOW_SEGMENTS for cached ParserOutput values
This lets large output entries fit into memcached via key segmentation.

Follows b09b3980f9 which applied the feature to PageEditStash.

Bug: T204742
Change-Id: I33a60f5d718cd9033ea12d1d16046d2bede87b5b
2019-08-25 00:53:28 +00:00
Reedy
9f2ffdfbd4 Remove "Squiz.WhiteSpace.FunctionSpacing" from phpcs exclusions
Change-Id: I78b3315f26ab91b6b443f5b028a635552f82f5a3
2019-05-11 02:44:26 +01:00
Aaron Schulz
f474fa4cef Avoid using outdated $casToken field for BagOStuff calls
Change-Id: Ic9bcb388e4f50e2ae16ae57aa16113e79b43350b
2019-03-11 23:39:29 -07:00