Commit graph

136 commits

Author SHA1 Message Date
Petr Pchelko
dbdc2a3cd3 Introduce JsonCodec to help with serialization/deserialization
Change-Id: I5433090ae8e2b3f2a4590cc404baf838025546ce
2020-11-19 08:32:21 -07:00
Petr Pchelko
7c68ae9296 Safe ParserOutput extension data and JsonUnserializable helper.
One major difference with what we've had before is that now we
actually write class names into the serialization - given that
this new mechanism is extencible, we can't establish any kind
of mapping of allowed classes. I do not think it's a problem
though.

Bug: T264394
Change-Id: Ia152f3b76b967aabde2d8a182e3aec7d3002e5ea
2020-11-10 11:21:09 -07:00
Petr Pchelko
8cc6b7f99a ParserCache JSON - do not \u encode unicode and special characters.
Without passing ALL_OK constant, json-encoding will \u-escape
all the unicode, which will blow the size of serialized data,
especially on Russian wiki out of proportion.

Bug: T263579
Change-Id: Ifaaf1cdfaeeb17c3a99ed742b64ae5cc3157500c
2020-10-22 18:26:59 -07:00
DannyS712
e2731a76ad Normalize error messages for non-serializable properties
Change-Id: If599082bd4acdc9df5b32aaabf2ba8d24e830914
2020-10-21 22:49:57 +00:00
Petr Pchelko
2bbf1dc97e ParserCache: add serialization format to HTML debug message.
Bug: T263579
Change-Id: I80f316ce78285cb245e05d01c7e1a8e314a2e732
2020-10-20 12:48:44 -07:00
Petr Pchelko
e269dd028b Hard-deprecate ParserCache::getETag.
This is not ParserCache business to build etags for output.

See https://github.com/SemanticMediaWiki/SemanticMediaWiki/pull/4862
for removal of the only use.
Change-Id: Iceb6bd761acc7511ea7d9d14b9df2e9e1fa51648
2020-10-16 20:17:26 +00:00
jenkins-bot
ed57d5295f Merge "Move serializability validation from ParserOutput to ParserCache" 2020-10-16 13:19:59 +00:00
Petr Pchelko
0f16608e6d Add basic docs for ParserCache
Change-Id: I6290c2f064d6ddc4693a27f1d8bf933bcdb4293f
2020-10-15 13:51:25 -07:00
Petr Pchelko
09c14b9dd0 Move serializability validation from ParserOutput to ParserCache
Bug: T263579
Change-Id: Iac2dbc817c2e7af4a6d112f01bd380a04354db22
2020-10-15 13:15:30 -07:00
daniel
0c059b7381 ParserCache: introduce feature flag for enabling JSON encoding.
This introduces $wgParserCacheUseJson for selectively enabling
JSON encoding in the parser cache. This is intended for testing only.

It should be removed before the release of 1.36.

Bug: T263579
Change-Id: I0d9cab3fafb984a3159e24f9e80f792429ff3c71
2020-10-13 23:46:57 +00:00
daniel
600f64029f Use JSON for parser cache
This adds JSON serialization and deserialization capabilities
to CacheTime and ParserOutput.

NOTE: JSON serialization is disabled for now. Merging this patch
should not change behavior in production.

Bug: T263579
Change-Id: I18187e8bce573d21f6f1bd29106e07c63a6d2f4d
2020-10-13 16:28:52 -07:00
Petr Pchelko
bb39896603 Hard-deprecate ParserCache::getKey.
Bug: T263689
Depends-On: I20b5a3eece79afaac6a4fef733d7a60ea23c6ffe
Depends-On: I3ed1188e267f4eaab0ae46f2bc6f9a379dea58ce
Change-Id: I30d05ee5b217fce0521d14867309979e76f34760
2020-10-13 08:31:23 -07:00
Petr Pchelko
13574e8404 Deprecate ParserCache::getKey and replace it with getMetadata
Bug: T263689
Change-Id: I4a71e5a7eb1c25cd53b857c115883cd00160736b
2020-10-13 08:31:22 -07:00
jenkins-bot
f43007d3f1 Merge "HACK/ParserCache: Force cache-miss if mUsedOptions is undefined" 2020-10-05 13:58:14 +00:00
daniel
ff07253be5 ParserCache: be resilient to string values
This makes the parser cache resilient to encountering string values
where it is currently expecting to get a ParserOutput objerct from the
underlying cache.

This provides forward compatibility with a switch to JSON based caching:
If we have to switch back after writing JSON to the cache for a while,
ParserCache would simply ignore the respective entries, rather than
causing fatal errors.

Bug: T263579
Change-Id: Iaed582097ab2d05edb4b99a738ac39c530fd63c1
2020-10-01 14:53:00 -06:00
Petr Pchelko
e7ff3cbb6b Cover ParserCache with integration tests
Bug: T250500
Change-Id: I8c45e7c6706b532f1569d06330cc45e841f208b7
2020-10-01 13:56:22 -06:00
Timo Tijhof
b52660a1f1 HACK/ParserCache: Force cache-miss if mUsedOptions is undefined
These are causing thousands of errors from wmf.11-cached pages
since we rolled back to wmf.10.

Bug: T264257
Change-Id: Ia3357b2f593ca16fc12241d7ea22bbfd222f2536
(cherry picked from commit 71ee44aabba5c10187ad6d5cb26b5ef072cbf9b2)
2020-10-01 18:25:47 +00:00
Ppchelko
3254e41a4c Revert "Revert "Revert "Hard deprecate all public properties in CacheTime and ParserOutput"""
This reverts commit deacee9088.

Bug: T264257
Change-Id: Ie68d8081a42e7d8103e287b6d6857a30dc522f75
2020-10-01 12:03:41 -06:00
Petr Pchelko
f24125684c Clean up ParserCache construction and inject logger
Bug: T263583
Depends-On: Iceaa0e872c53aa79b7012711813895221fa62fa6
Change-Id: I6f131a078e9d6eb5da3533b0ac3730e24bd3f56f
2020-09-28 13:17:30 -07:00
jenkins-bot
17291773c1 Merge "Create ParserCacheFactory." 2020-09-28 16:13:37 +00:00
Petr Pchelko
6417f2c49f ParserCache::get - drop support for passing Article.
Deprecated in 1.35. However, if you look closely,
the deprecation warning emitting code was passing
numeric 1.35 instead of a string '1.35' which caused
the deprecation function to throw an exception.

Thus, this code has not been deprecated in 1.35, but
was accidentally broken. Instead of fixing the deprecation,
just remove the fallback.

Change-Id: I369f03d6b01053fc0396beb635c7b7d49bd249da
2020-09-27 15:46:34 -07:00
Petr Pchelko
fec48eb5a4 Create ParserCacheFactory.
* Makes ParserCache take the root of the key
  as a constructor argument
* Introduces a ParserCacheFactory

Next steps:
- convert FlaggedRevs to using this.
- cleanup

This assumes that we wouldn't want to differentiate
the parser cache settings per use-case, as it is now
for default vs flaggedrevs caches. There are only two settings:
$wgParserCacheType - name of the BagOStuff to use
$wgParserCacheExpireTime - the expiration time.
I think if we wanted to have different settings for different
caches, we could add that as a next step.

Bug: T263583
Change-Id: I188772da541a95c95a5ecece7c7dd748395506c2
2020-09-25 18:17:58 -07:00
Ppchelko
deacee9088 Revert "Revert "Hard deprecate all public properties in CacheTime and ParserOutput""
This reverts commit a4dc6d82af.

I've reverted the merged patch since I didn't do enough testing
on serialized/reserialized ParserOutput and CacheTime. Now I'm
confident serialization/deserialization works.

Changes since original reverted version:
 - Use __get/__set instead of DeprecationHelper in order to
   avoid $deprecateProperties array to be serialized.
 - Add test for old format serialization new format deserialization.

Change-Id: Ic911c2724ad709931d3316e609781fb89b5b7b28
2020-09-24 07:55:18 -07:00
Ppchelko
a4dc6d82af Revert "Hard deprecate all public properties in CacheTime and ParserOutput"
This reverts commit 799c10b7eb.

Reason for revert: Didn't test how this would work with deserializing stored ParserOutput.

Change-Id: I4221bc26282f3b4bd044f0ab50d00e77eb57ede0
2020-09-23 22:46:33 +00:00
Petr Pchelko
799c10b7eb Hard deprecate all public properties in CacheTime and ParserOutput
* In preparation for ParserCache/Parsoid integration, it's nice to
  do some cleanups. Will untie our hands a bit more.
* Verified no usages in extensions deployed at wikimedia, other then
  Flow, fixed in the dependent patch.

Change-Id: Idd78413a36887e2ff5c902d410e55691cafb736b
2020-09-23 07:17:13 -07:00
Tim Starling
6b05a27987 Require three parameters to ParserCache::__construct()
Change-Id: I8a74fdf016bafa2efd32ef81f3c51909bc1d8ec7
Depends-On: I8bc1b94c01d2e6e0b352a44bcb8e1d24a9fbe4ee
2020-09-18 08:14:15 +10:00
Umherirrender
381c934075 Use StatsdDataFactory service in ParserCache
New argument is optional, because extension extends this class

Change-Id: I710016c0ca9f8bb595d9f3ccd9452c76fdda3ef3
2020-06-21 21:15:17 +02:00
DannyS712
cbbd029cac Remove terminating line breaks from wfDebugLog calls
Change-Id: Iac61ba7924597d654df7bf0a9136eeb3adbe0eef
2020-06-03 02:48:36 +00:00
Tim Starling
47a1619027 Remove terminating line breaks from debug messages
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.

So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.

Also:
* Fix the stripping of leading line breaks from the log header emitted
  by Setup.php. This feature, accidentally broken in 2014, allows
  requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
  WebRequest in the hopes that it would not always be needed, however
  $wgRequest->getIP() is now called unconditionally a few lines up in
  Setup.php. This means that it is put in its proper place after the
  "start request" message.
* Wrap the log header code in a closure so that variables like $name do
  not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
  parameter to wfDebug().

Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
2020-06-03 12:01:16 +10:00
Tim Starling
68c433bd23 Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.

General principles:
* Use DI if it is already used. We're not changing the way state is
  managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
  is a service, it's a more generic interface, it is the only
  thing that provides isRegistered() which is needed in some cases,
  and a HookRunner can be efficiently constructed from it
  (confirmed by benchmark). Because HookContainer is needed
  for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
  SpecialPage and ApiBase have getHookContainer() and getHookRunner()
  methods in the base class, and classes that extend that base class
  are not expected to know or care where the base class gets its
  HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
  getHookRunner() methods, getting them from the global service
  container. The point of this is to ease migration to DI by ensuring
  that call sites ask their local friendly base class rather than
  getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
  methods did not seem warranted, there is a private HookRunner property
  which is accessed directly. Very rarely (two cases), there is a
  protected property, for consistency with code that conventionally
  assumes protected=private, but in cases where the class might actually
  be overridden, a protected accessor is preferred over a protected
  property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
  global code. In a few cases it was used for objects with broken
  construction schemes, out of horror or laziness.

Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore

Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router

setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine

Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-05-30 14:23:28 +00:00
Reedy
b038d6333a Fix even more PSR12.Properties.ConstantVisibility.NotFound
Change-Id: I6d98efcfac1f1c0ab6a442e0af6d5daa6ef7801a
2020-05-16 00:28:41 +00:00
DannyS712
4721717527 Replace uses and hard deprecate Article:: and WikiPage::getRevision
Bug: T250532
Bug: T239975
Change-Id: Ic8f2baa0ac805d5196a7107bdc7a1abb36eba139
2020-04-20 23:06:48 +00:00
ArtBaltai
13ae7b807f ParserCache::get use WikiPage only as argument
ParserCache work only with WikiPage,remove Article and Page interfaces
Rename WikiPage property names and type hintings

Bug: T248719
Change-Id: I08afded432b059f94538be574a4789e18e89bf03
2020-04-12 03:49:48 +03:00
C. Scott Ananian
8a1c656150 Hard deprecate ParserCache::singleton(), deprecated in 1.30
Code search:
https://codesearch.wmflabs.org/search/?q=ParserCache%5Cs*%3A%3A%5Cs*singleton&i=fosho&files=&repos=

Bug: T249032
Change-Id: I22308bb2530a4aaa6a29e42d50fd679b932a6e9f
2020-04-01 10:31:38 -04:00
addshore
13548c6c12 Remove old pcache metric compat from ParserCache.php
All usages have been removed from grafana.

Bug: T235724
Change-Id: If7f72706ee80ff41beebdc16b3df014ec3e9caca
2019-10-31 14:41:52 +01:00
Aaron Schulz
6c31ca3f25 parsercache: use WRITE_ALLOW_SEGMENTS for cached ParserOutput values
This lets large output entries fit into memcached via key segmentation.

Follows b09b3980f9 which applied the feature to PageEditStash.

Bug: T204742
Change-Id: I33a60f5d718cd9033ea12d1d16046d2bede87b5b
2019-08-25 00:53:28 +00:00
Reedy
9f2ffdfbd4 Remove "Squiz.WhiteSpace.FunctionSpacing" from phpcs exclusions
Change-Id: I78b3315f26ab91b6b443f5b028a635552f82f5a3
2019-05-11 02:44:26 +01:00
Aaron Schulz
f474fa4cef Avoid using outdated $casToken field for BagOStuff calls
Change-Id: Ic9bcb388e4f50e2ae16ae57aa16113e79b43350b
2019-03-11 23:39:29 -07:00
addshore
bc86b698cd parser: Add new pcache metrics, split by page content model
Change-Id: I31c3c5b863309ffcc4424c43891b577b3fb7a753
2019-02-11 20:48:56 +00:00
daniel
d8c409dd16 Make HTML generation in RenderedRevision optional
This allows optimization for situations in which a caller
needs the meta-data of a ParserOutput, and the respective
ContentHandler can provide that meta-data without generating
HTML output.

Bug: T194048
Change-Id: I786d294d18a6a2e3cea61577313e21b578c44f1e
2018-08-31 10:48:41 +00:00
Fomafix
6a022c8d20 Add type hint for ParserOutput
EditPage::getPreviewLimitReport is called by EditPage::showEditForm
with $output = null. Specify this in the @param tag and allow this by a
default value.

Change-Id: Iec8905aab736a1f254a57853c7cab935d008653e
2018-07-30 09:23:59 +02:00
Thiemo Kreuz
e6b6920cff Fix PHPDoc type hints in CacheTime, ParserOptions, and related
I'm intentionally not touching any code in this patch, only
documentation.

Change-Id: I6975194c218760031789d5335dfbb330017dc6fc
2018-04-18 15:10:31 +00:00
Brad Jorsch
2791fb0861 Hard-deprecate ParserOutput stateful transform methods
This also removes all the in-core calls that had been kept for the
benefit of extensions, and causes them to not have any effect since
anything that had been calling them was already either a no-op or will
probably be broken now that nothing in core is setting or checking the
flags.

Change-Id: Id22c1a5a6d6a249debb14063ae3f8838d105b634
2018-02-13 12:28:36 -05:00
Kunal Mehta
399adec9ad Turn ParserCache into a service, deprecate $parserMemc
ParserCache is already a singleton, making it a good candidate for a
service. $parserMemc is an odd global (it lacks the "wg" prefix) and is
ripe for deprecation.

The following are now deprecated:
* $parserMemc global
* ParserCache::singleton()
* wfGetParserCacheStorage()

A ParserCache::getCacheStorage() method was added for cases where direct
access to the underlying BagOStuff object is necessary.

Usage of $parserMemc will emit deprecation warnings through the
DeprecatedGlobal class mechanism. All usage in core was migrated.

Also take this opportunity to inject the $wgCacheEpoch global value into
ParserCache. This will require an update to the FlaggedRevs extension.

Change-Id: I2ac7afff0d8522214329248c3d1cdccd0f72bbd4
2017-07-05 19:56:49 -07:00
Brad Jorsch
84694a9d59 Remove ParserOptions::legacyOptions() and cleanup related code
ParserOptions::legacyOptions() has been sitting around since 1.17.
Originally it seems to have been intended as a way to avoid a mass cache
invalidation (similar to optionsHashPre30() from I7fb9ffca9). That code
was mostly removed in 1.23, but legacyOptions() was left behind because
it was also being used in a few places as "all cache-varying options"
(despite it not being documented for that purpose) where we'd rather
have any key than no key at all.

This patch creates an actual ParserOptions::allCacheVaryingOptions()
method for those use cases and deprecates the long-obsolete
legacyOptions().

It also makes more explicit the use of the "all cache-varying options"
fallback in ParserCache::getKey(), and doesn't bother trying to use that
fallback in ParserCache::get() where it no longer makes sense.

Change-Id: Ife1e54744155136a570210c03fe907f18f8e8ece
2017-07-04 01:28:57 +00:00
Brad Jorsch
27fd0920a1 Remove ParserOptions::optionsHashPre30()
The pre-1.30 version of ParserOptions::optionsHash() was kept
temporarily as ParserOptions::optionsHashPre30() to prevent a cache
stampede on WMF sites when the hash format was changed in I7fb9ffca9.

Now that the cache has been rebuilt, it's no longer needed and we should
clean it up instead of leaving it forever to bitrot.

Change-Id: I037d8dfdefe72a295547bd331bc1454e69cb418d
2017-06-28 00:18:59 +00:00
Brad Jorsch
da43a0ae34 ParserCache: Delete old-style key when saving
It was noticed that disk usage on the parser cache machines was
increasing since shortly after wmf.4 was redeployed everywhere on the
9th. One theory is that I7fb9ffca9 causes this by making reparses for an
existing old-style cache entry start writing the new-style key where
they would previously have overwritten the old-style key. On that
theory, let's delete that old-style key (that should now be useless) on
save.

I'm assuming here that firing a blind delete for keys that probably
don't exist in the cache (i.e. every new edit) isn't going to hurt
anything. If that's not the case, we'd need to check existence before
deleting.

Bug: T167784
Change-Id: Ie5efb05722cb7da2a90da195a1f244468177175d
2017-06-14 13:42:36 +00:00
Brad Jorsch
0facbe3e3d Try harder to avoid parser cache pollution
* ParserOptions is reorganized so it knows all the options and their
  defaults, and can report whether the non-key options are at their
  defaults.
* Definition of the "canonical" ParserOptions (which is unfortunately
  different from the "default" ParserOptions) is moved from
  ContentHandler to ParserOptions.
* WikiPage uses this to throw an exception if it's asked to cache
  with options that aren't used in the cache key.
* ParserCache gets some temporary code to try to avoid a massive cache
  stampede on upgrade.

Bug: T110269
Change-Id: I7fb9ffca96e6bd04db44d2d5f2509ec96ad9371f
Depends-On: I4070a8f51927121f690469716625db4a1064dea5
2017-06-05 14:17:28 +00:00
Kunal Mehta
ff8a0c788b parser: Avoid deprecated wfMemcKey()
Tested that parser cache keys stay the same, before and after this
change.

Also use the more obvious ObjectCache::getLocalClusterInstance() instead
of looking up the main cache type in config and using
ObjectCache::getInstance().

Change-Id: Icef646b3c05e732ef4079d6900e6bce111debf2b
2017-05-25 12:05:49 -07:00
James D. Forrester
9635dda73a includes: Replace implicit Bugzilla bug numbers with Phab ones
It's unreasonable to expect newbies to know that "bug 12345" means "Task T14345"
except where it doesn't, so let's just standardise on the real numbers.

Change-Id: I6f59febaf8fc96e80f8cfc11f4356283f461142a
2017-02-21 18:13:24 +00:00