ResourceLoader: Remove DependencyStore::renew

== Background

When file dependency information is lost, the startup module computes
a hash that is based on an incomplete summary of bundled resources.
This means it arrives at a "wrong" hash. Once a browser actually asks
for that version of the module, though, we rediscover the dependency
information, and subsequent startup responses will include arrive once
again at the same correct hash. These 5-minute windows of time where
the browser cache of anyone visiting is churned over are not great,
and so we try to avoid them.

The status quo is the dedicated module_deps table in core with no
expiry. This means a potential concern is building up gargage over
time for modules and extensions that no longer exist or are no longer
deployed on that wiki. In practice this has not been much of an issue,
we haven't run the cleanupRemovedModules.php or purgeModuleDeps.php
scripts in years. Once in 2017 to fix corrupt rows (T158105), and
once in 2020 to estimate needed space if we had expiries
<https://phabricator.wikimedia.org/T113916#6142457>.

Hence we're moving to mainstash via KeyValueDepStore, and not to
memcached. But for that we might as well start using experies.

To not compromise on losing dep info regularly and causing avoidable
browser cache for modules that are hot and very much still existing,
we adopted `renew()` in 5282a0296 when drafting KeyValueDepStore, so that
we keep moving the TTL of active rows forward and let the rest naturally
expire.

== Problem

The changeTTL writes are so heavy and undebounced, that it fully
saturates the hardware disk, unable to keep up simply with the amount
of streaming append-only writes to disk.

https://phabricator.wikimedia.org/T312902

== Future

Perhaps we can make this work if SqlBagOStuff in "MainStash" mode
was more efficient and lenient around changeTTL. E.g. rather than
simultanously ensure presence of the row itself for perfect eventual
consistency, maybe it could just be a light "touch" to ensure the
TTL of any such row has a given minimum TTL.

Alternatively, if we don't make it part of the generalised
SqlBag/MainStash interface but something speciifc to KeyValueDepStore,
we could also do something several orders of magnitudes more efficient,
such as only touching it once a day or once a week, instead of several
hundred times a second after every read performing a write that
amplifies the read back into a full row write, with thus a very large
and repetative binlog.

== This change

As interim measure, I propose we remove renew() and instead increase
the TTL from 1 week to 1 year. This is still shorter than "indefinite"
which is what the module_deps table does in the status quo, and that
was never an issue in practice in terms of space. This is because
the list of modules modules is quite stable. It's limited to modules
that are both file-backed (so no gadgets) and also have non-trivial
file dependencies (such as styles.less -> foo.css -> bar.svg).

== Impact

The installer and update.php (DatabaseUpdater) already clear
`module_deps` and `objectcache` so this is a non-issue for third
parties.

For WMF, it means that the maintenance script we never ran, can
be removed as it will now automatically clean up this stuff after
a year of inactivity, with a small cache churn cost to pay at that
time.

Bug: T113916
Bug: T312902
Change-Id: Ie11bdfdcf5e6724bc19ac24e4353aaea316029fd
This commit is contained in:
Timo Tijhof 2022-07-11 14:20:22 -07:00
parent b10c2c984a
commit 1d66a22805
4 changed files with 21 additions and 54 deletions

View file

@ -106,7 +106,7 @@ class ResourceLoader implements LoggerAwareInterface {
/** @var string */
private const RL_DEP_STORE_PREFIX = 'ResourceLoaderModule';
/** @var int How long to preserve indirect dependency metadata in our backend store. */
private const RL_MODULE_DEP_TTL = BagOStuff::TTL_WEEK;
private const RL_MODULE_DEP_TTL = BagOStuff::TTL_YEAR;
/** @var int */
private const MAXAGE_RECOVER = 60;
@ -525,22 +525,23 @@ class ResourceLoader implements LoggerAwareInterface {
} else {
$this->depStoreUpdateBuffer[$entity] = null;
}
} elseif ( $priorPaths ) {
// Dependency store needs to store the existing path list for longer
$this->depStoreUpdateBuffer[$entity] = '*';
}
// Use a DeferrableUpdate to flush the buffered dependency updates...
// If paths were unchanged, leave the dependency store unchanged also.
// The entry will eventually expire, after which we will briefly issue an incomplete
// version hash for a 5-min startup window, the module then recomputes and rediscovers
// the paths and arrive at the same module version hash once again. It will churn
// part of the browser cache once, for clients connecting during that window.
if ( !$hasPendingUpdate ) {
DeferredUpdates::addCallableUpdate( function () {
$updatesByEntity = $this->depStoreUpdateBuffer;
$this->depStoreUpdateBuffer = []; // consume
$this->depStoreUpdateBuffer = [];
$cache = ObjectCache::getLocalClusterInstance();
$scopeLocks = [];
$depsByEntity = [];
$entitiesUnreg = [];
$entitiesRenew = [];
foreach ( $updatesByEntity as $entity => $update ) {
$lockKey = $cache->makeKey( 'rl-deps', $entity );
$scopeLocks[$entity] = $cache->getScopedLock( $lockKey, 0 );
@ -551,8 +552,6 @@ class ResourceLoader implements LoggerAwareInterface {
}
if ( $update === null ) {
$entitiesUnreg[] = $entity;
} elseif ( $update === '*' ) {
$entitiesRenew[] = $entity;
} else {
$depsByEntity[$entity] = $update;
}
@ -561,7 +560,6 @@ class ResourceLoader implements LoggerAwareInterface {
$ttl = self::RL_MODULE_DEP_TTL;
$this->depStore->storeMulti( self::RL_DEP_STORE_PREFIX, $depsByEntity, $ttl );
$this->depStore->remove( self::RL_DEP_STORE_PREFIX, $entitiesUnreg );
$this->depStore->renew( self::RL_DEP_STORE_PREFIX, $entitiesRenew, $ttl );
} );
}
}

View file

@ -21,9 +21,9 @@
namespace Wikimedia\DependencyStore;
/**
* Class for tracking per-entity dependency path lists that are expensive to mass compute
* Track per-module dependency file paths that are expensive to mass compute
*
* @internal This should not be used outside of ResourceLoader and ResourceLoader\Module
* @internal For use by ResourceLoader\Module only
*/
abstract class DependencyStore {
/** @var string */
@ -74,14 +74,6 @@ abstract class DependencyStore {
/**
* Set the currently tracked dependencies for an entity
*
* Dependency data should be set to persist as long as anything might rely on it existing
* in order to check the validity of some previously computed work. This can be achieved
* while minimizing storage space under the following scheme:
* - a) computed work has a TTL (time-to-live)
* - b) when work is computed, the dependency data is updated
* - c) the dependency data has a TTL higher enough to accounts for skew/latency
* - d) the TTL of tracked dependency data is renewed upon access
*
* @param string $type Entity type
* @param string $entity Entity name
* @param array $data Map of (paths: paths, asOf: UNIX timestamp or null)
@ -113,14 +105,4 @@ abstract class DependencyStore {
* @throws DependencyStoreException
*/
abstract public function remove( $type, $entities );
/**
* Set the expiry for the currently tracked dependencies for an entity or set of entities
*
* @param string $type Entity type
* @param string|string[] $entities Entity name(s)
* @param int $ttl New time-to-live in seconds
* @throws DependencyStoreException
*/
abstract public function renew( $type, $entities, $ttl );
}

View file

@ -24,13 +24,10 @@ use BagOStuff;
use InvalidArgumentException;
/**
* Lightweight class for tracking path dependencies lists via an object cache instance
*
* This does not throw DependencyStoreException due to I/O errors since it is optimized for
* speed and availability. Read methods return empty placeholders on failure. Write methods
* might issue I/O in the background and return immediately. However, reads methods will at
* least block on the resolution (success/failure) of any such pending writes.
* Track per-module file dependencies in object cache via BagOStuff.
*
* @see $wgResourceLoaderUseObjectCacheForDeps
* @internal For use by ResourceLoader\Module only
* @since 1.35
*/
class KeyValueDependencyStore extends DependencyStore {
@ -96,17 +93,6 @@ class KeyValueDependencyStore extends DependencyStore {
}
}
public function renew( $type, $entities, $ttl ) {
$keys = [];
foreach ( (array)$entities as $entity ) {
$keys[] = $this->getStoreKey( $type, $entity );
}
if ( $keys ) {
$this->stash->changeTTLMulti( $keys, $ttl, BagOStuff::WRITE_BACKGROUND );
}
}
/**
* @param string $type
* @param string $entity

View file

@ -27,11 +27,16 @@ use Wikimedia\Rdbms\IDatabase;
use Wikimedia\Rdbms\ILoadBalancer;
/**
* Class for tracking per-entity dependency path lists in the module_deps table
* Track per-module file dependencies in the core module_deps table
*
* This should not be used outside of ResourceLoader and ResourceLoader\Module
* Wiki farms that are too big for maintenance/update.php, can clean up
* unneeded data for modules that no longer exist after a MW upgrade,
* by running maintenance/cleanupRemovedModules.php.
*
* @internal For use with ResourceLoader/ResourceLoader\Module only
* To force a rebuild and incurr a small penalty in browser cache churn,
* run maintenance/purgeModuleDeps.php instead.
*
* @internal For use by ResourceLoader\Module only
* @since 1.35
*/
class SqlModuleDependencyStore extends DependencyStore {
@ -151,10 +156,6 @@ class SqlModuleDependencyStore extends DependencyStore {
}
}
public function renew( $type, $entities, $ttl ) {
// no-op
}
/**
* @param string[] $entities
* @param IDatabase $db