wiki.techinc.nl/includes/ResourceLoader/dependencystore/KeyValueDependencyStore.php
Timo Tijhof 1d66a22805 ResourceLoader: Remove DependencyStore::renew
== Background

When file dependency information is lost, the startup module computes
a hash that is based on an incomplete summary of bundled resources.
This means it arrives at a "wrong" hash. Once a browser actually asks
for that version of the module, though, we rediscover the dependency
information, and subsequent startup responses will include arrive once
again at the same correct hash. These 5-minute windows of time where
the browser cache of anyone visiting is churned over are not great,
and so we try to avoid them.

The status quo is the dedicated module_deps table in core with no
expiry. This means a potential concern is building up gargage over
time for modules and extensions that no longer exist or are no longer
deployed on that wiki. In practice this has not been much of an issue,
we haven't run the cleanupRemovedModules.php or purgeModuleDeps.php
scripts in years. Once in 2017 to fix corrupt rows (T158105), and
once in 2020 to estimate needed space if we had expiries
<https://phabricator.wikimedia.org/T113916#6142457>.

Hence we're moving to mainstash via KeyValueDepStore, and not to
memcached. But for that we might as well start using experies.

To not compromise on losing dep info regularly and causing avoidable
browser cache for modules that are hot and very much still existing,
we adopted `renew()` in 5282a0296 when drafting KeyValueDepStore, so that
we keep moving the TTL of active rows forward and let the rest naturally
expire.

== Problem

The changeTTL writes are so heavy and undebounced, that it fully
saturates the hardware disk, unable to keep up simply with the amount
of streaming append-only writes to disk.

https://phabricator.wikimedia.org/T312902

== Future

Perhaps we can make this work if SqlBagOStuff in "MainStash" mode
was more efficient and lenient around changeTTL. E.g. rather than
simultanously ensure presence of the row itself for perfect eventual
consistency, maybe it could just be a light "touch" to ensure the
TTL of any such row has a given minimum TTL.

Alternatively, if we don't make it part of the generalised
SqlBag/MainStash interface but something speciifc to KeyValueDepStore,
we could also do something several orders of magnitudes more efficient,
such as only touching it once a day or once a week, instead of several
hundred times a second after every read performing a write that
amplifies the read back into a full row write, with thus a very large
and repetative binlog.

== This change

As interim measure, I propose we remove renew() and instead increase
the TTL from 1 week to 1 year. This is still shorter than "indefinite"
which is what the module_deps table does in the status quo, and that
was never an issue in practice in terms of space. This is because
the list of modules modules is quite stable. It's limited to modules
that are both file-backed (so no gadgets) and also have non-trivial
file dependencies (such as styles.less -> foo.css -> bar.svg).

== Impact

The installer and update.php (DatabaseUpdater) already clear
`module_deps` and `objectcache` so this is a non-issue for third
parties.

For WMF, it means that the maintenance script we never ran, can
be removed as it will now automatically clean up this stuff after
a year of inactivity, with a small cache churn cost to pay at that
time.

Bug: T113916
Bug: T312902
Change-Id: Ie11bdfdcf5e6724bc19ac24e4353aaea316029fd
2022-07-12 15:25:39 -07:00

104 lines
3 KiB
PHP

<?php
/**
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
* http://www.gnu.org/copyleft/gpl.html
*
* @file
*/
namespace Wikimedia\DependencyStore;
use BagOStuff;
use InvalidArgumentException;
/**
* Track per-module file dependencies in object cache via BagOStuff.
*
* @see $wgResourceLoaderUseObjectCacheForDeps
* @internal For use by ResourceLoader\Module only
* @since 1.35
*/
class KeyValueDependencyStore extends DependencyStore {
/** @var BagOStuff */
private $stash;
/**
* @param BagOStuff $stash Storage backend
*/
public function __construct( BagOStuff $stash ) {
$this->stash = $stash;
}
public function retrieveMulti( $type, array $entities ) {
$entitiesByKey = [];
foreach ( $entities as $entity ) {
$entitiesByKey[$this->getStoreKey( $type, $entity )] = $entity;
}
$blobsByKey = $this->stash->getMulti( array_keys( $entitiesByKey ) );
$results = [];
foreach ( $entitiesByKey as $key => $entity ) {
$blob = $blobsByKey[$key] ?? null;
$data = is_string( $blob ) ? json_decode( $blob, true ) : null;
$results[$entity] = $this->newEntityDependencies(
$data[self::KEY_PATHS] ?? [],
$data[self::KEY_AS_OF] ?? null
);
}
return $results;
}
public function storeMulti( $type, array $dataByEntity, $ttl ) {
$blobsByKey = [];
foreach ( $dataByEntity as $entity => $data ) {
if ( !is_array( $data[self::KEY_PATHS] ) || !is_int( $data[self::KEY_AS_OF] ) ) {
throw new InvalidArgumentException( "Invalid entry for '$entity'" );
}
// Normalize the list by removing duplicates and sorting
$data[self::KEY_PATHS] = array_values( array_unique( $data[self::KEY_PATHS] ) );
sort( $data[self::KEY_PATHS], SORT_STRING );
$blob = json_encode( $data, JSON_UNESCAPED_SLASHES | JSON_UNESCAPED_UNICODE );
$blobsByKey[$this->getStoreKey( $type, $entity )] = $blob;
}
if ( $blobsByKey ) {
$this->stash->setMulti( $blobsByKey, $ttl, BagOStuff::WRITE_BACKGROUND );
}
}
public function remove( $type, $entities ) {
$keys = [];
foreach ( (array)$entities as $entity ) {
$keys[] = $this->getStoreKey( $type, $entity );
}
if ( $keys ) {
$this->stash->deleteMulti( $keys, BagOStuff::WRITE_BACKGROUND );
}
}
/**
* @param string $type
* @param string $entity
* @return string
*/
private function getStoreKey( $type, $entity ) {
return $this->stash->makeKey( "{$type}-dependencies", $entity );
}
}