== Background
Most of this was introduced in commit 5c335f9d77 (I1eb897c2cea3f5b7).
The original motivation was:
* Ensure wrappers like MultiWriteBagOStuff naturally do the right
thing. In practice, makeKey() results are interchangeable, with
the most contrained one (Memcached) also generally used as the first
tier. However, this is not intuitive and may change in the future.
To make it more intuitive, the default implemention became known
as "generic", with proxyCall() responsible for decoding these,
and then re-encoding them with makeKey() from the respective
underlying BagOStuff. This meant that MultiWriteBag would no longer
use the result of the Memcached-formatted cache key and pass it
to SqlBagOStuff.
* Allow extraction of the key group from a given key cache,
for use in statistics.
Both motivations remains valid and addressed after this refactor.
== Change
* Remove boilerplate and indirection around makeKey from a dozen
classes. E.g. copy-paste stubs for makeKey, makeKeyInternal, and
convertGenericKey.
Instead, let BagOStuff::makeKey and ::makeKeyInternal hold the
defaults. I believe this makes the logic easier to find, understand,
and refer to.
The three non-default implementations (Memcached, WinCache, Sql)
now naturally reflect what they are in terms of business logic,
they are a method override.
Introduce a single boolean requireConvertGenericKey() to let the
three non-default implementations signal their need to convert
keys before use.
* Further improve internal consistently of BagOStuff::makeKeyInternal.
The logic of genericKeyFromComponents() was moved up into
BagOStuff::makeKeyInternal. As a result of caling this directly
from BagOStuff::makeKey(), this code now sees $keyspace and $components
as separate arguments. To keep the behaviour the same, we would
have to either unshift $keyspace into $components, or duplicate
the strtr() call to escape it.
Instead, excempt keyspace from escaping. This matches how the most
commonly used BagOStuff implementations (MemcachedBag, and SqlBag)
already worked for 10+ years, thus this does not introduce any new
responsibility on callers. In particular, keyspace (not key group)
is set by MediaWiki core in service wiring to the wiki ID, and so
is not the concern of individual callers anyway.
* Docs: Explain in proxyCall() why this indirection and complexity
exists. It lets wrapping classes decode and re-encode keys.
* Docs: Explain the cross-wiki and local-wiki semantics of makeKey
and makeKeyGlobal, and centralise this and other important docs
about this method in the place with the most eye balls where it is
most likely seen and discovered, namely BagOStuff::makeKey.
Remove partial docs from other places in favour of references to this one.
Previously, there was no particular reason to follow `@see IStoreKeyEncoder`
much less to know that it holds critical that communicate the
responsibility to limit the key group to 48 chars.
* Docs: Consistently refer to the first component as the "key group",
thus unifying what was known as "key class", "collection",
"key collection name", or "collection name".
The term "key group" seems to be what is used by developers in
conversations for this concept, matching WMF on-boarding docs and
WMF's Grafana dashboard for WANObjectCache.
Change-Id: I6b3167cac824d8bd8773bc66c386f41e4d380021
379 lines
11 KiB
PHP
379 lines
11 KiB
PHP
<?php
|
|
/**
|
|
* Wrapper for object caching in different caches.
|
|
*
|
|
* This program is free software; you can redistribute it and/or modify
|
|
* it under the terms of the GNU General Public License as published by
|
|
* the Free Software Foundation; either version 2 of the License, or
|
|
* (at your option) any later version.
|
|
*
|
|
* This program is distributed in the hope that it will be useful,
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
* GNU General Public License for more details.
|
|
*
|
|
* You should have received a copy of the GNU General Public License along
|
|
* with this program; if not, write to the Free Software Foundation, Inc.,
|
|
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
|
* http://www.gnu.org/copyleft/gpl.html
|
|
*
|
|
* @file
|
|
* @ingroup Cache
|
|
*/
|
|
use Wikimedia\ObjectFactory\ObjectFactory;
|
|
|
|
/**
|
|
* A cache class that replicates all writes to multiple child caches. Reads
|
|
* are implemented by reading from the caches in the order they are given in
|
|
* the configuration until a cache gives a positive result.
|
|
*
|
|
* Note that cache key construction will use the first cache backend in the list,
|
|
* so make sure that the other backends can handle such keys (e.g. via encoding).
|
|
*
|
|
* @newable
|
|
* @ingroup Cache
|
|
*/
|
|
class MultiWriteBagOStuff extends BagOStuff {
|
|
/** @var BagOStuff[] Backing cache stores in order of highest to lowest tier */
|
|
protected $caches;
|
|
|
|
/** @var bool Use async secondary writes */
|
|
protected $asyncWrites = false;
|
|
/** @var int[] List of all backing cache indexes */
|
|
protected $cacheIndexes = [];
|
|
|
|
/** @var int TTL when a key is copied to a higher cache tier */
|
|
private static $UPGRADE_TTL = 3600;
|
|
|
|
/**
|
|
* @stable to call
|
|
* @param array $params
|
|
* - caches: A numbered array of either ObjectFactory::getObjectFromSpec
|
|
* arrays yielding BagOStuff objects or direct BagOStuff objects.
|
|
* If using the former, the 'args' field *must* be set.
|
|
* The first cache is the primary one, being the first to
|
|
* be read in the fallback chain. Writes happen to all stores
|
|
* in the order they are defined. However, lock()/unlock() calls
|
|
* only use the primary store.
|
|
* - replication: Either 'sync' or 'async'. This controls whether writes
|
|
* to secondary stores are deferred when possible. To use 'async' writes
|
|
* requires the 'asyncHandler' option to be set as well.
|
|
* Async writes can increase the chance of some race conditions
|
|
* or cause keys to expire seconds later than expected. It is
|
|
* safe to use for modules when cached values: are immutable,
|
|
* invalidation uses logical TTLs, invalidation uses etag/timestamp
|
|
* validation against the DB, or merge() is used to handle races.
|
|
* @phan-param array{caches:array<int,array|BagOStuff>,replication:string} $params
|
|
* @throws InvalidArgumentException
|
|
*/
|
|
public function __construct( $params ) {
|
|
parent::__construct( $params );
|
|
|
|
if ( empty( $params['caches'] ) || !is_array( $params['caches'] ) ) {
|
|
throw new InvalidArgumentException(
|
|
__METHOD__ . ': "caches" parameter must be an array of caches'
|
|
);
|
|
}
|
|
|
|
$this->caches = [];
|
|
foreach ( $params['caches'] as $cacheInfo ) {
|
|
if ( $cacheInfo instanceof BagOStuff ) {
|
|
$this->caches[] = $cacheInfo;
|
|
} else {
|
|
$this->caches[] = ObjectFactory::getObjectFromSpec( $cacheInfo );
|
|
}
|
|
}
|
|
|
|
$this->attrMap = $this->mergeFlagMaps( $this->caches );
|
|
|
|
$this->asyncWrites = (
|
|
isset( $params['replication'] ) &&
|
|
$params['replication'] === 'async' &&
|
|
is_callable( $this->asyncHandler )
|
|
);
|
|
|
|
$this->cacheIndexes = array_keys( $this->caches );
|
|
}
|
|
|
|
public function get( $key, $flags = 0 ) {
|
|
$args = func_get_args();
|
|
|
|
if ( $this->fieldHasFlags( $flags, self::READ_LATEST ) ) {
|
|
// If the latest write was a delete(), we do NOT want to fallback
|
|
// to the other tiers and possibly see the old value. Also, this
|
|
// is used by merge(), which only needs to hit the primary.
|
|
return $this->callKeyMethodOnTierCache(
|
|
0,
|
|
__FUNCTION__,
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
$args
|
|
);
|
|
}
|
|
|
|
$value = false;
|
|
// backends checked
|
|
$missIndexes = [];
|
|
foreach ( $this->cacheIndexes as $i ) {
|
|
$value = $this->callKeyMethodOnTierCache(
|
|
$i,
|
|
__FUNCTION__,
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
$args
|
|
);
|
|
if ( $value !== false ) {
|
|
break;
|
|
}
|
|
$missIndexes[] = $i;
|
|
}
|
|
|
|
if (
|
|
$value !== false &&
|
|
$this->fieldHasFlags( $flags, self::READ_VERIFIED ) &&
|
|
$missIndexes
|
|
) {
|
|
// Backfill the value to the higher (and often faster/smaller) cache tiers
|
|
$this->callKeyWriteMethodOnTierCaches(
|
|
$missIndexes,
|
|
'set',
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
[ $key, $value, self::$UPGRADE_TTL ]
|
|
);
|
|
}
|
|
|
|
return $value;
|
|
}
|
|
|
|
public function set( $key, $value, $exptime = 0, $flags = 0 ) {
|
|
return $this->callKeyWriteMethodOnTierCaches(
|
|
$this->cacheIndexes,
|
|
__FUNCTION__,
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
func_get_args()
|
|
);
|
|
}
|
|
|
|
public function delete( $key, $flags = 0 ) {
|
|
return $this->callKeyWriteMethodOnTierCaches(
|
|
$this->cacheIndexes,
|
|
__FUNCTION__,
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
func_get_args()
|
|
);
|
|
}
|
|
|
|
public function add( $key, $value, $exptime = 0, $flags = 0 ) {
|
|
// Try the write to the top-tier cache
|
|
$ok = $this->callKeyMethodOnTierCache(
|
|
0,
|
|
__FUNCTION__,
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
func_get_args()
|
|
);
|
|
|
|
if ( $ok ) {
|
|
// Relay the add() using set() if it succeeded. This is meant to handle certain
|
|
// migration scenarios where the same store might get written to twice for certain
|
|
// keys. In that case, it makes no sense to return false due to "self-conflicts".
|
|
$okSecondaries = $this->callKeyWriteMethodOnTierCaches(
|
|
array_slice( $this->cacheIndexes, 1 ),
|
|
'set',
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
[ $key, $value, $exptime, $flags ]
|
|
);
|
|
if ( $okSecondaries === false ) {
|
|
$ok = false;
|
|
}
|
|
}
|
|
|
|
return $ok;
|
|
}
|
|
|
|
public function merge( $key, callable $callback, $exptime = 0, $attempts = 10, $flags = 0 ) {
|
|
return $this->callKeyWriteMethodOnTierCaches(
|
|
$this->cacheIndexes,
|
|
__FUNCTION__,
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
func_get_args()
|
|
);
|
|
}
|
|
|
|
public function changeTTL( $key, $exptime = 0, $flags = 0 ) {
|
|
return $this->callKeyWriteMethodOnTierCaches(
|
|
$this->cacheIndexes,
|
|
__FUNCTION__,
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
func_get_args()
|
|
);
|
|
}
|
|
|
|
public function lock( $key, $timeout = 6, $exptime = 6, $rclass = '' ) {
|
|
// Only need to lock the first cache; also avoids deadlocks
|
|
return $this->callKeyMethodOnTierCache(
|
|
0,
|
|
__FUNCTION__,
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
func_get_args()
|
|
);
|
|
}
|
|
|
|
public function unlock( $key ) {
|
|
// Only the first cache is locked
|
|
return $this->callKeyMethodOnTierCache(
|
|
0,
|
|
__FUNCTION__,
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
func_get_args()
|
|
);
|
|
}
|
|
|
|
public function deleteObjectsExpiringBefore(
|
|
$timestamp,
|
|
callable $progress = null,
|
|
$limit = INF,
|
|
string $tag = null
|
|
) {
|
|
$ret = false;
|
|
foreach ( $this->caches as $cache ) {
|
|
if ( $cache->deleteObjectsExpiringBefore( $timestamp, $progress, $limit, $tag ) ) {
|
|
$ret = true;
|
|
}
|
|
}
|
|
|
|
return $ret;
|
|
}
|
|
|
|
public function getMulti( array $keys, $flags = 0 ) {
|
|
// Just iterate over each key in order to handle all the backfill logic
|
|
$res = [];
|
|
foreach ( $keys as $key ) {
|
|
$val = $this->get( $key, $flags );
|
|
if ( $val !== false ) {
|
|
$res[$key] = $val;
|
|
}
|
|
}
|
|
|
|
return $res;
|
|
}
|
|
|
|
public function setMulti( array $valueByKey, $exptime = 0, $flags = 0 ) {
|
|
return $this->callKeyWriteMethodOnTierCaches(
|
|
$this->cacheIndexes,
|
|
__FUNCTION__,
|
|
self::ARG0_KEYMAP,
|
|
self::RES_NONKEY,
|
|
func_get_args()
|
|
);
|
|
}
|
|
|
|
public function deleteMulti( array $keys, $flags = 0 ) {
|
|
return $this->callKeyWriteMethodOnTierCaches(
|
|
$this->cacheIndexes,
|
|
__FUNCTION__,
|
|
self::ARG0_KEYARR,
|
|
self::RES_NONKEY,
|
|
func_get_args()
|
|
);
|
|
}
|
|
|
|
public function changeTTLMulti( array $keys, $exptime, $flags = 0 ) {
|
|
return $this->callKeyWriteMethodOnTierCaches(
|
|
$this->cacheIndexes,
|
|
__FUNCTION__,
|
|
self::ARG0_KEYARR,
|
|
self::RES_NONKEY,
|
|
func_get_args()
|
|
);
|
|
}
|
|
|
|
public function incrWithInit( $key, $exptime, $step = 1, $init = null, $flags = 0 ) {
|
|
return $this->callKeyWriteMethodOnTierCaches(
|
|
$this->cacheIndexes,
|
|
__FUNCTION__,
|
|
self::ARG0_KEY,
|
|
self::RES_NONKEY,
|
|
func_get_args()
|
|
);
|
|
}
|
|
|
|
public function setMockTime( &$time ) {
|
|
parent::setMockTime( $time );
|
|
foreach ( $this->caches as $cache ) {
|
|
$cache->setMockTime( $time );
|
|
}
|
|
}
|
|
|
|
/**
|
|
* Call a method on the cache instance for the given cache tier (index)
|
|
*
|
|
* @param int $index Cache tier
|
|
* @param string $method Method name
|
|
* @param int $arg0Sig BagOStuff::A0_* constant describing argument 0
|
|
* @param int $rvSig BagOStuff::RV_* constant describing the return value
|
|
* @param array $args Method arguments
|
|
* @return mixed The result of calling the given method
|
|
*/
|
|
private function callKeyMethodOnTierCache( $index, $method, $arg0Sig, $rvSig, array $args ) {
|
|
return $this->caches[$index]->proxyCall( $method, $arg0Sig, $rvSig, $args, $this );
|
|
}
|
|
|
|
/**
|
|
* Call a write method on the cache instances, in order, for the given tiers (indexes)
|
|
*
|
|
* @param int[] $indexes List of cache tiers
|
|
* @param string $method Method name
|
|
* @param int $arg0Sig BagOStuff::ARG0_* constant describing argument 0
|
|
* @param int $resSig BagOStuff::RES_* constant describing the return value
|
|
* @param array $args Method arguments
|
|
* @return mixed First synchronous result or false if any failed; null if all asynchronous
|
|
*/
|
|
private function callKeyWriteMethodOnTierCaches(
|
|
array $indexes,
|
|
$method,
|
|
$arg0Sig,
|
|
$resSig,
|
|
array $args
|
|
) {
|
|
$res = null;
|
|
|
|
if ( $this->asyncWrites && array_diff( $indexes, [ 0 ] ) && $method !== 'merge' ) {
|
|
// Deep-clone $args to prevent misbehavior when something writes an
|
|
// object to the BagOStuff then modifies it afterwards, e.g. T168040.
|
|
$args = unserialize( serialize( $args ) );
|
|
}
|
|
|
|
foreach ( $indexes as $i ) {
|
|
$cache = $this->caches[$i];
|
|
|
|
if ( $i == 0 || !$this->asyncWrites ) {
|
|
// Tier 0 store or in sync mode: write synchronously and get result
|
|
$storeRes = $cache->proxyCall( $method, $arg0Sig, $resSig, $args, $this );
|
|
if ( $storeRes === false ) {
|
|
$res = false;
|
|
} elseif ( $res === null ) {
|
|
// first synchronous result
|
|
$res = $storeRes;
|
|
}
|
|
} else {
|
|
// Secondary write in async mode: do not block this HTTP request
|
|
( $this->asyncHandler )(
|
|
function () use ( $cache, $method, $arg0Sig, $resSig, $args ) {
|
|
$cache->proxyCall( $method, $arg0Sig, $resSig, $args, $this );
|
|
}
|
|
);
|
|
}
|
|
}
|
|
|
|
return $res;
|
|
}
|
|
}
|