wiki.techinc.nl/maintenance/purgeParserCache.php
Timo Tijhof e556a7ea99 purgeParserCache.php: Remove carriage return printer, minor cleanup
* There was a carriage return, as part of a self-overwriting progress
  bar. While nice for humans running the script by hand, this made
  the script difficult to understand in WMF production where the logs
  go to syslog/journalctl, which doesn't like \r and thus showed only
  `[2KB blob]` instead of the actual output. This can bypassed with
  `journalctl -u … --all`, but that still leaves one with a sea of
  misaligned progress bars, concatenated as one long "line".

  It seems most of our maintenance scripts nowadays just keep printing
  lines to notify of progress made, so change the script to that,
  by just printing a boring "... % done\n" line (at most) every 0.1%.

* Change hardcoded batch size of 100 to use $wgUpdateRowsPerQuery.
  This is generally a no-op as this configuration defaults to 100 in
  DefaultSettings.php, and is not overridden in WMF production.

* Add a "--dry-run" option to allow for last-minute verification of
  the age and date calculation which is rather non-trivial given that
  we don't index creation time and assume $wgParserCacheExpireTime
  is constant and unchanged, which in practice, when this script is
  considered for running, tends not to be the case, making it hard to
  be sure exactly what it will do.

Minor stuff:

* Make the ORDER BY clause explicitly use ascending order.

* Try to retroactively document why we use a --msleep parameter
  instead of waiting for replication.

* Revert some of the variable name changes of I723e6377c26750ff9
  which imho made the code harder to understand.

* Flip order of $overallRatio additions to match the order of
  $tablesDoneRatio (e.g. high level + current thing).

* Remove use of Language->timeanddate() in favour of native
  date formatting, mainly to include seconds and timezone,
  but also because its simpler not to bring in all of Language
  and LocalisationCache into the script.

* Pass unix time directly to SqlBagOStuff::deleteObjectsExpiringBefore,
  as other callers do, instead of TS_MW-formatting first. There is
  also additional casting to TS_UNIX within the method.

Before:

> Deleting objects expiring before 21:43, 7 May 2021
> [**************************************************] 100.00%
> Done

After:

> Deleting objects expiring before Fri, 07 May 2021 21:43:52 GMT
> ... 50.0% done
> ... 100.0% done
> Done

Bug: T280605
Bug: T282761
Change-Id: I563886be0b3aeb187c274c0de4549374e0ff18f0
2021-05-13 02:42:29 +00:00

117 lines
4.4 KiB
PHP

<?php
/**
* Remove old objects from the parser cache.
* This only works when the parser cache is in an SQL database.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
* http://www.gnu.org/copyleft/gpl.html
*
* @file
* @ingroup Maintenance
*/
require_once __DIR__ . '/Maintenance.php';
use MediaWiki\MediaWikiServices;
use Wikimedia\Timestamp\ConvertibleTimestamp;
/**
* Maintenance script to remove old objects from the parser cache.
*
* @ingroup Maintenance
*/
class PurgeParserCache extends Maintenance {
private $lastProgress;
private $usleep = 0;
public function __construct() {
parent::__construct();
$this->addDescription( "Remove old objects from the parser cache. " .
"This only works when the parser cache is in an SQL database." );
$this->addOption( 'expiredate', 'Delete objects expiring before this date.', false, true );
$this->addOption(
'age',
'Delete objects created more than this many seconds ago, assuming ' .
'$wgParserCacheExpireTime has remained consistent.',
false,
true );
$this->addOption( 'dry-run', 'Perform a dry run, to verify age and date calculation.' );
$this->addOption( 'msleep', 'Milliseconds to sleep between purge chunks of $wgUpdateRowsPerQuery.',
false,
true );
}
public function execute() {
global $wgParserCacheExpireTime;
$inputDate = $this->getOption( 'expiredate' );
$inputAge = $this->getOption( 'age' );
if ( $inputDate !== null ) {
$timestamp = strtotime( $inputDate );
} elseif ( $inputAge !== null ) {
$timestamp = time() + $wgParserCacheExpireTime - intval( $inputAge );
} else {
$this->fatalError( "Must specify either --expiredate or --age" );
return;
}
$this->usleep = 1e3 * $this->getOption( 'msleep', 0 );
$humanDate = ConvertibleTimestamp::convert( TS_RFC2822, $timestamp );
if ( $this->hasOption( 'dry-run' ) ) {
$this->fatalError( "\nDry run mode, would delete objects having an expiry before " . $humanDate . "\n" );
return;
}
$this->output( "Deleting objects expiring before " . $humanDate . "\n" );
$pc = MediaWikiServices::getInstance()->getParserCache()->getCacheStorage();
$success = $pc->deleteObjectsExpiringBefore( $timestamp, [ $this, 'showProgressAndWait' ] );
if ( !$success ) {
$this->fatalError( "\nCannot purge this kind of parser cache." );
}
$this->showProgressAndWait( 100 );
$this->output( "\nDone\n" );
}
public function showProgressAndWait( $percent ) {
// Parser caches involve mostly-unthrottled writes of large blobs. This is sometimes prone
// to replication lag. As such, while our purge queries are simple primary key deletes,
// we want to avoid adding significant load to the replication stream, by being
// proactively graceful with these sleeps between each batch.
// The reason we don't explicitly wait for replication is that that would require the script
// to be aware of cross-dc replicas, which we prefer not to, and waiting for replication
// and confirmation latency might actually be *too* graceful and take so long that the
// purge script would not be able to finish within 24 hours for large wiki farms.
// (T150124).
usleep( $this->usleep );
$percentString = sprintf( "%.1f", $percent );
if ( $percentString === $this->lastProgress ) {
// Only print a line if we've progressed >= 0.1% since the last printed line.
// This does not mean every 0.1% step is printed since we only run this callback
// once after a deletion batch. How often and how many lines we print depends on the
// batch size (SqlBagOStuff::deleteObjectsExpiringBefore, $wgUpdateRowsPerQuery),
// and on how many talbe rows there are.
return;
}
$this->lastProgress = $percentString;
$this->output( "... $percentString% done\n" );
}
}
$maintClass = PurgeParserCache::class;
require_once RUN_MAINTENANCE_IF_MAIN;