wiki.techinc.nl/includes/utils/FileContentsHasher.php
Timo Tijhof 6bf01cfaa3 resourceloader: Use FileContentsHasher batching in FileModule::getFileHashes
Instead of hashing each file separately, hash them in a batch.
Previous research on this area of code has identified the suppressing
and restoring of warnings to have a measurable cost, which is why
this was optimised in 3621ad0f82. However, we never really made
use if it (aisde from the 2:1 change in that commit itself), because
we always call it with a single item, turn it into an array, do the
hash and then merge it again.

Instead, we now let it handle a single module's set of files all
at once. Given that this no longer exposes an array of hashes,
also update the (private) signature of getFileHashes to reflect this.

This means all file modules will have their version bumped during the
next MediaWiki release. In general this happens for most releases
and weekly branches already (due to localisation update, general
maintenance, linting changes, and other internal changes). But,
noting here for future reference as this might not be obvious from
the diff.

Change-Id: I4e141a0f5a5c1c1972e2ba33d4b7be6e64ed6ab6
2019-12-17 11:56:22 +00:00

114 lines
3.2 KiB
PHP

<?php
/**
* Generate hash digests of file contents to help with cache invalidation.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
* http://www.gnu.org/copyleft/gpl.html
*
* @file
*/
class FileContentsHasher {
/** @var BagOStuff */
protected $cache;
/** @var FileContentsHasher */
private static $instance;
public function __construct() {
$this->cache = ObjectCache::getLocalServerInstance( 'hash' );
}
/**
* Get the singleton instance of this class.
*
* @return FileContentsHasher
*/
public static function singleton() {
if ( !self::$instance ) {
self::$instance = new self;
}
return self::$instance;
}
/**
* Get a hash of a file's contents, either by retrieving a previously-
* computed hash from the cache, or by computing a hash from the file.
*
* @param string $filePath Full path to the file.
* @param string $algo Name of selected hashing algorithm.
* @return string|bool Hash of file contents, or false if the file could not be read.
*/
private function getFileContentsHashInternal( $filePath, $algo = 'md4' ) {
$mtime = filemtime( $filePath );
if ( $mtime === false ) {
return false;
}
$cacheKey = $this->cache->makeGlobalKey( __CLASS__, $filePath, $mtime, $algo );
$hash = $this->cache->get( $cacheKey );
if ( $hash ) {
return $hash;
}
$contents = file_get_contents( $filePath );
if ( $contents === false ) {
return false;
}
$hash = hash( $algo, $contents );
$this->cache->set( $cacheKey, $hash, 60 * 60 * 24 ); // 24h
return $hash;
}
/**
* Get a hash of the combined contents of one or more files, either by
* retrieving a previously-computed hash from the cache, or by computing
* a hash from the files.
*
* @param string|string[] $filePaths One or more file paths.
* @param string $algo Name of selected hashing algorithm.
* @return string|bool Hash of files' contents, or false if no file could not be read.
*/
public static function getFileContentsHash( $filePaths, $algo = 'md4' ) {
$instance = self::singleton();
if ( !is_array( $filePaths ) ) {
$filePaths = (array)$filePaths;
}
Wikimedia\suppressWarnings();
if ( count( $filePaths ) === 1 ) {
$hash = $instance->getFileContentsHashInternal( $filePaths[0], $algo );
Wikimedia\restoreWarnings();
return $hash;
}
sort( $filePaths );
$hashes = [];
foreach ( $filePaths as $filePath ) {
$hashes[] = $instance->getFileContentsHashInternal( $filePath, $algo ) ?: '';
}
Wikimedia\restoreWarnings();
$hashes = implode( '', $hashes );
return $hashes ? hash( $algo, $hashes ) : false;
}
}