wiki.techinc.nl/includes/historyblob/ConcatenatedGzipHistoryBlob.php
Tim Starling 20d06b34bb Safer autoloading with respect to file-scope code
Many files were in the autoloader despite having potentially harmful
file-scope code.

* Exclude all CommandLineInc maintenance scripts from the autoloader.
* Introduce  "NO_AUTOLOAD" tag which excludes the file containing it
  from the autoloader. Use it on CommandLineInc.php and a few
  suspicious-looking files without classes in case they are refactored
  to add classes in the future.
* Add a test which parses all non-PSR4 class files and confirms that
  they do not contain dangerous file-scope code. It's slow (15s) but
  its results were enlightening.
* Several maintenance scripts define constants in the file scope,
  intending to modify the behaviour of MediaWiki. Either move the
  define() to a later setup function, or protect with NO_AUTOLOAD.
* Use require_once consistently with Maintenance.php and
  doMaintenance.php, per the original convention which is supposed to
  allow one maintenance script to use the class of another maintenance
  script. Using require breaks autoloading of these maintenance class
  files.
* When Maintenance.php is included, check if MediaWiki has already
  started, and if so, return early. Revert the fix for T250003 which
  is incompatible with this safety measure. Hopefully it was superseded
  by splitting out the class file.
* In runScript.php add a redundant PHP_SAPI check since it does some
  things in file-scope code before any other check will be run.
* Change the if(false) class_alias(...) to something more hackish and
  more compatible with the new test.
* Some site-related scripts found Maintenance.php in a non-standard way.
  Use the standard way.
* fileOpPerfTest.php called error_reporting(). Probably debugging code
  left in; removed.
* Moved mediawiki.compress.7z registration from the class file to the
  caller.

Change-Id: I1b1be90343a5ab678df6f1b1bdd03319dcf6537f
2021-01-11 11:59:36 +11:00

152 lines
3.8 KiB
PHP

<?php
/**
* Efficient concatenated text storage.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
* http://www.gnu.org/copyleft/gpl.html
*
* @file
*/
/**
* Concatenated gzip (CGZ) storage
* Improves compression ratio by concatenating like objects before gzipping
*/
class ConcatenatedGzipHistoryBlob implements HistoryBlob {
public $mVersion = 0;
public $mCompressed = false;
/**
* @var array|string
* @fixme Why are some methods treating it as an array, and others as a string, unconditionally?
*/
public $mItems = [];
public $mDefaultHash = '';
public $mSize = 0;
public $mMaxSize = 10000000;
public $mMaxCount = 100;
public function __construct() {
if ( !function_exists( 'gzdeflate' ) ) {
throw new MWException( "Need zlib support to read or write this "
. "kind of history object (ConcatenatedGzipHistoryBlob)\n" );
}
}
/**
* @param string $text
* @return string
*/
public function addItem( $text ) {
$this->uncompress();
$hash = md5( $text );
if ( !isset( $this->mItems[$hash] ) ) {
$this->mItems[$hash] = $text;
$this->mSize += strlen( $text );
}
return $hash;
}
/**
* @param string $hash
* @return array|bool
*/
public function getItem( $hash ) {
$this->uncompress();
if ( array_key_exists( $hash, $this->mItems ) ) {
return $this->mItems[$hash];
} else {
return false;
}
}
/**
* @param string $text
* @return void
*/
public function setText( $text ) {
$this->uncompress();
$this->mDefaultHash = $this->addItem( $text );
}
/**
* @return array|bool
*/
public function getText() {
$this->uncompress();
return $this->getItem( $this->mDefaultHash );
}
/**
* Remove an item
*
* @param string $hash
*/
public function removeItem( $hash ) {
$this->mSize -= strlen( $this->mItems[$hash] );
unset( $this->mItems[$hash] );
}
/**
* Compress the bulk data in the object
*/
public function compress() {
if ( !$this->mCompressed ) {
$this->mItems = gzdeflate( serialize( $this->mItems ) );
$this->mCompressed = true;
}
}
/**
* Uncompress bulk data
*/
public function uncompress() {
if ( $this->mCompressed ) {
$this->mItems = unserialize( gzinflate( $this->mItems ) );
$this->mCompressed = false;
}
}
/**
* @return array
*/
public function __sleep() {
$this->compress();
return [ 'mVersion', 'mCompressed', 'mItems', 'mDefaultHash' ];
}
public function __wakeup() {
$this->uncompress();
}
/**
* Helper function for compression jobs
* Returns true until the object is "full" and ready to be committed
*
* @return bool
*/
public function isHappy() {
return $this->mSize < $this->mMaxSize
&& count( $this->mItems ) < $this->mMaxCount;
}
}
// Blobs generated by MediaWiki < 1.5 on PHP 4 were serialized with the
// class name coerced to lowercase. We can improve efficiency by adding
// autoload entries for the lowercase variants of these classes (T166759).
// The code below is never executed, but it is picked up by the AutoloadGenerator
// parser, which scans for class_alias() calls.
/*
class_alias( ConcatenatedGzipHistoryBlob::class, 'concatenatedgziphistoryblob' );
*/