Modules now track their version via getVersionHash() instead of getModifiedTime(). == Background == While some resources have observeable timestamps (e.g. files stored on disk), many other resources do not. E.g. config variables, and module definitions. For static file modules, one can e.g. revert one of more files in a module to a previous version and not affect the max timestamp. Wiki modules include pages only if they exist. The user module supports common.js and skin.js. By default neither exists. If a user has both, and then the less-recently modified one is deleted, the max-timestamp remains unchanged. For client-side caching, batch requests use "Math.max" on the relevant timestamps. Again, if a module changes but another module is more recent (e.g. out-of-order deployment, or out-of-order discovery), the change would not result in a cache miss. More scenarios can be found in the associated Phabricator tasks. == Version hash == Previously we virtually mapped these variables to a timestamp by storing the current time alongside a hash of the value in ObjectCache. Considering the number of possible request contexts (wikis * modules * users * skins * languages) this doesn't work well. It results in needless cache invalidation when the first time observation is purged due to LRU algorithms. It also has other minor bugs leading to fewer cache hits. All modules automatically get the benefits of version hashing with this change. The old getDefinitionMtime() and getHashMtime() have been replaced with dummies that return 1. These functions are often called from getModifiedTime() in subclasses. For backward-compatibility, their respective values (definition summary and hash) are now included in getVersionHash directly. As examples, the following modules have been updated to use getVersionHash directly. Other modules still work fine and can be updated later. * ResourceLoaderFileModule * ResourceLoaderEditToolbarModule * ResourceLoaderStartUpModule * ResourceLoaderWikiModule The presence of hashes in place of timestamps increases the startup module size on a default MediaWiki install from 4.4k to 5.8k (after gzip and minification). == ETag == Since timestamps are no longer tracked, we need a different way to implement caching for cache proxies (e.g. Varnish) and web browsers. Previously we used the Last-Modified header (in combination with Cache-Control and Expires). Instead of Last-Modified (and If-Modified-Since), we use ETag (and If-None-Match). Entity tags (new in HTTP/1.1) are much stricter than Last-Modified by default. They instruct browsers to allow usage of partial Range requests. Since our responses are dynamically generated, we need to use the Weak version of ETag. While this sounds bad, it's no different than Last-Modified. As reassured by RFC 2616 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.3> the specified behaviour behind Last-Modified follows the same "Weak" caching logic as Entity tags. It's just that entity tags are capable of a stricter mode (whereas Last-Modified is inherently weak). == File cache == If $wgUseFileCache is enabled, ResourceLoader uses ResourceFileCache to cache load.php responses. While the blind TTL handling (during the allowed expiry period) is still maxage/timestamp based, tryRespondNotModified() now requires the caller to know the expected ETag. For this to work, the FileCache handling had to be moved from the top of ResoureLoader::respond() to after the expected ETag is computed. This also allows us to remove the duplicate tryRespondNotModified() handling since that's is already handled by ResourceLoader::respond() meanwhile. == Misc == * Remove redundant modifiedTime cache in ResourceLoaderFileModule. * Change bugzilla references to Phabricator. * Centralised inclusion of wgCacheEpoch using getDefinitionSummary. Previously this logic was duplicated in each place the modified timestamp was used. * It's easy to forget calling the parent class in getDefinitionSummary(). Previously this method only tracked 'class' by default. As such, various extensions hardcoded that one value instead of calling the parent and extending the array. To better prevent this in the future, getVersionHash() now asserts that the '_cacheEpoch' property made it through. * tests: Don't use getDefinitionSummary() as an API. Fix ResourceLoaderWikiModuleTest to call getPages properly. * In tests, the default timestamp used to be 1388534400000 (which is the unix time of 20140101000000; the unit tests' CacheEpoch). The new version hash of these modules is "XyCC+PSK", which is the base64 encoded prefix of the SHA1 digest of: '{"_class":"ResourceLoaderTestModule","_cacheEpoch":"20140101000000"}' * Add sha1.js library for client-side hash generation. Compared various different implementations for code size (after minfication/gzip), and speed (when used for short hexidecimal strings). https://jsperf.com/sha1-implementations - CryptoJS <https://code.google.com/p/crypto-js/#SHA-1> (min+gzip: 2.5k) http://crypto-js.googlecode.com/svn/tags/3.1.2/build/rollups/sha1.js Chrome: 45k, Firefox: 89k, Safari: 92k - jsSHA <https://github.com/Caligatio/jsSHA> https://github.com/Caligatio/jsSHA/blob/3c1d4f2e/src/sha1.js (min+gzip: 1.8k) Chrome: 65k, Firefox: 53k, Safari: 69k - phpjs-sha1 <https://github.com/kvz/phpjs> (RL min+gzip: 0.8k) https://github.com/kvz/phpjs/blob/1eaab15d/functions/strings/sha1.js Chrome: 200k, Firefox: 280k, Safari: 78k Modern browsers implement the HTML5 Crypto API. However, this API is asynchronous, only enabled when on HTTPS in Chromium, and is quite low-level. It requires boilerplate code to actually use with TextEncoder, ArrayBuffer and Uint32Array. Due this being needed in the module loader, we'd have to load the fallback regardless. Considering this is not used in a critical path for performance, it's not worth shipping two implementations for this optimisation. May also resolve: * T44094 * T90411 * T94810 Bug: T94074 Change-Id: Ibb292d2416839327d1807a66c78fd96dac0637d0
308 lines
8.8 KiB
PHP
308 lines
8.8 KiB
PHP
<?php
|
|
/**
|
|
* Abstraction for resource loader modules which pull from wiki pages.
|
|
*
|
|
* This program is free software; you can redistribute it and/or modify
|
|
* it under the terms of the GNU General Public License as published by
|
|
* the Free Software Foundation; either version 2 of the License, or
|
|
* (at your option) any later version.
|
|
*
|
|
* This program is distributed in the hope that it will be useful,
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
* GNU General Public License for more details.
|
|
*
|
|
* You should have received a copy of the GNU General Public License along
|
|
* with this program; if not, write to the Free Software Foundation, Inc.,
|
|
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
|
* http://www.gnu.org/copyleft/gpl.html
|
|
*
|
|
* @file
|
|
* @author Trevor Parscal
|
|
* @author Roan Kattouw
|
|
*/
|
|
|
|
/**
|
|
* Abstraction for resource loader modules which pull from wiki pages
|
|
*
|
|
* This can only be used for wiki pages in the MediaWiki and User namespaces,
|
|
* because of its dependence on the functionality of
|
|
* Title::isCssJsSubpage.
|
|
*/
|
|
class ResourceLoaderWikiModule extends ResourceLoaderModule {
|
|
|
|
// Origin defaults to users with sitewide authority
|
|
protected $origin = self::ORIGIN_USER_SITEWIDE;
|
|
|
|
// In-object cache for title info
|
|
protected $titleInfo = array();
|
|
|
|
// List of page names that contain CSS
|
|
protected $styles = array();
|
|
|
|
// List of page names that contain JavaScript
|
|
protected $scripts = array();
|
|
|
|
// Group of module
|
|
protected $group;
|
|
|
|
/**
|
|
* @param array $options For back-compat, this can be omitted in favour of overwriting getPages.
|
|
*/
|
|
public function __construct( array $options = null ) {
|
|
if ( isset( $options['styles'] ) ) {
|
|
$this->styles = $options['styles'];
|
|
}
|
|
if ( isset( $options['scripts'] ) ) {
|
|
$this->scripts = $options['scripts'];
|
|
}
|
|
if ( isset( $options['group'] ) ) {
|
|
$this->group = $options['group'];
|
|
}
|
|
}
|
|
|
|
/**
|
|
* Subclasses should return an associative array of resources in the module.
|
|
* Keys should be the title of a page in the MediaWiki or User namespace.
|
|
*
|
|
* Values should be a nested array of options. The supported keys are 'type' and
|
|
* (CSS only) 'media'.
|
|
*
|
|
* For scripts, 'type' should be 'script'.
|
|
*
|
|
* For stylesheets, 'type' should be 'style'.
|
|
* There is an optional media key, the value of which can be the
|
|
* medium ('screen', 'print', etc.) of the stylesheet.
|
|
*
|
|
* @param ResourceLoaderContext $context
|
|
* @return array
|
|
*/
|
|
protected function getPages( ResourceLoaderContext $context ) {
|
|
$config = $this->getConfig();
|
|
$pages = array();
|
|
|
|
// Filter out pages from origins not allowed by the current wiki configuration.
|
|
if ( $config->get( 'UseSiteJs' ) ) {
|
|
foreach ( $this->scripts as $script ) {
|
|
$pages[$script] = array( 'type' => 'script' );
|
|
}
|
|
}
|
|
|
|
if ( $config->get( 'UseSiteCss' ) ) {
|
|
foreach ( $this->styles as $style ) {
|
|
$pages[$style] = array( 'type' => 'style' );
|
|
}
|
|
}
|
|
|
|
return $pages;
|
|
}
|
|
|
|
/**
|
|
* Get group name
|
|
*
|
|
* @return string
|
|
*/
|
|
public function getGroup() {
|
|
return $this->group;
|
|
}
|
|
|
|
/**
|
|
* Get the Database object used in getTitleMTimes(). Defaults to the local slave DB
|
|
* but subclasses may want to override this to return a remote DB object, or to return
|
|
* null if getTitleMTimes() shouldn't access the DB at all.
|
|
*
|
|
* NOTE: This ONLY works for getTitleMTimes() and getModifiedTime(), NOT FOR ANYTHING ELSE.
|
|
* In particular, it doesn't work for getting the content of JS and CSS pages. That functionality
|
|
* will use the local DB irrespective of the return value of this method.
|
|
*
|
|
* @return IDatabase|null
|
|
*/
|
|
protected function getDB() {
|
|
return wfGetDB( DB_SLAVE );
|
|
}
|
|
|
|
/**
|
|
* @param Title $title
|
|
* @return null|string
|
|
*/
|
|
protected function getContent( $title ) {
|
|
$handler = ContentHandler::getForTitle( $title );
|
|
if ( $handler->isSupportedFormat( CONTENT_FORMAT_CSS ) ) {
|
|
$format = CONTENT_FORMAT_CSS;
|
|
} elseif ( $handler->isSupportedFormat( CONTENT_FORMAT_JAVASCRIPT ) ) {
|
|
$format = CONTENT_FORMAT_JAVASCRIPT;
|
|
} else {
|
|
return null;
|
|
}
|
|
|
|
$revision = Revision::newFromTitle( $title, false, Revision::READ_NORMAL );
|
|
if ( !$revision ) {
|
|
return null;
|
|
}
|
|
|
|
$content = $revision->getContent( Revision::RAW );
|
|
|
|
if ( !$content ) {
|
|
wfDebugLog( 'resourceloader', __METHOD__ . ': failed to load content of JS/CSS page!' );
|
|
return null;
|
|
}
|
|
|
|
return $content->serialize( $format );
|
|
}
|
|
|
|
/**
|
|
* @param ResourceLoaderContext $context
|
|
* @return string
|
|
*/
|
|
public function getScript( ResourceLoaderContext $context ) {
|
|
$scripts = '';
|
|
foreach ( $this->getPages( $context ) as $titleText => $options ) {
|
|
if ( $options['type'] !== 'script' ) {
|
|
continue;
|
|
}
|
|
$title = Title::newFromText( $titleText );
|
|
if ( !$title || $title->isRedirect() ) {
|
|
continue;
|
|
}
|
|
$script = $this->getContent( $title );
|
|
if ( strval( $script ) !== '' ) {
|
|
$script = $this->validateScriptFile( $titleText, $script );
|
|
$scripts .= ResourceLoader::makeComment( $titleText ) . $script . "\n";
|
|
}
|
|
}
|
|
return $scripts;
|
|
}
|
|
|
|
/**
|
|
* @param ResourceLoaderContext $context
|
|
* @return array
|
|
*/
|
|
public function getStyles( ResourceLoaderContext $context ) {
|
|
$styles = array();
|
|
foreach ( $this->getPages( $context ) as $titleText => $options ) {
|
|
if ( $options['type'] !== 'style' ) {
|
|
continue;
|
|
}
|
|
$title = Title::newFromText( $titleText );
|
|
if ( !$title || $title->isRedirect() ) {
|
|
continue;
|
|
}
|
|
$media = isset( $options['media'] ) ? $options['media'] : 'all';
|
|
$style = $this->getContent( $title );
|
|
if ( strval( $style ) === '' ) {
|
|
continue;
|
|
}
|
|
if ( $this->getFlip( $context ) ) {
|
|
$style = CSSJanus::transform( $style, true, false );
|
|
}
|
|
$style = CSSMin::remap( $style, false, $this->getConfig()->get( 'ScriptPath' ), true );
|
|
if ( !isset( $styles[$media] ) ) {
|
|
$styles[$media] = array();
|
|
}
|
|
$style = ResourceLoader::makeComment( $titleText ) . $style;
|
|
$styles[$media][] = $style;
|
|
}
|
|
return $styles;
|
|
}
|
|
|
|
/**
|
|
* @param ResourceLoaderContext $context
|
|
* @return int
|
|
*/
|
|
public function getModifiedTime( ResourceLoaderContext $context ) {
|
|
$modifiedTime = 1;
|
|
$titleInfo = $this->getTitleInfo( $context );
|
|
if ( count( $titleInfo ) ) {
|
|
$mtimes = array_map( function ( $value ) {
|
|
return $value['timestamp'];
|
|
}, $titleInfo );
|
|
$modifiedTime = max( $modifiedTime, max( $mtimes ) );
|
|
}
|
|
$modifiedTime = max(
|
|
$modifiedTime,
|
|
$this->getMsgBlobMtime( $context->getLanguage() ),
|
|
$this->getDefinitionMtime( $context )
|
|
);
|
|
return $modifiedTime;
|
|
}
|
|
|
|
/**
|
|
* @param ResourceLoaderContext $context
|
|
* @return array
|
|
*/
|
|
public function getDefinitionSummary( ResourceLoaderContext $context ) {
|
|
$summary = parent::getDefinitionSummary( $context );
|
|
$summary[] = array(
|
|
'pages' => $this->getPages( $context ),
|
|
);
|
|
return $summary;
|
|
}
|
|
|
|
/**
|
|
* @param ResourceLoaderContext $context
|
|
* @return bool
|
|
*/
|
|
public function isKnownEmpty( ResourceLoaderContext $context ) {
|
|
$titleInfo = $this->getTitleInfo( $context );
|
|
// Bug 68488: For modules in the "user" group, we should actually
|
|
// check that the pages are empty (page_len == 0), but for other
|
|
// groups, just check the pages exist so that we don't end up
|
|
// caching temporarily-blank pages without the appropriate
|
|
// <script> or <link> tag.
|
|
if ( $this->getGroup() !== 'user' ) {
|
|
return count( $titleInfo ) === 0;
|
|
}
|
|
|
|
foreach ( $titleInfo as $info ) {
|
|
if ( $info['length'] !== 0 ) {
|
|
// At least one non-0-lenth page, not empty
|
|
return false;
|
|
}
|
|
}
|
|
|
|
// All pages are 0-length, so it's empty
|
|
return true;
|
|
}
|
|
|
|
/**
|
|
* Get the modification times of all titles that would be loaded for
|
|
* a given context.
|
|
* @param ResourceLoaderContext $context Context object
|
|
* @return array Keyed by page dbkey. Value is an array with 'length' and 'timestamp'
|
|
* keys, where the timestamp is a UNIX timestamp
|
|
*/
|
|
protected function getTitleInfo( ResourceLoaderContext $context ) {
|
|
$dbr = $this->getDB();
|
|
if ( !$dbr ) {
|
|
// We're dealing with a subclass that doesn't have a DB
|
|
return array();
|
|
}
|
|
|
|
$hash = $context->getHash();
|
|
if ( isset( $this->titleInfo[$hash] ) ) {
|
|
return $this->titleInfo[$hash];
|
|
}
|
|
|
|
$this->titleInfo[$hash] = array();
|
|
$batch = new LinkBatch;
|
|
foreach ( $this->getPages( $context ) as $titleText => $options ) {
|
|
$batch->addObj( Title::newFromText( $titleText ) );
|
|
}
|
|
|
|
if ( !$batch->isEmpty() ) {
|
|
$res = $dbr->select( 'page',
|
|
array( 'page_namespace', 'page_title', 'page_touched', 'page_len' ),
|
|
$batch->constructSet( 'page', $dbr ),
|
|
__METHOD__
|
|
);
|
|
foreach ( $res as $row ) {
|
|
$title = Title::makeTitle( $row->page_namespace, $row->page_title );
|
|
$this->titleInfo[$hash][$title->getPrefixedDBkey()] = array(
|
|
'timestamp' => wfTimestamp( TS_UNIX, $row->page_touched ),
|
|
'length' => $row->page_len,
|
|
);
|
|
}
|
|
}
|
|
return $this->titleInfo[$hash];
|
|
}
|
|
}
|