wiki.techinc.nl/includes/resourceloader/ResourceLoaderStartUpModule.php
Timo Tijhof f37cee996e resourceloader: Replace timestamp system with version hashing
Modules now track their version via getVersionHash() instead of getModifiedTime().

== Background ==

While some resources have observeable timestamps (e.g. files stored on disk),
many other resources do not. E.g. config variables, and module definitions.

For static file modules, one can e.g. revert one of more files in a module to a
previous version and not affect the max timestamp.

Wiki modules include pages only if they exist. The user module supports common.js
and skin.js. By default neither exists. If a user has both, and then the
less-recently modified one is deleted, the max-timestamp remains unchanged.

For client-side caching, batch requests use "Math.max" on the relevant timestamps.
Again, if a module changes but another module is more recent (e.g. out-of-order
deployment, or out-of-order discovery), the change would not result in a cache miss.

More scenarios can be found in the associated Phabricator tasks.

== Version hash ==

Previously we virtually mapped these variables to a timestamp by storing the current
time alongside a hash of the value in ObjectCache. Considering the number of
possible request contexts (wikis * modules * users * skins * languages) this doesn't
work well. It results in needless cache invalidation when the first time observation
is purged due to LRU algorithms. It also has other minor bugs leading to fewer
cache hits.

All modules automatically get the benefits of version hashing with this change.
The old getDefinitionMtime() and getHashMtime() have been replaced with dummies
that return 1. These functions are often called from getModifiedTime() in subclasses.

For backward-compatibility, their respective values (definition summary and hash)
are now included in getVersionHash directly.

As examples, the following modules have been updated to use getVersionHash directly.
Other modules still work fine and can be updated later.

* ResourceLoaderFileModule
* ResourceLoaderEditToolbarModule
* ResourceLoaderStartUpModule
* ResourceLoaderWikiModule

The presence of hashes in place of timestamps increases the startup module size on
a default MediaWiki install from 4.4k to 5.8k (after gzip and minification).

== ETag ==

Since timestamps are no longer tracked, we need a different way to implement caching
for cache proxies (e.g. Varnish) and web browsers. Previously we used the
Last-Modified header (in combination with Cache-Control and Expires).

Instead of Last-Modified (and If-Modified-Since), we use ETag (and If-None-Match).

Entity tags (new in HTTP/1.1) are much stricter than Last-Modified by default.
They instruct browsers to allow usage of partial Range requests. Since our responses
are dynamically generated, we need to use the Weak version of ETag.

While this sounds bad, it's no different than Last-Modified. As reassured by
RFC 2616 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.3> the
specified behaviour behind Last-Modified follows the same "Weak" caching logic as
Entity tags. It's just that entity tags are capable of a stricter mode (whereas
Last-Modified is inherently weak).

== File cache ==

If $wgUseFileCache is enabled, ResourceLoader uses ResourceFileCache to cache
load.php responses. While the blind TTL handling (during the allowed expiry period)
is still maxage/timestamp based, tryRespondNotModified() now requires the caller to
know the expected ETag.

For this to work, the FileCache handling had to be moved from the top of
ResoureLoader::respond() to after the expected ETag is computed.

This also allows us to remove the duplicate tryRespondNotModified() handling since
that's is already handled by ResourceLoader::respond() meanwhile.

== Misc ==

* Remove redundant modifiedTime cache in ResourceLoaderFileModule.

* Change bugzilla references to Phabricator.

* Centralised inclusion of wgCacheEpoch using getDefinitionSummary. Previously this
  logic was duplicated in each place the modified timestamp was used.

* It's easy to forget calling the parent class in getDefinitionSummary().
  Previously this method only tracked 'class' by default. As such, various
  extensions hardcoded that one value instead of calling the parent and extending
  the array. To better prevent this in the future, getVersionHash() now asserts
  that the '_cacheEpoch' property made it through.

* tests: Don't use getDefinitionSummary() as an API.
  Fix ResourceLoaderWikiModuleTest to call getPages properly.

* In tests, the default timestamp used to be 1388534400000 (which is the unix time
  of 20140101000000; the unit tests' CacheEpoch). The new version hash of these
  modules is "XyCC+PSK", which is the base64 encoded prefix of the SHA1 digest of:
  '{"_class":"ResourceLoaderTestModule","_cacheEpoch":"20140101000000"}'

* Add sha1.js library for client-side hash generation.
  Compared various different implementations for code size (after minfication/gzip),
  and speed (when used for short hexidecimal strings).
  https://jsperf.com/sha1-implementations
  - CryptoJS <https://code.google.com/p/crypto-js/#SHA-1> (min+gzip: 2.5k)
    http://crypto-js.googlecode.com/svn/tags/3.1.2/build/rollups/sha1.js
    Chrome: 45k, Firefox: 89k, Safari: 92k
  - jsSHA <https://github.com/Caligatio/jsSHA>
    https://github.com/Caligatio/jsSHA/blob/3c1d4f2e/src/sha1.js (min+gzip: 1.8k)
    Chrome: 65k, Firefox: 53k, Safari: 69k
  - phpjs-sha1 <https://github.com/kvz/phpjs> (RL min+gzip: 0.8k)
    https://github.com/kvz/phpjs/blob/1eaab15d/functions/strings/sha1.js
    Chrome: 200k, Firefox: 280k, Safari: 78k

  Modern browsers implement the HTML5 Crypto API. However, this API is asynchronous,
  only enabled when on HTTPS in Chromium, and is quite low-level. It requires boilerplate
  code to actually use with TextEncoder, ArrayBuffer and Uint32Array. Due this being
  needed in the module loader, we'd have to load the fallback regardless. Considering
  this is not used in a critical path for performance, it's not worth shipping two
  implementations for this optimisation.

May also resolve:
* T44094
* T90411
* T94810

Bug: T94074
Change-Id: Ibb292d2416839327d1807a66c78fd96dac0637d0
2015-05-19 22:28:17 +00:00

417 lines
13 KiB
PHP

<?php
/**
* Module for resource loader initialization.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
* http://www.gnu.org/copyleft/gpl.html
*
* @file
* @author Trevor Parscal
* @author Roan Kattouw
*/
class ResourceLoaderStartUpModule extends ResourceLoaderModule {
// Cache for getConfigSettings() as it's called by multiple methods
protected $configVars = array();
protected $targets = array( 'desktop', 'mobile' );
/**
* @param ResourceLoaderContext $context
* @return array
*/
protected function getConfigSettings( $context ) {
$hash = $context->getHash();
if ( isset( $this->configVars[$hash] ) ) {
return $this->configVars[$hash];
}
global $wgContLang;
$mainPage = Title::newMainPage();
/**
* Namespace related preparation
* - wgNamespaceIds: Key-value pairs of all localized, canonical and aliases for namespaces.
* - wgCaseSensitiveNamespaces: Array of namespaces that are case-sensitive.
*/
$namespaceIds = $wgContLang->getNamespaceIds();
$caseSensitiveNamespaces = array();
foreach ( MWNamespace::getCanonicalNamespaces() as $index => $name ) {
$namespaceIds[$wgContLang->lc( $name )] = $index;
if ( !MWNamespace::isCapitalized( $index ) ) {
$caseSensitiveNamespaces[] = $index;
}
}
$conf = $this->getConfig();
// Build list of variables
$vars = array(
'wgLoadScript' => wfScript( 'load' ),
'debug' => $context->getDebug(),
'skin' => $context->getSkin(),
'stylepath' => $conf->get( 'StylePath' ),
'wgUrlProtocols' => wfUrlProtocols(),
'wgArticlePath' => $conf->get( 'ArticlePath' ),
'wgScriptPath' => $conf->get( 'ScriptPath' ),
'wgScriptExtension' => $conf->get( 'ScriptExtension' ),
'wgScript' => wfScript(),
'wgSearchType' => $conf->get( 'SearchType' ),
'wgVariantArticlePath' => $conf->get( 'VariantArticlePath' ),
// Force object to avoid "empty" associative array from
// becoming [] instead of {} in JS (bug 34604)
'wgActionPaths' => (object)$conf->get( 'ActionPaths' ),
'wgServer' => $conf->get( 'Server' ),
'wgServerName' => $conf->get( 'ServerName' ),
'wgUserLanguage' => $context->getLanguage(),
'wgContentLanguage' => $wgContLang->getCode(),
'wgVersion' => $conf->get( 'Version' ),
'wgEnableAPI' => $conf->get( 'EnableAPI' ),
'wgEnableWriteAPI' => $conf->get( 'EnableWriteAPI' ),
'wgMainPageTitle' => $mainPage->getPrefixedText(),
'wgFormattedNamespaces' => $wgContLang->getFormattedNamespaces(),
'wgNamespaceIds' => $namespaceIds,
'wgContentNamespaces' => MWNamespace::getContentNamespaces(),
'wgSiteName' => $conf->get( 'Sitename' ),
'wgDBname' => $conf->get( 'DBname' ),
'wgAvailableSkins' => Skin::getSkinNames(),
'wgExtensionAssetsPath' => $conf->get( 'ExtensionAssetsPath' ),
// MediaWiki sets cookies to have this prefix by default
'wgCookiePrefix' => $conf->get( 'CookiePrefix' ),
'wgCookieDomain' => $conf->get( 'CookieDomain' ),
'wgCookiePath' => $conf->get( 'CookiePath' ),
'wgCookieExpiration' => $conf->get( 'CookieExpiration' ),
'wgResourceLoaderMaxQueryLength' => $conf->get( 'ResourceLoaderMaxQueryLength' ),
'wgCaseSensitiveNamespaces' => $caseSensitiveNamespaces,
'wgLegalTitleChars' => Title::convertByteClassToUnicodeClass( Title::legalChars() ),
'wgResourceLoaderStorageVersion' => $conf->get( 'ResourceLoaderStorageVersion' ),
'wgResourceLoaderStorageEnabled' => $conf->get( 'ResourceLoaderStorageEnabled' ),
);
Hooks::run( 'ResourceLoaderGetConfigVars', array( &$vars ) );
$this->configVars[$hash] = $vars;
return $this->configVars[$hash];
}
/**
* Recursively get all explicit and implicit dependencies for to the given module.
*
* @param array $registryData
* @param string $moduleName
* @return array
*/
protected static function getImplicitDependencies( array $registryData, $moduleName ) {
static $dependencyCache = array();
// The list of implicit dependencies won't be altered, so we can
// cache them without having to worry.
if ( !isset( $dependencyCache[$moduleName] ) ) {
if ( !isset( $registryData[$moduleName] ) ) {
// Dependencies may not exist
$dependencyCache[$moduleName] = array();
} else {
$data = $registryData[$moduleName];
$dependencyCache[$moduleName] = $data['dependencies'];
foreach ( $data['dependencies'] as $dependency ) {
// Recursively get the dependencies of the dependencies
$dependencyCache[$moduleName] = array_merge(
$dependencyCache[$moduleName],
self::getImplicitDependencies( $registryData, $dependency )
);
}
}
}
return $dependencyCache[$moduleName];
}
/**
* Optimize the dependency tree in $this->modules.
*
* The optimization basically works like this:
* Given we have module A with the dependencies B and C
* and module B with the dependency C.
* Now we don't have to tell the client to explicitly fetch module
* C as that's already included in module B.
*
* This way we can reasonably reduce the amount of module registration
* data send to the client.
*
* @param array &$registryData Modules keyed by name with properties:
* - string 'version'
* - array 'dependencies'
* - string|null 'group'
* - string 'source'
* - string|false 'loader'
*/
public static function compileUnresolvedDependencies( array &$registryData ) {
foreach ( $registryData as $name => &$data ) {
if ( $data['loader'] !== false ) {
continue;
}
$dependencies = $data['dependencies'];
foreach ( $data['dependencies'] as $dependency ) {
$implicitDependencies = self::getImplicitDependencies( $registryData, $dependency );
$dependencies = array_diff( $dependencies, $implicitDependencies );
}
// Rebuild keys
$data['dependencies'] = array_values( $dependencies );
}
}
/**
* Get registration code for all modules.
*
* @param ResourceLoaderContext $context
* @return string JavaScript code for registering all modules with the client loader
*/
public function getModuleRegistrations( ResourceLoaderContext $context ) {
$resourceLoader = $context->getResourceLoader();
$target = $context->getRequest()->getVal( 'target', 'desktop' );
$out = '';
$registryData = array();
// Get registry data
foreach ( $resourceLoader->getModuleNames() as $name ) {
$module = $resourceLoader->getModule( $name );
$moduleTargets = $module->getTargets();
if ( !in_array( $target, $moduleTargets ) ) {
continue;
}
if ( $module->isRaw() ) {
// Don't register "raw" modules (like 'jquery' and 'mediawiki') client-side because
// depending on them is illegal anyway and would only lead to them being reloaded
// causing any state to be lost (like jQuery plugins, mw.config etc.)
continue;
}
$versionHash = $module->getVersionHash( $context );
if ( strlen( $versionHash ) !== 8 ) {
// Module implementation either broken or deviated from ResourceLoader::makeHash
// Asserted by tests/phpunit/structure/ResourcesTest.
$versionHash = ResourceLoader::makeHash( $versionHash );
}
$skipFunction = $module->getSkipFunction();
if ( $skipFunction !== null && !ResourceLoader::inDebugMode() ) {
$skipFunction = $resourceLoader->filter( 'minify-js',
$skipFunction,
// There will potentially be lots of these little string in the registrations
// manifest, we don't want to blow up the startup module with
// "/* cache key: ... */" all over it in non-debug mode.
/* cacheReport = */ false
);
}
$registryData[$name] = array(
'version' => $versionHash,
'dependencies' => $module->getDependencies(),
'group' => $module->getGroup(),
'source' => $module->getSource(),
'loader' => $module->getLoaderScript(),
'skip' => $skipFunction,
);
}
self::compileUnresolvedDependencies( $registryData );
// Register sources
$out .= ResourceLoader::makeLoaderSourcesScript( $resourceLoader->getSources() );
// Concatenate module loader scripts and figure out the different call
// signatures for mw.loader.register
$registrations = array();
foreach ( $registryData as $name => $data ) {
if ( $data['loader'] !== false ) {
$out .= ResourceLoader::makeCustomLoaderScript(
$name,
$data['version'],
$data['dependencies'],
$data['group'],
$data['source'],
$data['loader']
);
continue;
}
// Call mw.loader.register(name, version, dependencies, group, source, skip)
$registrations[] = array(
$name,
$data['version'],
$data['dependencies'],
$data['group'],
// Swap default (local) for null
$data['source'] === 'local' ? null : $data['source'],
$data['skip']
);
}
// Register modules
$out .= ResourceLoader::makeLoaderRegisterScript( $registrations );
return $out;
}
/**
* @return bool
*/
public function isRaw() {
return true;
}
/**
* Base modules required for the base environment of ResourceLoader
*
* @return array
*/
public static function getStartupModules() {
return array( 'jquery', 'mediawiki' );
}
/**
* Get the load URL of the startup modules.
*
* This is a helper for getScript(), but can also be called standalone, such
* as when generating an AppCache manifest.
*
* @param ResourceLoaderContext $context
* @return string
*/
public static function getStartupModulesUrl( ResourceLoaderContext $context ) {
$rl = $context->getResourceLoader();
$moduleNames = self::getStartupModules();
$query = array(
'modules' => ResourceLoader::makePackedModulesString( $moduleNames ),
'only' => 'scripts',
'lang' => $context->getLanguage(),
'skin' => $context->getSkin(),
'debug' => $context->getDebug() ? 'true' : 'false',
'version' => $rl->getCombinedVersion( $context, $moduleNames ),
);
// Ensure uniform query order
ksort( $query );
return wfAppendQuery( wfScript( 'load' ), $query );
}
/**
* @param ResourceLoaderContext $context
* @return string
*/
public function getScript( ResourceLoaderContext $context ) {
global $IP;
$out = file_get_contents( "$IP/resources/src/startup.js" );
if ( $context->getOnly() === 'scripts' ) {
// Startup function
$configuration = $this->getConfigSettings( $context );
$registrations = $this->getModuleRegistrations( $context );
// Fix indentation
$registrations = str_replace( "\n", "\n\t", trim( $registrations ) );
$mwMapJsCall = Xml::encodeJsCall(
'mw.Map',
array( $this->getConfig()->get( 'LegacyJavaScriptGlobals' ) )
);
$mwConfigSetJsCall = Xml::encodeJsCall(
'mw.config.set',
array( $configuration ),
ResourceLoader::inDebugMode()
);
$out .= "var startUp = function () {\n" .
"\tmw.config = new " .
$mwMapJsCall . "\n" .
"\t$registrations\n" .
"\t" . $mwConfigSetJsCall .
"};\n";
// Conditional script injection
$scriptTag = Html::linkedScript( self::getStartupModulesUrl( $context ) );
$out .= "if ( isCompatible() ) {\n" .
"\t" . Xml::encodeJsCall( 'document.write', array( $scriptTag ) ) .
"\n}";
}
return $out;
}
/**
* @return bool
*/
public function supportsURLLoading() {
return false;
}
/**
* Get the definition summary for this module.
*
* @param ResourceLoaderContext $context
* @return array
*/
public function getDefinitionSummary( ResourceLoaderContext $context ) {
global $IP;
$summary = parent::getDefinitionSummary( $context );
$summary[] = array(
// Detect changes to variables exposed in mw.config (T30899).
'vars' => $this->getConfigSettings( $context ),
// Changes how getScript() creates mw.Map for mw.config
'wgLegacyJavaScriptGlobals' => $this->getConfig()->get( 'LegacyJavaScriptGlobals' ),
// Detect changes to the module registrations
'moduleHashes' => $this->getAllModuleHashes( $context ),
'fileMtimes' => array(
filemtime( "$IP/resources/src/startup.js" ),
),
);
return $summary;
}
/**
* Helper method for getDefinitionSummary().
*
* @param ResourceLoaderContext $context
* @return string SHA-1
*/
protected function getAllModuleHashes( ResourceLoaderContext $context ) {
$rl = $context->getResourceLoader();
// Preload for getCombinedVersion()
$rl->preloadModuleInfo( $rl->getModuleNames(), $context );
// ATTENTION: Because of the line below, this is not going to cause infinite recursion.
// Think carefully before making changes to this code!
// Pre-populate versionHash with something because the loop over all modules below includes
// the startup module (this module).
// See ResourceLoaderModule::getVersionHash() for usage of this cache.
$this->versionHash[ $context->getHash() ] = null;
return $rl->getCombinedVersion( $context, $rl->getModuleNames() );
}
/**
* @return string
*/
public function getGroup() {
return 'startup';
}
}