wiki.techinc.nl/includes/ResourceLoader/StartUpModule.php

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

465 lines
16 KiB
PHP
Raw Normal View History

<?php
/**
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
* http://www.gnu.org/copyleft/gpl.html
*
* @file
* @author Trevor Parscal
* @author Roan Kattouw
*/
namespace MediaWiki\ResourceLoader;
use DomainException;
use Exception;
use MediaWiki\MainConfigNames;
use Wikimedia\RequestTimeout\TimeoutException;
/**
* Module for ResourceLoader initialization.
*
* See also <https://www.mediawiki.org/wiki/ResourceLoader/Features#Startup_Module>
*
* The startup module, as being called only from ClientHtml, has
* the ability to vary based extra query parameters, in addition to those
* from Context:
*
* - safemode: Only register modules that have ORIGIN_CORE as their origin.
* This disables ORIGIN_USER modules and mw.loader.store. (T185303, T145498)
* See also: OutputPage::disallowUserJs()
*
* @ingroup ResourceLoader
* @internal
*/
class StartUpModule extends Module {
/**
* Cache version for client-side ResourceLoader module storage.
* Like ResourceLoaderStorageVersion but not configurable.
*/
private const STORAGE_VERSION = '2';
private $groupIds = [
// These reserved numbers MUST start at 0 and not skip any. These are preset
// for forward compatibility so that they can be safely referenced by mediawiki.js,
// even when the code is cached and the order of registrations (and implicit
// group ids) changes between versions of the software.
self::GROUP_USER => 0,
self::GROUP_PRIVATE => 1,
];
/**
* Recursively get all explicit and implicit dependencies for to the given module.
*
* @param array $registryData
* @param string $moduleName
* @param string[] $handled Internal parameter for recursion. (Optional)
* @return array
* @throws CircularDependencyError
*/
protected static function getImplicitDependencies(
array $registryData,
string $moduleName,
array $handled = []
): array {
static $dependencyCache = [];
// No modules will be added or changed server-side after this point,
// so we can safely cache parts of the tree for re-use.
if ( !isset( $dependencyCache[$moduleName] ) ) {
if ( !isset( $registryData[$moduleName] ) ) {
// Unknown module names are allowed here, this is only an optimisation.
// Checks for illegal and unknown dependencies happen as PHPUnit structure tests,
// and also client-side at run-time.
$dependencyCache[$moduleName] = [];
return [];
}
$data = $registryData[$moduleName];
$flat = $data['dependencies'];
// Prevent recursion
$handled[] = $moduleName;
foreach ( $data['dependencies'] as $dependency ) {
if ( in_array( $dependency, $handled, true ) ) {
// If we encounter a circular dependency, then stop the optimiser and leave the
// original dependencies array unmodified. Circular dependencies are not
// supported in ResourceLoader. Awareness of them exists here so that we can
// optimise the registry when it isn't broken, and otherwise transport the
// registry unchanged. The client will handle this further.
throw new CircularDependencyError();
}
// Recursively add the dependencies of the dependencies
$flat = array_merge(
$flat,
self::getImplicitDependencies( $registryData, $dependency, $handled )
);
}
$dependencyCache[$moduleName] = $flat;
}
return $dependencyCache[$moduleName];
}
/**
* Optimize the dependency tree in $this->modules.
*
* The optimization basically works like this:
* Given we have module A with the dependencies B and C
* and module B with the dependency C.
* Now we don't have to tell the client to explicitly fetch module
* C as that's already included in module B.
*
* This way we can reasonably reduce the amount of module registration
* data send to the client.
*
* @param array[] &$registryData Modules keyed by name with properties:
resourceloader: Replace timestamp system with version hashing Modules now track their version via getVersionHash() instead of getModifiedTime(). == Background == While some resources have observeable timestamps (e.g. files stored on disk), many other resources do not. E.g. config variables, and module definitions. For static file modules, one can e.g. revert one of more files in a module to a previous version and not affect the max timestamp. Wiki modules include pages only if they exist. The user module supports common.js and skin.js. By default neither exists. If a user has both, and then the less-recently modified one is deleted, the max-timestamp remains unchanged. For client-side caching, batch requests use "Math.max" on the relevant timestamps. Again, if a module changes but another module is more recent (e.g. out-of-order deployment, or out-of-order discovery), the change would not result in a cache miss. More scenarios can be found in the associated Phabricator tasks. == Version hash == Previously we virtually mapped these variables to a timestamp by storing the current time alongside a hash of the value in ObjectCache. Considering the number of possible request contexts (wikis * modules * users * skins * languages) this doesn't work well. It results in needless cache invalidation when the first time observation is purged due to LRU algorithms. It also has other minor bugs leading to fewer cache hits. All modules automatically get the benefits of version hashing with this change. The old getDefinitionMtime() and getHashMtime() have been replaced with dummies that return 1. These functions are often called from getModifiedTime() in subclasses. For backward-compatibility, their respective values (definition summary and hash) are now included in getVersionHash directly. As examples, the following modules have been updated to use getVersionHash directly. Other modules still work fine and can be updated later. * ResourceLoaderFileModule * ResourceLoaderEditToolbarModule * ResourceLoaderStartUpModule * ResourceLoaderWikiModule The presence of hashes in place of timestamps increases the startup module size on a default MediaWiki install from 4.4k to 5.8k (after gzip and minification). == ETag == Since timestamps are no longer tracked, we need a different way to implement caching for cache proxies (e.g. Varnish) and web browsers. Previously we used the Last-Modified header (in combination with Cache-Control and Expires). Instead of Last-Modified (and If-Modified-Since), we use ETag (and If-None-Match). Entity tags (new in HTTP/1.1) are much stricter than Last-Modified by default. They instruct browsers to allow usage of partial Range requests. Since our responses are dynamically generated, we need to use the Weak version of ETag. While this sounds bad, it's no different than Last-Modified. As reassured by RFC 2616 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.3> the specified behaviour behind Last-Modified follows the same "Weak" caching logic as Entity tags. It's just that entity tags are capable of a stricter mode (whereas Last-Modified is inherently weak). == File cache == If $wgUseFileCache is enabled, ResourceLoader uses ResourceFileCache to cache load.php responses. While the blind TTL handling (during the allowed expiry period) is still maxage/timestamp based, tryRespondNotModified() now requires the caller to know the expected ETag. For this to work, the FileCache handling had to be moved from the top of ResoureLoader::respond() to after the expected ETag is computed. This also allows us to remove the duplicate tryRespondNotModified() handling since that's is already handled by ResourceLoader::respond() meanwhile. == Misc == * Remove redundant modifiedTime cache in ResourceLoaderFileModule. * Change bugzilla references to Phabricator. * Centralised inclusion of wgCacheEpoch using getDefinitionSummary. Previously this logic was duplicated in each place the modified timestamp was used. * It's easy to forget calling the parent class in getDefinitionSummary(). Previously this method only tracked 'class' by default. As such, various extensions hardcoded that one value instead of calling the parent and extending the array. To better prevent this in the future, getVersionHash() now asserts that the '_cacheEpoch' property made it through. * tests: Don't use getDefinitionSummary() as an API. Fix ResourceLoaderWikiModuleTest to call getPages properly. * In tests, the default timestamp used to be 1388534400000 (which is the unix time of 20140101000000; the unit tests' CacheEpoch). The new version hash of these modules is "XyCC+PSK", which is the base64 encoded prefix of the SHA1 digest of: '{"_class":"ResourceLoaderTestModule","_cacheEpoch":"20140101000000"}' * Add sha1.js library for client-side hash generation. Compared various different implementations for code size (after minfication/gzip), and speed (when used for short hexidecimal strings). https://jsperf.com/sha1-implementations - CryptoJS <https://code.google.com/p/crypto-js/#SHA-1> (min+gzip: 2.5k) http://crypto-js.googlecode.com/svn/tags/3.1.2/build/rollups/sha1.js Chrome: 45k, Firefox: 89k, Safari: 92k - jsSHA <https://github.com/Caligatio/jsSHA> https://github.com/Caligatio/jsSHA/blob/3c1d4f2e/src/sha1.js (min+gzip: 1.8k) Chrome: 65k, Firefox: 53k, Safari: 69k - phpjs-sha1 <https://github.com/kvz/phpjs> (RL min+gzip: 0.8k) https://github.com/kvz/phpjs/blob/1eaab15d/functions/strings/sha1.js Chrome: 200k, Firefox: 280k, Safari: 78k Modern browsers implement the HTML5 Crypto API. However, this API is asynchronous, only enabled when on HTTPS in Chromium, and is quite low-level. It requires boilerplate code to actually use with TextEncoder, ArrayBuffer and Uint32Array. Due this being needed in the module loader, we'd have to load the fallback regardless. Considering this is not used in a critical path for performance, it's not worth shipping two implementations for this optimisation. May also resolve: * T44094 * T90411 * T94810 Bug: T94074 Change-Id: Ibb292d2416839327d1807a66c78fd96dac0637d0
2015-04-29 22:53:24 +00:00
* - string 'version'
* - array 'dependencies'
* - string|null 'group'
* - string 'source'
* @phan-param array<string,array{version:string,dependencies:array,group:?string,source:string}> &$registryData
*/
public static function compileUnresolvedDependencies( array &$registryData ): void {
foreach ( $registryData as &$data ) {
$dependencies = $data['dependencies'];
try {
foreach ( $data['dependencies'] as $dependency ) {
$implicitDependencies = self::getImplicitDependencies( $registryData, $dependency );
$dependencies = array_diff( $dependencies, $implicitDependencies );
}
} catch ( CircularDependencyError $err ) {
// Leave unchanged
$dependencies = $data['dependencies'];
}
// Rebuild keys
$data['dependencies'] = array_values( $dependencies );
}
}
/**
* Get registration code for all modules.
*
* @param Context $context
* @return string JavaScript code for registering all modules with the client loader
*/
public function getModuleRegistrations( Context $context ): string {
* Made Resources.php return a pure-data array instead of an ugly mix of data and code. This allows the class code to be lazy-loaded with the autoloader, for a performance advantage especially on non-APC installs. And using the convention where if the class is omitted, ResourceLoaderFileModule is assumed, the registration code becomes shorter and simpler. * Modified ResourceLoader to lazy-initialise module objects, for a further performance advantage. * Deleted ResourceLoader::getModules(), provided getModuleNames() instead. Although the startup module needs this functionality, it's slow to generate, so to avoid misuse, it's better to provide a foolproof fast interface and let the startup module do the slow thing itself. * Modified ResourceLoader::register() to optionally accept an info array instead of an object. * Added $wgResourceModules, allowing extensions to efficiently define their own resource loader modules. The trouble with hooks is that they contain code, and code is slow. We've been through all this before with i18n. Hooks are useful as a performance tool only if you call them very rarely. * Moved ResourceLoader settings to their own section in DefaultSettings.php * Added options to ResourceLoaderFileModule equivalent to the $localBasePath and $remoteBasePath parameters, to allow it to be instantiated via the new array style. Also added remoteExtPath, which allows modules to be registered before $wgExtensionAssetsPath is known. * Added OutputPage::getResourceLoader(), mostly for debugging. * The time saving at the moment is about 5ms per request with no extensions, which is significant already with 6 load.php requests for a cold cache page view. This is a much more scalable interface; the relative saving will grow as more extensions are added which use this interface, especially for non-APC installs. Although the interface is backwards compatible, extension updates will follow in a subsequent commit.
2010-11-19 10:41:06 +00:00
$resourceLoader = $context->getResourceLoader();
// Future developers: Use WebRequest::getRawVal() instead getVal().
// The getVal() method performs slow Language+UTF logic. (f303bb9360)
$safemode = $context->getRequest()->getRawVal( 'safemode' ) === '1';
$skin = $context->getSkin();
[ResourceLoader 2]: Add support for multiple loadScript sources Front-end: * New mw.loader method: addSource(). Call with two arguments or an object as first argument for multiple registrations * New property in module registry: "source". Optional for local modules (falls back to 'local'). When loading/using one or more modules, the worker will group the request by source and make separate requests to the sources as needed. * Re-arranging object properties in mw.loader.register to match the same order all other code parts use. * Adding documentation for 'source' and where missing updating it to include 'group' as well. * Refactor of mw.loader.work() by Roan Kattouw and Timo Tijhof:' -- Additional splitting layer by source (in addition to splitting by group), renamed 'groups' to 'splits' -- Clean up of the loop, and removing a no longer needed loop after the for-in-loop -- Much more function documentation in mw.loader.work() -- Moved caching of wgResourceLoaderMaxQueryLength out of the loop and renamed 'limit' to 'maxQueryLength Back-end changed provided through patch by Roan Kattouw (to avoid broken code between commits): * New method in ResourceLoader: addSource(). During construction of ResourceLoader this will be called by default for 'local' with loadScript property set to $wgLoadScript. Additional sources can be registered through $wgResourceLoaderSources (empty array by default) * Calling mw.loader.addSource from the startup module * Passing source to mw.loader.register from startup module * Some new static helper methods Use: * By default nothing should change in core, all modules simply default to 'local'. This info originates from the getSource()-method of the ResourceLoaderModule class, which is inherited to all core ResourceLoaderModule-implementations (none override it) * Third-party users and/or extensions can create new classes extending ResourceLoaderModule, re-implementing the getSource-method to return something else. Basic example: $wgResourceLoaderSources['mywiki'] = array( 'loadScript' => 'http://example.org/w/load.php' ); class MyCentralWikiModule extends ResourceLoaderModule { function getSource(){ return 'mywiki'; } } $wgResourceModules['cool.stuff'] => array( 'class' => 'MyCentralWikiModule' ); More complicated example // imagine some stuff with a ForeignGadgetRepo class, putting stuff in $wgResourceLoaderSources in the __construct() method class ForeignGadgetRepoGadget extends ResourceLoaderModule { function getSource(){ return $this->source; } } Loading: Loading is completely transparent, stuff like $wgOut->addModules() or mw.loader.loader/using both take it as any other module and load from the right source accordingly. -- This commit is part of the ResourceLoader 2 project.
2011-07-26 21:10:34 +00:00
resourceloader: Use 'enableModuleContentVersion' for startup module This significantly simplifies the getVersionHash implementation for StartupModule, and fixes a couple of bugs. Previously, the startup module's E-Tag was determined by the 'getDefinitionSummary' method, which combined the E-Tag values from all registered modules, plus what we thought is all information used by 'getScript' (config vars, embedded script files, list of base modules, ...) However, this were various things part of the manifest that it forgot about, including: * Changes to the list of dependencies of a module. * Changes to the name of module. * Changes to the cache group of module. * Adding or removing a foreign module source (mw.loader.addSource). These are all quite rare, and when they do change, they usually also involve a change that *was* tracked already. But, sometimes they don't and that's when bugs happened. Instead of the tracking array of getDefinitionSummary, we now use the 'enableModuleContentVersion' option for StartupModule, which simply calls the actual getScript() method and hashes that. Of note: When an exception happens with the version computation of any individual module, we catch it, log it, and continue with the rest. Previously, the first time such error was discovered at run-time would be in the getCombinedVersion() call from StartupModule::getAllModuleHashes(). That public getCombinedVersion() method of ResourceLoader had the benefit of also outputting details of that exception in the HTTP response output. In order to keep that behaviour, I made outputErrorAndLog() public so that StartupModule can call it directly now. This is covered by ResourceLoaderTest::testMakeModuleResponseStartupError. Bug: T201686 Change-Id: I8e8d3a2cd2ccd68d2d78e988bcdd0d77fbcbf1d4
2018-08-30 02:52:39 +00:00
$moduleNames = $resourceLoader->getModuleNames();
// Preload with a batch so that the below calls to getVersionHash() for each module
// don't require on-demand loading of more information.
try {
$resourceLoader->preloadModuleInfo( $moduleNames, $context );
} catch ( TimeoutException $e ) {
throw $e;
resourceloader: Use 'enableModuleContentVersion' for startup module This significantly simplifies the getVersionHash implementation for StartupModule, and fixes a couple of bugs. Previously, the startup module's E-Tag was determined by the 'getDefinitionSummary' method, which combined the E-Tag values from all registered modules, plus what we thought is all information used by 'getScript' (config vars, embedded script files, list of base modules, ...) However, this were various things part of the manifest that it forgot about, including: * Changes to the list of dependencies of a module. * Changes to the name of module. * Changes to the cache group of module. * Adding or removing a foreign module source (mw.loader.addSource). These are all quite rare, and when they do change, they usually also involve a change that *was* tracked already. But, sometimes they don't and that's when bugs happened. Instead of the tracking array of getDefinitionSummary, we now use the 'enableModuleContentVersion' option for StartupModule, which simply calls the actual getScript() method and hashes that. Of note: When an exception happens with the version computation of any individual module, we catch it, log it, and continue with the rest. Previously, the first time such error was discovered at run-time would be in the getCombinedVersion() call from StartupModule::getAllModuleHashes(). That public getCombinedVersion() method of ResourceLoader had the benefit of also outputting details of that exception in the HTTP response output. In order to keep that behaviour, I made outputErrorAndLog() public so that StartupModule can call it directly now. This is covered by ResourceLoaderTest::testMakeModuleResponseStartupError. Bug: T201686 Change-Id: I8e8d3a2cd2ccd68d2d78e988bcdd0d77fbcbf1d4
2018-08-30 02:52:39 +00:00
} catch ( Exception $e ) {
// Don't fail the request (T152266)
// Also print the error in the main output
$resourceLoader->outputErrorAndLog( $e,
'Preloading module info from startup failed: {exception}',
[ 'exception' => $e ]
);
}
[ResourceLoader 2]: Add support for multiple loadScript sources Front-end: * New mw.loader method: addSource(). Call with two arguments or an object as first argument for multiple registrations * New property in module registry: "source". Optional for local modules (falls back to 'local'). When loading/using one or more modules, the worker will group the request by source and make separate requests to the sources as needed. * Re-arranging object properties in mw.loader.register to match the same order all other code parts use. * Adding documentation for 'source' and where missing updating it to include 'group' as well. * Refactor of mw.loader.work() by Roan Kattouw and Timo Tijhof:' -- Additional splitting layer by source (in addition to splitting by group), renamed 'groups' to 'splits' -- Clean up of the loop, and removing a no longer needed loop after the for-in-loop -- Much more function documentation in mw.loader.work() -- Moved caching of wgResourceLoaderMaxQueryLength out of the loop and renamed 'limit' to 'maxQueryLength Back-end changed provided through patch by Roan Kattouw (to avoid broken code between commits): * New method in ResourceLoader: addSource(). During construction of ResourceLoader this will be called by default for 'local' with loadScript property set to $wgLoadScript. Additional sources can be registered through $wgResourceLoaderSources (empty array by default) * Calling mw.loader.addSource from the startup module * Passing source to mw.loader.register from startup module * Some new static helper methods Use: * By default nothing should change in core, all modules simply default to 'local'. This info originates from the getSource()-method of the ResourceLoaderModule class, which is inherited to all core ResourceLoaderModule-implementations (none override it) * Third-party users and/or extensions can create new classes extending ResourceLoaderModule, re-implementing the getSource-method to return something else. Basic example: $wgResourceLoaderSources['mywiki'] = array( 'loadScript' => 'http://example.org/w/load.php' ); class MyCentralWikiModule extends ResourceLoaderModule { function getSource(){ return 'mywiki'; } } $wgResourceModules['cool.stuff'] => array( 'class' => 'MyCentralWikiModule' ); More complicated example // imagine some stuff with a ForeignGadgetRepo class, putting stuff in $wgResourceLoaderSources in the __construct() method class ForeignGadgetRepoGadget extends ResourceLoaderModule { function getSource(){ return $this->source; } } Loading: Loading is completely transparent, stuff like $wgOut->addModules() or mw.loader.loader/using both take it as any other module and load from the right source accordingly. -- This commit is part of the ResourceLoader 2 project.
2011-07-26 21:10:34 +00:00
// Get registry data
$states = [];
$registryData = [];
resourceloader: Use 'enableModuleContentVersion' for startup module This significantly simplifies the getVersionHash implementation for StartupModule, and fixes a couple of bugs. Previously, the startup module's E-Tag was determined by the 'getDefinitionSummary' method, which combined the E-Tag values from all registered modules, plus what we thought is all information used by 'getScript' (config vars, embedded script files, list of base modules, ...) However, this were various things part of the manifest that it forgot about, including: * Changes to the list of dependencies of a module. * Changes to the name of module. * Changes to the cache group of module. * Adding or removing a foreign module source (mw.loader.addSource). These are all quite rare, and when they do change, they usually also involve a change that *was* tracked already. But, sometimes they don't and that's when bugs happened. Instead of the tracking array of getDefinitionSummary, we now use the 'enableModuleContentVersion' option for StartupModule, which simply calls the actual getScript() method and hashes that. Of note: When an exception happens with the version computation of any individual module, we catch it, log it, and continue with the rest. Previously, the first time such error was discovered at run-time would be in the getCombinedVersion() call from StartupModule::getAllModuleHashes(). That public getCombinedVersion() method of ResourceLoader had the benefit of also outputting details of that exception in the HTTP response output. In order to keep that behaviour, I made outputErrorAndLog() public so that StartupModule can call it directly now. This is covered by ResourceLoaderTest::testMakeModuleResponseStartupError. Bug: T201686 Change-Id: I8e8d3a2cd2ccd68d2d78e988bcdd0d77fbcbf1d4
2018-08-30 02:52:39 +00:00
foreach ( $moduleNames as $name ) {
* Made Resources.php return a pure-data array instead of an ugly mix of data and code. This allows the class code to be lazy-loaded with the autoloader, for a performance advantage especially on non-APC installs. And using the convention where if the class is omitted, ResourceLoaderFileModule is assumed, the registration code becomes shorter and simpler. * Modified ResourceLoader to lazy-initialise module objects, for a further performance advantage. * Deleted ResourceLoader::getModules(), provided getModuleNames() instead. Although the startup module needs this functionality, it's slow to generate, so to avoid misuse, it's better to provide a foolproof fast interface and let the startup module do the slow thing itself. * Modified ResourceLoader::register() to optionally accept an info array instead of an object. * Added $wgResourceModules, allowing extensions to efficiently define their own resource loader modules. The trouble with hooks is that they contain code, and code is slow. We've been through all this before with i18n. Hooks are useful as a performance tool only if you call them very rarely. * Moved ResourceLoader settings to their own section in DefaultSettings.php * Added options to ResourceLoaderFileModule equivalent to the $localBasePath and $remoteBasePath parameters, to allow it to be instantiated via the new array style. Also added remoteExtPath, which allows modules to be registered before $wgExtensionAssetsPath is known. * Added OutputPage::getResourceLoader(), mostly for debugging. * The time saving at the moment is about 5ms per request with no extensions, which is significant already with 6 load.php requests for a cold cache page view. This is a much more scalable interface; the relative saving will grow as more extensions are added which use this interface, especially for non-APC installs. Although the interface is backwards compatible, extension updates will follow in a subsequent commit.
2010-11-19 10:41:06 +00:00
$module = $resourceLoader->getModule( $name );
$moduleSkins = $module->getSkins();
if (
( $safemode && $module->getOrigin() > Module::ORIGIN_CORE_INDIVIDUAL )
|| ( $moduleSkins !== null && !in_array( $skin, $moduleSkins ) )
) {
continue;
}
if ( $module instanceof StartUpModule ) {
resourceloader: Remove support for raw modules Being a raw module means that when it is requested from load.php with "only=scripts" set, then the output is *not* wrapped in an 'mw.loader.implement' closure *and* there no 'mw.loader.state()' appendix. Instead, it is served "raw". Before 2018, the modules 'mediawiki' and 'jquery' were raw modules. They were needed before the client could define 'mw.loader.implement', and could never be valid dependencies. Module 'mediawiki' merged to 'startup', and 'jquery' became a regular module (T192623). Based on the architecture of modules being deliverable bundles, it doesn't make sense for there to ever be raw modules again. Anything that 'startup' needs should be bundled with it. Anything else is a regular module. On top of that, we never actually needed this feature because specifying the 'only=scripts' and 'raw=1' parameters does the same thing. The only special bit about marking modules (not requests) as "raw" was that it allowed the client to forget to specify "raw=1" and the server would automatically omit the 'mw.loader.state()' appendix based on whether the module is marked as raw. As of Ie4564ec8e26ad53f2, the two remaining use cases for raw responses now specify the 'raw=1' request parameter, and we can get rid of the "raw module" feature and all the complexity around it. == Startup module In the startup module there was an interesting use of isRaw() that has little to do with the above. The "ATTENTION" warning there applies to the startup module only, not raw modules in general. This is now fixed by explicitly checking for StartupModule. Above that warning, it talked about saving bytes, which was an optimisation given that "raw" modules don't communicate with mw.loader, they also don't need to be registered there because even if mw.loader would try to load them, the server would never inform mw.loader about the module having arrived. There are now no longer any such modules. Bug: T201483 Change-Id: I8839036e7b2b76919b6cd3aa42ccfde4d1247899
2019-06-13 18:41:56 +00:00
// Don't register 'startup' to the client because loading it lazily or depending
// on it doesn't make sense, because the startup module *is* the client.
// Registering would be a waste of bandwidth and memory and risks somehow causing
// it to load a second time.
resourceloader: Use 'enableModuleContentVersion' for startup module This significantly simplifies the getVersionHash implementation for StartupModule, and fixes a couple of bugs. Previously, the startup module's E-Tag was determined by the 'getDefinitionSummary' method, which combined the E-Tag values from all registered modules, plus what we thought is all information used by 'getScript' (config vars, embedded script files, list of base modules, ...) However, this were various things part of the manifest that it forgot about, including: * Changes to the list of dependencies of a module. * Changes to the name of module. * Changes to the cache group of module. * Adding or removing a foreign module source (mw.loader.addSource). These are all quite rare, and when they do change, they usually also involve a change that *was* tracked already. But, sometimes they don't and that's when bugs happened. Instead of the tracking array of getDefinitionSummary, we now use the 'enableModuleContentVersion' option for StartupModule, which simply calls the actual getScript() method and hashes that. Of note: When an exception happens with the version computation of any individual module, we catch it, log it, and continue with the rest. Previously, the first time such error was discovered at run-time would be in the getCombinedVersion() call from StartupModule::getAllModuleHashes(). That public getCombinedVersion() method of ResourceLoader had the benefit of also outputting details of that exception in the HTTP response output. In order to keep that behaviour, I made outputErrorAndLog() public so that StartupModule can call it directly now. This is covered by ResourceLoaderTest::testMakeModuleResponseStartupError. Bug: T201686 Change-Id: I8e8d3a2cd2ccd68d2d78e988bcdd0d77fbcbf1d4
2018-08-30 02:52:39 +00:00
// ATTENTION: Because of the line below, this is not going to cause infinite recursion.
// Think carefully before making changes to this code!
// The below code is going to call Module::getVersionHash() for every module.
resourceloader: Use 'enableModuleContentVersion' for startup module This significantly simplifies the getVersionHash implementation for StartupModule, and fixes a couple of bugs. Previously, the startup module's E-Tag was determined by the 'getDefinitionSummary' method, which combined the E-Tag values from all registered modules, plus what we thought is all information used by 'getScript' (config vars, embedded script files, list of base modules, ...) However, this were various things part of the manifest that it forgot about, including: * Changes to the list of dependencies of a module. * Changes to the name of module. * Changes to the cache group of module. * Adding or removing a foreign module source (mw.loader.addSource). These are all quite rare, and when they do change, they usually also involve a change that *was* tracked already. But, sometimes they don't and that's when bugs happened. Instead of the tracking array of getDefinitionSummary, we now use the 'enableModuleContentVersion' option for StartupModule, which simply calls the actual getScript() method and hashes that. Of note: When an exception happens with the version computation of any individual module, we catch it, log it, and continue with the rest. Previously, the first time such error was discovered at run-time would be in the getCombinedVersion() call from StartupModule::getAllModuleHashes(). That public getCombinedVersion() method of ResourceLoader had the benefit of also outputting details of that exception in the HTTP response output. In order to keep that behaviour, I made outputErrorAndLog() public so that StartupModule can call it directly now. This is covered by ResourceLoaderTest::testMakeModuleResponseStartupError. Bug: T201686 Change-Id: I8e8d3a2cd2ccd68d2d78e988bcdd0d77fbcbf1d4
2018-08-30 02:52:39 +00:00
// For StartUpModule (this module) the hash is computed based on the manifest content,
// which is the very thing we are computing right here. As such, this must skip iterating
// over 'startup' itself.
continue;
}
// Optimization: Exclude modules in the `noscript` group. These are only ever used
// directly by HTML without use of JavaScript (T291735).
if ( $module->getGroup() === self::GROUP_NOSCRIPT ) {
continue;
}
resourceloader: Don't let module exception break startup When getScript (or some other method used in a module response) throws an error, only that module fails (by outputting mw.loader.state instead of mw.loader.implement). Other modules will work. This has always been the case and is working fine. For example, "load.php?modules=foo|bar", where 'foo' throws, will return: ```js /* exception message: .. */ mw.loader.implement('bar', ..) mw.loader.state('foo', 'error') ``` The problem, however, is that during the generation of the startup module, we iterate over all other modules. In 2011, the getVersionHash method (then: getModifiedTime) was fairly simple and unlikely to throw errors. Nowadays, some modules use enableModuleContentVersion which will involve the same code path as for regular module responses. The try/catch in ResourceLoader::makeModuleResponse() suffices for the case of loading modules other than startup. But when loading the startup module, and an exception happens in getVersionHash, then the entire startup response is replaced with an exception comment. Example case: * A file not existing for a FileModule subclass that uses enableModuleContentVersion. * A database error from a data module, like CiteDataModule or CNChoiceData. Changes: * Ensure E-Tag is still useful while an error happens in production because we respond with 200 OK and one error isn't the same as another. Fixed by try/catch in getCombinedVersion. * Ensure start manifest isn't disrupted by one broken module. Fixed by try/catch in StartupModule::getModuleRegistrations(). Tests: * testMakeModuleResponseError: The case that already worked fined. * testMakeModuleResponseStartupError: The case fixed in this commit. * testGetCombinedVersion: The case fixed in this commit for E-Tag. Bug: T152266 Change-Id: Ice4ede5ea594bf3fa591134bc9382bd9c24e2f39
2016-12-03 00:48:14 +00:00
try {
// The version should be formatted by ResourceLoader::makeHash and be of
resourceloader: Skip version hash calculation in debug mode === Why * More speed In debug mode, the server should regenerate the startup manifest on each page view to ensure immediate effect of changes. But, this also means more version recomputation work on the server. For most modules, this was already quite fast on repeat views because of OS-level file caches, and our file-hash caches and LESS compile caches in php-apcu from ResourceLoader. But, this makes it even faster. * Better integration with browser devtools. Breakpoints stay more consistently across browsers when the URL stays the same even after you have changed the file and reloaded the page. For static files, I believe most browsers ignore query parameters. But for package files that come from load.php, this was harder for browsers to guess correctly which old script URL is logically replaced by a different one on the next page view. === How Change Module::getVersionHash to return empty strings in debug mode. I considered approaching this from StartupModule::getModuleRegistrations instead to make the change apply only to the client-side manifest. I decided against this because we have other calls to getVersionHash on the server-side (such as for E-Tag calculation, and formatting cross-wiki URLs) which would then not match the version queries that mw.loader formats in debug mode. Also, those calls would still be incurring some the avoidable costs. === Notes * The two test cases for verifying the graceful fallback in production if version hash computations throw an exception, were moved to a non-debug test case as no longer happen now during the debug (unminified) test cases. * Avoid "PHP Notice: Undefined offset 0" in testMakeModuleResponseStartupError by adding a fallback to empty string so that if the test fails, it fails in a more useful way instead of aborting with this error before the assertion happens. (Since PHPUnit generally stops on the first error.) * In practice, there are still "version" query parameters and E-Tag headers in debug mode. These are not module versions, but URL "combined versions" crafted by getCombinedVersion() in JS and PHP. These return the constant "ztntf" in debug mode, which is the hash of an empty string. We could alter these methods to special-case when all inputs are and join to a still-empty string, or maybe we just leave them be. I've done the latter for now. Bug: T235672 Bug: T85805 Change-Id: I0e63eef4f85b13089a0aa3806a5b6f821d527a92
2021-08-28 02:53:36 +00:00
// length ResourceLoader::HASH_LENGTH (or empty string).
// The getVersionHash method is final and is covered by tests, as is makeHash().
resourceloader: Don't let module exception break startup When getScript (or some other method used in a module response) throws an error, only that module fails (by outputting mw.loader.state instead of mw.loader.implement). Other modules will work. This has always been the case and is working fine. For example, "load.php?modules=foo|bar", where 'foo' throws, will return: ```js /* exception message: .. */ mw.loader.implement('bar', ..) mw.loader.state('foo', 'error') ``` The problem, however, is that during the generation of the startup module, we iterate over all other modules. In 2011, the getVersionHash method (then: getModifiedTime) was fairly simple and unlikely to throw errors. Nowadays, some modules use enableModuleContentVersion which will involve the same code path as for regular module responses. The try/catch in ResourceLoader::makeModuleResponse() suffices for the case of loading modules other than startup. But when loading the startup module, and an exception happens in getVersionHash, then the entire startup response is replaced with an exception comment. Example case: * A file not existing for a FileModule subclass that uses enableModuleContentVersion. * A database error from a data module, like CiteDataModule or CNChoiceData. Changes: * Ensure E-Tag is still useful while an error happens in production because we respond with 200 OK and one error isn't the same as another. Fixed by try/catch in getCombinedVersion. * Ensure start manifest isn't disrupted by one broken module. Fixed by try/catch in StartupModule::getModuleRegistrations(). Tests: * testMakeModuleResponseError: The case that already worked fined. * testMakeModuleResponseStartupError: The case fixed in this commit. * testGetCombinedVersion: The case fixed in this commit for E-Tag. Bug: T152266 Change-Id: Ice4ede5ea594bf3fa591134bc9382bd9c24e2f39
2016-12-03 00:48:14 +00:00
$versionHash = $module->getVersionHash( $context );
} catch ( TimeoutException $e ) {
throw $e;
resourceloader: Don't let module exception break startup When getScript (or some other method used in a module response) throws an error, only that module fails (by outputting mw.loader.state instead of mw.loader.implement). Other modules will work. This has always been the case and is working fine. For example, "load.php?modules=foo|bar", where 'foo' throws, will return: ```js /* exception message: .. */ mw.loader.implement('bar', ..) mw.loader.state('foo', 'error') ``` The problem, however, is that during the generation of the startup module, we iterate over all other modules. In 2011, the getVersionHash method (then: getModifiedTime) was fairly simple and unlikely to throw errors. Nowadays, some modules use enableModuleContentVersion which will involve the same code path as for regular module responses. The try/catch in ResourceLoader::makeModuleResponse() suffices for the case of loading modules other than startup. But when loading the startup module, and an exception happens in getVersionHash, then the entire startup response is replaced with an exception comment. Example case: * A file not existing for a FileModule subclass that uses enableModuleContentVersion. * A database error from a data module, like CiteDataModule or CNChoiceData. Changes: * Ensure E-Tag is still useful while an error happens in production because we respond with 200 OK and one error isn't the same as another. Fixed by try/catch in getCombinedVersion. * Ensure start manifest isn't disrupted by one broken module. Fixed by try/catch in StartupModule::getModuleRegistrations(). Tests: * testMakeModuleResponseError: The case that already worked fined. * testMakeModuleResponseStartupError: The case fixed in this commit. * testGetCombinedVersion: The case fixed in this commit for E-Tag. Bug: T152266 Change-Id: Ice4ede5ea594bf3fa591134bc9382bd9c24e2f39
2016-12-03 00:48:14 +00:00
} catch ( Exception $e ) {
resourceloader: Use 'enableModuleContentVersion' for startup module This significantly simplifies the getVersionHash implementation for StartupModule, and fixes a couple of bugs. Previously, the startup module's E-Tag was determined by the 'getDefinitionSummary' method, which combined the E-Tag values from all registered modules, plus what we thought is all information used by 'getScript' (config vars, embedded script files, list of base modules, ...) However, this were various things part of the manifest that it forgot about, including: * Changes to the list of dependencies of a module. * Changes to the name of module. * Changes to the cache group of module. * Adding or removing a foreign module source (mw.loader.addSource). These are all quite rare, and when they do change, they usually also involve a change that *was* tracked already. But, sometimes they don't and that's when bugs happened. Instead of the tracking array of getDefinitionSummary, we now use the 'enableModuleContentVersion' option for StartupModule, which simply calls the actual getScript() method and hashes that. Of note: When an exception happens with the version computation of any individual module, we catch it, log it, and continue with the rest. Previously, the first time such error was discovered at run-time would be in the getCombinedVersion() call from StartupModule::getAllModuleHashes(). That public getCombinedVersion() method of ResourceLoader had the benefit of also outputting details of that exception in the HTTP response output. In order to keep that behaviour, I made outputErrorAndLog() public so that StartupModule can call it directly now. This is covered by ResourceLoaderTest::testMakeModuleResponseStartupError. Bug: T201686 Change-Id: I8e8d3a2cd2ccd68d2d78e988bcdd0d77fbcbf1d4
2018-08-30 02:52:39 +00:00
// Don't fail the request (T152266)
// Also print the error in the main output
$resourceLoader->outputErrorAndLog( $e,
resourceloader: Don't let module exception break startup When getScript (or some other method used in a module response) throws an error, only that module fails (by outputting mw.loader.state instead of mw.loader.implement). Other modules will work. This has always been the case and is working fine. For example, "load.php?modules=foo|bar", where 'foo' throws, will return: ```js /* exception message: .. */ mw.loader.implement('bar', ..) mw.loader.state('foo', 'error') ``` The problem, however, is that during the generation of the startup module, we iterate over all other modules. In 2011, the getVersionHash method (then: getModifiedTime) was fairly simple and unlikely to throw errors. Nowadays, some modules use enableModuleContentVersion which will involve the same code path as for regular module responses. The try/catch in ResourceLoader::makeModuleResponse() suffices for the case of loading modules other than startup. But when loading the startup module, and an exception happens in getVersionHash, then the entire startup response is replaced with an exception comment. Example case: * A file not existing for a FileModule subclass that uses enableModuleContentVersion. * A database error from a data module, like CiteDataModule or CNChoiceData. Changes: * Ensure E-Tag is still useful while an error happens in production because we respond with 200 OK and one error isn't the same as another. Fixed by try/catch in getCombinedVersion. * Ensure start manifest isn't disrupted by one broken module. Fixed by try/catch in StartupModule::getModuleRegistrations(). Tests: * testMakeModuleResponseError: The case that already worked fined. * testMakeModuleResponseStartupError: The case fixed in this commit. * testGetCombinedVersion: The case fixed in this commit for E-Tag. Bug: T152266 Change-Id: Ice4ede5ea594bf3fa591134bc9382bd9c24e2f39
2016-12-03 00:48:14 +00:00
'Calculating version for "{module}" failed: {exception}',
[
'module' => $name,
'exception' => $e,
]
);
$versionHash = '';
$states[$name] = 'error';
}
resourceloader: Implement "skip function" feature A module can be registered with a skip function. Such function, if provided, will be invoked by the client when a module is queued for loading. If the function returns true, the client will bypass any further loading action and mark the module as 'ready'. This can be used to implement a feature test for a module providing a shim or polyfill. * Change visibility of method ResourceLoader::filter to public. So that it can be invoked by ResourceLoaderStartupModule. * Add option to suppress the cache key report in ResourceLoader::filter. We usually only call the minifier once on an entire request reponse (because it's all concatenated javascript or embedded javascript in various different closures, still valid as one large script) and only add a little bottom line for the cache key. When embedding the skip function we have to run the minifier on them separately as they're output as strings (not actual functions). These strings are typically quite small and blowing up the response with loads of cache keys is not desirable in production. * Add method to clear the static cache of ResourceLoader::inDebugMode. Global static state is evil but, as long as we have it, we at least need to clear it after switching contexts in the test suite. Also: * Remove obsolete setting of 'debug=true' in the FauxRequest in ResourceLoaderTestCase. It already sets global wgResourceLoaderDebug in the setUp() method. Bug: 66390 Change-Id: I87a0ea888d791ad39f114380c42e2daeca470961
2014-04-30 21:06:51 +00:00
$skipFunction = $module->getSkipFunction();
if ( $skipFunction !== null && !$context->getDebug() ) {
$skipFunction = ResourceLoader::filter( 'minify-js', $skipFunction );
resourceloader: Implement "skip function" feature A module can be registered with a skip function. Such function, if provided, will be invoked by the client when a module is queued for loading. If the function returns true, the client will bypass any further loading action and mark the module as 'ready'. This can be used to implement a feature test for a module providing a shim or polyfill. * Change visibility of method ResourceLoader::filter to public. So that it can be invoked by ResourceLoaderStartupModule. * Add option to suppress the cache key report in ResourceLoader::filter. We usually only call the minifier once on an entire request reponse (because it's all concatenated javascript or embedded javascript in various different closures, still valid as one large script) and only add a little bottom line for the cache key. When embedding the skip function we have to run the minifier on them separately as they're output as strings (not actual functions). These strings are typically quite small and blowing up the response with loads of cache keys is not desirable in production. * Add method to clear the static cache of ResourceLoader::inDebugMode. Global static state is evil but, as long as we have it, we at least need to clear it after switching contexts in the test suite. Also: * Remove obsolete setting of 'debug=true' in the FauxRequest in ResourceLoaderTestCase. It already sets global wgResourceLoaderDebug in the setUp() method. Bug: 66390 Change-Id: I87a0ea888d791ad39f114380c42e2daeca470961
2014-04-30 21:06:51 +00:00
}
$registryData[$name] = [
resourceloader: Replace timestamp system with version hashing Modules now track their version via getVersionHash() instead of getModifiedTime(). == Background == While some resources have observeable timestamps (e.g. files stored on disk), many other resources do not. E.g. config variables, and module definitions. For static file modules, one can e.g. revert one of more files in a module to a previous version and not affect the max timestamp. Wiki modules include pages only if they exist. The user module supports common.js and skin.js. By default neither exists. If a user has both, and then the less-recently modified one is deleted, the max-timestamp remains unchanged. For client-side caching, batch requests use "Math.max" on the relevant timestamps. Again, if a module changes but another module is more recent (e.g. out-of-order deployment, or out-of-order discovery), the change would not result in a cache miss. More scenarios can be found in the associated Phabricator tasks. == Version hash == Previously we virtually mapped these variables to a timestamp by storing the current time alongside a hash of the value in ObjectCache. Considering the number of possible request contexts (wikis * modules * users * skins * languages) this doesn't work well. It results in needless cache invalidation when the first time observation is purged due to LRU algorithms. It also has other minor bugs leading to fewer cache hits. All modules automatically get the benefits of version hashing with this change. The old getDefinitionMtime() and getHashMtime() have been replaced with dummies that return 1. These functions are often called from getModifiedTime() in subclasses. For backward-compatibility, their respective values (definition summary and hash) are now included in getVersionHash directly. As examples, the following modules have been updated to use getVersionHash directly. Other modules still work fine and can be updated later. * ResourceLoaderFileModule * ResourceLoaderEditToolbarModule * ResourceLoaderStartUpModule * ResourceLoaderWikiModule The presence of hashes in place of timestamps increases the startup module size on a default MediaWiki install from 4.4k to 5.8k (after gzip and minification). == ETag == Since timestamps are no longer tracked, we need a different way to implement caching for cache proxies (e.g. Varnish) and web browsers. Previously we used the Last-Modified header (in combination with Cache-Control and Expires). Instead of Last-Modified (and If-Modified-Since), we use ETag (and If-None-Match). Entity tags (new in HTTP/1.1) are much stricter than Last-Modified by default. They instruct browsers to allow usage of partial Range requests. Since our responses are dynamically generated, we need to use the Weak version of ETag. While this sounds bad, it's no different than Last-Modified. As reassured by RFC 2616 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.3> the specified behaviour behind Last-Modified follows the same "Weak" caching logic as Entity tags. It's just that entity tags are capable of a stricter mode (whereas Last-Modified is inherently weak). == File cache == If $wgUseFileCache is enabled, ResourceLoader uses ResourceFileCache to cache load.php responses. While the blind TTL handling (during the allowed expiry period) is still maxage/timestamp based, tryRespondNotModified() now requires the caller to know the expected ETag. For this to work, the FileCache handling had to be moved from the top of ResoureLoader::respond() to after the expected ETag is computed. This also allows us to remove the duplicate tryRespondNotModified() handling since that's is already handled by ResourceLoader::respond() meanwhile. == Misc == * Remove redundant modifiedTime cache in ResourceLoaderFileModule. * Change bugzilla references to Phabricator. * Centralised inclusion of wgCacheEpoch using getDefinitionSummary. Previously this logic was duplicated in each place the modified timestamp was used. * It's easy to forget calling the parent class in getDefinitionSummary(). Previously this method only tracked 'class' by default. As such, various extensions hardcoded that one value instead of calling the parent and extending the array. To better prevent this in the future, getVersionHash() now asserts that the '_cacheEpoch' property made it through. * tests: Don't use getDefinitionSummary() as an API. Fix ResourceLoaderWikiModuleTest to call getPages properly. * In tests, the default timestamp used to be 1388534400000 (which is the unix time of 20140101000000; the unit tests' CacheEpoch). The new version hash of these modules is "XyCC+PSK", which is the base64 encoded prefix of the SHA1 digest of: '{"_class":"ResourceLoaderTestModule","_cacheEpoch":"20140101000000"}' * Add sha1.js library for client-side hash generation. Compared various different implementations for code size (after minfication/gzip), and speed (when used for short hexidecimal strings). https://jsperf.com/sha1-implementations - CryptoJS <https://code.google.com/p/crypto-js/#SHA-1> (min+gzip: 2.5k) http://crypto-js.googlecode.com/svn/tags/3.1.2/build/rollups/sha1.js Chrome: 45k, Firefox: 89k, Safari: 92k - jsSHA <https://github.com/Caligatio/jsSHA> https://github.com/Caligatio/jsSHA/blob/3c1d4f2e/src/sha1.js (min+gzip: 1.8k) Chrome: 65k, Firefox: 53k, Safari: 69k - phpjs-sha1 <https://github.com/kvz/phpjs> (RL min+gzip: 0.8k) https://github.com/kvz/phpjs/blob/1eaab15d/functions/strings/sha1.js Chrome: 200k, Firefox: 280k, Safari: 78k Modern browsers implement the HTML5 Crypto API. However, this API is asynchronous, only enabled when on HTTPS in Chromium, and is quite low-level. It requires boilerplate code to actually use with TextEncoder, ArrayBuffer and Uint32Array. Due this being needed in the module loader, we'd have to load the fallback regardless. Considering this is not used in a critical path for performance, it's not worth shipping two implementations for this optimisation. May also resolve: * T44094 * T90411 * T94810 Bug: T94074 Change-Id: Ibb292d2416839327d1807a66c78fd96dac0637d0
2015-04-29 22:53:24 +00:00
'version' => $versionHash,
resourceloader: Add context param to ResourceLoaderModule::getDependencies By providing context as a parameter in getDependencies, we allow modules to dyanamically determine dependencies based on context. Note: To ease rollout, the parameter is optional in this patch. It is expected that it will be made non-optional in the near future. The use case is for CentralNotice campaigns to be able to add special modules ahead of deciding which banner to show a user. The dynamically chosen RL modules would replace ad-hoc JS currently sent with some banners. A list of possible campaigns and banners is already sent as a PHP- implemented RL module; that's the module that will dynamically choose other modules as dependencies when appropriate. This approach will save a round trip as compared to dynamically loading the modules client-side. For compatibility, extensions that override ResourceLoaderModule::getDependencies() should be updated with the new method signature. Here are changes for extensions currently deployed on Wikimedia wikis: * CentralNotice: I816bffa3815e2eab7e88cb04d1b345070e6aa15f * Gadgets: I0a10fb0cbf17d095ece493e744296caf13dcee02 * EventLogging: I67e957f74d6ca48cfb9a41fb5144bcc78f885e50 * PageTriage: Ica3ba32aa2fc76d11a44f391b6edfc871e7fbe0d * UniversalLanguageSelector: Ic63e617f51702c27104e123d4bed91983a726b7f * VisualEditor: I0ac775ca286e64825e31a9213b94648e41a5bc30 For more on the CentralNotice use case, please see I9f80edcbcacca2. Bug: T98924 Change-Id: Iee61e5b527321d01287baa03ad9b4d4f526ff3ef
2015-04-08 21:34:08 +00:00
'dependencies' => $module->getDependencies( $context ),
'group' => $this->getGroupId( $module->getGroup() ),
'source' => $module->getSource(),
resourceloader: Implement "skip function" feature A module can be registered with a skip function. Such function, if provided, will be invoked by the client when a module is queued for loading. If the function returns true, the client will bypass any further loading action and mark the module as 'ready'. This can be used to implement a feature test for a module providing a shim or polyfill. * Change visibility of method ResourceLoader::filter to public. So that it can be invoked by ResourceLoaderStartupModule. * Add option to suppress the cache key report in ResourceLoader::filter. We usually only call the minifier once on an entire request reponse (because it's all concatenated javascript or embedded javascript in various different closures, still valid as one large script) and only add a little bottom line for the cache key. When embedding the skip function we have to run the minifier on them separately as they're output as strings (not actual functions). These strings are typically quite small and blowing up the response with loads of cache keys is not desirable in production. * Add method to clear the static cache of ResourceLoader::inDebugMode. Global static state is evil but, as long as we have it, we at least need to clear it after switching contexts in the test suite. Also: * Remove obsolete setting of 'debug=true' in the FauxRequest in ResourceLoaderTestCase. It already sets global wgResourceLoaderDebug in the setUp() method. Bug: 66390 Change-Id: I87a0ea888d791ad39f114380c42e2daeca470961
2014-04-30 21:06:51 +00:00
'skip' => $skipFunction,
];
}
self::compileUnresolvedDependencies( $registryData );
// Register sources
$sources = $oldSources = $resourceLoader->getSources();
$this->getHookRunner()->onResourceLoaderModifyStartupSourceUrls( $sources, $context );
if ( array_keys( $sources ) !== array_keys( $oldSources ) ) {
throw new DomainException( 'ResourceLoaderModifyStartupSourceUrls hook must not add or remove sources' );
}
$out = ResourceLoader::makeLoaderSourcesScript( $context, $sources );
// Figure out the different call signatures for mw.loader.register
$registrations = [];
foreach ( $registryData as $name => $data ) {
resourceloader: Replace timestamp system with version hashing Modules now track their version via getVersionHash() instead of getModifiedTime(). == Background == While some resources have observeable timestamps (e.g. files stored on disk), many other resources do not. E.g. config variables, and module definitions. For static file modules, one can e.g. revert one of more files in a module to a previous version and not affect the max timestamp. Wiki modules include pages only if they exist. The user module supports common.js and skin.js. By default neither exists. If a user has both, and then the less-recently modified one is deleted, the max-timestamp remains unchanged. For client-side caching, batch requests use "Math.max" on the relevant timestamps. Again, if a module changes but another module is more recent (e.g. out-of-order deployment, or out-of-order discovery), the change would not result in a cache miss. More scenarios can be found in the associated Phabricator tasks. == Version hash == Previously we virtually mapped these variables to a timestamp by storing the current time alongside a hash of the value in ObjectCache. Considering the number of possible request contexts (wikis * modules * users * skins * languages) this doesn't work well. It results in needless cache invalidation when the first time observation is purged due to LRU algorithms. It also has other minor bugs leading to fewer cache hits. All modules automatically get the benefits of version hashing with this change. The old getDefinitionMtime() and getHashMtime() have been replaced with dummies that return 1. These functions are often called from getModifiedTime() in subclasses. For backward-compatibility, their respective values (definition summary and hash) are now included in getVersionHash directly. As examples, the following modules have been updated to use getVersionHash directly. Other modules still work fine and can be updated later. * ResourceLoaderFileModule * ResourceLoaderEditToolbarModule * ResourceLoaderStartUpModule * ResourceLoaderWikiModule The presence of hashes in place of timestamps increases the startup module size on a default MediaWiki install from 4.4k to 5.8k (after gzip and minification). == ETag == Since timestamps are no longer tracked, we need a different way to implement caching for cache proxies (e.g. Varnish) and web browsers. Previously we used the Last-Modified header (in combination with Cache-Control and Expires). Instead of Last-Modified (and If-Modified-Since), we use ETag (and If-None-Match). Entity tags (new in HTTP/1.1) are much stricter than Last-Modified by default. They instruct browsers to allow usage of partial Range requests. Since our responses are dynamically generated, we need to use the Weak version of ETag. While this sounds bad, it's no different than Last-Modified. As reassured by RFC 2616 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.3> the specified behaviour behind Last-Modified follows the same "Weak" caching logic as Entity tags. It's just that entity tags are capable of a stricter mode (whereas Last-Modified is inherently weak). == File cache == If $wgUseFileCache is enabled, ResourceLoader uses ResourceFileCache to cache load.php responses. While the blind TTL handling (during the allowed expiry period) is still maxage/timestamp based, tryRespondNotModified() now requires the caller to know the expected ETag. For this to work, the FileCache handling had to be moved from the top of ResoureLoader::respond() to after the expected ETag is computed. This also allows us to remove the duplicate tryRespondNotModified() handling since that's is already handled by ResourceLoader::respond() meanwhile. == Misc == * Remove redundant modifiedTime cache in ResourceLoaderFileModule. * Change bugzilla references to Phabricator. * Centralised inclusion of wgCacheEpoch using getDefinitionSummary. Previously this logic was duplicated in each place the modified timestamp was used. * It's easy to forget calling the parent class in getDefinitionSummary(). Previously this method only tracked 'class' by default. As such, various extensions hardcoded that one value instead of calling the parent and extending the array. To better prevent this in the future, getVersionHash() now asserts that the '_cacheEpoch' property made it through. * tests: Don't use getDefinitionSummary() as an API. Fix ResourceLoaderWikiModuleTest to call getPages properly. * In tests, the default timestamp used to be 1388534400000 (which is the unix time of 20140101000000; the unit tests' CacheEpoch). The new version hash of these modules is "XyCC+PSK", which is the base64 encoded prefix of the SHA1 digest of: '{"_class":"ResourceLoaderTestModule","_cacheEpoch":"20140101000000"}' * Add sha1.js library for client-side hash generation. Compared various different implementations for code size (after minfication/gzip), and speed (when used for short hexidecimal strings). https://jsperf.com/sha1-implementations - CryptoJS <https://code.google.com/p/crypto-js/#SHA-1> (min+gzip: 2.5k) http://crypto-js.googlecode.com/svn/tags/3.1.2/build/rollups/sha1.js Chrome: 45k, Firefox: 89k, Safari: 92k - jsSHA <https://github.com/Caligatio/jsSHA> https://github.com/Caligatio/jsSHA/blob/3c1d4f2e/src/sha1.js (min+gzip: 1.8k) Chrome: 65k, Firefox: 53k, Safari: 69k - phpjs-sha1 <https://github.com/kvz/phpjs> (RL min+gzip: 0.8k) https://github.com/kvz/phpjs/blob/1eaab15d/functions/strings/sha1.js Chrome: 200k, Firefox: 280k, Safari: 78k Modern browsers implement the HTML5 Crypto API. However, this API is asynchronous, only enabled when on HTTPS in Chromium, and is quite low-level. It requires boilerplate code to actually use with TextEncoder, ArrayBuffer and Uint32Array. Due this being needed in the module loader, we'd have to load the fallback regardless. Considering this is not used in a critical path for performance, it's not worth shipping two implementations for this optimisation. May also resolve: * T44094 * T90411 * T94810 Bug: T94074 Change-Id: Ibb292d2416839327d1807a66c78fd96dac0637d0
2015-04-29 22:53:24 +00:00
// Call mw.loader.register(name, version, dependencies, group, source, skip)
$registrations[] = [
$name,
ResourceLoader: Raise MW JavaScript startup requirement to ES6 The UA sniffs that overrode the feature tests are no longer needed. * MSIE 10: Fine, rejected by feature checks. * UC Mini "Speed Mode": Redundant, the version that this sniff matched is pre-ES6. Current versions of UC Mini don't appear to support enabling "Speed Mode" on random websites nor does it offer it for Wikipedia specifically. Details at https://phabricator.wikimedia.org/T178356#8740573. * Google Web Light: Redundant, shutdown as of 2022. Any references or extensions that still reach the proxy, get redirected to our online URLs https://googleweblight.com/?lite_url=https://en.m.wikipedia.org/wiki/Banana https://phabricator.wikimedia.org/T152602 https://en.wikipedia.org/wiki/Google_Web_Light * MeeGo: Redundant, discontinued and presumed rejected. Either way, unsupported. * Opera Mini: Fine, rejected by checks. Details at https://phabricator.wikimedia.org/T178356#8740573. * Ovi Browser: Redundant, discontinued and presumed rejected. Either way, unsupported. * Google Glass: Improve UX (since 2013, T58008). * NetFront: Redundant. Old versions are presumed rejected. Current versions are Chromium-based and presumed fine. The exclusion was not UX based, but due to jQuery explicitly not supporting it in 2013. This is no longer the case, so we can let the feature test lead the way here. * PlayStation: Redundant, same story as NetFront. The version that matched the sniff is presumed rejected. Current versions probably fine, but even not, don't match our sniff so are already enabled today. Bug: T178356 Change-Id: Ib6263ce3ffd11af5e501de8857f3e48a248c6210
2023-03-24 12:56:01 +00:00
$data['version'],
$data['dependencies'],
$data['group'],
// Swap default (local) for null
$data['source'] === 'local' ? null : $data['source'],
$data['skip']
];
}
// Register modules
$out .= "\n" . ResourceLoader::makeLoaderRegisterScript( $context, $registrations );
resourceloader: Don't let module exception break startup When getScript (or some other method used in a module response) throws an error, only that module fails (by outputting mw.loader.state instead of mw.loader.implement). Other modules will work. This has always been the case and is working fine. For example, "load.php?modules=foo|bar", where 'foo' throws, will return: ```js /* exception message: .. */ mw.loader.implement('bar', ..) mw.loader.state('foo', 'error') ``` The problem, however, is that during the generation of the startup module, we iterate over all other modules. In 2011, the getVersionHash method (then: getModifiedTime) was fairly simple and unlikely to throw errors. Nowadays, some modules use enableModuleContentVersion which will involve the same code path as for regular module responses. The try/catch in ResourceLoader::makeModuleResponse() suffices for the case of loading modules other than startup. But when loading the startup module, and an exception happens in getVersionHash, then the entire startup response is replaced with an exception comment. Example case: * A file not existing for a FileModule subclass that uses enableModuleContentVersion. * A database error from a data module, like CiteDataModule or CNChoiceData. Changes: * Ensure E-Tag is still useful while an error happens in production because we respond with 200 OK and one error isn't the same as another. Fixed by try/catch in getCombinedVersion. * Ensure start manifest isn't disrupted by one broken module. Fixed by try/catch in StartupModule::getModuleRegistrations(). Tests: * testMakeModuleResponseError: The case that already worked fined. * testMakeModuleResponseStartupError: The case fixed in this commit. * testGetCombinedVersion: The case fixed in this commit for E-Tag. Bug: T152266 Change-Id: Ice4ede5ea594bf3fa591134bc9382bd9c24e2f39
2016-12-03 00:48:14 +00:00
if ( $states ) {
$out .= "\n" . ResourceLoader::makeLoaderStateScript( $context, $states );
resourceloader: Don't let module exception break startup When getScript (or some other method used in a module response) throws an error, only that module fails (by outputting mw.loader.state instead of mw.loader.implement). Other modules will work. This has always been the case and is working fine. For example, "load.php?modules=foo|bar", where 'foo' throws, will return: ```js /* exception message: .. */ mw.loader.implement('bar', ..) mw.loader.state('foo', 'error') ``` The problem, however, is that during the generation of the startup module, we iterate over all other modules. In 2011, the getVersionHash method (then: getModifiedTime) was fairly simple and unlikely to throw errors. Nowadays, some modules use enableModuleContentVersion which will involve the same code path as for regular module responses. The try/catch in ResourceLoader::makeModuleResponse() suffices for the case of loading modules other than startup. But when loading the startup module, and an exception happens in getVersionHash, then the entire startup response is replaced with an exception comment. Example case: * A file not existing for a FileModule subclass that uses enableModuleContentVersion. * A database error from a data module, like CiteDataModule or CNChoiceData. Changes: * Ensure E-Tag is still useful while an error happens in production because we respond with 200 OK and one error isn't the same as another. Fixed by try/catch in getCombinedVersion. * Ensure start manifest isn't disrupted by one broken module. Fixed by try/catch in StartupModule::getModuleRegistrations(). Tests: * testMakeModuleResponseError: The case that already worked fined. * testMakeModuleResponseStartupError: The case fixed in this commit. * testGetCombinedVersion: The case fixed in this commit for E-Tag. Bug: T152266 Change-Id: Ice4ede5ea594bf3fa591134bc9382bd9c24e2f39
2016-12-03 00:48:14 +00:00
}
return $out;
}
private function getGroupId( $groupName ): ?int {
if ( $groupName === null ) {
return null;
}
if ( !array_key_exists( $groupName, $this->groupIds ) ) {
$this->groupIds[$groupName] = count( $this->groupIds );
}
return $this->groupIds[$groupName];
}
resourceloader: Embed 'mediawiki' directly in startup response Embed the essential files to define mw.loader directly as part of the startup module. * This means the internal 'mediawiki' module no longer exists. This is safe to remove because: 1) While registered server-side for loading from startup.js, a PHPUnit structure test disallowed being specified as a dependency. 2) Anything that attempted to load it client-side failed because the module was marked in the registry as 'raw', thereby excluding it from the data sent to the client-side. As such, it was seen as an unknown module that the client refused to fetch from the server. * Deprecate getStartupModules() and getLegacyModules(). These are no longer needed. There are no known callers anywhere in Wikimedia Git or elsewhere indexed by Codesearch, but easy enough to leave as no-op for one release. * Remove ResourceLoaderRawFileModule class. No longer needed. Was created as a hack specifically for the 'mediawiki' module so that it would not leak global variables in debug mode. It has no usage anywhere in Wikimedia Git, nor elsewhere in Codesearch. Remove without deprecation given this was meant to be a 'private' class. * Introduce (private) getBaseModules(). Previously, this list only existed locally in getStartupModulesUrl() by merging getStartupModules() and getLegacyModules(). This value was factored out into its own method. * Make getStartupModulesUrl() private and rename to getBaseModulesUrl(). It is only used internally to export the 'baseModulesUri' value. Its name was already confusing before, but it would've been even more confusing now given it doesn't even call getStartupModules() any more. Bug: T192623 Change-Id: I14ba282d7b65e99ca54b7c2f77ba6e1adaddd11c
2018-06-15 20:20:14 +00:00
/**
* Base modules implicitly available to all modules.
*
* @return array
*/
private function getBaseModules(): array {
return [ 'jquery', 'mediawiki.base' ];
}
/**
* Get the localStorage key for the entire module store. The key references
* $wgDBname to prevent clashes between wikis under the same web domain.
*
* @return string localStorage item key for JavaScript
*/
private function getStoreKey(): string {
return 'MediaWikiModuleStore:' . $this->getConfig()->get( MainConfigNames::DBname );
}
/**
* @see $wgResourceLoaderMaxQueryLength
* @return int
*/
private function getMaxQueryLength(): int {
$len = $this->getConfig()->get( MainConfigNames::ResourceLoaderMaxQueryLength );
// - Ignore -1, which in MW 1.34 and earlier was used to mean "unlimited".
// - Ignore invalid values, e.g. non-int or other negative values.
if ( $len === false || $len < 0 ) {
// Default
$len = 2000;
}
return $len;
}
/**
* Get the key on which the JavaScript module cache (mw.loader.store) will vary.
*
* @param Context $context
* @return string String of concatenated vary conditions
*/
private function getStoreVary( Context $context ): string {
return implode( ':', [
$context->getSkin(),
self::STORAGE_VERSION,
$this->getConfig()->get( MainConfigNames::ResourceLoaderStorageVersion ),
$context->getLanguage(),
] );
}
/**
* @param Context $context
* @return string|array JavaScript code
*/
public function getScript( Context $context ) {
global $IP;
$conf = $this->getConfig();
if ( $context->getOnly() !== 'scripts' ) {
return '/* Requires only=scripts */';
}
$enableJsProfiler = $conf->get( MainConfigNames::ResourceLoaderEnableJSProfiler );
resourceloader: Combine base modules and page modules requests This commit implements step 4 and step 5 of the plan outlined at T192623. Before this task began, the typical JavaScript execution flow was: * HTML triggers request for startup module (js req 1). * Startup module contains registry, site config, and triggers a request for the base modules (js req 2). * After the base modules arrive (which define jQuery and mw.loader), the startup module invokes a callback that processes RLQ, which is what will request modules for this page (js req 3). In past weeks, we have: * Made mediawiki.js independent of jQuery. * Spun off 'mediawiki.base' from mediawiki.js – for everything that wasn't needed for defining `mw.loader`. * Moved mediawiki.js from the base module request to being embedded as part of startup.js. The concept of dependencies is native to ResourceLoader, and thanks to the use of closures in mw.loader.implement() responses, we can download any number of interdependant modules in a single request (or parallel requests). Then, when a response arrives, mw.loader takes care to pause or resume execution as-needed. It is normal for ResourceLoader to batch several modules together, including their dependencies. As such, we can eliminate one of the two roundtrips required before a page can request modules. Specifically, we can eliminate "js req 2" (above), by making the two remaining base modules ("jquery" and "mediawiki.base") an implied dependency for all other modules, which ResourceLoader will naturally fetch and execute in the right order as part of the batch request. Bug: T192623 Change-Id: I17cd13dffebd6ae476044d8d038dc3974a1fa176
2018-07-12 20:09:28 +00:00
$startupCode = file_get_contents( "$IP/resources/src/startup/startup.js" );
resourceloader: Embed 'mediawiki' directly in startup response Embed the essential files to define mw.loader directly as part of the startup module. * This means the internal 'mediawiki' module no longer exists. This is safe to remove because: 1) While registered server-side for loading from startup.js, a PHPUnit structure test disallowed being specified as a dependency. 2) Anything that attempted to load it client-side failed because the module was marked in the registry as 'raw', thereby excluding it from the data sent to the client-side. As such, it was seen as an unknown module that the client refused to fetch from the server. * Deprecate getStartupModules() and getLegacyModules(). These are no longer needed. There are no known callers anywhere in Wikimedia Git or elsewhere indexed by Codesearch, but easy enough to leave as no-op for one release. * Remove ResourceLoaderRawFileModule class. No longer needed. Was created as a hack specifically for the 'mediawiki' module so that it would not leak global variables in debug mode. It has no usage anywhere in Wikimedia Git, nor elsewhere in Codesearch. Remove without deprecation given this was meant to be a 'private' class. * Introduce (private) getBaseModules(). Previously, this list only existed locally in getStartupModulesUrl() by merging getStartupModules() and getLegacyModules(). This value was factored out into its own method. * Make getStartupModulesUrl() private and rename to getBaseModulesUrl(). It is only used internally to export the 'baseModulesUri' value. Its name was already confusing before, but it would've been even more confusing now given it doesn't even call getStartupModules() any more. Bug: T192623 Change-Id: I14ba282d7b65e99ca54b7c2f77ba6e1adaddd11c
2018-06-15 20:20:14 +00:00
$mwLoaderCode = file_get_contents( "$IP/resources/src/startup/mediawiki.js" ) .
file_get_contents( "$IP/resources/src/startup/mediawiki.loader.js" ) .
resourceloader: Embed 'mediawiki' directly in startup response Embed the essential files to define mw.loader directly as part of the startup module. * This means the internal 'mediawiki' module no longer exists. This is safe to remove because: 1) While registered server-side for loading from startup.js, a PHPUnit structure test disallowed being specified as a dependency. 2) Anything that attempted to load it client-side failed because the module was marked in the registry as 'raw', thereby excluding it from the data sent to the client-side. As such, it was seen as an unknown module that the client refused to fetch from the server. * Deprecate getStartupModules() and getLegacyModules(). These are no longer needed. There are no known callers anywhere in Wikimedia Git or elsewhere indexed by Codesearch, but easy enough to leave as no-op for one release. * Remove ResourceLoaderRawFileModule class. No longer needed. Was created as a hack specifically for the 'mediawiki' module so that it would not leak global variables in debug mode. It has no usage anywhere in Wikimedia Git, nor elsewhere in Codesearch. Remove without deprecation given this was meant to be a 'private' class. * Introduce (private) getBaseModules(). Previously, this list only existed locally in getStartupModulesUrl() by merging getStartupModules() and getLegacyModules(). This value was factored out into its own method. * Make getStartupModulesUrl() private and rename to getBaseModulesUrl(). It is only used internally to export the 'baseModulesUri' value. Its name was already confusing before, but it would've been even more confusing now given it doesn't even call getStartupModules() any more. Bug: T192623 Change-Id: I14ba282d7b65e99ca54b7c2f77ba6e1adaddd11c
2018-06-15 20:20:14 +00:00
file_get_contents( "$IP/resources/src/startup/mediawiki.requestIdleCallback.js" );
if ( $conf->get( MainConfigNames::ResourceLoaderEnableJSProfiler ) ) {
$mwLoaderCode .= file_get_contents( "$IP/resources/src/startup/profiler.js" );
}
resourceloader: Combine base modules and page modules requests This commit implements step 4 and step 5 of the plan outlined at T192623. Before this task began, the typical JavaScript execution flow was: * HTML triggers request for startup module (js req 1). * Startup module contains registry, site config, and triggers a request for the base modules (js req 2). * After the base modules arrive (which define jQuery and mw.loader), the startup module invokes a callback that processes RLQ, which is what will request modules for this page (js req 3). In past weeks, we have: * Made mediawiki.js independent of jQuery. * Spun off 'mediawiki.base' from mediawiki.js – for everything that wasn't needed for defining `mw.loader`. * Moved mediawiki.js from the base module request to being embedded as part of startup.js. The concept of dependencies is native to ResourceLoader, and thanks to the use of closures in mw.loader.implement() responses, we can download any number of interdependant modules in a single request (or parallel requests). Then, when a response arrives, mw.loader takes care to pause or resume execution as-needed. It is normal for ResourceLoader to batch several modules together, including their dependencies. As such, we can eliminate one of the two roundtrips required before a page can request modules. Specifically, we can eliminate "js req 2" (above), by making the two remaining base modules ("jquery" and "mediawiki.base") an implied dependency for all other modules, which ResourceLoader will naturally fetch and execute in the right order as part of the batch request. Bug: T192623 Change-Id: I17cd13dffebd6ae476044d8d038dc3974a1fa176
2018-07-12 20:09:28 +00:00
// Perform replacements for mediawiki.js
$mwLoaderPairs = [
resourceloader: Fix load.mock.php query parameter corruption in tests === Observe the bug 1. Run Special:JavaScriptTest (add ?module=mediawiki.loader to run only the relevant tests) 2. In the Network panel, check the JS requests to load.mock.php?… 3. Without this patch, they are like: "load.mock.php?1234?lang=en&modules=…&…" With this patch, they are like: "load.mock.php?lang=en&modules=…&…" The question mark is only valid as the start of the query string, not as divider between them. This means without this patch, the "lang" parameter is simply ignored because it becomes part of the key "1234?lang" with value "en". === What The mock server doesn't do anything with "lang". And given that RL sorts its query parameters for optimum cache-hit rate, the corrupted parameter is always "lang", as its sorts before "module" or "version", which our mock server does utilize. As part of server-side compression of the startup module (d13e5b75), we filter redundant base parameters that match the default. For RLContext, this is `{ debug: false, lang: qqx, skin: fallback }`. As such, if one were to mock the localisation backend with uselang=qqx internally, the "lang" parameter will not need to be sent, and thus the above bug will start corrupting the "modules" paramater instead, which our test suite correctly detects as being very badly broken. === Why mediawiki.loader.test.js used QUnit.fixurl() as paranoid way to avoid accidental caching. This blindly adds "?<random>" to the url. Upstream QUnit assumes the URL will be a simple file on disk, not expecting existing query parameters. === Fix * Removing the call to QUnit.fixurl(). It was set by me years ago. But, there is no reason to believe a browser would cache this anyway. Plus, the file hardly ever changes. Just in case, set a no-cache header on the server side instead. * Relatedly, the export of $VARS.reqBase is an associative array in PHP and becomes an object in JSON. Make sure this works even if the PHP array is empty, by casting to an object. Otherwise, it becomes `[]` instead of `{}` given an PHP php array is ambiguous in terms of whether it is meant as hashtable or list. Bug: T250045 Change-Id: I3b8ff427577af9df3f1c26500ecf3646973ad34c
2019-10-27 22:54:34 +00:00
// This should always be an object, even if the base vars are empty
// (such as when using the default lang/skin).
'$VARS.reqBase' => $context->encodeJson( (object)$context->getReqBase() ),
'$VARS.baseModules' => $context->encodeJson( $this->getBaseModules() ),
resourceloader: Implement debug=2 request splitting == What == Change debug mode 2 to behave more like production mode: * use module scope (no longer global scope). * load modules concurrently (no longer each module serially). * bundle files (no longer each file separately). What remains different in debug=2 from production mode: * disable minification. * disable batching (one module per request). == How == * Limit the old logic (getScriptURLsForDebug) to just legacy debug. * Set maxQueryLength=0 for non-legacy debug, to ensure each module still gets its own dedicated request for easy debugging, and to get concurrency to make more optimal use of server and browser capacity. This does not effect package file modules much, as those already worked in this way. The only difference for package file modules is that they now load faster (see below) by eliminating the in-between request. == Alternative approach == An alternative approach, which I considered, is to modify Module::buildContent(), around where we currently call getScriptURLsForDebug for DEBUG_LEGACY, and add a conditional branch for DEBUG_MAIN which would always return an array with a single URL, to `load.php?modules=:name`. Much like getScriptURLsForDebug does by default, but without the legacy-specific overrides to that method from e.g. FileModule. I decided against this because the mw.loader client handles such script-arrays in a global FIFO fashion, tailored for legacy debug mode where it crucial to only serially queue>load>execute one script file of one module at any given time (because the raw files can't have a "mw.loader.implement" closure and thus execute immediately on arrival, with no other coordination for file order and module dependency order). This would make debug=2 slow, possibly slower than debug=1 since in debug=1 at least we consolidate most PHP roundtrips in a single batch, and most other scripts can be served quickly as static file by Apache. By letting the client act like it does for production mode, and proactively split its requests, we get a few benefits compared to this alternative approach: * Fewer requests and shorter request dependency chain. There is no in-between request for the "page module batch" that fans out to individual module reqs. Instead, the client makes those reqs directly. * All module requests are discovered and queued with the browser in one go, letting the server handle them as quickly as it can. In production, probably all in parallel. Locally, mediawiki-docker seems to handle about 6 at time (this depite having 10 php-fpm proccess). I suspect that maybe due to a poor interactions between HTTP1 connection reuse and keep-alive timeouts, or perhaps unneeded session locks with sqlite. * The browser can spend time parsing/compiling other requests at the same time as one of them executes. * No additional client-side logic. * No increase in client payload. Bug: T85805 Change-Id: I232310eb624e0204484ec9f3d715d5b6b8532fe8
2021-12-06 17:40:46 +00:00
'$VARS.maxQueryLength' => $context->encodeJson(
// In debug mode (except legacy debug mode), let the client fetch each module in
// its own dedicated request (T85805).
// This is effectively the equivalent of ClientHtml::makeLoad,
resourceloader: Implement debug=2 request splitting == What == Change debug mode 2 to behave more like production mode: * use module scope (no longer global scope). * load modules concurrently (no longer each module serially). * bundle files (no longer each file separately). What remains different in debug=2 from production mode: * disable minification. * disable batching (one module per request). == How == * Limit the old logic (getScriptURLsForDebug) to just legacy debug. * Set maxQueryLength=0 for non-legacy debug, to ensure each module still gets its own dedicated request for easy debugging, and to get concurrency to make more optimal use of server and browser capacity. This does not effect package file modules much, as those already worked in this way. The only difference for package file modules is that they now load faster (see below) by eliminating the in-between request. == Alternative approach == An alternative approach, which I considered, is to modify Module::buildContent(), around where we currently call getScriptURLsForDebug for DEBUG_LEGACY, and add a conditional branch for DEBUG_MAIN which would always return an array with a single URL, to `load.php?modules=:name`. Much like getScriptURLsForDebug does by default, but without the legacy-specific overrides to that method from e.g. FileModule. I decided against this because the mw.loader client handles such script-arrays in a global FIFO fashion, tailored for legacy debug mode where it crucial to only serially queue>load>execute one script file of one module at any given time (because the raw files can't have a "mw.loader.implement" closure and thus execute immediately on arrival, with no other coordination for file order and module dependency order). This would make debug=2 slow, possibly slower than debug=1 since in debug=1 at least we consolidate most PHP roundtrips in a single batch, and most other scripts can be served quickly as static file by Apache. By letting the client act like it does for production mode, and proactively split its requests, we get a few benefits compared to this alternative approach: * Fewer requests and shorter request dependency chain. There is no in-between request for the "page module batch" that fans out to individual module reqs. Instead, the client makes those reqs directly. * All module requests are discovered and queued with the browser in one go, letting the server handle them as quickly as it can. In production, probably all in parallel. Locally, mediawiki-docker seems to handle about 6 at time (this depite having 10 php-fpm proccess). I suspect that maybe due to a poor interactions between HTTP1 connection reuse and keep-alive timeouts, or perhaps unneeded session locks with sqlite. * The browser can spend time parsing/compiling other requests at the same time as one of them executes. * No additional client-side logic. * No increase in client payload. Bug: T85805 Change-Id: I232310eb624e0204484ec9f3d715d5b6b8532fe8
2021-12-06 17:40:46 +00:00
// which does this for stylesheets.
( !$context->getDebug() || $context->getDebug() === $context::DEBUG_LEGACY ) ?
$this->getMaxQueryLength() :
0
),
'$VARS.storeEnabled' => $context->encodeJson(
$conf->get( MainConfigNames::ResourceLoaderStorageEnabled )
&& !$context->getDebug()
&& $context->getRequest()->getRawVal( 'safemode' ) !== '1'
),
'$VARS.storeKey' => $context->encodeJson( $this->getStoreKey() ),
'$VARS.storeVary' => $context->encodeJson( $this->getStoreVary( $context ) ),
'$VARS.groupUser' => $context->encodeJson( $this->getGroupId( self::GROUP_USER ) ),
'$VARS.groupPrivate' => $context->encodeJson( $this->getGroupId( self::GROUP_PRIVATE ) ),
'$VARS.sourceMapLinks' => $context->encodeJson(
$conf->get( MainConfigNames::ResourceLoaderEnableSourceMapLinks )
),
// When profiling is enabled, insert the calls.
// When disabled (the default), insert nothing.
'$CODE.profileExecuteStart();' => $enableJsProfiler
? 'mw.loader.profiler.onExecuteStart( module );'
: '',
'$CODE.profileExecuteEnd();' => $enableJsProfiler
? 'mw.loader.profiler.onExecuteEnd( module );'
: '',
'$CODE.profileScriptStart();' => $enableJsProfiler
? 'mw.loader.profiler.onScriptStart( module );'
: '',
'$CODE.profileScriptEnd();' => $enableJsProfiler
? 'mw.loader.profiler.onScriptEnd( module );'
: '',
// Debug stubs
'$CODE.consoleLog();' => $context->getDebug()
? 'console.log.apply( console, arguments );'
: '',
// As a paranoia measure, create a window.QUnit placeholder that shadows any
// DOM global (e.g. for <h2 id="QUnit">), to avoid test code in prod (T356768).
'$CODE.undefineQUnit();' => !$conf->get( MainConfigNames::EnableJavaScriptTest )
? 'window.QUnit = undefined;'
: '',
];
$mwLoaderCode = strtr( $mwLoaderCode, $mwLoaderPairs );
resourceloader: Combine base modules and page modules requests This commit implements step 4 and step 5 of the plan outlined at T192623. Before this task began, the typical JavaScript execution flow was: * HTML triggers request for startup module (js req 1). * Startup module contains registry, site config, and triggers a request for the base modules (js req 2). * After the base modules arrive (which define jQuery and mw.loader), the startup module invokes a callback that processes RLQ, which is what will request modules for this page (js req 3). In past weeks, we have: * Made mediawiki.js independent of jQuery. * Spun off 'mediawiki.base' from mediawiki.js – for everything that wasn't needed for defining `mw.loader`. * Moved mediawiki.js from the base module request to being embedded as part of startup.js. The concept of dependencies is native to ResourceLoader, and thanks to the use of closures in mw.loader.implement() responses, we can download any number of interdependant modules in a single request (or parallel requests). Then, when a response arrives, mw.loader takes care to pause or resume execution as-needed. It is normal for ResourceLoader to batch several modules together, including their dependencies. As such, we can eliminate one of the two roundtrips required before a page can request modules. Specifically, we can eliminate "js req 2" (above), by making the two remaining base modules ("jquery" and "mediawiki.base") an implied dependency for all other modules, which ResourceLoader will naturally fetch and execute in the right order as part of the batch request. Bug: T192623 Change-Id: I17cd13dffebd6ae476044d8d038dc3974a1fa176
2018-07-12 20:09:28 +00:00
// Perform string replacements for startup.js
$pairs = [
// Raw JavaScript code (not JSON)
'$CODE.registrations();' => trim( $this->getModuleRegistrations( $context ) ),
'$CODE.defineLoader();' => $mwLoaderCode,
];
resourceloader: Combine base modules and page modules requests This commit implements step 4 and step 5 of the plan outlined at T192623. Before this task began, the typical JavaScript execution flow was: * HTML triggers request for startup module (js req 1). * Startup module contains registry, site config, and triggers a request for the base modules (js req 2). * After the base modules arrive (which define jQuery and mw.loader), the startup module invokes a callback that processes RLQ, which is what will request modules for this page (js req 3). In past weeks, we have: * Made mediawiki.js independent of jQuery. * Spun off 'mediawiki.base' from mediawiki.js – for everything that wasn't needed for defining `mw.loader`. * Moved mediawiki.js from the base module request to being embedded as part of startup.js. The concept of dependencies is native to ResourceLoader, and thanks to the use of closures in mw.loader.implement() responses, we can download any number of interdependant modules in a single request (or parallel requests). Then, when a response arrives, mw.loader takes care to pause or resume execution as-needed. It is normal for ResourceLoader to batch several modules together, including their dependencies. As such, we can eliminate one of the two roundtrips required before a page can request modules. Specifically, we can eliminate "js req 2" (above), by making the two remaining base modules ("jquery" and "mediawiki.base") an implied dependency for all other modules, which ResourceLoader will naturally fetch and execute in the right order as part of the batch request. Bug: T192623 Change-Id: I17cd13dffebd6ae476044d8d038dc3974a1fa176
2018-07-12 20:09:28 +00:00
$startupCode = strtr( $startupCode, $pairs );
return [
'plainScripts' => [
[
'virtualFilePath' => new FilePath(
'resources/src/startup/startup.js',
MW_INSTALL_PATH,
$conf->get( MainConfigNames::ResourceBasePath )
),
'content' => $startupCode,
],
],
];
}
/**
* @return bool
*/
public function supportsURLLoading(): bool {
Fix the fixme on r88053: dependency handling was broken in debug mode in certain cases. More specifically, if A is a file module that depends on B, B is a wiki module that depends on C and C is a file module, the loading order is CBA (correct) in production mode but was BCA (wrong) in debug mode. Fixed this by URL-ifying scripts and styles for those modules in debug mode, as I said to on CR. What this means is that the initial debug=true request for a module will now always return arrays of URLs, never the JS or CSS itself. This was already the case for file modules (which returned arrays of URLs to the raw files), but not for other modules (which returned the JS and CSS itself). So for non-file modules, load.php?modules=foo&debug=true now returns some JS that instructs the loader to fetch the module's JS from load.php?modules=foo&debug=true&only=scripts and the CSS from ...&only=styles . * Removed the magic behavior where ResourceLoaderModule::getScripts() and getStyles() could return an array of URLs where the documentation said they should return a JS/CSS string. Because I didn't restructure the calling code too much, the old magical behavior should still work. * Instead, move this behavior to getScriptURLsForDebug() and getStyleURLsForDebug(). The default implementation constructs a single URL for a load.php request for the module with debug=true&only=scripts (or styles). The URL building code duplicates some things from OutputPage::makeResourceLoaderLink(), I'll clean that up later. ResourceLoaderFileModule overrides this method to return URLs to the raw files, using code that I removed from getScripts()/getStyles() * Add ResourceLoaderModule::supportsURLLoading(), which returns true by default but may return false to indicate that a module does not support loading via a URL. This is needed to respect $this->debugRaw in ResourceLoaderFileModule (set to true for jquery and mediawiki), and obviously for the startup module as well, because we get bootstrapping problems otherwise (can't call mw.loader.implement() when the code for mw.loader isn't loaded yet)
2011-09-13 17:13:53 +00:00
return false;
}
/**
resourceloader: Use 'enableModuleContentVersion' for startup module This significantly simplifies the getVersionHash implementation for StartupModule, and fixes a couple of bugs. Previously, the startup module's E-Tag was determined by the 'getDefinitionSummary' method, which combined the E-Tag values from all registered modules, plus what we thought is all information used by 'getScript' (config vars, embedded script files, list of base modules, ...) However, this were various things part of the manifest that it forgot about, including: * Changes to the list of dependencies of a module. * Changes to the name of module. * Changes to the cache group of module. * Adding or removing a foreign module source (mw.loader.addSource). These are all quite rare, and when they do change, they usually also involve a change that *was* tracked already. But, sometimes they don't and that's when bugs happened. Instead of the tracking array of getDefinitionSummary, we now use the 'enableModuleContentVersion' option for StartupModule, which simply calls the actual getScript() method and hashes that. Of note: When an exception happens with the version computation of any individual module, we catch it, log it, and continue with the rest. Previously, the first time such error was discovered at run-time would be in the getCombinedVersion() call from StartupModule::getAllModuleHashes(). That public getCombinedVersion() method of ResourceLoader had the benefit of also outputting details of that exception in the HTTP response output. In order to keep that behaviour, I made outputErrorAndLog() public so that StartupModule can call it directly now. This is covered by ResourceLoaderTest::testMakeModuleResponseStartupError. Bug: T201686 Change-Id: I8e8d3a2cd2ccd68d2d78e988bcdd0d77fbcbf1d4
2018-08-30 02:52:39 +00:00
* @return bool
*/
public function enableModuleContentVersion(): bool {
resourceloader: Use 'enableModuleContentVersion' for startup module This significantly simplifies the getVersionHash implementation for StartupModule, and fixes a couple of bugs. Previously, the startup module's E-Tag was determined by the 'getDefinitionSummary' method, which combined the E-Tag values from all registered modules, plus what we thought is all information used by 'getScript' (config vars, embedded script files, list of base modules, ...) However, this were various things part of the manifest that it forgot about, including: * Changes to the list of dependencies of a module. * Changes to the name of module. * Changes to the cache group of module. * Adding or removing a foreign module source (mw.loader.addSource). These are all quite rare, and when they do change, they usually also involve a change that *was* tracked already. But, sometimes they don't and that's when bugs happened. Instead of the tracking array of getDefinitionSummary, we now use the 'enableModuleContentVersion' option for StartupModule, which simply calls the actual getScript() method and hashes that. Of note: When an exception happens with the version computation of any individual module, we catch it, log it, and continue with the rest. Previously, the first time such error was discovered at run-time would be in the getCombinedVersion() call from StartupModule::getAllModuleHashes(). That public getCombinedVersion() method of ResourceLoader had the benefit of also outputting details of that exception in the HTTP response output. In order to keep that behaviour, I made outputErrorAndLog() public so that StartupModule can call it directly now. This is covered by ResourceLoaderTest::testMakeModuleResponseStartupError. Bug: T201686 Change-Id: I8e8d3a2cd2ccd68d2d78e988bcdd0d77fbcbf1d4
2018-08-30 02:52:39 +00:00
// Enabling this means that ResourceLoader::getVersionHash will simply call getScript()
// and hash it to determine the version (as used by E-Tag HTTP response header).
return true;
}
}