wiki.techinc.nl/includes/resourceloader/ResourceLoaderWikiModule.php

386 lines
11 KiB
PHP
Raw Normal View History

<?php
/**
* Abstraction for ResourceLoader modules that pull from wiki pages.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
* http://www.gnu.org/copyleft/gpl.html
*
* @file
* @author Trevor Parscal
* @author Roan Kattouw
*/
/**
* Abstraction for ResourceLoader modules which pull from wiki pages
*
* This can only be used for wiki pages in the MediaWiki and User namespaces,
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
* because of its dependence on the functionality of Title::isCssJsSubpage.
*
* This module supports being used as a placeholder for a module on a remote wiki.
* To do so, getDB() must be overloaded to return a foreign database object that
* allows local wikis to query page metadata.
*
* Safe for calls on local wikis are:
* - Option getters:
* - getGroup()
* - getPosition()
* - getPages()
* - Basic methods that strictly involve the foreign database
* - getDB()
* - isKnownEmpty()
* - getTitleInfo()
*/
class ResourceLoaderWikiModule extends ResourceLoaderModule {
/** @var string Position on the page to load this module at */
protected $position = 'bottom';
// Origin defaults to users with sitewide authority
protected $origin = self::ORIGIN_USER_SITEWIDE;
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
// In-process cache for title info
protected $titleInfo = [];
// List of page names that contain CSS
protected $styles = [];
// List of page names that contain JavaScript
protected $scripts = [];
// Group of module
protected $group;
/**
* @param array $options For back-compat, this can be omitted in favour of overwriting getPages.
*/
public function __construct( array $options = null ) {
if ( is_null( $options ) ) {
return;
}
foreach ( $options as $member => $option ) {
switch ( $member ) {
case 'position':
case 'styles':
case 'scripts':
case 'group':
case 'targets':
$this->{$member} = $option;
break;
}
}
}
/**
* Subclasses should return an associative array of resources in the module.
* Keys should be the title of a page in the MediaWiki or User namespace.
*
* Values should be a nested array of options. The supported keys are 'type' and
* (CSS only) 'media'.
*
* For scripts, 'type' should be 'script'.
*
* For stylesheets, 'type' should be 'style'.
* There is an optional media key, the value of which can be the
* medium ('screen', 'print', etc.) of the stylesheet.
*
* @param ResourceLoaderContext $context
* @return array
*/
protected function getPages( ResourceLoaderContext $context ) {
$config = $this->getConfig();
$pages = [];
// Filter out pages from origins not allowed by the current wiki configuration.
if ( $config->get( 'UseSiteJs' ) ) {
foreach ( $this->scripts as $script ) {
$pages[$script] = [ 'type' => 'script' ];
}
}
if ( $config->get( 'UseSiteCss' ) ) {
foreach ( $this->styles as $style ) {
$pages[$style] = [ 'type' => 'style' ];
}
}
return $pages;
}
/**
* Get group name
*
* @return string
*/
public function getGroup() {
return $this->group;
}
/**
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
* Get the Database object used in getTitleInfo().
*
* Defaults to the local replica DB. Subclasses may want to override this to return a foreign
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
* database object, or null if getTitleInfo() shouldn't access the database.
*
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
* NOTE: This ONLY works for getTitleInfo() and isKnownEmpty(), NOT FOR ANYTHING ELSE.
* In particular, it doesn't work for getContent() or getScript() etc.
*
* @return IDatabase|null
*/
protected function getDB() {
return wfGetDB( DB_REPLICA );
}
/**
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
* @param string $title
* @return null|string
*/
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
protected function getContent( $titleText ) {
$title = Title::newFromText( $titleText );
if ( !$title ) {
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
return null;
}
$handler = ContentHandler::getForTitle( $title );
if ( $handler->isSupportedFormat( CONTENT_FORMAT_CSS ) ) {
$format = CONTENT_FORMAT_CSS;
} elseif ( $handler->isSupportedFormat( CONTENT_FORMAT_JAVASCRIPT ) ) {
$format = CONTENT_FORMAT_JAVASCRIPT;
} else {
return null;
}
$revision = Revision::newFromTitle( $title, false, Revision::READ_NORMAL );
if ( !$revision ) {
return null;
}
2012-06-08 06:31:28 +00:00
$content = $revision->getContent( Revision::RAW );
if ( !$content ) {
wfDebugLog( 'resourceloader', __METHOD__ . ': failed to load content of JS/CSS page!' );
return null;
}
return $content->serialize( $format );
}
/**
* @param ResourceLoaderContext $context
* @return string
*/
public function getScript( ResourceLoaderContext $context ) {
$scripts = '';
foreach ( $this->getPages( $context ) as $titleText => $options ) {
if ( $options['type'] !== 'script' ) {
continue;
}
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
$script = $this->getContent( $titleText );
if ( strval( $script ) !== '' ) {
* (bug 28626) Validate JavaScript files and pages loaded via ResourceLoader before minification, protecting separate modules from interference This is possibly not perfect but seems to serve for a start; follows up on r91591 that adds JSMin+ to use it in some unit tests. May want to adjust some related bits. - $wgResourceLoaderValidateJs on by default (can be disabled) - when loading a JS file through ResourceLoaderFileModule or ResourceLoaderWikiModule, parse it using JSMinPlus's JSParser class. If the parser throws an exception, the JS code of the offending file will be replaced by a JS exception throw listing the file or page name, line number (in original form), and description of the error from the parser. - parsing results are cached based on md5 of content to avoid re-parsing identical text - for JS pages loaded via direct load.php request, the parse error is thrown and visible in the JS console/error log Issues: - the primary use case for this is when a single load.php request implements multiple modules via mw.loader.implement() -- the loader catches the exception and skips on to the next module (good) but doesn't re-throw the exception for the JS console. It does log to console if present, but it'll only show up as a regular debug message, not an error. This can suppress visibility of errors in a module that's loaded together with other modules (such as a gadget). - have not done performance testing on the JSParser - have not done thorough unit testing with the JSParser
2011-07-06 21:48:09 +00:00
$script = $this->validateScriptFile( $titleText, $script );
$scripts .= ResourceLoader::makeComment( $titleText ) . $script . "\n";
}
}
return $scripts;
}
/**
* @param ResourceLoaderContext $context
* @return array
*/
public function getStyles( ResourceLoaderContext $context ) {
$styles = [];
foreach ( $this->getPages( $context ) as $titleText => $options ) {
if ( $options['type'] !== 'style' ) {
continue;
}
$media = isset( $options['media'] ) ? $options['media'] : 'all';
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
$style = $this->getContent( $titleText );
if ( strval( $style ) === '' ) {
continue;
}
if ( $this->getFlip( $context ) ) {
$style = CSSJanus::transform( $style, true, false );
}
$style = MemoizedCallable::call( 'CSSMin::remap',
[ $style, false, $this->getConfig()->get( 'ScriptPath' ), true ] );
if ( !isset( $styles[$media] ) ) {
$styles[$media] = [];
}
$style = ResourceLoader::makeComment( $titleText ) . $style;
ResourceLoader: Refactor style loading Fixes: * bug 31676: Work around IE stylesheet limit. * bug 35562: @import styles broken in modules that combine multiple stylesheets. * bug 40498: Don't output empty "@media print { }" blocks. * bug 40500: Don't ignore media-type for urls in debug mode. Approach: * Re-use the same <style> tag so that we stay under the 31 stylesheet limit in IE. Unless the to-be-added css text from the being-loaded module contains @import, in which case we do create a new <style> tag and then re-use that one from that point on (bug 31676). * Return stylesheets as arrays, instead of a concatenated string. This fixes bug 35562, because @import only works when at the top of a stylesheet. By not unconditionally concatenating files within a module on the server side already, @import will work in e.g. module 'site' that contains 2 wiki pages. This is normalized in ResourceLoader::makeCombinedStyles(), so far only ResourceLoaderWikiModule makes use of this. Misc. clean up and bug fixes: * Reducing usage of jQuery() and mw.html.element() where native DOM would be very simple and faster. Aside from simplicity and speed, this is also working towards a more stand-alone ResourceLoader. * Trim server output a little bit more - Redundant new line after minify-css (it is now an array, so no need to keep space afterwards) - Redundant semi-colon after minify-js if it ends in a colon * Allow space in styleTest.css.php * Clean up and extend unit tests to cover for these features and bug fixes. * Don't set styleEl.rel = 'stylesheet'; that has no business on a <style> tag. * Fix bug in mw.loader's addStyleTag(). It turns out IE6 has an odd security measure that does not allow manipulation of elements (at least style tags) that are created by a different script (even if that script was served from the same domain/origin etc.). We didn't ran into this before because we only created new style tags, never appended to them. Now that we do, this came up. Took a while to figure out because it was created by mediawiki.js but it calls jQuery which did the actual dom insertion. Odd thing is, we load jquery.js and mediawiki.js in the same request even... Without this all css-url related mw.loader tests would fail in IE6. * mediawiki.js and mediawiki.test.js now pass jshint again. Tested (and passing qunit/?module=mediawiki; 123 of 123): * Chrome 14, 21 * Firefox 3.0, 3.6, 4, 7, 14, 15, 16beta * IE 6, 7, 8, 9 * Safari 4.0, 5.0, 5.1 * Opera 10.0, 11.1, 11.5, 11.6, 12.0, 12.5beta * iPhone 3GS / iOS 3.0 / Mobile Safari 4.0 iPhone 4 / iOS 4.0.1 / Mobile Safari 4.0.5 iPhone 4S / iOS 6.0 Beta / Mobile Safari 6.0 Change-Id: I3e8227ddb87fd9441071ca935439fc6467751dab
2012-07-25 21:20:21 +00:00
$styles[$media][] = $style;
}
return $styles;
}
resourceloader: Enable module content version for data modules This greatly simplifies logic required to compute module versions. It also makes it significantly less error-prone. Since f37cee996e, we support hashes as versions (instead of timestamps). This means we can build a hash of the content directly, instead of compiling a large array with all values that may influence the module content somehow. Benefits: * Remove all methods and logic related to querying database and disk for timestamps, revision numbers, definition summaries, cache epochs, and more. * No longer needlessly invalidate cache as a result of no-op changes to implementation datails. Due to inclusion of absolute file paths in the definition summary, cache was always invalidated when moving wikis to newer MediaWiki branches; even if the module observed no actual changes. * When changes are reverted within a certain period of time, old caches can now be re-used. The module would produce the same version hash as before. Previously when a change was deployed and then reverted, all web clients (even those that never saw the bad version) would have re-fetch modules because the version increased. Updated unit tests to account for the change in version. New default version of empty test modules is: "mvgTPvXh". For the record, this comes from the base64 encoding of the SHA1 digest of the JSON serialised form of the module content: > $str = '{"scripts":"","styles":{"css":[]},"messagesBlob":"{}"}'; > echo base64_encode(sha1($str, true)); > FEb3+VuiUm/fOMfod1bjw/te+AQ= Enabled content versioning for the data modules in MediaWiki core: * EditToolbarModule * JqueryMsgModule * LanguageDataModule * LanguageNamesModule * SpecialCharacterDataModule * UserCSSPrefsModule * UserDefaultsModule * UserOptionsModule The FileModule and base class explicitly disable it for now and keep their current behaviour of using the definition summary. We may remove it later, but that requires more performance testing first. Explicitly disable it in the WikiModule class to avoid breakage when the default changes. Ref T98087. Change-Id: I782df43c50dfcfb7d7592f744e13a3a0430b0dc6
2015-06-02 17:27:23 +00:00
/**
* Disable module content versioning.
*
* This class does not support generating content outside of a module
* request due to foreign database support.
*
* See getDefinitionSummary() for meta-data versioning.
*
* @return bool
*/
public function enableModuleContentVersion() {
return false;
}
/**
* @param ResourceLoaderContext $context
* @return array
*/
public function getDefinitionSummary( ResourceLoaderContext $context ) {
resourceloader: Replace timestamp system with version hashing Modules now track their version via getVersionHash() instead of getModifiedTime(). == Background == While some resources have observeable timestamps (e.g. files stored on disk), many other resources do not. E.g. config variables, and module definitions. For static file modules, one can e.g. revert one of more files in a module to a previous version and not affect the max timestamp. Wiki modules include pages only if they exist. The user module supports common.js and skin.js. By default neither exists. If a user has both, and then the less-recently modified one is deleted, the max-timestamp remains unchanged. For client-side caching, batch requests use "Math.max" on the relevant timestamps. Again, if a module changes but another module is more recent (e.g. out-of-order deployment, or out-of-order discovery), the change would not result in a cache miss. More scenarios can be found in the associated Phabricator tasks. == Version hash == Previously we virtually mapped these variables to a timestamp by storing the current time alongside a hash of the value in ObjectCache. Considering the number of possible request contexts (wikis * modules * users * skins * languages) this doesn't work well. It results in needless cache invalidation when the first time observation is purged due to LRU algorithms. It also has other minor bugs leading to fewer cache hits. All modules automatically get the benefits of version hashing with this change. The old getDefinitionMtime() and getHashMtime() have been replaced with dummies that return 1. These functions are often called from getModifiedTime() in subclasses. For backward-compatibility, their respective values (definition summary and hash) are now included in getVersionHash directly. As examples, the following modules have been updated to use getVersionHash directly. Other modules still work fine and can be updated later. * ResourceLoaderFileModule * ResourceLoaderEditToolbarModule * ResourceLoaderStartUpModule * ResourceLoaderWikiModule The presence of hashes in place of timestamps increases the startup module size on a default MediaWiki install from 4.4k to 5.8k (after gzip and minification). == ETag == Since timestamps are no longer tracked, we need a different way to implement caching for cache proxies (e.g. Varnish) and web browsers. Previously we used the Last-Modified header (in combination with Cache-Control and Expires). Instead of Last-Modified (and If-Modified-Since), we use ETag (and If-None-Match). Entity tags (new in HTTP/1.1) are much stricter than Last-Modified by default. They instruct browsers to allow usage of partial Range requests. Since our responses are dynamically generated, we need to use the Weak version of ETag. While this sounds bad, it's no different than Last-Modified. As reassured by RFC 2616 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.3> the specified behaviour behind Last-Modified follows the same "Weak" caching logic as Entity tags. It's just that entity tags are capable of a stricter mode (whereas Last-Modified is inherently weak). == File cache == If $wgUseFileCache is enabled, ResourceLoader uses ResourceFileCache to cache load.php responses. While the blind TTL handling (during the allowed expiry period) is still maxage/timestamp based, tryRespondNotModified() now requires the caller to know the expected ETag. For this to work, the FileCache handling had to be moved from the top of ResoureLoader::respond() to after the expected ETag is computed. This also allows us to remove the duplicate tryRespondNotModified() handling since that's is already handled by ResourceLoader::respond() meanwhile. == Misc == * Remove redundant modifiedTime cache in ResourceLoaderFileModule. * Change bugzilla references to Phabricator. * Centralised inclusion of wgCacheEpoch using getDefinitionSummary. Previously this logic was duplicated in each place the modified timestamp was used. * It's easy to forget calling the parent class in getDefinitionSummary(). Previously this method only tracked 'class' by default. As such, various extensions hardcoded that one value instead of calling the parent and extending the array. To better prevent this in the future, getVersionHash() now asserts that the '_cacheEpoch' property made it through. * tests: Don't use getDefinitionSummary() as an API. Fix ResourceLoaderWikiModuleTest to call getPages properly. * In tests, the default timestamp used to be 1388534400000 (which is the unix time of 20140101000000; the unit tests' CacheEpoch). The new version hash of these modules is "XyCC+PSK", which is the base64 encoded prefix of the SHA1 digest of: '{"_class":"ResourceLoaderTestModule","_cacheEpoch":"20140101000000"}' * Add sha1.js library for client-side hash generation. Compared various different implementations for code size (after minfication/gzip), and speed (when used for short hexidecimal strings). https://jsperf.com/sha1-implementations - CryptoJS <https://code.google.com/p/crypto-js/#SHA-1> (min+gzip: 2.5k) http://crypto-js.googlecode.com/svn/tags/3.1.2/build/rollups/sha1.js Chrome: 45k, Firefox: 89k, Safari: 92k - jsSHA <https://github.com/Caligatio/jsSHA> https://github.com/Caligatio/jsSHA/blob/3c1d4f2e/src/sha1.js (min+gzip: 1.8k) Chrome: 65k, Firefox: 53k, Safari: 69k - phpjs-sha1 <https://github.com/kvz/phpjs> (RL min+gzip: 0.8k) https://github.com/kvz/phpjs/blob/1eaab15d/functions/strings/sha1.js Chrome: 200k, Firefox: 280k, Safari: 78k Modern browsers implement the HTML5 Crypto API. However, this API is asynchronous, only enabled when on HTTPS in Chromium, and is quite low-level. It requires boilerplate code to actually use with TextEncoder, ArrayBuffer and Uint32Array. Due this being needed in the module loader, we'd have to load the fallback regardless. Considering this is not used in a critical path for performance, it's not worth shipping two implementations for this optimisation. May also resolve: * T44094 * T90411 * T94810 Bug: T94074 Change-Id: Ibb292d2416839327d1807a66c78fd96dac0637d0
2015-04-29 22:53:24 +00:00
$summary = parent::getDefinitionSummary( $context );
$summary[] = [
'pages' => $this->getPages( $context ),
// Includes meta data of current revisions
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
'titleInfo' => $this->getTitleInfo( $context ),
];
resourceloader: Replace timestamp system with version hashing Modules now track their version via getVersionHash() instead of getModifiedTime(). == Background == While some resources have observeable timestamps (e.g. files stored on disk), many other resources do not. E.g. config variables, and module definitions. For static file modules, one can e.g. revert one of more files in a module to a previous version and not affect the max timestamp. Wiki modules include pages only if they exist. The user module supports common.js and skin.js. By default neither exists. If a user has both, and then the less-recently modified one is deleted, the max-timestamp remains unchanged. For client-side caching, batch requests use "Math.max" on the relevant timestamps. Again, if a module changes but another module is more recent (e.g. out-of-order deployment, or out-of-order discovery), the change would not result in a cache miss. More scenarios can be found in the associated Phabricator tasks. == Version hash == Previously we virtually mapped these variables to a timestamp by storing the current time alongside a hash of the value in ObjectCache. Considering the number of possible request contexts (wikis * modules * users * skins * languages) this doesn't work well. It results in needless cache invalidation when the first time observation is purged due to LRU algorithms. It also has other minor bugs leading to fewer cache hits. All modules automatically get the benefits of version hashing with this change. The old getDefinitionMtime() and getHashMtime() have been replaced with dummies that return 1. These functions are often called from getModifiedTime() in subclasses. For backward-compatibility, their respective values (definition summary and hash) are now included in getVersionHash directly. As examples, the following modules have been updated to use getVersionHash directly. Other modules still work fine and can be updated later. * ResourceLoaderFileModule * ResourceLoaderEditToolbarModule * ResourceLoaderStartUpModule * ResourceLoaderWikiModule The presence of hashes in place of timestamps increases the startup module size on a default MediaWiki install from 4.4k to 5.8k (after gzip and minification). == ETag == Since timestamps are no longer tracked, we need a different way to implement caching for cache proxies (e.g. Varnish) and web browsers. Previously we used the Last-Modified header (in combination with Cache-Control and Expires). Instead of Last-Modified (and If-Modified-Since), we use ETag (and If-None-Match). Entity tags (new in HTTP/1.1) are much stricter than Last-Modified by default. They instruct browsers to allow usage of partial Range requests. Since our responses are dynamically generated, we need to use the Weak version of ETag. While this sounds bad, it's no different than Last-Modified. As reassured by RFC 2616 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.3> the specified behaviour behind Last-Modified follows the same "Weak" caching logic as Entity tags. It's just that entity tags are capable of a stricter mode (whereas Last-Modified is inherently weak). == File cache == If $wgUseFileCache is enabled, ResourceLoader uses ResourceFileCache to cache load.php responses. While the blind TTL handling (during the allowed expiry period) is still maxage/timestamp based, tryRespondNotModified() now requires the caller to know the expected ETag. For this to work, the FileCache handling had to be moved from the top of ResoureLoader::respond() to after the expected ETag is computed. This also allows us to remove the duplicate tryRespondNotModified() handling since that's is already handled by ResourceLoader::respond() meanwhile. == Misc == * Remove redundant modifiedTime cache in ResourceLoaderFileModule. * Change bugzilla references to Phabricator. * Centralised inclusion of wgCacheEpoch using getDefinitionSummary. Previously this logic was duplicated in each place the modified timestamp was used. * It's easy to forget calling the parent class in getDefinitionSummary(). Previously this method only tracked 'class' by default. As such, various extensions hardcoded that one value instead of calling the parent and extending the array. To better prevent this in the future, getVersionHash() now asserts that the '_cacheEpoch' property made it through. * tests: Don't use getDefinitionSummary() as an API. Fix ResourceLoaderWikiModuleTest to call getPages properly. * In tests, the default timestamp used to be 1388534400000 (which is the unix time of 20140101000000; the unit tests' CacheEpoch). The new version hash of these modules is "XyCC+PSK", which is the base64 encoded prefix of the SHA1 digest of: '{"_class":"ResourceLoaderTestModule","_cacheEpoch":"20140101000000"}' * Add sha1.js library for client-side hash generation. Compared various different implementations for code size (after minfication/gzip), and speed (when used for short hexidecimal strings). https://jsperf.com/sha1-implementations - CryptoJS <https://code.google.com/p/crypto-js/#SHA-1> (min+gzip: 2.5k) http://crypto-js.googlecode.com/svn/tags/3.1.2/build/rollups/sha1.js Chrome: 45k, Firefox: 89k, Safari: 92k - jsSHA <https://github.com/Caligatio/jsSHA> https://github.com/Caligatio/jsSHA/blob/3c1d4f2e/src/sha1.js (min+gzip: 1.8k) Chrome: 65k, Firefox: 53k, Safari: 69k - phpjs-sha1 <https://github.com/kvz/phpjs> (RL min+gzip: 0.8k) https://github.com/kvz/phpjs/blob/1eaab15d/functions/strings/sha1.js Chrome: 200k, Firefox: 280k, Safari: 78k Modern browsers implement the HTML5 Crypto API. However, this API is asynchronous, only enabled when on HTTPS in Chromium, and is quite low-level. It requires boilerplate code to actually use with TextEncoder, ArrayBuffer and Uint32Array. Due this being needed in the module loader, we'd have to load the fallback regardless. Considering this is not used in a critical path for performance, it's not worth shipping two implementations for this optimisation. May also resolve: * T44094 * T90411 * T94810 Bug: T94074 Change-Id: Ibb292d2416839327d1807a66c78fd96dac0637d0
2015-04-29 22:53:24 +00:00
return $summary;
}
/**
* @param ResourceLoaderContext $context
* @return bool
*/
public function isKnownEmpty( ResourceLoaderContext $context ) {
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
$revisions = $this->getTitleInfo( $context );
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
// For user modules, don't needlessly load if there are no non-empty pages
if ( $this->getGroup() === 'user' ) {
foreach ( $revisions as $revision ) {
if ( $revision['page_len'] > 0 ) {
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
// At least one non-empty page, module should be loaded
return false;
}
}
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
return true;
}
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
// Bug 68488: For other modules (i.e. ones that are called in cached html output) only check
// page existance. This ensures that, if some pages in a module are temporarily blanked,
// we don't end omit the module's script or link tag on some pages.
return count( $revisions ) === 0;
}
private function setTitleInfo( $key, array $titleInfo ) {
$this->titleInfo[$key] = $titleInfo;
}
/**
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
* Get the information about the wiki pages for a given context.
* @param ResourceLoaderContext $context
* @return array Keyed by page name
*/
protected function getTitleInfo( ResourceLoaderContext $context ) {
$dbr = $this->getDB();
if ( !$dbr ) {
// We're dealing with a subclass that doesn't have a DB
return [];
}
$pageNames = array_keys( $this->getPages( $context ) );
sort( $pageNames );
$key = implode( '|', $pageNames );
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
if ( !isset( $this->titleInfo[$key] ) ) {
$this->titleInfo[$key] = self::fetchTitleInfo( $dbr, $pageNames, __METHOD__ );
}
return $this->titleInfo[$key];
}
private static function fetchTitleInfo( IDatabase $db, array $pages, $fname = __METHOD__ ) {
$titleInfo = [];
$batch = new LinkBatch;
foreach ( $pages as $titleText ) {
$batch->addObj( Title::newFromText( $titleText ) );
}
if ( !$batch->isEmpty() ) {
$res = $db->select( 'page',
// Include page_touched to allow purging if cache is poisoned (T117587, T113916)
[ 'page_namespace', 'page_title', 'page_touched', 'page_len', 'page_latest' ],
$batch->constructSet( 'page', $db ),
$fname
);
foreach ( $res as $row ) {
// Avoid including ids or timestamps of revision/page tables so
// that versions are not wasted
$title = Title::makeTitle( $row->page_namespace, $row->page_title );
$titleInfo[$title->getPrefixedText()] = [
'page_len' => $row->page_len,
'page_latest' => $row->page_latest,
'page_touched' => $row->page_touched,
];
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
}
}
return $titleInfo;
}
/**
* @since 1.28
* @param ResourceLoaderContext $context
* @param IDatabase $db
* @param string[] $modules
*/
public static function preloadTitleInfo(
ResourceLoaderContext $context, IDatabase $db, array $moduleNames
) {
$rl = $context->getResourceLoader();
// getDB() can be overridden to point to a foreign database.
// For now, only preload local. In the future, we could preload by wikiID.
$allPages = [];
$wikiModules = [];
foreach ( $moduleNames as $name ) {
$module = $rl->getModule( $name );
if ( $module instanceof self ) {
$mDB = $module->getDB();
// Subclasses may disable getDB and implement getTitleInfo differently
if ( $mDB && $mDB->getWikiID() === $db->getWikiID() ) {
$wikiModules[] = $module;
$allPages += $module->getPages( $context );
resourceloader: Refactor ResourceLoaderWikiModule to reduce database queries Wiki modules are special due to their isKnownEmpty implementation and support for foreign databases. MediaWiki doesn't have convenient ways of making Revision objects for remote wikis. As such, wiki modules will keep using meta data to generate the hash. However minimise needless cache invalidation by refining the implementation. Impact: * Remove use of getMsgBlobMtime(). This module doesn't support getMessages(). * In the title info, use the revision content sha1 and size for tracking. The page_touched previously used updates too often. It's updated both on edits for various types of purges. Using the rev_sha1 means old versions return when the content is the same. Regardless of how the content changed via revert or actual edits resulting in the same contnet. * Change in-process cache to be keyed by page list instead of entire ResourceLoaderContext. Because of this, getTitleInfo() was previously performing its batch query twice on the same page. Once for only=styles (top) and only=scripts (bottom). Both operate on the full getPages() set but had different context keys. Clean up: * Better document the support for foreign databases. * Move Title construction to getContent to reduce duplication. * Remove use of getDefinitionMtime(). That method is a no-op since the switch to version hashing. * Remove remaining use of mtime in getModifiedTime(). This is now covered by hashing the title info in getDefinitionSummary(). Also refactor the code to be more readable. No intended change in behaviour. Bug: T98087 Change-Id: Id46740db04c0c42bc5ca87d1487230a32feb34df
2015-06-04 01:52:42 +00:00
}
}
}
$allInfo = self::fetchTitleInfo( $db, array_keys( $allPages ), __METHOD__ );
foreach ( $wikiModules as $module ) {
$pages = $module->getPages( $context );
$info = array_intersect_key( $allInfo, $pages );
$pageNames = array_keys( $pages );
sort( $pageNames );
$key = implode( '|', $pageNames );
$module->setTitleInfo( $key, $info );
}
return $allInfo;
}
/**
* @return string
*/
public function getPosition() {
return $this->position;
}
/**
* @since 1.28
* @return string
*/
public function getType() {
// Check both because subclasses don't always pass pages via the constructor,
// they may also override getPages() instead, in which case we should keep
// defaulting to LOAD_GENERAL and allow them to override getType() separately.
return ( $this->styles && !$this->scripts ) ? self::LOAD_STYLES : self::LOAD_GENERAL;
}
}