CacheEpoch isn't meant for this (it's meant for page cache), I can't
imagine a scenario in which we'd want to bump that to invalidate
MessageBlobStore. It should be standalone and can be easily cleared
if needed by truncating the relevant table (it's automatically
repopulated).
This also removes wfTimestamp/DateTime overhead.
Change-Id: Iab06edbf71f20f3430207a80df90131c79dc03a7
These callers don't need to do purges, but can still perfectly
take advantage of this instance over a plain BagOStuff. Namely:
* Replication and snapshot lag awareness
* Preemptive regeneration
* Easy process cache support
The idea is for there to only be one caching class/factory
to use, instead of having rules for picking which one to use.
Change-Id: I8e362df451c0c28731fc853c044c4c4b8e097f01
LocalisationCache was added in 1.16, so in order to retain PHP 5.1
compatibility, array_fill_keys() was avoided. In 1.17, support for
PHP 5.1 was dropped, so we can use that function now.
Change-Id: I435705639f1a470324a4ba46153351aadc0d40e2
* If replace() was called recently, then we know that this is the
master data center and that the messages are up-to-date. With this
change, replace() calls have nothing to contend with aside from other
replace() calls. Even if there were timeouts due to such contention,
caused by high MediaWiki: page edit rates, the replace() updates would
pick up the prior changes in passing since they do load().
* This also avoids the following scenario:
a) Someone edits a message page, triggering replace()
b) Some page view causes load() to trigger loadFromDB()
due to the hash being seen as volatile due to replace()
c) The loadFromDB() loads stale slave data and undoes
the message key update the replace() did
Change-Id: I9cdf7ad3d67f168fcba7f633af9e32a8d1fa928e
* Made the status key only act as a backoff key inside
loadFromDBWithLock(), instead of also being a lock key. The
getReentrantScopedLock() call is now non-blocking, which has
a similar affect to the add() call on the status key but much
better prioritizes replace(); that method already has a blocking
getReentrantScopedLock() call, so once it gets that lock, there
is no worry of load() failing due to contention.
* Bail out in loadFromDBWithLock() if the scope lock was not acquired.
Normally the status lock often assured the lock could be obtained,
unless a competing replace() came in. The old status lock would
cause a bail if not acquired, and this carries over that behavior
but also avoids clobbering updates when replace() contention happens.
* Avoid calling saveToCaches() in replace() if the lock was not
acquired to avoid clobbering conflicting writes.
* Made the loadFromDB() call in replace() use DB_MASTER to avoid
seeing stale data if the cache is volatile.
* Lowered WAIT_SEC, as the default PHP timeout is 30.
We want this to be able to at least finish the other calls
in replace() even if the lock times out.
* Avoid checking the status key an extra time in load().
* Made a few small code and doc cleanups.
Change-Id: Ibaf1f67618ec374c83c3135a71e90223dd2b1856
* The key salting of the invidial keys via the WAN cache makes
it OK to use slaves here. The salting happens via the delete()
in the MessageCache::replace() call, which is hit when the
message page is edited.
Bug: T92357
Change-Id: Ic191f6c3fc49c4c58461d26468dd0fa94a52051e
* These keys are used for larger messages not in the main blob,
and they still use explicit purges as the pages change
Change-Id: I254ddcc9ffe7c49654788d6aabcca2ab7ed4f03f
* The localCache member will use CACHE_NONE if this is off,
so the set()/get() calls are already no-oped
* Also allow using the local cache if $hash is false
in "stale mode", which handles the case where the hash
key fell out of memory for some reason more gracefully
Change-Id: Ie12efcd4088a6dc4a4cdd2fd06646f2881df53d7
* This avoids constant churn when $wgUseLocalMessageCache is set to false
* Follow up to db464b8a84
Bug: T109183
Change-Id: I8da324c53527da32d09964be6c3a92176af4ee7b
Follow-up for I020617d, where I got this wrong. It is probable that multiple
application servers will try to build a local cache at the same time, in which
case they will cause the cache key to thrash. It is necessary that two
application servers building a local cache from the database generate the same
hash for a given set of messages.
Change-Id: Ieeefc2094a83be9401c466bec859c1588ddfbcdf
* Use hash key volatility to propagate invalidations over DCs
in addition to memcached->APC instances.
* Make use of stale APC cache due to hash mismatch instead
of waiting around in some cases. This is similar to the
TTL expiry and volatility cases.
* Renamed $hashExpired and added some comments for clarity.
Change-Id: I8890fb174cae72c4ce8872df64f52f5cbd55183b
In addition to eliminating disk IO in a hot path, using APC spares us from
having to serialize and unserialize cache arrays. Since we're not serializing,
though, we don't have a string representation to hash, so use a random string
instead. (The code already treats the association of hash string to cache as
purely symbolic, so this is not problematic.)
Whereas the hash was previously stored as the first 32 bytes of each cache
file, we now store it as an array key instead (like VERSION and EXPIRY were
already). Because this changes the structure of cached data, we have to bump
MSG_CACHE_VERSION.
While we're here, make MessageCache::getLocalCache() and
MessageCache::saveToLocal() protected, make their signatures more
consistent with other methods in this class. While they were (implicitly)
public before, there are absolutely no external callers in Core or
extensions[0][1], so we can skip the standard deprecation process.
[0]: https://github.com/search?q=%40wikimedia+getlocalcache&type=Code&utf8=%E2%9C%93
[1]: https://github.com/search?utf8=%E2%9C%93&q=%40wikimedia+savetolocal&type=Code
Change-Id: I020617d2df2a8f0f243b85f3383dc7b16f15aaad
Especially in long running maintenance scripts, this can be problematic.
LinkCache is now LRU-based, and will store a maximum of 10,000 good titles,
and 10,000 bad ones.
LinkCache::getGoodLinks() and getBadLinks() are deprecated since they
problematic to support in this implementation and are unused.
Bug: T106998
Change-Id: I1328149d65a5e75a5d6e10cb2686a099562a1847
Missing global declaration when using $wgContLang in
MessageCache::normalizeKey
Change-Id: Ia1c1f41244fd5629527b99a5f2038f607cff42c4
Follow-up: 47e0f0c3
* Normalize the message name returned by allmessages
* Separate message key normalization into
MessageCache::normalizeKey()
Bug: T63894
Change-Id: I1d89fc73fea705243d390bab91255a635d8f9eee
The variable $row is set to the last user of the first loop.
It is better to use the $name which is set in the foreach correctly.
This now adds all userpages to the LinkBatch and that avoids some extra
queries on at least Special:ListFiles.
Change-Id: Ied378b1596ec9d38eda41ce5ee413203c65eb21b
strtr() is marginally faster as it runs through the string only
once. A better fit for one-for-one character translation.
The strtr() function also supports an associative array as second
parameter for entire string replacements. This, too, has the same
performance and predictable behaviour (starts with the longest key).
Whereas str_replace is for more aggressive needs where you want
multiple passes until there are no further matches.
The associative array form is arguably also easier to understand
and harder to mess up since the needle/replacement pairs are
explicitly connected instead of two separate arrays.
Also:
* Use getFormattedNsText instead of strtr( getNsText, .. ) which
reduces duplication of this fact through a more semantic intent.
Change-Id: Ie23e4210a5b6908dd79eebc8a2b931d12fe31af6
Since MediaWiki 1.24 a constructor is defined in HTMLFileCache,
so the constructor of the parent class, FileCacheBase, is no more
called, and $wgUseGzip is no more used. This fix explicitely calls
the parent constructor.
Bug: T103237
Change-Id: I80c1871881049b9618c23aa76e6665867ecec2aa
LinkBatch::add already handle the underscore/space part, that means it
is not need to do it on the caller side when adding user names to
LinkBatch
Change-Id: I09e80712903a539164141cc0a88d321203114677
- Removed space after casts
- Removed spaces in array index
- Added spaces around string concat
- Added space after words: switch, foreach
- else if -> elseif
- Removed parentheses around require_once, because it is not a function
- Added newline at end of file
- Removed double spaces
- Added spaces around operations
- Removed repeated newlines
Bug: T102609
Change-Id: Ib860222b24f8ad8e9062cd4dc42ec88dc63fb49e
wfSuppressWarnings() and wfRestoreWarnings() were split out into a
separate library. All usages in core were replaced with the new
functions, and the wf* global functions are marked as deprecated.
Additionally, some uses of @ were replaced due to composer's autoloader
being loaded even earlier.
Ie1234f8c12693408de9b94bf6f84480a90bd4f8e adds the library to
mediawiki/vendor.
Bug: T100923
Change-Id: I5c35079a0a656180852be0ae6b1262d40f6534c4
Implementation written by Fred Emmott of Facebook. Quoting Fred:
As well as array access being faster, the main advantage is actually
that this significantly reduces the use of unserialize(), which does a
lot of memcpys when making the strings.
Benchmarks compared to LCStoreCDB:
* HHVM (no repo-auth): ~7% improvement
* HHVM (with repo-auth): ~12% improvement
* PHP7: ~1% improvement
My (Legoktm) brief testing noted that the generated PHP files were
noticiably larger than the CDB ones:
* 1.5M en.l10n.php
* 932K l10n_cache-en.cdb
Bug: T99740
Change-Id: Ib2c5856d40cd928cab4a79cb935b3ce08c598300
* This should not be used, and load() does not see them.
If the content language is 'de', then message overrides
will not include MediaWiki:<key>/de pages.
Change-Id: Ie4b6b356bd309814dd4a88040a29a7ebd509712a
* The cache has to reload and *after* locking to avoid
losing any concurrent changes.
* Also fixed incorrect assumption in MessageCacheTest.
Message overrides for the content language do not use
the language suffix.
Change-Id: I98ff158a1575330bc59efe6badb27f8de8717951
This allows us to continue to use MessageBlobStore::get() as a way
to pre-fetch messages in batches for multiple modules while also
allowing individual modules to retrieve their messages in a convenient
way.
This is in preparation for implementing ResourceLoaderModule::buildContents
and using it in ResourceLoaderModule::getVersionHash.
ResourceLoader::respond() already fetches message blobs in batches for the
actual response, but it also needs getVersionHash(), which would otherwise
fetch the same messages a second time.
Change-Id: I7e4c8b65765b54807123e85cfbb7eb2e5b2f39bd
Modules now track their version via getVersionHash() instead of getModifiedTime().
== Background ==
While some resources have observeable timestamps (e.g. files stored on disk),
many other resources do not. E.g. config variables, and module definitions.
For static file modules, one can e.g. revert one of more files in a module to a
previous version and not affect the max timestamp.
Wiki modules include pages only if they exist. The user module supports common.js
and skin.js. By default neither exists. If a user has both, and then the
less-recently modified one is deleted, the max-timestamp remains unchanged.
For client-side caching, batch requests use "Math.max" on the relevant timestamps.
Again, if a module changes but another module is more recent (e.g. out-of-order
deployment, or out-of-order discovery), the change would not result in a cache miss.
More scenarios can be found in the associated Phabricator tasks.
== Version hash ==
Previously we virtually mapped these variables to a timestamp by storing the current
time alongside a hash of the value in ObjectCache. Considering the number of
possible request contexts (wikis * modules * users * skins * languages) this doesn't
work well. It results in needless cache invalidation when the first time observation
is purged due to LRU algorithms. It also has other minor bugs leading to fewer
cache hits.
All modules automatically get the benefits of version hashing with this change.
The old getDefinitionMtime() and getHashMtime() have been replaced with dummies
that return 1. These functions are often called from getModifiedTime() in subclasses.
For backward-compatibility, their respective values (definition summary and hash)
are now included in getVersionHash directly.
As examples, the following modules have been updated to use getVersionHash directly.
Other modules still work fine and can be updated later.
* ResourceLoaderFileModule
* ResourceLoaderEditToolbarModule
* ResourceLoaderStartUpModule
* ResourceLoaderWikiModule
The presence of hashes in place of timestamps increases the startup module size on
a default MediaWiki install from 4.4k to 5.8k (after gzip and minification).
== ETag ==
Since timestamps are no longer tracked, we need a different way to implement caching
for cache proxies (e.g. Varnish) and web browsers. Previously we used the
Last-Modified header (in combination with Cache-Control and Expires).
Instead of Last-Modified (and If-Modified-Since), we use ETag (and If-None-Match).
Entity tags (new in HTTP/1.1) are much stricter than Last-Modified by default.
They instruct browsers to allow usage of partial Range requests. Since our responses
are dynamically generated, we need to use the Weak version of ETag.
While this sounds bad, it's no different than Last-Modified. As reassured by
RFC 2616 <http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.3.3> the
specified behaviour behind Last-Modified follows the same "Weak" caching logic as
Entity tags. It's just that entity tags are capable of a stricter mode (whereas
Last-Modified is inherently weak).
== File cache ==
If $wgUseFileCache is enabled, ResourceLoader uses ResourceFileCache to cache
load.php responses. While the blind TTL handling (during the allowed expiry period)
is still maxage/timestamp based, tryRespondNotModified() now requires the caller to
know the expected ETag.
For this to work, the FileCache handling had to be moved from the top of
ResoureLoader::respond() to after the expected ETag is computed.
This also allows us to remove the duplicate tryRespondNotModified() handling since
that's is already handled by ResourceLoader::respond() meanwhile.
== Misc ==
* Remove redundant modifiedTime cache in ResourceLoaderFileModule.
* Change bugzilla references to Phabricator.
* Centralised inclusion of wgCacheEpoch using getDefinitionSummary. Previously this
logic was duplicated in each place the modified timestamp was used.
* It's easy to forget calling the parent class in getDefinitionSummary().
Previously this method only tracked 'class' by default. As such, various
extensions hardcoded that one value instead of calling the parent and extending
the array. To better prevent this in the future, getVersionHash() now asserts
that the '_cacheEpoch' property made it through.
* tests: Don't use getDefinitionSummary() as an API.
Fix ResourceLoaderWikiModuleTest to call getPages properly.
* In tests, the default timestamp used to be 1388534400000 (which is the unix time
of 20140101000000; the unit tests' CacheEpoch). The new version hash of these
modules is "XyCC+PSK", which is the base64 encoded prefix of the SHA1 digest of:
'{"_class":"ResourceLoaderTestModule","_cacheEpoch":"20140101000000"}'
* Add sha1.js library for client-side hash generation.
Compared various different implementations for code size (after minfication/gzip),
and speed (when used for short hexidecimal strings).
https://jsperf.com/sha1-implementations
- CryptoJS <https://code.google.com/p/crypto-js/#SHA-1> (min+gzip: 2.5k)
http://crypto-js.googlecode.com/svn/tags/3.1.2/build/rollups/sha1.js
Chrome: 45k, Firefox: 89k, Safari: 92k
- jsSHA <https://github.com/Caligatio/jsSHA>
https://github.com/Caligatio/jsSHA/blob/3c1d4f2e/src/sha1.js (min+gzip: 1.8k)
Chrome: 65k, Firefox: 53k, Safari: 69k
- phpjs-sha1 <https://github.com/kvz/phpjs> (RL min+gzip: 0.8k)
https://github.com/kvz/phpjs/blob/1eaab15d/functions/strings/sha1.js
Chrome: 200k, Firefox: 280k, Safari: 78k
Modern browsers implement the HTML5 Crypto API. However, this API is asynchronous,
only enabled when on HTTPS in Chromium, and is quite low-level. It requires boilerplate
code to actually use with TextEncoder, ArrayBuffer and Uint32Array. Due this being
needed in the module loader, we'd have to load the fallback regardless. Considering
this is not used in a critical path for performance, it's not worth shipping two
implementations for this optimisation.
May also resolve:
* T44094
* T90411
* T94810
Bug: T94074
Change-Id: Ibb292d2416839327d1807a66c78fd96dac0637d0
Seriously, the ops team spent some time trying to find that page during an outage,
while in fact it's an obscure Tolkien reference - better be clear.
Also, set the other dummy titles to something very clearly explaining what's
going on and where.
Change-Id: I6f33a2ea5030f22a258830a33f7bcefa7f0acd85