Before c962b48056, the 'loadedLanguages' array was used to track
which languages were loaded and in the cache, with 'cache' being a
simple array. In that commit, the 'cache' array also started being used
for incomplete datasets, which didn't affect 'loadedLanguages'.
Then in 97e86d934b, the 'loadedLanguages' array was removed in favour
of checking keys on 'cache' directly, and 'cache' was converted to
MapCacheLRU.
This led to problem where partially loaded data was mistaken for being
full datasets (fatal error, T208897). This was fixed in a5c984cc59,
by bringing back the 'loadedLanguages' array, which fixed the issue from
the POV of partially loaded data.
However, this then exposed a new problem. The 'cache' data can be evicted
by MapCacheLRU, whereas 'loadedLanguages' is not aware of that. Thus it
claims languages are loaded that sometimes aren't. (This only affects web
requests where more than 5 language codes are involved, per MapCacheLRU.)
Fix this by re-removing the 'loadedLanguages' array, this time
strengthening the 'cache' key check to not just check that the root key
exists, but that it is in fact holding the full dataset as generated by
MessageCache::load(). The 'VERSION' key appears to be a good proxy for
that.
Bug: T230690
Change-Id: I1162a3857376aa37e5894ae3c8be84a2295782a3
When reporting exceptions that occur during initialization, wgUser may
be null. Don't die when that happens.
Change-Id: I65d5a17d80f9021e28a218c7a5a17e399bc7ce98
This variable has never been set to anything other than the default value of
24 hours as introduced in 2003 (r2203, r2204; or 036ff960ce, edf6b38626).
The variable has never changed in core, it's not overridden at WMF,
and MessageCache is not constructed anywhere other than ServiceWiring.php
anywhere in repos on Wikimedia Gerrit, indexed by MediaWiki Codesearch,
or any GitHub-hosted repository (incl Wikia repos and WikiHow mirrors).
I've also checked all GitHub-hosted repos for boilerplates and/or public
settings files from devs or prod, and couldn't find any example of
this being overridden (after filtering out copies of the core files
themselves). Rather than having to support potentially hard-to-predict
interactions betweeen caching layers by checking its state, make it
a constant so we can code reason about it more easily.
Change-Id: Ie2e139001aae3ac54b509d94a3d917bb408eaca0
For some unknown reason, when the `actor` table has few enough NS8 rows
compared to `page` MariaDB 10.1.37 decides it makes more sense to fetch
everything from `actor` then join `revision` then `page` rather than
fetching the rows from `page` in the first place.
We can work around it by telling it to not reorder the query, but then
we also have to reorder it ourselves to put `page` first instead of
`revision`.
Bug: T231196
Change-Id: I2b2fb209e648d1e407c5c2d32d3ac9e574e361d5
This was removed in 97e86d934b in 2018 in favour of using
`$this->cache->has($code)`. This is a problem because there
are cases where only a narrow subset of that structure is
populated (by MessageCache->replace) without things like
$this->overridable (or anything else that MessageCache->load does)
having ocurred yet.
The assumption that keys are only added to $this->cache by
MessageCache->load (or after that method has been called) was
actually true at some point. But, this changed in 2017 when
commit c962b48056 optimised MessageCache->replace to not call
MessageCache->load.
Bug: T208897
Change-Id: Ie8bb4a4793675e5f1454e65c427f3100035c8b4d
The way isMainCacheable() was used, it always returned false in
non-content languages, because it would try to find strings like
'hidetoc/fr' in the array of message keys (which contains strings like
'hidetoc').
The consequence of this was that MessageCache would check the database
for a MediaWiki:hidetoc/fr page even if it already knew that that page
didn't exist. This is a substantial performance hit when requesting lots
of messages, like when building version hashes for ResourceLoader's
startup module.
Follows-up 4fc5ba8bf8.
Bug: T228555
Change-Id: I20433175ca919acc1c995f4a9cd50ca53afcdd02
Use MediaWikiServices (not OutputPage) to obtain a ResourceLoader
instance, then call ->getMessageBlobStore() to obtain its
MessageBlobStore instance (don't construct a new one).
Change-Id: I6b8bacac9888b5807328eece01134a6c5747dc72
In Id044b8dcd7c, we lost a condition that ensured that the cache would
be populated with the latest revision. Now it was being populated with
all revisions, with a random one winning.
Bug: T218918
Change-Id: I1a47356ea35f0abf35bb1a3489d0d3442a3400a5
Remove many references to database fields rev_text_id and ar_text_id,
which are being retired as part of MCR Schema Migration.
Some references remain, and will be removed under other patchsets
or other tasks.
Bug: T198341
Change-Id: Id044b8dcd7c9d09d5d6037eb732f6a105933f516
We are incrementally removing places where the parser is used with
tidy disabled, since future parsers will not support such operation.
Bug: T198214
Change-Id: I0f417f75a49dfea873e9a2f44d81796a48b9f428
Follow up to a3d6c1411d.
This avoids extra queries for messages that have a software defined value.
Bug: T193271
Change-Id: I25aa0e27200a0b417721cf1fbd34a82095405b89
Prior to I462554b30, MessageCache::replace() did just that: it took the
existing cache and updated the one entry.
In I462554b30, the rearrangement of work into a DeferredUpdate
introduced a bug: the in-process cache was updated, but when the shared
cache was loaded later the entry was never updated in there so the
shared caches kept the old value. This was found in code review and
worked around by reloading all the messages from the database instead of
updating the existing cache.
But all that extra work reloading everything from the database causes
major slowness saving any MediaWiki-namespace page when the wiki has
many such small pages. Let's go back and fix the bug so replace() can
again replace instead of reloading everything.
Bug: T203925
Change-Id: Ife8e1bd6f143f480eb8da09b817c85aadf33a923
Clean up individual message cache handling when the master key is
volatile. The goal is not to treat the message as gone but to re-fetch
it when accessed rather than cache the value in the main/process cache.
Bug: T193271
Change-Id: I4bcaf10c1516e7c96c2d0963722affeaf80272e0
In some functions MediaWikiServices::getInstance() was called twices or
in loops. Extract the variable to reduce calls.
Change-Id: I2705db11d7a9ea73efb9b5a5c40747ab0b3ea36f
In cases where we're operating on text data (and not binary data),
use e.g. "\u{00A0}" to refer directly to the Unicode character
'NO-BREAK SPACE' instead of "\xc2\xa0" to specify the bytes C2h A0h
(which correspond to the UTF-8 encoding of that character). This
makes it easier to look up those mysterious sequences, as not all
are as recognizable as the no-break space.
This is not enforced by PHP, but I think we should write those in
uppercase and zero-padded to at least four characters, like the
Unicode standard does.
Note that not all "\xNN" escapes can be automatically replaced:
* We can't use Unicode escapes for binary data that is not UTF-8
(e.g. in code converting from legacy encodings or testing the
handling of invalid UTF-8 byte sequences).
* '\xNN' escapes in regular expressions in single-quoted strings
are actually handled by PCRE and have to be dealt with carefully
(those regexps should probably be changed to use the /u modifier).
* "\xNN" referring to ASCII characters ("\x7F" and lower) should
probably be left as-is.
The replacements in this commit were done semi-manually by piping
the existing "\xNN" escapes through the following terrible Ruby
script I devised:
chars = eval('"' + ARGV[0] + '"').force_encoding('utf-8')
puts chars.split('').map{|char|
'\\u{' + char.ord.to_s(16).upcase.rjust(4, '0') + '}'
}.join('')
Change-Id: Idc3dee3a7fb5ebfaef395754d8859b18f1f8769a
Find: /isset\(\s*([^()]+?)\s*\)\s*\?\s*\1\s*:\s*/
Replace with: '\1 ?? '
(Everywhere except includes/PHPVersionCheck.php)
(Then, manually fix some line length and indentation issues)
Then manually reviewed the replacements for cases where confusing
operator precedence would result in incorrect results
(fixing those in I478db046a1cc162c6767003ce45c9b56270f3372).
Change-Id: I33b421c8cb11cdd4ce896488c9ff5313f03a38cf
The avoids the long delete() loop in MessageCache::replace()
and has better separation of concern.
Change-Id: I0acb0119058fa92fcafb52a5850f5dad4aaa94d2
This also removes all the in-core calls that had been kept for the
benefit of extensions, and causes them to not have any effect since
anything that had been calling them was already either a no-op or will
probably be broken now that nothing in core is setting or checking the
flags.
Change-Id: Id22c1a5a6d6a249debb14063ae3f8838d105b634
And deprecate passing false for ParserOptions::setWrapOutputClass().
There are three cases for the Parser wrapper: the default
mw-parser-output, a custom wrapper, or no wrapper. As things currently
stand, we have to fragment the parser cache on each of these options,
which uses a nontrival amount of storage space (T167784).
Ideally we'd do all the wrapping as a post-cache transform, but
TemplateStyles needs to know the wrapper in use in order to properly
prefix its CSS rules (that's why we added the wrapper in the first
place). So, second best option is to make *un*wrapping be a post-cache
transform and make "custom wrapper" be uncacheable.
This patch does the first bit (unwrapping as a post-cache transform),
and a followup will do the second part once the deprecation process is
satisfied.
Bug: T181846
Change-Id: Iba16e78c41be992467101e7d83e9c3134765b101
These comments do not add anything. I argue they are worse than having
no comments, because I have to read them first to understand they
actually don't explain anything. Removing them makes room for actual
improvements in the future (if needed).
Change-Id: Iee70aad681b3385e9af282d5581c10addbb91ac4
This is a re-submission of I4f24e7fbb68.
As a first major step towards Multi-Content-Revisions (MCR),
this patch turns the Revision class into a legacy proxy for
the new RevisionRecord and RevisionStore classes.
Backwards compatibility is maintained for all but some
rare edge cases, like constructing a completely empty
Revision object.
For more information on MCR, see
<https://www.mediawiki.org/wiki/Requests_for_comment/Multi-Content_Revisions>.
NOTE: once this is merged, verify create/delete/restore cycle on beta,
ideally with emulated replication lag.
Bug: T174025
Change-Id: Ia4c20a91e98df0b9b14b138eb4825c55e5200384
This reverts commit 9dcc56b3c9.
With this patch applied, newly created revisions are sometimes not found
just after submitting an edit, until replicas have caught up.
Our best theory is that it somehow interfere with ChronologyProtector,
but we don't have a good idea how.
Also, as legoktm mentioned, the commit message is terrible and needs fixing.
Change-Id: Idf3404f3fa8f8d08a7fb2ab8268726e2c1edecfe