I believe the more recent syntax is quite a bit more readable. The
most obvious benefit is that it allows for much less duplication.
Note this patch is intentionally only touching tests, so it can't
have any effect on production code.
Change-Id: Ibdde9e37edf80f0d3bb3cb9056bee5f7df8010ee
* Its not very clean to import Wikimedia\Stats in parsoid
* Mediawiki depends on parsoid
* As a workaround we can extract the 2 methods we need in SiteConfig
Bug: T354908
Change-Id: I696131cfba6ccc26ae1f705f216e221a7c3db175
And deprecated aliases for the the no namespaced classes.
ReplicatedBagOStuff that already is deprecated isn't moved.
Bug: T353458
Change-Id: Ie01962517e5b53e59b9721e9996d4f1ea95abb51
Changes to the use statements done automatically via script
Addition of missing use statement done manually
Change-Id: I4ff4d0c10820dc2a3b8419b4115fadf81a76f7a2
Page bundle headers should not contain objects, as they are supposed
to represent plaintext HTTP headers.
Change-Id: I2a87a8233b9e42cbafdba63bdf513abe00d826ce
The `supportsContentModel` method is really querying Parsoid for the
set of content models it supports, so it makes sense to put it in the
Parsoid-specific SiteConfig service.
This is part of the work to deprecate and remove ParsoidOutputAccess.
Change-Id: I81eb2df8cef93ede95361a4e03185b3d58e5b84b
I don't think these do anything with the documentation generators
we currently use. Especially not in tests. How are tests part of a
"package" when the code is not?
Note how most of these are simply identical to the namespace. They
are most probably auto-generated by some IDEs but don't actually
mean anything.
Change-Id: I771b5f2041a8e3b077865c79cbebddbe028543d1
I keep running into this whenever I use createNoOpMock. I think it's
XDebug that's calling this method, and then PHPUnit flooding the
console with extremely long stack traces.
We pretty much never do anything custom with this method:
https://codesearch.wmcloud.org/search/?q=__debugInfo&files=%5C.php%24
Change-Id: Ib2ab86fb243555f5e4449ed72cb032cb465e415d
This is a follow-up to: I0b683461212a357c7eb09ddec59c87539e323c65
and I40a8372a76f33c5f62ea73bb1180dd7c47412c89 which explicitly for
backward compatibility reasons supports IBufferingStatsdDataFactory.
Now that we've fully switched to StatsFactory together with the
`copyToStatsdAt()` method, we're fine to fully remove this `instanceof`
logic.
Bug: T356815
Change-Id: I164d82904b6d3fb575cb973c14f9454569bf09ac
Covered:
- Constructor initialization with correct dependencies.
- Retrieve roles assigned to page content.
- Check if the specified role exists in the page content slots.
- Retrieve model name for specified role in page content
- Handle exception for non-existent role when retrieving model
- Retrieve content format for specified role in page content
- Retrieve serialized content for specified role in page content
- Handle exception for non-existent role when retrieving content
Change-Id: Ia2129e37b15bb8c09c0b26e487a9e311e66b932f
HtmlOutputRendererHelper should not crash hard if the ParserOutput has
no language set. ParserOutput may come from a variety of places, we
should be lenient about it not having a language.
However, we should try harder to actually set a language on ParserOutput
if we have one available. So this also updates
PageBundleParserOutputConverter to keep the ParserOutput's language in
sync wit the language header in the PageBundle.
Bug: T349868
Bug: T353689
Bug: T359426
Change-Id: I2edf20dc3b199e22cda2f32bc858c21ca7d8f4bd
ParserOutput::getText() is not a simple getter, but does
transformations on the "text" of the ParserOutput; the simple getter
is named ::getRawText().
To maintain consistency, rename ParserOutput::setText() to
::setRawText() and the property name ParserOutput::$mText to
::$mRawText so future readers are not confused.
The JSON property name as it appears in the serialized ParserCache
is left as 'Text' so that we don't have any forward- or backward-
rollback issues.
Change-Id: I3ef34814ab9473cc70d0a6806e8c5a4a02b73491
This was banned because it could be used to load other files,
including potentially local files, in IE9 and earlier. This
browser is no longer relevant. Wikimedia sites stopped supporting
the needed TLS versions for that browser 4 years ago.
Modern browsers have redefined filter to mean something different.
Generally the new filter is perfectly safe as long as we ban the url()
function which we do.
For context on why it was originally banned, see
https://static-codereview.wikimedia.org/MediaWiki/66990.html
Bug: T308160
Change-Id: Ic94f499dfe66e3cce12496893d0ecbee006bd243
This class belongs with the rest of the Parsoid output stash code.
This class has been marked @unstable since 1.39 and thus the move
does not need release notes.
Change-Id: I16061c0c28b1549fbe90ea082cc717fee4a09a6e
Set the render ID for each parse stored into cache so that we are able
to identify a specific parse when there are dependencies (for example
in an edit based on that parse). This is recorded as a property added
to the ParserOutput, not the parent CacheTime interface. Even though
the render ID is /related/ to the CacheTime interface, CacheTime is
also used directly as a parser cache key, and the UUID should not be
part of the lookup key.
In general we are trying to move the location where these cache
properties are set as early as possible, so we check at each location
to ensure we don't overwrite a previously-set value. Eventually we
can convert most of these checks into assertions that the cache
properties have already been set (T350538). The primary location for
setting cache properties is the ContentRenderer.
Moved setting the revision timestamp into ContentRenderer as well, as
it was set along the same code paths. An extra parameter was added to
ContentRenderer::getParserOutput() to support this.
Added merge code to ParserOutput::mergeInternalMetaDataFrom() which
should ensure that cache time, revision, timestamp, and render id are
all set properly when multiple slots are combined together in MCR.
In order to ensure the render ID is set on all codepaths we needed to
plumb the GlobalIdGenerator service into ContentRenderer, ParserCache,
ParserCacheFactory, and RevisionOutputCache. Eventually (T350538) it
should only be necessary in the ContentRenderer.
Bug: T350538
Bug: T349868
Followup-To: Ic9b7cc0fcf365e772b7d080d76a065e3fd585f80
Change-Id: I72c5e6f86b7f081ab5ce7a56f5365d2f75067a78
Changes to the use statements done automatically via script
Addition of missing use statements and changes to docs done manually
Change-Id: Ib326ae1e5c8409a98398c721e8b8ce42c73bd012
Pages that are fast to render can be omitted from the parser cache
to preserve disk space and cache write operations.
The threshold is configurable per namespace, so the tradeoff can
be evaluated based on different access patterns. For example, pages
that are accessed rarely, like file description pages on commons,
may have a high threshold configured, while pages that are read
frequently, like wikipedia articles, may be configured to be always
cached, using a 0 threshold.
Filtering is based on a time profile recorded in the ParserOutput.
A generic mechanism for capturing the timing profile is implemented
in the ContentHandler base class. Subclasses may implement a more
rigorous capture mechanism.
Bug: T346765
Change-Id: I38a6f3ef064f98f3ad6a7c60856b0248a94fe9ac
There are a couple of user options related classes already,
and the T321527 work on dynamic defaults is going to add
even more. Let's move them into a separate namespace
to make core a bit more organized.
Old name is kept as an alias for compatibility purposes.
Bug: T321527
Bug: T352284
Change-Id: I9822eb1553870b876d0b8a927e4e86c27d83bd52
The main motivation is to further reduce the complexity of the class:
* There is no code that ever writes to $this->mSubstIDs. It's
effectively a constant.
* According to CodeSearch the getSubstIDs() method is not used
anywhere. It's @internal to the parser.
* I find it weird that the parser needs to call 2 factory methods to
do 1 thing.
* I still find it a good idea to keep the knowledge encapsulated in
the factory and not have the [ 'subst', 'safesubst' ] array in the
parser. That's why I propose the new method.
Change-Id: I5c147c75200c3c34a410d93a0328b56ea00a050f
Garbage in, garbage out. When the wikitext is broken, it's still
helpful if the user can see the broken wikitext. Even if it's not
fully parsed. It's not the job of this class to fix broken UTF-8.
The worst thing that can happen is that the wikitext contains some
unparsed magic words. However, this is really only relevant for
very old revisions (20 years old, see T321234). It's very normal
that old revisions can't be 100% parsed any more, most notably
because of deleted templates. This here is not much different.
Bug: T321234
Change-Id: I0ce40f6575668847ef309599ee32de52190ab212
The extra code that scans for duplicates and throws an exception was
added via I95dea67 in 2017. I'm not entrirely sure why. This should
be impossible in all relevant real-world scenarios. Maybe it happened
in a local dev scenario?
Even if, duplicates are harmless. Let me explain:
The only way a duplicate can end here is when the same magic word is
added twice to the $this->names array. The only thing that happens
then is that the resulting regex contains one of the sub-patterns
twice. It doesn't matter which one matches. We know these subpatterns
are identical. Unfortunately the PCRE compiler doesn't know and
assumes duplicate names are a problem. We have two options to fix
this: Strip duplicates in $this->names with array_unique() or tell
the PCRE compiler that duplicates are ok with the /J modifier.
I would like to avoid the extra, potentially expensive array_unique()
because, as said, duplicates never happen in real-world scenarios.
The /J modifier is supported since PHP 7.2.
Change-Id: I5f113abdbb44354fcc01be7f36fbc7d07f75876c
* MagicWord::getId was added in r24808 (164bb322f2) but never used.
At the time, access modifiers like 'private' were not yet in use.
Deprecate the method with warnings, for removal in a future release.
* Fix zero coverage for MagicWord, due to constructor being
internal, this is only intended to be created via array and
factory classes. Let their tests cover this class.
* Remove redundant file-level description and ensure the class desc
and ingroup tag are on the class block instead.
Ref https://gerrit.wikimedia.org/r/q/owner:Krinkle+message:ingroup
* Mark constructor `@internal` (was already implied by
stable interface policy), and explain where to get the object
instead.
* Mark load() `@internal`. Method was introduced in 1.1 when the
class (and PHP) did not yet use visibility modifiers for private
methods. The only way to get an instance of MagicWord
(MagicWordFactory::get) already calls load(), the method is not
a no-op if called a second time, and (fortunately) there exist no
callers to this outside this class that I could find.
* MagicWordArray::getBaseRegex was marked as internal
in change I17f1b7207db8d2203c904508f3ab8a64b68736a8.
Change-Id: I4084f858bb356029c142fbdb699f91cf0d6ec56f
The tests we added before create only MagicWordArray objects with a
single magic word. Here we are testing actual arrays of magic words.
Change-Id: I5880cca2a1e1ecf7018edd22c11229da5d5baffd
I think this code is effectively covered by the parser tests that use
magic words. Still it worried me more and more to make changes to
this code without dedicated unit tests.
Change-Id: Id72e1d7ef4736e4d0672798d720465648d91b3ba
This nominally takes a string-valued language code conforming to the
BCP-47 standard, but this is often generated from a Bcp47Code object.
Since the MediaWiki Language code implements Bcp47Code, we may have
the case where we have a Language object in hand (but typed as a
Bcp47Code not Language) and call Language::toBcp47Code() only to pass
it to LanguageCode::bcp47ToInternal to convert it back to a
mediawiki-internal code.
We can save steps and be more efficient if allow the parameter to be a
Bcp47Code object, and write a fast path for the special case where
that Bcp47Code happens to be a Language object and we can simply call
Language::getCode() to obtain the internal code.
Change-Id: I24932449b8c40e3a5072748d87667184f4befa67