Following up on the comment I made at Ibbc1423166f4804a5122, make Parser
instance management a ParserFactory responsibility. It is weird for
Parser to have a ParserFactory proxy aspect.
* Add ParserFactory::getMainInstance(), which is equivalent to the old
MediaWikiServices::getParser() and $wgParser.
* Add ParserFactory::getInstance(), which is equivalent to
$wgParser->getFreshInstance(), returning the main instance if it is
free, or a new instance otherwise. The naming is supposed to encourage
it as the default way to get a parser, which will help with the linked
bug.
* Deprecate Parser::getFreshParser() and migrate all core callers.
I left the entry in ServiceWiring.php so that it's not immediately
necessary to migrate ObjectFactory specs that ask for Parser.
Bug: T310948
Change-Id: I762b191e978c2d1bbc9f332c9cfa047888ce2e67
Parsoid currently only supports wikitext (and JSON), so don't give it anything else.
NOTE: ParsoidOutputAccess will fail on content that is unsupported by parsoid.
This will however not affect the /transform and /page endpoints in the
parsoid extension, since they use the ParsoidHandler base class, which doesn't
rely on ParsoidOutputAccess.
Bug: T301371
Change-Id: I6bc9b978947b31455a4bce6385b7bdf64ed4043c
This removes a cyclic dependency:
ParsoidHTML helper in the REST component uses ParsoidOutputAccess in the
parser component. So ParsoidOutputAccess cannot use LocalizedHttpException
from the REST component.
This also improves separation of concerns: the parsing component should
not be concerned with HTTP status codes.
Bug: T301371
Change-Id: I2e661fe3ce0824dbfd7579650972f9019c92ed59
This isolates ParsoidHTMLHelper from the internal of
ParsoidOutputAccess. The corresponding test cases were changed to use a
mock ParsoidOutputAccess, and to not test the behavior of
ParsoidOutputAccess.
Bug: T301371
Change-Id: Id693fae2264f15e5d35f28acc5adc4239b2ae24f
This patch introduces a ParsoidOutputAccess service for
getting parsoid outputs and warms the cache with pregenerated
outputs.
It also introduces a config variable in ParsoidCacheConfig that
is turned off by default for controlling the cache warming.
Bug: T301371
Change-Id: I6152c42ea765d94093d8d62598b1b4278314adec
ParserCache::checkOutdated relies on ParserOutput::getCacheRevisionId() to determine
whether a revision is still current after loading it from the cache. If
the revision ID is 0 or null, this will result in false negatives, and
the revision will always be considered outdated.
It is better to detect and report this before writing the ParserOutput to the cache.
This also adds an assertion in DerivedPageDataUpdater that will trigger
an exception if we try to write to the parser cache before the revision
has been saved and the ID is known.
Change-Id: I242b769afbc7e1ae1e3f218d451f04945dfa8be4
* Export nofollow and target settings in siteinfo API so that Parsoid's
developer mode of ApiSiteConfig works.
* Implement SiteConfig::getNoFollowConfig and
SiteConfig::getExternalLinkTarget, which are defined as abstract
in the parent class in Parsoid.
Bug: T186241
Change-Id: I6a1f12335be19509d4c5a17e2cae96ecdb677103
Parsoid repo use of this method still relies on $wikitextOverride
arg being handled properly, and the $pageId may be a LinkTarget not
a PageIdentity.
This patch contains some "unnecessary" renames just to make the diff
with 5f5b4cbbb4a6214229a23062787c25acd4192ff7^ easier to read.
Change-Id: I79a2773bf6d2366593b31555afd0b548b66d222a
* This let us pass mocked revisions in the parser test runner while
running in Parsoid mode.
* This leads to improvement in wt2html tests results where a revision
id is queried. I've verified this in the Cite extension repo as
also the main parserTests.text file but I cannot enable Parsoid
integrated testing on the main parser tests file without doing a
sweep over all parser tests and adding appropriate test sections
* Currently, PageConfigFactory doesn't have unit tests. Will look
into adding them separately in a followup.
* Moved the setupParsoidTransform function to a more suitable place
in the ParserTestRunner.php file.
Bug: T270310
Change-Id: I94d68c8528bb2f7b367c68d80d14ebc1ab904a7f
When JSON support was introduced into ParserCache in 1.36, it was
controlled by a feature flag, $wgParserCacheUseJson. The feature flag
was "born deprecated" in 1.36. It can now be removed.
This means that ParserCache will always store entries as JSON.
Support for reading old non-JSON entries remains intact.
This is needed when updating wikis from a version older than 1.36
to the current version.
Change-Id: Id04e42bfb458d98414bac50e0d6c505e8878e5c0
Due to quirks of bootstrapping process in PHPUnit, enableMagicLinks can
end up as null instead of an array with keys expected by ParserOptions.
As a workaround, set a fallback for each expected key.
Change-Id: I1511503937f8ac4fcd2f2c8b98bfd7dba17385ec
New trait for PageBundle class to serialize & deserialize
PageBundle object into json before stashing and after unstashing.
Change-Id: I486fab5b3d01bcef2b535af579cd9672403b2102
Add stash option to /page/html & /revision/html endpoints.
When this option is set, the PageBundle returned by Parsoid is
stashed and an etag is returned that can later be used to
make use of the stashed PageBundle.
The stash is for now backed by the BagOStuff returned by
ObjectCache::getLocalClusterInstance().
This patch adds additional data to the ParserOutput stored in ParserCache.
Old entries lacking that data will be ignored.
Bug: T267990
Co-Authored-by: Nikki <nnikkhoui@wikimedia.org>
Change-Id: Id35f1423a69e3ff63e4f9883b3f7e3f9521d81d5
This is a replay of 4bc0dc348a, which
was reverted in 9bd4fc0ae9 due to unexpected
use on Dutch Wiktionary. In 1.36 deprecation warnings and a tracking
category were added if a wiki defined [[Template:=]] to expand to
anything other than `=` (see aeb3f45c20).
This patch follows up that deprecation by finally defining `{{=}}` as
a built-in, since the last usage on deployed wikis was cleaned up
sometime around February 2021 (list at
https://meta.wikimedia.org/wiki/Equals_sign_parser_function_template_conflicts
).
We've left the tracking category defined for now, so that any remaining
pages left in the tracking category on third-party wikis still retain
localized category documentation. But it is expected that the next MW
release will also remove the tracking category.
Bug: T91154
Change-Id: I4717172f1d74d326212d51015a6cd87c3758f30d
Follow-up to I9d1f0f6bab1305552a0350667d6142a24bc04049. That patch was
not collecting data at all (not even overwriting them over and over
again) - the assignment operation was, in practice, a NOP. This patch
fixes this.
Bug: T303014
Bug: T303015
Change-Id: I7d09b532f3270edf4327c16e032d665353d992f6
DefaultSettings.php has been replaced by MainConfigSchema.
Loading DefaultSettings.php is deprecated.
Code that needs to have access to configuration defaults should use the
ConfigSchema service object.
Bug: T300129
Change-Id: I7b2c0ca95a78990be1cdb9dd9ace92f6dcf1af15
NOTE: This changes the HTML returned by the endpoint!
It will now include the id="mwXYZ" attributes needed to
later map to data-parsoid entries.
Bug: T268205
Change-Id: I0a29434b996cc289eb67083e62bd6f1ad750cb4d
Remove the code that outputs self-closing <mw:editsection ... /> tags
in Parser, previously used for transcluded sections.
Remove the ability to handle them in ParserOutput. We don't need
backwards-compatibility with cached content, because that feature did
not work correctly for several years: Remex-Tidy always expanded them
to normal open and close tags.
Remove handling for this case in skin code (and fix documentation).
These are backwards-compatible changes.
Depends-On: Idbf0b95a3c0b04caa056b71dd08f46659920114a
Bug: T306299
Change-Id: I3fac0f34d134d8eec46c7eefa3ad2b67abb957da
This assures that nofollow's are added where necessary and sets the
title attribute on links where appropriate, matching file links outside
of galleries.
Change-Id: I999a294bbcb921fe53823503e8009ab678411bb2
Previously:
* It was unclear that generate-html is an optional optimization
* Most of MediaWiki core was doing $parserOutput->setText('') if
html wasn't generated. However this is wrong and will cause
$parserOutput->hasText() to return true and also potentially cause
cache pollution if a content handler both does that and supports
parser cache (Like MassMessage; see T299896)
* The default value of mText in the constructor was '', and most
of the time MW used that default. This doesn't seem right. If
setText() is never called, the ParserOutput should not be considered
to have text
* It was impossible to set mText to null, as $parserOutput->setText(null)
was a no-op. Docs implied you were supposed to do this, so it was very
confusing.
This patch clarifies docs, changes the default value for ParserOutput::$mText
from '' to null, and makes $parserOutput->setText(null) do what you
expect it to. The last two are arguably breaking changes, although
the previous behaviours were unexpected, mostly undocumented and
based on a code search do not appear to be relied on.
It seems like the main reason this only broke MassMessage is most
content handlers either don't support generateHtml, or they don't
support parser cache.
Bug: T306591
Change-Id: I49cdf21411c6b02ac9a221a13393bebe17c7871e
Depends-On: I68ad491735b2df13951399312a4f9c37b63a08fa
All the other ways of doing it were ridiculous and much harder to read,
and usually required repeating the needle expression (to get its
length). I found these occurrences by grepping for various expressions,
but I undoubtedly missed some.
I didn't try replacing the many instances of strpos(...) === 0 with
str_starts_with(...), because I think they're readable enough as-is
(although less efficient). Likewise I didn't try porting strpos(...) !==
false to str_contains(...). For case-insensitive comparisons, Tim
Starling requested that we stick with substr_compare() because it's more
efficient than calling strtolower().
On PHP < 8 these functions will be included with a polyfill via
vendor/autoload.php. This is included at the beginning of
includes/AutoLoader.php, so if our autoloader has been included the
polyfill will be available. This means it should be safe to call these
functions from any code that would not be usable without our autoloader.
Three uses that Tim Starling identified as being performance-sensitive
have been split out to a separate commit for porting after the switch to
PHP 8.
Change-Id: I113a8d052b6845852c15969a2f0e6fbbe3e9f8d9
Now largely automated:
VARS=$(grep -o "'[A-Za-z0-9_]*'" includes/MainConfigNames.php | \
tr "\n" '|' | sed "s/|$/\n/;s/'//g")
sed -i -E "s/'($VARS)'/MainConfigNames::\1/g" \
$(grep -ERIl "'($VARS)'" includes/)
Then git add -p with lots of error-prone manual checking. Then
semi-manually add all the necessary "use" lines:
vim $(grep -L 'use MediaWiki\\MainConfigNames;' \
$(git diff --cached --name-only --diff-filter=M HEAD^))
I didn't bother fixing lines that were over 100 characters unless they
were over 120 and triggered phpcs.
Bug: T305805
Change-Id: I74e0ab511abecb276717ad4276a124760a268147
* Allow EditPage to create a user on page save. This has to be enabled
in config and then activated by the UI/API caller.
* Add an autocreate source for temporary users.
* Allow editing by anonymous users via automatic account creation when
$wgGroupPermisions['*']['edit'] = false. On an edit GET request, use
an unsaved placeholder user to stand in for post-create permissions.
* On preview or aborted save, the username to be created is stashed in a
session and restored on subsequent requests.
* On a (likely) successful page save, create the account.
* Put regular non-temporary users in a "named" group so that they can be
given additional permissions.
* Use a different "~~~" signature for temporary users
* Show account creation warnings on edit and preview.
Change-Id: I67b23abf73cc371280bfb2b6c43b3ce0e077bfe5