Commit graph

2596 commits

Author SHA1 Message Date
Tim Starling
f270881ca2 Deprecate Parser::getFreshParser()
Following up on the comment I made at Ibbc1423166f4804a5122, make Parser
instance management a ParserFactory responsibility. It is weird for
Parser to have a ParserFactory proxy aspect.

* Add ParserFactory::getMainInstance(), which is equivalent to the old
  MediaWikiServices::getParser() and $wgParser.
* Add ParserFactory::getInstance(), which is equivalent to
  $wgParser->getFreshInstance(), returning the main instance if it is
  free, or a new instance otherwise. The naming is supposed to encourage
  it as the default way to get a parser, which will help with the linked
  bug.
* Deprecate Parser::getFreshParser() and migrate all core callers.

I left the entry in ServiceWiring.php so that it's not immediately
necessary to migrate ObjectFactory specs that ask for Parser.

Bug: T310948
Change-Id: I762b191e978c2d1bbc9f332c9cfa047888ce2e67
2022-07-05 14:09:36 +10:00
daniel
2ba27ab06e Protect against passing unsupported content models to Parsoid.
Parsoid currently only supports wikitext (and JSON), so don't give it anything else.

NOTE: ParsoidOutputAccess will fail on content that is unsupported by parsoid.
This will however not affect the /transform and /page endpoints in the
parsoid extension, since they use the ParsoidHandler base class, which doesn't
rely on ParsoidOutputAccess.

Bug: T301371
Change-Id: I6bc9b978947b31455a4bce6385b7bdf64ed4043c
2022-06-30 14:54:42 +00:00
jenkins-bot
25517baeb5 Merge "Move knowledge about HTTP status out of ParsoidOutputAccess" 2022-06-28 21:15:45 +00:00
jenkins-bot
dd598ef412 Merge "Move access to the page bundle into ParsoidOutputAccess" 2022-06-28 21:11:44 +00:00
daniel
8ce08c0cbc Move knowledge about HTTP status out of ParsoidOutputAccess
This removes a cyclic dependency:
ParsoidHTML helper in the REST component uses ParsoidOutputAccess in the
parser component. So ParsoidOutputAccess cannot use LocalizedHttpException
from the REST component.

This also improves separation of concerns: the parsing component should
not be concerned with HTTP status codes.

Bug: T301371
Change-Id: I2e661fe3ce0824dbfd7579650972f9019c92ed59
2022-06-28 12:30:44 +02:00
jenkins-bot
6b57c06bf7 Merge "Storage: Warm parsoid parser cache with parsoid outputs" 2022-06-28 10:00:45 +00:00
daniel
1271faa381 Move access to the page bundle into ParsoidOutputAccess
This isolates ParsoidHTMLHelper from the internal of
ParsoidOutputAccess. The corresponding test cases were changed to use a
mock ParsoidOutputAccess, and to not test the behavior of
ParsoidOutputAccess.

Bug: T301371
Change-Id: Id693fae2264f15e5d35f28acc5adc4239b2ae24f
2022-06-28 11:49:36 +02:00
Derick Alangi
1854fb02d9 Storage: Warm parsoid parser cache with parsoid outputs
This patch introduces a ParsoidOutputAccess service for
getting parsoid outputs and warms the cache with pregenerated
outputs.

It also introduces a config variable in ParsoidCacheConfig that
is turned off by default for controlling the cache warming.

Bug: T301371
Change-Id: I6152c42ea765d94093d8d62598b1b4278314adec
2022-06-28 09:05:41 +00:00
daniel
65dee01426 ParserCache: ensure we know a revision ID
ParserCache::checkOutdated relies on ParserOutput::getCacheRevisionId() to determine
whether a revision is still current after loading it from the cache. If
the revision ID is 0 or null, this will result in false negatives, and
the revision will always be considered outdated.

It is better to detect and report this before writing the ParserOutput to the cache.

This also adds an assertion in DerivedPageDataUpdater that will trigger
an exception if we try to write to the parser cache before the revision
has been saved and the ID is known.

Change-Id: I242b769afbc7e1ae1e3f218d451f04945dfa8be4
2022-06-27 13:29:25 +00:00
jenkins-bot
56fab95e0b Merge "Do minor code cleanup" 2022-06-26 09:34:04 +00:00
jenkins-bot
cb91e78024 Merge "Ensure core compatibility with Parsoid external link attributes support" 2022-06-24 23:44:13 +00:00
Isabelle Hurbain-Palatin
61c14054f5 Ensure core compatibility with Parsoid external link attributes support
* Export nofollow and target settings in siteinfo API so that Parsoid's
  developer mode of ApiSiteConfig works.
* Implement SiteConfig::getNoFollowConfig and
  SiteConfig::getExternalLinkTarget, which are defined as abstract
  in the parent class in Parsoid.

Bug: T186241
Change-Id: I6a1f12335be19509d4c5a17e2cae96ecdb677103
2022-06-24 19:12:43 -04:00
Matěj Suchánek
1865180ae7 Do minor code cleanup
Remove dead code and fix typos. Should cause no change in behavior.

Change-Id: I5d293b842bc93a28b8bcd799a31b5e6e30fe692e
2022-06-24 13:52:42 +02:00
jenkins-bot
3ed9d3a6f9 Merge "Use the same tooltip for transcluded sections as normal ones" 2022-06-22 18:31:43 +00:00
Matěj Suchánek
3c1f8dadba Clean up LinkHolderArray::__construct
The class has been marked internal since 1.35.

Change-Id: I90bdf9d0637ffd770276bad3dd81c71b0a746cad
2022-06-18 10:33:34 +02:00
Subramanya Sastry
368e955d73 Followup to 5f5b4cbb: Unbreak Parsoid CI
Parsoid repo use of this method still relies on $wikitextOverride
arg being handled properly, and the $pageId may be a LinkTarget not
a PageIdentity.

This patch contains some "unnecessary" renames just to make the diff
with 5f5b4cbbb4a6214229a23062787c25acd4192ff7^ easier to read.

Change-Id: I79a2773bf6d2366593b31555afd0b548b66d222a
2022-06-17 16:18:44 -04:00
Subramanya Sastry
5f5b4cbbb4 Have Parsoid\Config\PageConfigFactory take a rev instead of wikitext
* This let us pass mocked revisions in the parser test runner while
  running in Parsoid mode.

* This leads to improvement in wt2html tests results where a revision
  id is queried. I've verified this in the Cite extension repo as
  also the main parserTests.text file but I cannot enable Parsoid
  integrated testing on the main parser tests file without doing a
  sweep over all parser tests and adding appropriate test sections

* Currently, PageConfigFactory doesn't have unit tests. Will look
  into adding them separately in a followup.

* Moved the setupParsoidTransform function to a more suitable place
  in the ParserTestRunner.php file.

Bug: T270310
Change-Id: I94d68c8528bb2f7b367c68d80d14ebc1ab904a7f
2022-06-15 22:55:28 -05:00
jenkins-bot
e3f5fdd8b1 Merge "Set alt in galleries, despite caption being visible" 2022-06-08 16:06:15 +00:00
Arlo Breault
0a82cbf301 Set alt in galleries, despite caption being visible
Similar to e7004a6 for tooltips.

Matches Parsoid commit I9d82c8003fd67cd984a8f4523e4993ed1f22b5d2

Bug: T297443
Bug: T162360
Bug: T63566
Change-Id: Ia3b13424d9f77586d334a477822e6e918d3b65c3
2022-06-07 18:20:26 -04:00
jenkins-bot
b494330aa7 Merge "ParserCache: always use JSON" 2022-06-07 14:12:29 +00:00
daniel
697f28df32 ParserCache: always use JSON
When JSON support was introduced into ParserCache in 1.36, it was
controlled by a feature flag, $wgParserCacheUseJson. The feature flag
was "born deprecated" in 1.36. It can now be removed.

This means that ParserCache will always store entries as JSON.
Support for reading old non-JSON entries remains intact.
This is needed when updating wikis from a version older than 1.36
to the current version.

Change-Id: Id04e42bfb458d98414bac50e0d6c505e8878e5c0
2022-06-07 15:19:45 +02:00
Kosta Harlan
a9ee442f7f ParserOptions: Add fallback to enableMagicLinks
Due to quirks of bootstrapping process in PHPUnit, enableMagicLinks can
end up as null instead of an array with keys expected by ParserOptions.
As a workaround, set a fallback for each expected key.

Change-Id: I1511503937f8ac4fcd2f2c8b98bfd7dba17385ec
2022-06-03 14:43:16 +00:00
jenkins-bot
5351d2cd7d Merge "Add {{=}} as a built-in magic word" 2022-05-26 17:32:32 +00:00
Nikki Nikkhoui
b5fe60a7e1 Introduce PageBundleJsonTrait for serialization
New trait for PageBundle class to serialize & deserialize
PageBundle object into json before stashing and after unstashing.

Change-Id: I486fab5b3d01bcef2b535af579cd9672403b2102
2022-05-23 17:54:48 +01:00
Derick Alangi
13f6ec9e1b Rest: Migrate parsoid stashing logic from RESTbase
Add stash option to /page/html & /revision/html endpoints.
When this option is set, the PageBundle returned by Parsoid is
stashed and an etag is returned that can later be used to
make use of the stashed PageBundle.

The stash is for now backed by the BagOStuff returned by
ObjectCache::getLocalClusterInstance().

This patch adds additional data to the ParserOutput stored in ParserCache.
Old entries lacking that data will be ignored.

Bug: T267990
Co-Authored-by: Nikki <nnikkhoui@wikimedia.org>
Change-Id: Id35f1423a69e3ff63e4f9883b3f7e3f9521d81d5
2022-05-23 17:28:29 +01:00
C. Scott Ananian
cf52f646bb Add {{=}} as a built-in magic word
This is a replay of 4bc0dc348a, which
was reverted in 9bd4fc0ae9 due to unexpected
use on Dutch Wiktionary.  In 1.36 deprecation warnings and a tracking
category were added if a wiki defined [[Template:=]] to expand to
anything other than `=` (see aeb3f45c20).
This patch follows up that deprecation by finally defining `{{=}}` as
a built-in, since the last usage on deployed wikis was cleaned up
sometime around February 2021 (list at
https://meta.wikimedia.org/wiki/Equals_sign_parser_function_template_conflicts
).

We've left the tracking category defined for now, so that any remaining
pages left in the tracking category on third-party wikis still retain
localized category documentation.  But it is expected that the next MW
release will also remove the tracking category.

Bug: T91154
Change-Id: I4717172f1d74d326212d51015a6cd87c3758f30d
2022-05-20 13:08:20 -04:00
Amir Sarabadani
d71b75ac7a parser: Avoid pushing the whole content to ParserObserver debug log
Bug: T305218
Change-Id: I9b59a5fc4b70c509ee121476dcc74301f9bcaa0b
2022-05-18 22:07:57 +00:00
jenkins-bot
df116c5002 Merge "Turn DefaultSettings.php into a deprecated stub" 2022-05-18 03:28:56 +00:00
jenkins-bot
8a97c6eefb Merge "Set tooltips in galleries, despite caption being visible" 2022-05-17 18:23:00 +00:00
jenkins-bot
543d09e9d8 Merge "Clarify tooltips are set if captions aren't visible" 2022-05-17 17:52:49 +00:00
Isabelle Hurbain-Palatin
1277d9f154 Still collect metadata on multiple writes
Follow-up to I9d1f0f6bab1305552a0350667d6142a24bc04049. That patch was
not collecting data at all (not even overwriting them over and over
again) - the assignment operation was, in practice, a NOP. This patch
fixes this.

Bug: T303014
Bug: T303015
Change-Id: I7d09b532f3270edf4327c16e032d665353d992f6
2022-05-17 11:14:51 -04:00
C. Scott Ananian
db492e204f ParserOutput: Ensure that array elements are always terminated with a comma
Change-Id: I47263fef3b6ad10ffcaa128ee415e560a3ed86c3
2022-05-17 11:14:43 -04:00
daniel
237bbf089f Turn DefaultSettings.php into a deprecated stub
DefaultSettings.php has been replaced by MainConfigSchema.
Loading DefaultSettings.php is deprecated.

Code that needs to have access to configuration defaults should use the
ConfigSchema service object.

Bug: T300129
Change-Id: I7b2c0ca95a78990be1cdb9dd9ace92f6dcf1af15
2022-05-17 16:50:56 +02:00
Arlo Breault
e7004a62d4 Set tooltips in galleries, despite caption being visible
Matches Parsoid commit Icbc36b6e9aa1b9f4f27c23f4833c626a725cc154

Bug: T297443
Bug: T108380
Change-Id: Ib6606dcbe8cf9476aaf694c4f86d5dc4768b9dd2
2022-05-16 19:25:49 -04:00
Arlo Breault
e2752a0dcf Clarify tooltips are set if captions aren't visible
Matches Parsoid commit Icbc36b6e9aa1b9f4f27c23f4833c626a725cc154

Bug: T297443
Bug: T108380
Depends-On: I896e2af2e8a712a36eb23a25cad08f53574fc044
Change-Id: I30eba0fb226971ddeda4eb240929e89ef7e5f45f
2022-05-16 19:25:27 -04:00
Derick Alangi
1618bbd671 Add data-parsoid data to ParserOutput for caching
NOTE: This changes the HTML returned by the endpoint!
It will now include the id="mwXYZ" attributes needed to
later map to data-parsoid entries.

Bug: T268205
Change-Id: I0a29434b996cc289eb67083e62bd6f1ad750cb4d
2022-05-16 15:06:15 +00:00
Bartosz Dziewoński
b19fcb64bf Use the same tooltip for transcluded sections as normal ones
Remove the code that outputs self-closing <mw:editsection ... /> tags
in Parser, previously used for transcluded sections.

Remove the ability to handle them in ParserOutput. We don't need
backwards-compatibility with cached content, because that feature did
not work correctly for several years: Remex-Tidy always expanded them
to normal open and close tags.

Remove handling for this case in skin code (and fix documentation).
These are backwards-compatible changes.

Depends-On: Idbf0b95a3c0b04caa056b71dd08f46659920114a
Bug: T306299
Change-Id: I3fac0f34d134d8eec46c7eefa3ad2b67abb957da
2022-05-14 02:44:46 +02:00
Bartosz Dziewoński
f7705d976a ParserObserver: Only report duplicate parse if the content is the same
Bug: T303596
Change-Id: Ib3b00a8cfabeb12723ac6a441495d72fd0c0ca92
2022-05-14 02:13:25 +02:00
Arlo Breault
2cff882c79 Support unlinked media |link=| in gallery
Change-Id: Ie142752f9af9af622b9ba8175fe4c8f155199bcb
2022-05-11 12:11:44 -04:00
Arlo Breault
2cd940e1f3 Remove redundant calls to add(External)Link
parseLinkParameter already calls these.

Follow up to 1c9664d

Change-Id: If4cbcc4608ac732710f8d9a7b833d900064f1bf9
2022-05-11 12:11:44 -04:00
Arlo Breault
6289e1919d Use Linker::getImageLinkMTOParams() for galleries
This assures that nofollow's are added where necessary and sets the
title attribute on links where appropriate, matching file links outside
of galleries.

Change-Id: I999a294bbcb921fe53823503e8009ab678411bb2
2022-05-11 12:11:43 -04:00
Matěj Suchánek
e47c441078 Fix many typos in comments
Found using IntelliJ's "Typo" code inspection.

Change-Id: I746220ebe6e1e39f6cb503390ec9053e6518cf16
2022-05-10 12:46:11 +00:00
Brian Wolff
bec8dada48 Clarify generate-html and make ParserOutput behave as expected
Previously:
* It was unclear that generate-html is an optional optimization
* Most of MediaWiki core was doing $parserOutput->setText('') if
html wasn't generated. However this is wrong and will cause
$parserOutput->hasText() to return true and also potentially cause
cache pollution if a content handler both does that and supports
parser cache (Like MassMessage; see T299896)
* The default value of mText in the constructor was '', and most
of the time MW used that default. This doesn't seem right. If
setText() is never called, the ParserOutput should not be considered
to have text
* It was impossible to set mText to null, as $parserOutput->setText(null)
was a no-op. Docs implied you were supposed to do this, so it was very
confusing.

This patch clarifies docs, changes the default value for ParserOutput::$mText
from '' to null, and makes $parserOutput->setText(null) do what you
expect it to. The last two are arguably breaking changes, although
the previous behaviours were unexpected, mostly undocumented and
based on a code search do not appear to be relied on.

It seems like the main reason this only broke MassMessage is most
content handlers either don't support generateHtml, or they don't
support parser cache.

Bug: T306591
Change-Id: I49cdf21411c6b02ac9a221a13393bebe17c7871e
Depends-On: I68ad491735b2df13951399312a4f9c37b63a08fa
2022-05-03 11:23:08 +02:00
jenkins-bot
4a8ee34f09 Merge "Use str_starts_with/str_ends_with" 2022-05-03 02:37:12 +00:00
Aryeh Gregor
7b4b0135b9 Use str_starts_with/str_ends_with
All the other ways of doing it were ridiculous and much harder to read,
and usually required repeating the needle expression (to get its
length). I found these occurrences by grepping for various expressions,
but I undoubtedly missed some.

I didn't try replacing the many instances of strpos(...) === 0 with
str_starts_with(...), because I think they're readable enough as-is
(although less efficient). Likewise I didn't try porting strpos(...) !==
false to str_contains(...). For case-insensitive comparisons, Tim
Starling requested that we stick with substr_compare() because it's more
efficient than calling strtolower().

On PHP < 8 these functions will be included with a polyfill via
vendor/autoload.php. This is included at the beginning of
includes/AutoLoader.php, so if our autoloader has been included the
polyfill will be available. This means it should be safe to call these
functions from any code that would not be usable without our autoloader.

Three uses that Tim Starling identified as being performance-sensitive
have been split out to a separate commit for porting after the switch to
PHP 8.

Change-Id: I113a8d052b6845852c15969a2f0e6fbbe3e9f8d9
2022-05-02 10:59:58 +03:00
Aryeh Gregor
b85391120b Use UrlUtils in Parser
Change-Id: I65f851ea29efe482ee225565a200d623fa85bc20
2022-04-28 17:14:51 +03:00
jenkins-bot
b978acab5a Merge "Fix output encoding of language converter display title" 2022-04-27 15:43:34 +00:00
Aryeh Gregor
7b791474a5 Use MainConfigNames instead of string literals, #4
Now largely automated:

VARS=$(grep -o "'[A-Za-z0-9_]*'" includes/MainConfigNames.php | \
  tr "\n" '|' | sed "s/|$/\n/;s/'//g")
sed -i -E "s/'($VARS)'/MainConfigNames::\1/g" \
  $(grep -ERIl "'($VARS)'" includes/)

Then git add -p with lots of error-prone manual checking. Then
semi-manually add all the necessary "use" lines:

vim $(grep -L 'use MediaWiki\\MainConfigNames;' \
  $(git diff --cached --name-only --diff-filter=M HEAD^))

I didn't bother fixing lines that were over 100 characters unless they
were over 120 and triggered phpcs.

Bug: T305805
Change-Id: I74e0ab511abecb276717ad4276a124760a268147
2022-04-26 19:03:37 +03:00
Tim Starling
d6a3b6cfa8 TempUser EditPage and permissions
* Allow EditPage to create a user on page save. This has to be enabled
  in config and then activated by the UI/API caller.
* Add an autocreate source for temporary users.
* Allow editing by anonymous users via automatic account creation when
  $wgGroupPermisions['*']['edit'] = false. On an edit GET request, use
  an unsaved placeholder user to stand in for post-create permissions.
* On preview or aborted save, the username to be created is stashed in a
  session and restored on subsequent requests.
* On a (likely) successful page save, create the account.
* Put regular non-temporary users in a "named" group so that they can be
  given additional permissions.
* Use a different "~~~" signature for temporary users
* Show account creation warnings on edit and preview.

Change-Id: I67b23abf73cc371280bfb2b6c43b3ce0e077bfe5
2022-04-26 14:10:53 +10:00
Alexander Vorwerk
ad0867dab9 parser: Emit deprecation warnings from ParsoidServices
no longer used in deployed code

Change-Id: I68d146f577fb2aed1dca6f597d81148c3c6ade43
2022-04-21 18:23:13 +00:00