Commit graph

35 commits

Author SHA1 Message Date
Ed Sanders
4b9ccab9b9 ESLint: Enforce prefer-arrow-callback and autofix
Change-Id: Iddfa574e42e569ac5e2a2b098ad2f11ca80c5955
2024-06-11 19:03:54 +01:00
Lucas Werkmeister
a5c1fc67ee api-testing: Further increase ETag number in transform tests
Change I5521b7652f (commit 682a19e9f6) increased the number by roughly
one order of magnitude, which leaves me nervous that the tests might
start failing again if enough other tests are run in the same wiki to
let the increased revision IDs exist. Let’s bump it further to make that
much less likely (though not impossible).

1219844647 is an arbitrary large number; my intention in choosing a
random-looking number (rather than, say, 1234567890) is that it’s easier
to search for, both in codesearch and in commit messages, if it should
ever pop up in an error message.

Bug: T366142
Change-Id: I8186d9d46bc2a3f5ec04b38aab5cbe85e609835a
Follows-Up: I5521b7652faca9821fa08a9987a9452a4c555203
2024-06-10 16:50:41 +02:00
Arlo Breault
009edac867 Don't ignore offsetType attribute on lint API paths
The default used to be 'ucs2' when linting but
Ic9b7cc0fcf365e772b7d080d76a065e3fd585f80 stopped setting the offsetType
in the environment options, which changed the output to Parsoid's
default, 'byte'.

For the parsing paths, pagebundles are now post-processed and
non-pagebundles were left as a TODO.

However, for the lint paths, it was always ok to continue setting the
environment option because Parsoid is called directly and doesn't go
through the ParserOutputAccess.  The above patch was trying to limit
Parsoid specific options to that.

Bug: T365284
Change-Id: I8389c2f53b399b39a9f1d908a38aecb3abcb15ef
2024-06-03 19:59:28 -04:00
Jakob Warkotsch
682a19e9f6 api-testing: Increase ETag number in transform tests
These tests fail for wikis with many revisions, and even in CI when
running after a suite that generates many edits such as Wikibase.

Bug: T366142
Change-Id: I5521b7652faca9821fa08a9987a9452a4c555203
2024-05-30 10:05:06 +02:00
Wendy Quarshie
2ef78ee34b Improve error localization in REST handlers
BUg: T358745
Change-Id: Icb804560c827ee3e5df56d9d8d9565b8157fa9e1
2024-04-02 16:49:40 +00:00
James D. Forrester
c2aa05102d build: Upgrade eslint-config-wikimedia from 0.25.0 to 0.26.0 and make pass
Mostly this has a bunch of whitespace changes from the
template-curly-spacing and brace-style rules being set
to align with other spacing rules.

Change-Id: I4609c52a4ef426ad1f35fb4bfe447bb08323a8e8
2023-11-22 13:25:32 -05:00
daniel
7b2d7abecb REST: enable parsoid transform endpoints
Bug: T350661
Change-Id: I7fe681b591eb1414cab6ff33cdc25410000b7cd6
2023-11-07 09:50:27 +01:00
daniel
7d0fb8d247 Parsoid API should return latest version instead of redirecting to it
Request that only specify a title, no revision, should get the page
content directly.

Bug: T350359
Change-Id: Ia461cce0df63c05c6f8f94275f6b94f81323d171
2023-11-03 16:30:03 +00:00
daniel
e9aaa47b96 ParsoidHandler: emit relative URLs in redirects
This was missed when fixing a similar issue for core endpoints.

I also deleted some dead code in ParsoidHandler, instead of fixing
it to also generate relative redirects.

Bug: T350219
Bug: T349001
Change-Id: If3fe901723dcae9a22806650a21a79778354d8c5
2023-11-01 16:59:15 +00:00
Subramanya Sastry
1aa71cf51b Disable Parsoid support for non-default output versions and offset types
* This is in service of a followup patch that merges ParsoidOutputAccess
  and ParserOutputAccess. We want to eliminate all Parsoid-specific options
  that aren't part of ParserOptions and aren't easily supportable via
  html2html transforms.

* offsetType conversion relies on Parsoid code that is a bit entangled
  with env, siteconfig (and extension configs), page source, etc. It
  could all be refactored but once the html2html output transformation
  framework lands, we could potentially use that to call Parsoid to do
  these transforms by exposing such transforms to the framework.

* In this patch, outputContentVersion that isn't the default major HTML
  version is no longer support. It could potentially be supported via the
  downgrade functionality in Parsoid in the future, or we might decide
  to re-enable multiple outputContentVersion selection in the future
  if such a use case arises. But, there are no plans to bump the major
  HTML version in the near future while we work on read views.

* Rather than delete associated tests, I've marked them skipped so that
  they can re-enabled when this support is added back.

Bug: T347426
Change-Id: Ibede4acd68e944512f6d00763d29c6b1605d67eb
2023-09-27 15:03:41 -05:00
Umherirrender
e53a94a369 build: Fix or suppress eslint/stylelint warnings
Change-Id: If37e9b9d998660749402c173898eebd3da6ec105
2023-08-06 01:05:07 +00:00
jenkins-bot
5434c71393 Merge "Use Bcp47Code when interfacing with Parsoid" 2023-03-13 19:11:03 +00:00
C. Scott Ananian
5ad8dea80a Use Bcp47Code when interfacing with Parsoid
It is very easy for developers and maintainers to mix up "internal
MediaWiki language codes" and "BCP-47 language codes"; the latter are
standards-compliant and used in web protocols like HTTP, HTML, and
SVG; but much of WMF production is very dependent on historical codes
used by MediaWiki which in some cases predate the IANA standardized
name for the language in question.

Phan and other static checking tools aren't much help distinguishing
BCP-47 from internal codes when both are represented with the PHP
string type, so the wikimedia/bcp-47-code package introduced a very
lightweight wrapper type in order to uniquely identify BCP-47 codes.
Language implements Bcp47Code, and LanguageFactory::getLanguage() is
an easy way to convert (or downcast) between Bcp47Code and Language
objects.

This patch updates the Parsoid integration code and the associated
REST handlers to use Bcp47Code in APIs so that the standalone Parsoid
library does not need to know anything about MediaWiki-internal codes.
The principle has been, first, to try to convert a string to a
Bcp47Code as soon as possible and as close to the original input as
possible, so it is easy to see *why* a given string is a BCP-47 code
(usually, because it is coming from HTTP/HTML/etc) and we're not stuck
deep inside some method trying to figure out where a string we're
given is coming from and therefore what sort of string code it might
be.  Second, we've added explicit compatibility code to accept
MediaWiki internal codes and convert them to Bcp47Code for backward
compatibility with existing clients, using the @internal
LanguageCode::normalizeNonstandardCodeAndWarn() method.  The intention
is to gradually remove these backward compatibility thunks and replace
them with HTTP 400 errors or wfDeprecated messages in order to
identify and repair callers who are incorrectly using
non-standard-compliant language codes in web standards
(HTTP/HTML/SVG/etc).

Finally, maintaining a code as a Bcp47Code and not immediately
converting to Language helps us delay or even avoid full loading of a
Language object in some cases, which is another reason to occasionally
push Bcp47Code (instead of Language) down the call stack.

Bug: T327379
Depends-On: I830867d58f8962d6a57be16ce3735e8384f9ac1c
Change-Id: I982e0df706a633b05dcc02b5220b737c19adc401
2023-03-13 13:25:09 -04:00
Subramanya Sastry
071d368495 Revert "Revert "TransformHandler: Load stashed page bundle based on ETag.""
This reverts commit c4f40bd107.

Change-Id: Iff0f9859a83506059f100ddd60b74cfdd1279071
2023-03-10 13:44:46 -06:00
Subramanya Sastry
c4f40bd107 Revert "TransformHandler: Load stashed page bundle based on ETag."
This reverts commit ee8dd055c8.

Reason for revert: breaks officewiki

Bug: T331629
Depends-On: I46f16eae9c137d43aad22bfd4be460cfb635614b
Change-Id: Ieb0dedfb5ae3168749a9ab6d930be527337348e8
2023-03-09 15:36:54 +00:00
daniel
ee8dd055c8 TransformHandler: Load stashed page bundle based on ETag.
Allow clients to use an If-Match header with the
transform/html/to/wikitext endpoint.

Bug: T310464
Needed-By: Ifb1c40a0044f04fb339b00630fbca9190a1bce51
Change-Id: Ida81a314f015e205f2081c68a82d486145097c92
2023-01-24 20:26:05 +01:00
Daniel Kinzler
5cb388455b [Re-apply] ParsoidHandler: use HtmlOutputRendererHelper in wt2html
This restores change Ie430acd0753880d88370bb9f22bb40a0f9ded917.
This reverts commit ab6baad1a5.

NOTE: Also needs the patch the fixes the original reason for the
revert: Ief721c23ed9a57d781cfdac625a62113f22f87a5

Change-Id: Ic48db1b5fdff1dfd4f2d2643d64252e5fc721e79
2022-12-05 18:43:51 +00:00
Daniel Kinzler
ab6baad1a5 Revert "ParsoidHandler: use HtmlOutputRendererHelper in wt2html"
This reverts commit e82f11c246.

Reason for revert: Breaks parsoid CI

1) Parsoid round-trip e2e testing with MW REST endpoints
     rt-testing e2e:
     AssertionError: expected 1 to equal 0
     + expected - actual
     -1
     +0

     at Context.<anonymous> (tests/api-testing/RoundTrip.js:59:10)
     at processTicksAndRejections (internal/process/task_queues.js:95:5)

Change-Id: Ib94f964c2717885f777c1fe0c9c443cd6a5ed3ae
2022-12-01 21:17:34 +00:00
daniel
e82f11c246 ParsoidHandler: use HtmlOutputRendererHelper in wt2html
NOTE: This causes Parsoid output to be written to the parser cache.
This should be unconditional in the future, but for now it is
controled by wgTemporaryParsoidHandlerParserCacheWriteRatio.

This change affects the following endpoints that use the wt2html method:
* /coredev/v0/transform/wikitext/to/html in core
* /{domain}/v3/transform/wikitext/to/html from parsoid
* /{domain}/v3/page/html/{title} from parsoid

The /v1/page/{title}/html endpoint is not affected, since it
doesn't use wt2html, but has always been using HtmlOutputRendererHelper
directly.

Bug: T322672
Depends-On: Ic37f606bb51504c8164d005af55ca9a65f595041
Change-Id: Ie430acd0753880d88370bb9f22bb40a0f9ded917
2022-12-01 10:14:49 +00:00
jenkins-bot
91a758ab85 Merge "Follow redirects for page/{title} formats html/with_html" 2022-11-17 00:58:46 +00:00
msantos
46071e9c3d Follow redirects for page/{title} formats html/with_html
* Apply Legacy Temporary redirects (302) if page is a redirect in order
to have feature parity with RESTBase

* Check for normalization redirects and execute permanent redirects (301)

* Add/Update mocha tests for the redirects functionality

* Add query parameter 'redirect=no' check to bypass redirect logic

* Unit tests to check status code and location headers

Bug: T301372
Change-Id: I841c21d54a58e118617aaf5e2c604ea22914adaa
2022-11-16 17:50:03 +00:00
Abijeet
ad5b43f8a2 Api Testing: Enable some variant tests in Transform.js
Change-Id: Idf1f736eb958b093eafce4ee1cd138f52cfae986
2022-11-11 13:48:40 +05:30
daniel
c25380ca37 Parsoid: Fix e2e tests for size limits.
The tests for size limits did not catch an issue introduced by
If09afc4b933e, which caused resource limits to trigger early, since they
were now being compared to the size in bytes, rather than characters.

The reason the tests didn't protect us is threefold:
- They only check the error returned when the resource sizes is one over the limit.
  They don't check that the error is NOT returned when the size is one under the limit.
- They did not test with a multi-byte character.
- They were disabled, because the limits are quite high, and the e2e test can not change them.

This patch is an attempt to fix all three issues.

Depends-On: I40901a1204b3c698895a836bf3b605239878d1fe
Change-Id: I2aead24cb7f47eb1267fdd2954a7c7e45dd4ed51
2022-10-14 13:22:12 +00:00
daniel
68ccdf26f9 html2wt: fall back to re-rendering if needed.
If we don't have a render id given, but we do have a revision id, we can
fall back to the current rendering that revision to provide a baseline for
selser. This is better than no selser. On wikis that do not heavily rely
on templates, or where templates rarely change while an edit is in
progress, this will produce a clean diff.

Bug: T318398
Change-Id: If7612cc6e64d1f1243289b7d6ba96c71f09fe15d
2022-10-05 14:52:03 +00:00
daniel
f31cd9f1d3 REST: HtmlInputTransformHelper: Load original data from stash
Parsoid needs the original rendering in order to apply
selective serialization (selser). The page/{title}/html endpoint
can stash the rendering, and now the transform endpoint can make use
of the stashed rendering.

Bug: T310464
Change-Id: Ia58043ed3aa1eb12731d82aa87606c82ec63f663
2022-09-29 19:52:27 +02:00
daniel
d6140952ed HTMLTransform: do not presume wikitext
Parsoid supports other source formats besides wikitext.
This patch improves support for non-wikitext content by removing
assumptions about the source type.

Change-Id: I5480ff200a93026cea7f1542e12834b06ac6f730
2022-09-22 17:41:48 +01:00
daniel
6236b1f2f4 Move knowledge about the attribs array out of TransformContext
This renames TransformContext to HTMLTransformInput. It is becoming a
wrapper around the input HTML, with a bunch of optional context data
attached.

This introduces a factory method for HTMLTransformInput, so we can
extract knowledge about the structure of the $attribs array from
HTMLTransformInput.

This also allows us to inject Document objects and perhaps PageBundle
objects, instead of just arrays.

Change-Id: I66f9c5dbb50c6bf1f582adad7766422216482402
2022-07-22 12:12:55 +00:00
daniel
3f1cf31740 phpunit tests for ParsoidHandler::html2wt
The test cases were mostly ported from tests/api-testing/REST/Transform.js

Change-Id: Ie6b9f28b6e49e44c64f1fa73ca11e21c2b451474
2022-07-18 13:51:49 +02:00
daniel
0cdc7a73eb Transform e2e test: Move data into separate files
Move HTML and other complex input and output data from giant strings into files.
This makes them easier to read, modify, and compare.

Change-Id: Iafe2638e4eda903e1064b05adaa80a39ff5028f9
2022-07-08 20:08:16 +02:00
Derick Alangi
c1c5f3f7f3 Rest: Make transformation endpoints configurable
Page bundles should not be part of the publicly accessible
API and if we really need it, we can expose it and consume
internally. Hence, remove support for this transform from the
endpoint.

This patch falls back to the behavior before for backwards
compatibility before we fully switch over to using the transforms
in core. We still need to support both core & restbase for now.

NOTE: API integration tests have been updated to status code
404 since we're removing support for it now. If we need it,
we'll expose via a different endpoint and introduce tests.

Bug: T311477
Change-Id: I5577f61f7ae7da2a4d3a78d9ce962997466d550c
2022-06-30 13:28:57 +01:00
Arlo Breault
2dbef8d9f0 Transform.js tests: soften one more schema version check.
Follow up to I6d7db6a05c48de8a57f83e4c8af38ab50271297a

Change-Id: I317ce587e62f9e94bbafbdabac64156237c4f1e3
2022-06-14 15:09:47 -04:00
jenkins-bot
65572907d2 Merge "Transform.js tests: soften schema version check." 2022-06-14 16:31:48 +00:00
daniel
dbd5506a4a Transform.js tests: soften schema version check.
This changes the Transform.js e2e test to not break when the parsoid
schema version is getting higher. We are only guaranteeing stability
according to the semver caret semantics to clients, so that is what
the e2e test should assert.

This allows tests to pass when raising the version of the parsoid lib in
composer.json. This also allows core tests to pass when run against the
development branch of the parsoid repo.

Change-Id: I6d7db6a05c48de8a57f83e4c8af38ab50271297a
2022-06-14 17:38:41 +02:00
Derick Alangi
b0d08dcbfc Rest: Remove {domain} param from TransformHandler
RESTBase used this to check if the domain sent by the client
matches the domain in the restbase config under the hood. Since
this code is moving to core, this assertion is no longer needed.

Bug: T301370
Change-Id: I01a4f35b81c31d106671e5c829d317a41687fd7a
2022-06-02 19:37:51 +01:00
Derick Alangi
80ce1fe28f Rest: Move TransformHandler to core (part 1)
Begin moving the transform endpoints and handler class to
MediaWiki core.

Bug: T301370
Change-Id: I94e9d2e8d497c1992c542001afe333fa7537e553
2022-06-02 15:55:06 +01:00