If variant conversion is not supported by Parsoid, fallback to using
the old LanguageConverter.
We still call parsoid to perform variant conversion in order to add
metadata that is missing when the core language converter is used.
Bug: T318401
Change-Id: I0499c853b4e301f135339fc137054bd760ee237d
Depends-On: Ie94aaa11963ec1e9e99136af469a05fa4005710d
This restores change Ie430acd0753880d88370bb9f22bb40a0f9ded917.
This reverts commit ab6baad1a5.
NOTE: Also needs the patch the fixes the original reason for the
revert: Ief721c23ed9a57d781cfdac625a62113f22f87a5
Change-Id: Ic48db1b5fdff1dfd4f2d2643d64252e5fc721e79
* Share logic previously implemented for html/with formats through
a trait class
* source/bare formats doesn't execute a temporary redirect. the
JSON body will contain a key "redirect_target" instead if a wiki
redirect is found
* Introduce PageRedirectHandlerTest to test redirect logic shared
between multiple handlers
* Move Handler instatiation to HandlerTestTrait
* Update api-testing tests in Update.js
Change-Id: Id66e33e19adabdb3c9621eaea4a5d441f23edafd
NOTE: This causes Parsoid output to be written to the parser cache.
This should be unconditional in the future, but for now it is
controled by wgTemporaryParsoidHandlerParserCacheWriteRatio.
This change affects the following endpoints that use the wt2html method:
* /coredev/v0/transform/wikitext/to/html in core
* /{domain}/v3/transform/wikitext/to/html from parsoid
* /{domain}/v3/page/html/{title} from parsoid
The /v1/page/{title}/html endpoint is not affected, since it
doesn't use wt2html, but has always been using HtmlOutputRendererHelper
directly.
Bug: T322672
Depends-On: Ic37f606bb51504c8164d005af55ca9a65f595041
Change-Id: Ie430acd0753880d88370bb9f22bb40a0f9ded917
* Apply Legacy Temporary redirects (302) if page is a redirect in order
to have feature parity with RESTBase
* Check for normalization redirects and execute permanent redirects (301)
* Add/Update mocha tests for the redirects functionality
* Add query parameter 'redirect=no' check to bypass redirect logic
* Unit tests to check status code and location headers
Bug: T301372
Change-Id: I841c21d54a58e118617aaf5e2c604ea22914adaa
The test case was too-strictly enforcing a case-sensitive match against
the BCP 47 language tag, which caused a spurious failure with the
latest version of Parsoid.
Change-Id: I2915e3bc288f4293e4ebf11ab68ccd6d020f2b8e
Variant conversion is based on the Accept-Language header. Updated
the HtmlOutputRendererHelper to set the HTTP headers related to
variant conversion.
Bug: T317019
Change-Id: I5e11452f1c531a757e8d860f9c727b5810406bce
The tests for size limits did not catch an issue introduced by
If09afc4b933e, which caused resource limits to trigger early, since they
were now being compared to the size in bytes, rather than characters.
The reason the tests didn't protect us is threefold:
- They only check the error returned when the resource sizes is one over the limit.
They don't check that the error is NOT returned when the size is one under the limit.
- They did not test with a multi-byte character.
- They were disabled, because the limits are quite high, and the e2e test can not change them.
This patch is an attempt to fix all three issues.
Depends-On: I40901a1204b3c698895a836bf3b605239878d1fe
Change-Id: I2aead24cb7f47eb1267fdd2954a7c7e45dd4ed51
If we don't have a render id given, but we do have a revision id, we can
fall back to the current rendering that revision to provide a baseline for
selser. This is better than no selser. On wikis that do not heavily rely
on templates, or where templates rarely change while an edit is in
progress, this will produce a clean diff.
Bug: T318398
Change-Id: If7612cc6e64d1f1243289b7d6ba96c71f09fe15d
Parsoid needs the original rendering in order to apply
selective serialization (selser). The page/{title}/html endpoint
can stash the rendering, and now the transform endpoint can make use
of the stashed rendering.
Bug: T310464
Change-Id: Ia58043ed3aa1eb12731d82aa87606c82ec63f663
Parsoid supports other source formats besides wikitext.
This patch improves support for non-wikitext content by removing
assumptions about the source type.
Change-Id: I5480ff200a93026cea7f1542e12834b06ac6f730
This renames TransformContext to HTMLTransformInput. It is becoming a
wrapper around the input HTML, with a bunch of optional context data
attached.
This introduces a factory method for HTMLTransformInput, so we can
extract knowledge about the structure of the $attribs array from
HTMLTransformInput.
This also allows us to inject Document objects and perhaps PageBundle
objects, instead of just arrays.
Change-Id: I66f9c5dbb50c6bf1f582adad7766422216482402
Move HTML and other complex input and output data from giant strings into files.
This makes them easier to read, modify, and compare.
Change-Id: Iafe2638e4eda903e1064b05adaa80a39ff5028f9
Page bundles should not be part of the publicly accessible
API and if we really need it, we can expose it and consume
internally. Hence, remove support for this transform from the
endpoint.
This patch falls back to the behavior before for backwards
compatibility before we fully switch over to using the transforms
in core. We still need to support both core & restbase for now.
NOTE: API integration tests have been updated to status code
404 since we're removing support for it now. If we need it,
we'll expose via a different endpoint and introduce tests.
Bug: T311477
Change-Id: I5577f61f7ae7da2a4d3a78d9ce962997466d550c
This patch changes the etag emitted for HTML in GET requests to be
marked as a "strong" (byte-by-byte) eTag. This matches the behavior of
RESTbase.
This also adds checks for strong ETags to the e2e tests for the v1/page
and v1/revision endpoints. This is done for completeness. These endpoints
do not use the code changed here, the tests would have passed without
the change to ParsoidHandler.
NOTE: This addresses the issue of weak ETags coming from parsoid
endpoints only for setups that do NOT use RESTbase and/or Varnish.
RESTbase never had the issue. Varnish apparently weakens any ETag when
applying compression.
Context: The eTags returned in response to GET requests to parsoid endpoints
are later used in an If-Match header in requests to the transform endpoint,
when transforming HTML to wikitext. The intent is for the request to
fail if the server doesn't have the "stash" entry identified by the
etag, which would be needed to get a clean conversion using SelSer.
For this to work, the etag must be "strong", that ism they must not have
the W/ prefix that indicates a weak prefix. The HTTP spec mandates that
If-Match headers using a weak etag must always cause the request to
fail.
Bug: T310710
Change-Id: I075b20f0937b51b7fcde2e9fa9ac1cec8c7eec87
This changes the Transform.js e2e test to not break when the parsoid
schema version is getting higher. We are only guaranteeing stability
according to the semver caret semantics to clients, so that is what
the e2e test should assert.
This allows tests to pass when raising the version of the parsoid lib in
composer.json. This also allows core tests to pass when run against the
development branch of the parsoid repo.
Change-Id: I6d7db6a05c48de8a57f83e4c8af38ab50271297a
RESTBase used this to check if the domain sent by the client
matches the domain in the restbase config under the hood. Since
this code is moving to core, this assertion is no longer needed.
Bug: T301370
Change-Id: I01a4f35b81c31d106671e5c829d317a41687fd7a
These three endpoints have been experimental for many months:
revision/{id}
revision/{id}/html
revision/{id}/with_html
Promote them to officially released. This completes the
basic "revision" endpoint support, and helps clear out
the coreDevelopmentRoutes.json file for unrelated
experiments.
This also modifies the existing revision/{id}/bare endpoint
response, which previously pointed callers to the experimental
endpoint for html. It now points callers to the official one.
Bug: T305506
Change-Id: Iee8d1723e98dd3e3e389a0514dde28799914b2fd
Sqlite and MySQL are returning results in different
orders for the REST Search handler, which affects how it handles
de-duplicating results with redirects.
If the Redirect Target is processed first, the matched_title
field is not populated because that page is not a redirect.
If the Redirect Source is processed first, the matched_title
field IS populated because that page is a redirect.
Either way, we don't have duplicate results (which is the most
important part). Until the logic is consistent, remove the
matched_title check.
Bug: T302706
Change-Id: Ic3977655565aa9f6d6c184749706273b0315b7be
Partial revert of I0e30fdb6acba85cec4bb1499f7063ba6bfb0ffb2;
re-application of I3e1fe5e8112e3b1d487c46bc7fd8f924d65ce7fa and
I8f7b5e9257c7f283ded1a61e41d9e344f5cea67d
Bug: T301100
Change-Id: I90bea3fda48efdc7e915908d00535df220c4cc69
Enabling this setting will cause post-send deferred updates to be run
before a response is sent to the client, so the client can observe all
effects of their last request immediately.
This resolves a problem with some end-to-end tests that were failing
because the updates caused by one request had not landed in the database
by the time the subsequent request was made.
This patch re-enabled some e2e tests that were disabled because of this
problem. If $wgForceDeferredUpdatesPreSend works as intended, the tests
should again pass reliably.
Bug: T230211
Bug: T301100
Change-Id: I0e30fdb6acba85cec4bb1499f7063ba6bfb0ffb2
Add a field to the response object of the REST
endpoint /search/page to display the title of the page
that the given page is a redirect to, or null if
the page is not a redirect.
Bug: T296671
Change-Id: I6673d50e8eae822455972403c82ec33e6ffce5dd