Commit graph

556 commits

Author SHA1 Message Date
jenkins-bot
c268687d46 Merge "Hard deprecate Sanitizer::removeHTMLtags()" 2022-03-08 19:29:55 +00:00
jenkins-bot
d1cfc0317d Merge "Add explicit casts between scalar types" 2022-03-08 17:32:26 +00:00
Umherirrender
6ea3d6ac2c Add explicit casts between scalar types
php internal functions like floor/round/ceil documented to return
float, most cases the result is used as int, added casts

Found by phan strict checks

Change-Id: I92daeb0f7be8a0566fd9258f66ed3aced9a7b792
2022-03-08 16:59:01 +00:00
C. Scott Ananian
d6576e5dc6 Hard deprecate Sanitizer::removeHTMLtags()
Rename Sanitizer::removeHTMLtags() into an @internal method named
::internalRemoveHtmlTags() so that we can deprecate external use.

Code search:
https://codesearch.wmcloud.org/deployed/?q=removeHTMLtags&i=nope&files=&excludeFiles=&repos=

Followup-To: Ic864c01471c292f11799c4fbdac4d7d30b8bc50f
Depends-On: Iaca83ed06e9c61d8366579cd2283cba653c82319
Depends-On: I1963bfe9a99198ea02ca482a5769467ce806cd58
Depends-On: I83923d8b38d33f3638cd53958dd10f257ec21f7c
Depends-On: I018b34bb5f6e113056da9b04cc72d4318422adce
Change-Id: I202826f8b27519f7be89643e24eda47a6e3fc9f6
2022-03-07 22:04:56 -05:00
C. Scott Ananian
9f14fbd002 Add Sanitizer::removeSomeTags() which uses Remex to tokenize
The existing Sanitizer::removeHTMLtags() method, in addition to having
dodgy capitalization, uses regular expressions to parse the HTML.
That produces corner cases like T298401 and T67747 and is not guaranteed
to yield balanced or well-formed HTML.

Instead, introduce and use a new Sanitizer::removeSomeTags() method
which is guaranteed to always return balanced and well-formed HTML.

Note that Sanitizer::removeHTMLtags()/::removeSomeTags() take a callback
argument which (as far as I can tell) is never used outside core. Mark
that argument as @internal, and clean up the version used by
::removeSomeTags().

Use the new ::removeSomeTags() method in the two places where
DISPLAYTITLE is handled (following up on T67747).  The use by the
legacy parser is more difficult to replace (and would have a
performace cost), so leave the old ::removeHTMLtags() method in place
for that call site for now: when the legacy parser is replaced by
Parsoid the need for the old ::removeHTMLtags() will go away.  In a
follow-up patch we'll rename ::removeHTMLtags() and mark it @internal
so that we can deprecate ::removeHTMLtags() for external use.

Some benchmarking code added.  On my machine, with PHP 7.4, the new
method tidies short 30-character title strings at a rate of about
6764/s while the tidy-based method being replaced here managed 6384/s.
Sanitizer::removeHTMLtags blazes through short strings 20x faster
(120,915/s); some of this difference is due to the set up cost of
creating the tag whitelist and the Remex pipeline, so further
optimizations could doubtless be done if Sanitizer::removeSomeTags()
is more widely used.

Bug: T299722
Bug: T67747
Change-Id: Ic864c01471c292f11799c4fbdac4d7d30b8bc50f
2022-03-04 14:06:02 -05:00
jenkins-bot
24aa34d06c Merge "phpcs: Disable Generic.Files.LineLength for test files" 2022-02-21 15:51:29 +00:00
jenkins-bot
99dee6855a Merge "Change return value of ParserOutput::getPageProperty() when property is missing" 2022-02-19 00:49:48 +00:00
C. Scott Ananian
c39ef6c6c9 Change return value of ParserOutput::getPageProperty() when property is missing
The old ParserOutput::getProperty() method returned `false` when a property
was missing.  This requires callers to use the `?:` syntax to supply default
values, which then causes any falsey value to be treated as missing.
So, for example, setting the defaultsort to '0' will cause the default
sort to be ignored.

Modern php convention is to use `null` for missing values, and the `??`
syntax is a better/more restrictive alternative to `?:`.

We renamed `ParserOutput::getProperty()` to `::getPageProperty()` in
1.38 (Ie963eea5aa0f0e984ced7c4dfa0fd65d57313cfa/T287216) but kept the
return value convention.  Before this actually makes it into a 1.38
release, take the opportunity to fix the return value for the new
`ParserOutput::getPageProperty()` method to return `null` when the
property is missing.

We need to do some temporary workarounds to the places we'd
already swapped over to use the new `::getPageProperty()` method
to allow them to handle either `false` or `null` as a return value;
we'll clean that up once this is merged.

Code search:
https://codesearch.wmcloud.org/deployed/?q=-%3EgetPageProperty%5C%28|T301915&i=nope&files=&excludeFiles=&repos=

Bug: T301915
Depends-On: I3f11ce604970e47b41fc1c123792df8c3045626f
Depends-On: Ie7533f49fe4cad01ebfda29760d23c61e9867b10
Depends-On: Ic5c09f5caa4c897bc553c614fbae9cee159566a2
Depends-On: I0278b2eafd90e77e4fee41c45a1165fb79ddf47e
Depends-On: I383abb6b7dc5e96c0061af13957609f6e31a1065
Depends-On: I79f9f4078e415284af29b15047bafd1c823d7f5b
Depends-On: I02276c48c49f5d2d241a69eb0a6cdf439b572d8b
Depends-On: I71628661b4539a4e35ae32846e719f92bcf782e0
Depends-On: I7e215cb43de0ce150a6bcc00f92481dcdcfed383
Change-Id: Iaa25c390118d2db2b6578cdd558f2defd5351d15
2022-02-18 21:15:58 +00:00
Timo Tijhof
8d406bbcd6 phpcs: Disable Generic.Files.LineLength for test files
There is a common and reasonable need for longer lines in tests.
The nudge for shorter lines doesn't seem valuable here. The natural
breaks will likely still fall in 80-100 given the enforced practice
for non-test code, e.g. whether through habit, or 80-100 column markers
in text editors, or the finite width of diff and code review
interfaces.

Change-Id: I879479e13551789a67624ce66f0946d2f185e6ee
2022-02-18 18:32:05 +00:00
C. Scott Ananian
3c211fdb3c Update ParserCache serialization test cases to use valid category keys
Category keys are supposed to be non-null strings.

The test cases use bogus integer values, which causes issues when
refactoring more strictly enforces validity checks on category sort
key values.

Change-Id: If2937a694ba6bd4c522336f33aa58d40949b5a54
2022-02-17 12:12:53 -05:00
C. Scott Ananian
b46dfcc351 Update ParserCache serialization test cases to use a valid index policy
Valid values for ParserCache::$mIndexPolicy are '', 'noindex', and 'index'.

The test cases use the bogus value 'policy1', which causes issues when
refactoring more strictly enforces validity checks on index policy
values.

Change-Id: I2d00ff4e3ded043d18942c8482a39fac14ec60bc
2022-02-09 12:47:27 -05:00
Func
7f74a2e50c Clean up tests that misused the parameters of assertSame/Equals
Expected value is the first parameter to assertSame() or assertEquals().
And turn to use assertCount() for some assertions aginst count of array.

Based on code search `assert(?:Same|Equals)\(.+,.+expected` and I look
through files roughly, so some assertions that don't contains 'expected'
are also fixed. In the meantime, some assertions that I am not clear
about are not touched.

Change-Id: I75798b60d29fd19b33f4fdf34ed3c788db420d01
2022-02-08 07:21:10 +00:00
jenkins-bot
44c1145dff Merge "Add ParserOutput::appendExtensionData()" 2022-02-04 22:20:47 +00:00
jenkins-bot
fe5f09812d Merge "Add ParserOutput::{set,append}JsConfigVar()" 2022-02-04 21:15:56 +00:00
C. Scott Ananian
baaee141e4 Add ParserOutput::appendExtensionData()
Soft-deprecate the use of ::setExtensionData() to destructively update
the value stored under a single key.  Add the new
::appendExtensionData() method to use where multiple values are
desired.  This accomodates the asynchronous and incremental parsing
goals on the Parsoid roadmap.

Bug: T300981
Change-Id: I2dea4ba71ea506428854a9983c1abd906b2efd5f
2022-02-04 13:43:22 -05:00
C. Scott Ananian
0f5dc718ce Add ParserOutput::{set,append}JsConfigVar()
Deprecate ParserOutput::addJsConfigVars() and add setter methods which
better ensure that the ParserOutput contents are independent of parse
order.  This accomodates the asynchronous and incremental parsing goals
on the Parsoid roadmap.

Bug: T300307
Change-Id: I4f08d1098da211f7bf5c43c08c620de224cbf37f
2022-02-04 13:42:59 -05:00
daniel
026133bb05 remove access to config globals from includes/parser
Loops ServiceOptions through to CoreParserFunctions and CoreTagHooks to
avoid access to the main config from static methods.

Bug: T294739
Change-Id: Ia6c97f2d0952964c2ad6189f8053ad127589b37c
2022-02-01 07:48:57 -08:00
Alexander Vorwerk
decbaf4f38 phpunit: use ->getServiceContainer() in integration tests
Change-Id: I38299cb65eeaadfdc0eb05db4e8c0b0119cfb37d
2022-01-27 22:04:16 +01:00
Alexander Vorwerk
1f78d6a249 ParserCacheSerializationTestCases: call ::addModule(Style)?s with an array
Bug: T299865
Change-Id: Ifb0dd97c7023154ba1d834e574a913cfe9ff0f1f
2022-01-23 17:26:32 +01:00
Reedy
044259b8d5 ParserOutputTest: Call ParserOutput::addModule(Style)?s with an array
Bug: T299747
Change-Id: I14bb2b12f515369a3890e70e8effeef4c501ecbd
2022-01-21 10:40:46 +00:00
Kosta Harlan
0c2cc804e1 phpunit: Use is_file/is_dir instead of file_exists
Yes, it's a micro-optimization. See https://bugs.php.net/bug.php?id=78285
and https://thephp.cc/articles/caching-makes-everything-faster-right
for more info.

Change-Id: Ib8e8e9794e15066476f35cdb1236df8b983274d6
2022-01-03 21:47:56 +01:00
Thiemo Kreuz
b4c63c64ae Remove some more comments that literally repeat the code
Nothing to learn from these.

You can find a longer explanation in the comments in I93751e6.

Change-Id: I195aae70fc282b58be5b18160783f27d38605d15
2021-12-09 19:01:36 +01:00
Ppchelko
643fc535c3 Reapply "Move limit report rendering to ParserOutput"
This reverts commit 2bcb3fe567.

Reason for revert: this is a good change,
just needed more work to not break CI

Change-Id: I23768bee242e3cf81b1493a740cf070e7ad1e224
2021-11-09 11:08:08 -08:00
Ppchelko
2bcb3fe567 Revert "Move limit report rendering to ParserOutput"
This reverts commit 89028e0b8e.

Reason for revert: Temporary until we deal with T295357

Change-Id: I556de18dbf900a9bc58d5ae22d1bf194682d0840
2021-11-09 15:57:18 +00:00
Petr Pchelko
89028e0b8e Move limit report rendering to ParserOutput
This does not move the actual limit report data into
ParserOptions yet, that should be done separately
given that it will require serialization changes.
Let's get this change settled first before messing
with serialization.

This unifies canonical and non-canonical ParserOptions,
so ParserCache can now be used with both. It is hard
to say how this will affect the ParserCache capacity,
so we should monitor it after releasing this.

Change-Id: I154c0a77a5b0287b5572614d56339fb57ac56c33
2021-11-08 12:45:41 -08:00
jdlrobson
24949480eb Give skins more flexibility over table of contents render
* Do not store table of contents in parser output
* Instead inject table of contents via strpos where needed
  inside Article based on Skin "toc" option
* Use <mw:tocplace> as a TOC placeholder; for Parsoid compatibility
  this will be replaced with a <meta> tag in a followup patch.

Bug: T287767
Change-Id: I44045b3b9e78e7ab793da3f37e3c0dbc91cd7d39
2021-10-25 22:26:41 +00:00
Umherirrender
b581e6d40e Hide deprecation on tests for ParserOutput::addWarning
Change-Id: Ifc5f3ef93720a944f0e0fffc1666047b46f1683b
2021-10-20 21:32:49 +02:00
C. Scott Ananian
4834340ec0 Deprecate ParserOutput::addWarning() in favor of ::addWarningMsg()
Encourage localization and factor out common code by taking a message
key as the first argument to ::addWarningMsg() instead of a wikitext
string.  This also plays nicer with Parsoid by separating out the
localization code from the parse.

Bug: T293515
Change-Id: I6a7c04c67ac586ab00d4edcbb3d09485a7794e23
2021-10-15 16:06:13 -04:00
C. Scott Ananian
6bf78153bf Hard deprecate ParserOutput::addTrackingCategory()
Non-WMF-deployed patches:
I403b646852064fed22e8e9bbfadd221990363881 (CollaborationKit)

Code search:
https://codesearch.wmcloud.org/search/?q=%5BOo%5Dutput%28%5C%28%5C%29%29%3F-%3EaddTrackingCategory%5C%28&i=nope&files=&excludeFiles=&repos=

Depends-On: I339a14907c8059c52ffce5d8a554e97f1f0b58d6
Depends-On: Ie6a09796b908dd105ec17d5a1f0f9586eed9bd8c
Depends-On: I983d370a03413086acafe7e67a2b812400e1a38d
Depends-On: Ie41ad70f1fb5de202355314a96a1243234fb267c
Depends-On: I715f788c5026a6c07d4995980a4fcc6c8b84e2b4
Depends-On: I1330a806febf898b3a0a2998fac857f0f737a0a3
Change-Id: I340c138f26e25ce971a46470a4da02d011b5a638
2021-10-15 14:18:26 -04:00
C. Scott Ananian
9632dfb041 Move ::addTrackingCategory() implementation to TrackingCategories
This moves the implementation of ParserOutput::addTrackingCategory()
to the TrackingCategories class as a non-static method. This makes
invocation from ParserOutput awkward, but when invoking as
Parser::addTrackingCategory() all the necessary services are
available.  As a result, we've also soft-deprecated
ParserOutput::addTrackingCategory(); new users should use the
TrackingCategories::addTrackingCategory() method, or else
Parser::addTrackingCategory() if the parser object is available.

The Parser class is already kind of bloated as it is (alas), but there
aren't too many callsites which invoke
ParserOutput::addTrackingCategory() and don't have the corresponding
Parser object handy; see:

https://codesearch.wmcloud.org/search/?q=%5BOo%5Dutput%28%5C%28%5C%29%29%3F-%3EaddTrackingCategory%5C%28&i=nope&files=&excludeFiles=&repos=

Change-Id: I697ce188a912e445a6a748121575548e79aabac6
2021-10-15 14:17:58 -04:00
C. Scott Ananian
5ae946d3a6 Rename ParserOutput::setCategoryLinks() and ::getCategoryLinks()
Make ::setCategory() consistent with the corresponding singular method,
which is ::addCategory(), not ::addCategoryLink().  Also, don't return
a value.

This renaming is in preparation for factoring out a write-only base
class from ParserOutput suitable to be used by Parsoid.

Note that OutputPage does distinguish a 'category link' from a
'category list', and there are separate OutputPage::getCategories()
and OutputPage::getCategoryLinks() methods.  However, the category
map in ParserOutput isn't exactly the same as either of these:
it's actually a map (or list of pairs) of category name to sort key.

Rename ParserOutput::getCategoryLinks() to ::getCategoryNames()
in order to clarify that the concept involved is not the same as
the OutputPage "category links" methods.

Code search:
https://codesearch.wmcloud.org/deployed/?q=-%3E(get%7Cset)CategoryLinks%5C(&i=nope&files=&excludeFiles=&repos=

(Note that many of the code search matches are for the methods in
OutputPage, which we are trying to disambiguate here.)

Bug: T287216
Change-Id: Idb383d3d9ef7b76f8a0208a057a3cb8c639465c9
2021-10-15 09:45:36 -07:00
jenkins-bot
8ab400e920 Merge "Sanitizer: Replace RFC 3454 by RFC 8264 for clearUrl" 2021-10-14 16:15:44 +00:00
jenkins-bot
388e5f200c Merge "Sanitizer: Use \u{xxxx} syntax in cleanUrl" 2021-10-14 16:07:43 +00:00
jenkins-bot
2504a3bb68 Merge "ParserOutput: remove Title from public interface" 2021-10-14 16:02:28 +00:00
daniel
845b36c7bb ParserOutput: remove Title from public interface
Usages of Title in the public interface of ParsrOutput are being
replaced with LinkTarget or PageReference, as appropriate.

Change-Id: I571cf18fd7e666a59ef3947b09c2e75357bcac04
2021-10-14 10:46:53 +00:00
C. Scott Ananian
3056737420 Make Parser::$mStripState private
This property was deprecated in 1.35.  The replacement function
Parser::getStripState() was introduced in MediaWiki 1.34.

Code search:
https://codesearch.wmcloud.org/search/?q=-%3EmStripState&i=nope&files=&excludeFiles=&repos=

Depends-On clauses below are for WMF-deployed code.  Other uses in
non-WMF-deployed code have been patched in:
* https://github.com/SemanticMediaWiki/SemanticMediaWiki/pull/4936
* Idc2fadf5105d6eb30777a16dff0035bceff17174 (BlueSpiceSocial)
* I130fd61a8fe2d28e6b116a3fcc767b8abd466cea (ContributionScores)
* I3676fe9882ce9de5732cb7230528134df544ff98 (HierarchyBuilder)
* Ic392afd1e93ae0003fd0ab65114ec1ff38bb2927 (Mpdf)
* I4b01017da752def982777c4fea5fad5e21e4c7ea (MsLinks)
* I09726078ee62eb99e032b8faa5f938e20107f48c (Negref)
* Ibfd6b7064a8e650c3492e0d2764d4f7afc4937bf (PageForms)
* Ia865435688d36178508f21cffae79538c919035c (PageInCat)
* Ib94db0e6d365e4cb3f51121340a04d31b88add62 (ParserFun)
* I8660c0691b7e9842106d7dcb224ff5ecf374e4bc (PhpTags)
* I1ad5f78e5a937767123400ceca4967941e256e5e (RandomImageByCategory)
* I4539e7cea597f71b2a2d9a6cae137bc25085ed6b (ReplaceSet)
* If8ff2e21952b3f08d3a8950d42e2afb56973fb89 (SemanticDrilldown)
* I4a5bd64760cdde5b614a7d4e2b09e8d0634b2056 (SemanticPageSeries)
* Ia04f1aac1d8ae4ea16c98cfbbe72195fffe653b6 (SemanticRating)
* Id2a2e2d024922e3babf756ebae1a4f59b4358146 (Spark)
* I4a979024b18ec4834dc06b51ee0f018d749c6dab (Tooltip)
* Iaf179914863998b32bfecc16c874c3cffd6c26e9 (VIKI)
* I2de0e7a6c133c2e1f3cb7502a81d809c4489db4c (XSL)
* https://gitlab.com/hydrawiki/extensions/characterescapes/-/merge_requests/1 (characterescapes)
* https://github.com/JeroenDeDauw/Validator/pull/38 (Validator)
* https://github.com/lingua-libre/CustomSubtitle/pull/3 (CustomSubtitle)
* https://github.com/mkroetzsch/AutoCreatePage/pull/12 (AutoCreatePage)
* https://gitlab.com/nonsensopedia/extensions/advancedbacklinks/-/merge_requests/95 (advancedbacklinks)
* https://github.com/vonloxley/Shariff-Mediawiki/pull/16 (Shariff-Mediawiki)

Bug: T275160
Depends-On: I062ac8b69756a7ad35d8cc744b4735fd2e70f13e
Depends-On: Ic4be2bad176f2c59a1104219be10045cd5929261
Depends-On: I3cb117a91c8c57331b6b513f64ddb68d6ae2758c
Depends-On: I67b5926f2f851b3dc709d044eec5dd3df5065482
Depends-On: I7806068e1cd6e4da66adfe7bb75095d4bfb5d6bc
Depends-On: I429da35ca4e276c852b8d6ee102ff19f742c22c0
Change-Id: I4af85a46cfcafba15aa5ee50fda9f7b04681d6e6
2021-10-13 19:58:50 -04:00
Petr Pchelko
a1aa3e0827 Hard-deprecate all public property access on CacheTime and ParserOutput.
- Added a test where ParserOutput objects with CacheTime
properties set are unserialized from previous versions.
- Generate new serialization tests for 1.38

Now all serialization in production is JSON, so changing
property visibility shouldn't affect ParserCache.

Bug: T263851
Depends-On: I283340ff559420ceee8f286ba3ef202c01206a23
Change-Id: I70d6feb1c995a0a0f763b21261141ae8ee6dc570
2021-10-13 13:27:16 -04:00
C. Scott Ananian
af5d13c5de Rename ParserOutput::{get,set,unset}Property to {get,set,unset}PageProperty
The ::getProperty() naming is too generic and doesn't clearly indicate
that these are "page properties" (which have their own table in the DB).
As part of refactoring a clean API out of ParserOutput which can be used
by Parsoid, clean up the naming here.

Soft-deprecation in this patch, there are a handful of external users
which need to be cleaned up before we hard-deprecate.

Bug: T287216
Change-Id: Ie963eea5aa0f0e984ced7c4dfa0fd65d57313cfa
2021-10-08 10:07:17 -04:00
jenkins-bot
e5b8997536 Merge "Remove "auto-number headings" preference" 2021-10-05 19:18:40 +00:00
Amir Sarabadani
649bbdd6c5 Remove "auto-number headings" preference
Bug: T284921
Change-Id: Ic9ed88f419419cf4cc5cc32010539eea8b76314b
2021-10-03 00:47:08 +02:00
C. Scott Ananian
df3cc40fac Rename ParserOutput::{allow,prevent}Clickjacking() -> ::{get,set}PreventClickjacking()
This name is consist with the rest of the setter and getter methods
in ParserOutput.  Renamed the methods in OutputPage, ImageHistoryList,
ImageHistoryPseudoPager, and ContribsPager as well for consistency;
it also makes chasing down lingering references in codesearch easier.

Soft-deprecated the old name for 1.38.  Hard-deprecation will follow,
but there are a number of users in production that should be chased
down first.

Code search:

https://codesearch.https://codesearch.wmcloud.org/deployed/?q=(allow%7Cprevent)Clickjacking&i=nope&files=&excludeFiles=&repos=

Bug: T287216
Change-Id: I9822c60c180d204bd30cb4447a1120155d456da4
2021-10-01 14:13:47 -04:00
C. Scott Ananian
db81b56adf Rename ParserOutput::hideNewSection() -> ::setHideNewSection()
This name is consist with the rest of the setter and getter methods
in ParserOutput (note that ParserOutput::getHideNewSection() already
exists and is consistently named).

Hard deprecated the old name for 1.38.  Rarely used outside core, and
a pull request already created for the one outside user:
https://github.com/SkizNet/mediawiki-WikiMirror/pull/15

Code search:
https://codesearch.wmcloud.org/search/?q=hideNewSection&i=nope&files=&excludeFiles=&repos=

Bug: T287216
Change-Id: Ia553373eef78f875a83ad0eebfe2e465ce33272f
2021-09-29 17:47:54 -04:00
jenkins-bot
eaf10f600b Merge "Expand local URLs to absolute URLs in ParserOutput" 2021-09-27 14:48:02 +00:00
Alexander Vorwerk
04dfdc3653 Hard deprecate User::setOption()
deprecated since 1.35

Bug: T277818
Change-Id: Ic251d624e5d6fa857aa92f9c5dd3df44714ac610
2021-09-26 17:18:54 +02:00
Petr Pchelko
d334de960a Expand local URLs to absolute URLs in ParserOutput
New option 'absoluteURLs' was added to getText method
of the ParserOutput object that replaces all links
in the page HTML with absolute URLs.

Removing the action=render special case from Title
seems safe cause we will end up replacing the result
with absolute URL if we're in a render action no matter
where Title::getLocalUrl was called from.

This change is safely revertable from the perspective
of ParserCache.

Bug: T263581
Change-Id: Id660e1026192f40181587199d3418568f0fdb6d3
2021-09-23 11:48:51 -07:00
Umherirrender
2e4ee47c3d Cleanup mixed space/tab line indent
Change-Id: I833052a656b1ce419c0929f6f0514f2a33c2c4cc
2021-09-04 00:52:31 +02:00
Fomafix
4706b9bfb5 Sanitizer: Replace RFC 3454 by RFC 8264 for clearUrl
RFC 3454 is obsoleted by RFC 7564 obsoleted by RFC 8264.

The mapping section 3.1 "Commonly mapped to nothing"
https://datatracker.ietf.org/doc/html/rfc3454#section-3.1
was replaced by section 9.13. "PrecisIgnorableProperties (M)"
https://datatracker.ietf.org/doc/html/rfc8264#section-9.13
which uses the Default_Ignorable_Code_Point from
https://www.unicode.org/Public/UCD/latest/ucd/DerivedCoreProperties.txt

U+1806 (MONGOLIAN TODO SOFT HYPHEN) is not included anymore.

Change-Id: Ib86e619ba9181cdc7f731a96c684a119527e3b76
2021-08-26 21:02:21 +00:00
Fomafix
c2a598d5eb Sanitizer: Use \u{xxxx} syntax in cleanUrl
Also add a simple testcase to verify the handling of these Unicode
characters.

Change-Id: I611a7cf816bb6f6e61b834e48be28943f73e0908
2021-08-26 21:02:21 +00:00
vladshapik
1091f7753f Hard-deprecate Parser::mUser public access, Parser::getUser and ParserOptions::getUser
Bug: T285713
Depends-On: Ie75c9cd66d296ce7cf15432e2093817e18004443
Change-Id: I4297aea3489bb66c98c664da2332584c27793bfa
2021-08-17 15:42:05 +00:00
Petr Pchelko
79475042a5 ParserOptions: support setting a default for lazy options
If the default is not set for a lazy loaded option,
canonical parser output key ends up being 'lazy_option=default',
because when computing the keys we exclude defaults.

If on the other hand we register a default for a lazy option,
it does not get loaded, since in lazyLoadOption we believe
the default is already the loaded value.

Change-Id: I92b3e18fabef4eecac2ec2a4844f1be2716e5d89
Needed-By: I3bce04684070ad306685dabbc51267def25773cd
2021-08-10 10:22:05 -07:00