Commit graph

100 commits

Author SHA1 Message Date
Umherirrender
e662614f95 Use explicit nullable type on parameter arguments
Implicitly marking parameter $... as nullable is deprecated in php8.4,
the explicit nullable type must be used instead

Created with autofix from Ide15839e98a6229c22584d1c1c88c690982e1d7a

Break one long line in SpecialPage.php

Bug: T376276
Change-Id: I807257b2ba1ab2744ab74d9572c9c3d3ac2a968e
2024-10-16 20:58:33 +02:00
C. Scott Ananian
8409b9ebd3 JsonCodec: fix ${var} deprecation notice in error message
Change-Id: I55c86098f4fae8a2e686e48cb74c4fbd3530ce51
2024-10-16 01:51:35 -04:00
C. Scott Ananian
3bc172d0e4 [JsonCodec] Use wikimedia/json-codec to implement JsonCodec
This adds support for serializing/deserializing objects which
implement the JsonCodecable interface from the wikimedia/json-codec
library used by Parsoid.  JsonCodecable allows customizing the encoding
of objects of a given class using a class-specific codec object, and
JsonCodecable is an interface which is defined and can be used outside
mediawiki core.

In addition json-codec supports deserialization in the presence of
aliased class names, fixing T353883.

Backward and forward compatibility established via the mechanism
described in
https://www.mediawiki.org/wiki/Manual:Parser_cache/Serialization_compatibility

Test data generated by this patch was added in
I109640b510cef9b3b870a8c188f3b4f086d75d06 to ensure forward
compatibility with the output after this patch is merged.

Benchmarks:
                        PHP 7.4.33          PHP 8.2.19          PHP 8.3.6
                      BEFORE    AFTER     BEFORE    AFTER     BEFORE    AFTER
Serialize:            926.7/s  1424.8/s   978.5/s  1542.4/s  1023.5/s  1488.6/s
Serialize (assoc):    930.2/s  1378.6/s   974.6/s  1541.9/s  1022.4/s  1463.4/s
Deserialize:         1942.7/s  1961.3/s  2118.8/s  2175.9/s  2129.8/s  2063.5/s
Deserialize (assoc): 1952.0/s  1905.7/s  2107.5/s  2192.1/s  2153.3/s  2011.1/s

These numbers definitely do not have as many significant digits as
written here.  But they should be sufficient to demonstrate that
performance is not impaired by this patch and in fact serialization
speed improves slightly.

Bug: T273540
Bug: T327439
Bug: T346829
Bug: T353883
Depends-On: If1d70ba18712839615c1f4fea236843ffebc8645
Change-Id: Ia1017dcef462f3ac1ff5112106f7df81f5cc384f
2024-10-15 20:09:51 -04:00
Bartosz Dziewoński
df4cbf5ac6 Replace gettype() with get_debug_type() in debug/log/test output
get_debug_type() does the same thing but better (spelling type names
in the same way as in type declarations, and including names of
object classes and resource types). It was added in PHP 8, but the
symfony/polyfill-php80 package provides it while we still support 7.4.

Also remove uses of get_class() and get_resource_type() where the new
method already provides the same information.

For reference:
https://www.php.net/manual/en/function.get-debug-type.php
https://www.php.net/manual/en/function.gettype.php

In this commit I'm only changing code where it looks like the result
is used only for some king of debug, log, or test output. This
probably won't break anything important, but I'm not sure whether
anything might depend on the exact values.

Change-Id: I7c1f0a8f669228643e86f8e511c0e26a2edb2948
2024-07-31 19:33:57 +02:00
Bartosz Dziewoński
c045fa0291 Replace gettype() with get_debug_type() in exception messages
get_debug_type() does the same thing but better (spelling type names
in the same way as in type declarations, and including names of
object classes and resource types). It was added in PHP 8, but the
symfony/polyfill-php80 package provides it while we still support 7.4.

Also remove uses of get_class() and get_resource_type() where the new
method already provides the same information.

For reference:
https://www.php.net/manual/en/function.get-debug-type.php
https://www.php.net/manual/en/function.gettype.php

To keep this safe and simple to review, I'm only changing cases where
the type is immediately used in an exception message.

Change-Id: I325efcddcb58be63b1592b9c20ac0845393c15e2
2024-07-31 19:24:39 +02:00
James D. Forrester
19f4e6945a Rename JsonUnserial… to JsonDeserial…
This is to make it clearer that they're related to converting serialized
content back into JSON, rather than stating that things are not
representable in JSON.

Change-Id: Ic440ac2d05b5ac238a1c0e4821d3f2d858bc3d76
2024-06-12 14:50:58 -04:00
C. Scott Ananian
f66bda6a2e [JsonCodec] Hide TYPE_ANNOTATION from the unserialization methods
Change-Id: Ia32f95a6bdf342262b4ef044140527f0676402b9
2024-05-22 10:41:23 -04:00
C. Scott Ananian
c5cc43348a [JsonCodec, ParserCache] Improve debugging of serializability failures
Bug: T365036
Change-Id: I6c4c2a6a48d3bca4ade76a05bbd81cb4968872a3
2024-05-16 14:49:21 -04:00
Ebrahim Byagowi
a717db8e60 Add namespace and deprecation alias to FormatJson
This patch introduces a namespace declaration for the
MediaWiki\Json to FormatJson and establishes a class
alias marked as deprecated since version 1.43.

Bug: T353458
Change-Id: I5e1311e4eb7a878a7db319b725ae262f40671c32
2024-05-16 16:28:01 +03:30
Umherirrender
8d97313f81 Fix some line indent
Change-Id: I8f82724197d20f9289d80e138d80310f1eab29f2
2024-04-20 00:25:15 +02:00
C. Scott Ananian
dbc75831fe [JsonCodec] throw JsonException now that we require PHP >= 7.4
Also fixes JsonCodeTest::testInvalidJsonData() which was misusing the
data provided by ::provideSimpleTypes().

Change-Id: Ia654359e0fdec3ad546e8bea2e9133c142f0f144
2024-01-08 20:03:12 +00:00
C. Scott Ananian
057ea0fcd9 Protect against ParserOutput re-namespacing
Follow-up to 9bfb75ff90.

Bug: T353835
Change-Id: I230ff033fb7b52d542f9f76f88704007d9ef5b4b
2023-12-20 20:29:13 +00:00
Amir Sarabadani
f4e68e055f Reorg: Move Status to MediaWiki\Status\
This class is used heavily basically everywhere, moving it to Utils
wouldn't make much sense. Also with this change, we can move
StatusValue to MediaWiki\Status as well.

Bug: T321882
Depends-On: I5f89ecf27ce1471a74f31c6018806461781213c3
Change-Id: I04c1dcf5129df437589149f0f3e284974d7c98fa
2023-08-25 15:44:17 +02:00
Umherirrender
ee73e6ac1b Remove unused local variable assignment
Dead code found by phan

Change-Id: I9fc404d546a4fb1c61394cb6359eb774fd94383a
2023-02-04 22:16:31 +01:00
thiemowmde
d4d070c6e9 json: Dont try to deserialize actual user-land instances
A valid JSON serialization is an instance of PHP's stdClass. A check
with is_object() is not sufficient in this case because it includes
everything else that's also a class in PHP.

This should help to uncover programming errors like the one in
I969d8c4.

Bug: T312589
Change-Id: I917d49944497b19909a9a1d1e2861e86e7a0aca8
2023-01-26 21:17:21 +00:00
C. Scott Ananian
96e4f5d840 JsonCodec: fix en/decoding of nested objects and stdClass objects
Add a type annotation when encoding `stdClass` objects so that we can
be sure to decode them as objects instead of arrays.

This avoids issues such as that seen in the Graph extension (T312589)
where an extension data key is stored as a stdClass.  If ParserOutput
was computed fresh, a subsequent getExtensionData(..) call will return
a stdClass object, but if the ParserOutput was cached, getExtensionData()
would return an array.  After this change the return type is always
consistent.

Properly handle nested objects: encode all object values returned by
JsonSerializable::jsonSerialize() (so that client is not responsible
for implementing this correctly), and decode all object values *before*
calling JsonUnserializable::newFromJsonArray (again, so that the
client is not responsible for decoding its property values).  The new
behavior matches how serialize/unserialize is handled in the 'naive'
JsonUnserializable{Sub,Super}Class test cases; ParserOutput (the only
users of JsonCodec in core) was doing an extra manual decode for
the ExtensionData array in ParserOutput::initFromJson that is no longer
necessary.

The GrowthExperiments and SemanticMediaWiki extensions were working
around the non-recursive nature of JsonCodec; this patch depends on
patches to GrowthExperiments to make it agnostic about whether object
unserialization occurs before or after ::newFromJsonArray() is called,
which can then be further cleaned up once this is released.
A pull request for SemanticMediaWiki has also been submitted.

Bug: T312589
Depends-On: I3413609251f056893d3921df23698aeed40754ed
Change-Id: Id7d0695af40b9801b42a9b82f41e46118da288dc
2023-01-12 14:12:32 -05:00
Reedy
a3095fbb94 Add return type to jsonSerialize()
Bug: T311919
Change-Id: I469deae973ab58ef41aac6a56cea0653a988c05c
2022-07-02 15:34:02 +00:00
Umherirrender
69b6c4983d Pass array to Assert::parameterType when asserting multiple types
Change-Id: I6db78db18b2d8982ce5158f44c03bfdb8d48f97c
2022-06-18 09:34:36 +02:00
Aryeh Gregor
1560b98225 Type hints for ArrayAccess and JsonSerializable
These two interfaces' methods have tentative return types in PHP 8.1,
which causes code without the type hints to raise warnings. Where the
type hint is "mixed", we need to use the special declaration
[\ReturnTypeWillChange] in a comment to suppress the warning as long as
we still support PHP < 8.0, which doesn't have a "mixed" type hint.

Bug: T289879
Change-Id: I1a126e602e92b8d13c7795eb6d790effd5ddc986
2022-04-11 15:06:27 +03:00
Kevin Israel
210a34369a FormatJson: Optimize encode() for supported PHP versions
- Removed the str_replace() call to replace unescaped line terminators
  if UTF8_OK is set. PHP 7.1 and later escape these by default.

  The speedup isn't much at all (about 1% in my testing when encoding an
  API siteinfo result taken from enwiki). Perhaps it's not surprising
  given the way str_replace() works[1]. Still, it's better not to spend
  CPU time looking for characters that will not occur.

- Changed the algorithm for the optional spaces-to-tabs conversion when
  pretty printing. Instead of replacing one indent level throughout the
  entire string before replacing the next level, use a regex to replace
  in one pass. This is usually faster now that PHP 7 enables PCRE's JIT
  compiler by default. Without JIT, the regex was often slower.

  The speedup can be large for deeply nested data. For example, in my
  testing the languages/i18n data took about 8% less time to encode as
  tab-indented JSON, yet the API site info result took about 45% less.
  (This, of course, isn't actually relevant to the API even when pretty
  printed output is requested, because ApiFormatJson uses the default
  indent string of four spaces, which will always be faster unless
  support for tab indentation is added to PHP's json extension.)

- Set options using if statements instead of the ternary operator. This
  is the clearer way, and maybe the slightly faster one, skipping the
  assignment when the flags do not need to be set.

[1]: https://github.com/php/php-src/blob/PHP-8.0.10/ext/standard/string.c#L2969

Change-Id: Iebb1df0264e335a1819956710eeacf6d6b8f1471
2021-08-20 08:03:11 -04:00
Kevin Israel
b084f499db FormatJson: Add message for JSON_ERROR_INVALID_PROPERTY_NAME
The comment added in b9461e3f1c is incorrect. This is actually a
decode error, so is relevant to FormatJson::parse().

Change-Id: I3cc33f0f260c0ba4fe96fb75565f52d089b9a975
2021-08-16 10:57:14 -04:00
libraryupgrader
5357695270 build: Updating dependencies
composer:
* mediawiki/mediawiki-codesniffer: 36.0.0 → 37.0.0
  The following sniffs now pass and were enabled:
  * Generic.ControlStructures.InlineControlStructure
  * MediaWiki.PHPUnit.AssertCount.NotUsed

npm:
* svgo: 2.3.0 → 2.3.1
  * https://npmjs.com/advisories/1754 (CVE-2021-33587)

Change-Id: I2a9bbee2fecbf7259876d335f565ece4b3622426
2021-07-22 03:36:05 +00:00
Petr Pchelko
7c87764400 JsonCodec: verify expected class before attempting to unserialize it
Change-Id: I69ea087e5014931176e05924026e96bb7c893bea
2021-06-08 21:20:05 -07:00
Umherirrender
2579ca623a build: Updating mediawiki/mediawiki-codesniffer to 34.0.0
Change-Id: I2fb18ddd4c144655a665792901e59f88bcd906dc
2020-12-07 14:55:24 +01:00
Petr Pchelko
dbdc2a3cd3 Introduce JsonCodec to help with serialization/deserialization
Change-Id: I5433090ae8e2b3f2a4590cc404baf838025546ce
2020-11-19 08:32:21 -07:00
daniel
a2ae4192c0 ParserOutputAccess: cache ouput for old revisions
DEPLOY: Set $wgOldRevisionParserCacheExpireTime = 0 in production first!

Bug: T267832
Depends-On: I3c73f5d9f6a54e2736600e8f9506659a3fb0e7f6
Change-Id: I0fe275b4991f1bf89c7bb587132bc4fb0ea862e2
2020-11-17 20:52:35 +00:00
Petr Pchelko
7c68ae9296 Safe ParserOutput extension data and JsonUnserializable helper.
One major difference with what we've had before is that now we
actually write class names into the serialization - given that
this new mechanism is extencible, we can't establish any kind
of mapping of allowed classes. I do not think it's a problem
though.

Bug: T264394
Change-Id: Ia152f3b76b967aabde2d8a182e3aec7d3002e5ea
2020-11-10 11:21:09 -07:00
Petr Pchelko
1c70cca3ee Check if non-JSON-serializable data passed to ParserOutput
Bug: T264394
Change-Id: I6eedd03a81b95f6f55d25c00b31e01cbd8658d43
2020-10-05 10:54:08 -06:00
Ed Sanders
0cf40a4f7a Flip Yoda conditionals
Change-Id: Id3495b6f15c267123c89f3a0ace496e6ecbeb58e
2020-07-22 17:49:12 +01:00
Reedy
4cd8d9cff5 Fix numerous PSR12.Properties.ConstantVisibility.NotFound
Change-Id: I2ec09c02c2e4ed399d993cb1871e67df02167ca8
2020-05-11 01:36:36 +01:00
Kunal Mehta
b9461e3f1c FormatJson: Improve parse() error code handling and tests
Three of the errors are encode errors that won't be emitted when we're
trying to decode JSON, so we can ignore those lines of code.

JSON_ERROR_UTF16 is a new error code in PHP 7.0, so add that in.

Improve test coverage while we're at it. The UTF16 test case was
copied from php-src/ext/json/tests/bug62010.phpt.

Change-Id: I79aa0db3d967d512611f8521bb052af36c3cda8e
2020-01-01 02:34:44 -08:00
Max Semenik
8a98dd9d59 Convert some private static arrays to constants
Remove @since for some private ones as we don't guarantee anything
about private class members.

Change-Id: Ifb898353c02082e9ef69d67f69339345c6cd154d
2019-10-16 01:30:54 +00:00
Bill Pirkle
5a166f00d8 Comments, tests, and tweaks for JSON decoding quirks
PHP JSON decoding has surprising behavior on some edge cases.
Documented this via comments, added related tests, and tweaked
related CommentStore code.

Bug: T206411
Change-Id: I6927fdaf616b37a04d81a638a0ed257afac9b844
2018-11-07 13:04:21 -06:00
RazeSoldier
24ffbd9bd1 Use "break" instead of "continue"
"continue" statements are equivalent to "break". In PHP 7.3, will generate a warning.

Bug: T200595
Change-Id: I244ecb2e1ce5a76295f014fb1becd8d263196846
2018-08-24 00:18:07 +08:00
Kevin Israel
381858ab52 FormatJson: cleanup after PHP 5.5 support removal
* Use PHP 5.6 constant expression support in definition of ALL_OK.
* Remove one level of nesting in encode(). Follows up I801eaffc.
* Update HTML5 section number in doc comment for XMLMETA_OK.
* Made other minor doc comment fixes, such as capitalizing "JSON".
* Not done: changing $badChars and $badCharsEscaped to constants.
  This will have to wait until HHVM 3.18 support is dropped.

Change-Id: I06413dfe0fedddfd20d3e375eadd9daad6d6230e
2018-06-09 09:06:02 -04:00
Bartosz Dziewoński
0313128b10 Use PHP 7 "\u{NNNN}" Unicode codepoint escapes in string literals
In cases where we're operating on text data (and not binary data),
use e.g. "\u{00A0}" to refer directly to the Unicode character
'NO-BREAK SPACE' instead of "\xc2\xa0" to specify the bytes C2h A0h
(which correspond to the UTF-8 encoding of that character). This
makes it easier to look up those mysterious sequences, as not all
are as recognizable as the no-break space.

This is not enforced by PHP, but I think we should write those in
uppercase and zero-padded to at least four characters, like the
Unicode standard does.

Note that not all "\xNN" escapes can be automatically replaced:
* We can't use Unicode escapes for binary data that is not UTF-8
  (e.g. in code converting from legacy encodings or testing the
  handling of invalid UTF-8 byte sequences).
* '\xNN' escapes in regular expressions in single-quoted strings
  are actually handled by PCRE and have to be dealt with carefully
  (those regexps should probably be changed to use the /u modifier).
* "\xNN" referring to ASCII characters ("\x7F" and lower) should
  probably be left as-is.

The replacements in this commit were done semi-manually by piping
the existing "\xNN" escapes through the following terrible Ruby
script I devised:

  chars = eval('"' + ARGV[0] + '"').force_encoding('utf-8')
  puts chars.split('').map{|char|
    '\\u{' + char.ord.to_s(16).upcase.rjust(4, '0') + '}'
  }.join('')

Change-Id: Idc3dee3a7fb5ebfaef395754d8859b18f1f8769a
2018-06-04 16:20:13 +00:00
Fomafix
bb52950fee Remove workaround for PHP bug 66021 (PHP < 5.5.12)
The PHP bug 66021 <https://bugs.php.net/bug.php?id=66021> was fixed by
https://github.com/php/php-src/pull/518 and is included in PHP 5.4.28+
and PHP 5.5.12+.
This workaround is not necessary anymore because the minimum PHP
version for MediaWiki is 7.0.0+.

Change-Id: I801eaffc253fd88e0d3c87cfe97777837bd3902d
2018-05-31 01:14:58 +00:00
Bartosz Dziewoński
0cccd68dc8 Code style: no space after unary minus operator
Searched for /([^\d\w\s\)\]]\s*)- \d/ to find potential issues.
It seems there's no PHPCS check for this, huh.

Also fixed typo in a comment in LoginSignupSpecialPage.

Change-Id: Iaab1a1f5a9f234971e550e7909aa5c3e0c02a983
2017-01-05 14:38:32 +01:00
Sam Wilson
66e215baee Remove spaces after cast operators
This fixes the outstanding mis-spaced cast operators to bring them
into line with the coding standards on mediawiki.org (and with the
more common usage within this codebase).

Bug: T149545
Change-Id: Ib7bcf95bbee83d20c05f6d621ce7b4e1fb58a347
2016-10-31 13:57:39 +00:00
Kunal Mehta
6e9b4f0e9c Convert all array() syntax to []
Per wikitech-l consensus:
 https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html

Notes:
* Disabled CallTimePassByReference due to false positives (T127163)

Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
2016-02-17 01:33:00 -08:00
Kevin Israel
a508f5daee FormatJson: Remove PHP 5.3 compatibility code
MediaWiki now only works with PHP versions that are new enough
to have the encoding options required by encode54(). So fold
that into encode() and remove encode53() and prettyPrint().

Change-Id: I6b22daf8fa01ef608efbde9c6aecdbb5ce03e2b9
2016-02-12 18:49:01 -05:00
Vivek Ghaisas
9f5b6f5aeb Fix whitespace issues around parentheses
Fix issues found by MediaWiki.WhiteSpace.SpaceyParenthesis sniff.

Bug: T102617
Change-Id: Iec7f71e64081659fba373ec20d9d2006306a98f4
2015-06-16 22:14:02 +03:00
Timo Tijhof
532337e6ff Use "string|false" as @return instead of "string|bool" where appropiate
This makes sure static analyzers don't warn for supposedly unsafe
code accessing variables as strings when they could be boolean after
having only checked against false.

https://github.com/scrutinizer-ci/php-analyzer/issues/605

Change-Id: Idb676de7587f1eccb46c12de0131bea4489a0785
2015-04-01 09:48:30 +01:00
Kunal Mehta
c91fd8043b Fix phpcs errors and warnings in includes/json
Change-Id: Id5ae1cabe87f73f7458a744834ebb6a1a7c3dbf8
2015-03-15 02:35:26 +00:00
Bryan Davis
8fea9c619d FormatJson::stripComments
Add stripComments method that can be used to remove single line and
multiline comments from an otherwise valid JSON string. Inspired by the
comment removal code in redisJobRunnerService and discussions on irc
about the Extension registration RFC.

Change-Id: Ie743957bfbb7b1fca8cb78ad48c1efd953362fde
2014-10-12 12:34:22 -06:00
Yuri Astrakhan
c361cb74fa Added missing JsonFormat::parse() RELEASE NOTES, fixed docs
Constant values were changed to be above 0xFF - this way
we can easily decide to allow depth-parsing-limit to be OR-able:

  FormatJson::parse( $value, 30 | FormatJson::FORCE_ASSOC )

Follows-up Ic0eb0a7 and I1c4f37a.

Change-Id: I9bfd67a5ca4ea1d399821549c7e63ffdecd56ad1
2014-09-30 01:45:35 +00:00
Yuri Astrakhan
289d3e4f00 FormatJson::parse( TRY_FIXING ) - remove trailing commas
Removes trailing commas from json text when parsing
Solves very common cases like [1,2,3,]

Resulting status will be set to OK but not Good to warn caller

Change-Id: Ic0eb0a711da3ae578d6bb58d7474279d6845a4a7
2014-09-27 06:20:36 -04:00
Yuri Astrakhan
9a380626bc Added FormatJson::parse( $value, $options = 0 ) returning Status
* Returns Status object that will contain decoded value on success
* Adds i18n messages for all available PHP JSON errors

ATTN Translation team: please copy these messages:

gwtoolset-json-error-depth => json-error-depth
gwtoolset-json-error-state-mismatch => json-error-state-mismatch
gwtoolset-json-error-ctrl-char => json-error-ctrl-char
gwtoolset-json-error-syntax => json-error-syntax
gwtoolset-json-error-utf8 => json-error-utf8

Change-Id: I1c4f37aaabad369b75a1fbd223fad27ebcfe1c3c
2014-09-26 18:55:09 +00:00
Kevin Israel
b9a12b0b33 FormatJson: Remove speculative comment
Follows-up bec7e8287c. The comment "Can be removed once we require
PHP >= 5.4.28, 5.5.12, 5.6.0" relies on some assumptions that might
later prove to be incorrect:

* That the fix won't be reverted from any of those PHP versions
  (e.g. if deemed to break BC)

* That the bug will be fixed in PECL jsonc and jsond, as well as in
  HHVM

* That we don't need to support older versions of those once we
  require one of the mentioned PHP versions

Change-Id: I67034c561d54d37dee961ada8c9cf5ccfd113da1
2014-04-25 15:16:18 -04:00
Kevin Israel
bec7e8287c FormatJson: Skip whitespace cleanup when unnecessary
The patch[1] for PHP bug 66021[2], which removes the same undesirable
whitespace that WS_CLEANUP_REGEX does, has been merged into php-src.
Subsequent PHP versions having the patch shouldn't have to take the
10-20% performance hit from that workaround.

[1]: https://github.com/php/php-src/commit/82a4f1a1a287
[2]: https://bugs.php.net/bug.php?id=66021

Change-Id: I717a0e164952cc6ace104f13f6236e86c4ab8b58
2014-04-24 20:54:44 -04:00