This adds support for serializing/deserializing objects which
implement the JsonCodecable interface from the wikimedia/json-codec
library used by Parsoid. JsonCodecable allows customizing the encoding
of objects of a given class using a class-specific codec object, and
JsonCodecable is an interface which is defined and can be used outside
mediawiki core.
In addition json-codec supports deserialization in the presence of
aliased class names, fixing T353883.
Backward and forward compatibility established via the mechanism
described in
https://www.mediawiki.org/wiki/Manual:Parser_cache/Serialization_compatibility
Test data generated by this patch was added in
I109640b510cef9b3b870a8c188f3b4f086d75d06 to ensure forward
compatibility with the output after this patch is merged.
Benchmarks:
PHP 7.4.33 PHP 8.2.19 PHP 8.3.6
BEFORE AFTER BEFORE AFTER BEFORE AFTER
Serialize: 926.7/s 1424.8/s 978.5/s 1542.4/s 1023.5/s 1488.6/s
Serialize (assoc): 930.2/s 1378.6/s 974.6/s 1541.9/s 1022.4/s 1463.4/s
Deserialize: 1942.7/s 1961.3/s 2118.8/s 2175.9/s 2129.8/s 2063.5/s
Deserialize (assoc): 1952.0/s 1905.7/s 2107.5/s 2192.1/s 2153.3/s 2011.1/s
These numbers definitely do not have as many significant digits as
written here. But they should be sufficient to demonstrate that
performance is not impaired by this patch and in fact serialization
speed improves slightly.
Bug: T273540
Bug: T327439
Bug: T346829
Bug: T353883
Depends-On: If1d70ba18712839615c1f4fea236843ffebc8645
Change-Id: Ia1017dcef462f3ac1ff5112106f7df81f5cc384f
This is to make it clearer that they're related to converting serialized
content back into JSON, rather than stating that things are not
representable in JSON.
Change-Id: Ic440ac2d05b5ac238a1c0e4821d3f2d858bc3d76
This patch introduces a namespace declaration for the
MediaWiki\Json to FormatJson and establishes a class
alias marked as deprecated since version 1.43.
Bug: T353458
Change-Id: I5e1311e4eb7a878a7db319b725ae262f40671c32
'string|int|float|bool' (in any order) can be replaced by 'scalar'.
'string|int|float|bool|null' (likewise) can be replaced by '?scalar'.
This is convenient for functions that can accept any primitive value,
which comes up sometimes when serializing things as SQL, JSON etc.
Change-Id: I4a711ee59611d76d6745f3640e4aa6bebec02918
I don't think these do anything with the documentation generators
we currently use. Especially not in tests. How are tests part of a
"package" when the code is not?
Note how most of these are simply identical to the namespace. They
are most probably auto-generated by some IDEs but don't actually
mean anything.
Change-Id: I771b5f2041a8e3b077865c79cbebddbe028543d1
Also fixes JsonCodeTest::testInvalidJsonData() which was misusing the
data provided by ::provideSimpleTypes().
Change-Id: Ia654359e0fdec3ad546e8bea2e9133c142f0f144
This class is used heavily basically everywhere, moving it to Utils
wouldn't make much sense. Also with this change, we can move
StatusValue to MediaWiki\Status as well.
Bug: T321882
Depends-On: I5f89ecf27ce1471a74f31c6018806461781213c3
Change-Id: I04c1dcf5129df437589149f0f3e284974d7c98fa
Just methods where adding "static" to the declaration was enough, I
didn't do anything with providers that used $this.
Initially by search and replace. There were many mistakes which I
found mostly by running the PHPStorm inspection which searches for
$this usage in a static method. Later I used the PHPStorm "make static"
action which avoids the more obvious mistakes.
Bug: T332865
Change-Id: I47ed6692945607dfa5c139d42edbd934fa4f3a36
A valid JSON serialization is an instance of PHP's stdClass. A check
with is_object() is not sufficient in this case because it includes
everything else that's also a class in PHP.
This should help to uncover programming errors like the one in
I969d8c4.
Bug: T312589
Change-Id: I917d49944497b19909a9a1d1e2861e86e7a0aca8
Add a type annotation when encoding `stdClass` objects so that we can
be sure to decode them as objects instead of arrays.
This avoids issues such as that seen in the Graph extension (T312589)
where an extension data key is stored as a stdClass. If ParserOutput
was computed fresh, a subsequent getExtensionData(..) call will return
a stdClass object, but if the ParserOutput was cached, getExtensionData()
would return an array. After this change the return type is always
consistent.
Properly handle nested objects: encode all object values returned by
JsonSerializable::jsonSerialize() (so that client is not responsible
for implementing this correctly), and decode all object values *before*
calling JsonUnserializable::newFromJsonArray (again, so that the
client is not responsible for decoding its property values). The new
behavior matches how serialize/unserialize is handled in the 'naive'
JsonUnserializable{Sub,Super}Class test cases; ParserOutput (the only
users of JsonCodec in core) was doing an extra manual decode for
the ExtensionData array in ParserOutput::initFromJson that is no longer
necessary.
The GrowthExperiments and SemanticMediaWiki extensions were working
around the non-recursive nature of JsonCodec; this patch depends on
patches to GrowthExperiments to make it agnostic about whether object
unserialization occurs before or after ::newFromJsonArray() is called,
which can then be further cleaned up once this is released.
A pull request for SemanticMediaWiki has also been submitted.
Bug: T312589
Depends-On: I3413609251f056893d3921df23698aeed40754ed
Change-Id: Id7d0695af40b9801b42a9b82f41e46118da288dc
These two interfaces' methods have tentative return types in PHP 8.1,
which causes code without the type hints to raise warnings. Where the
type hint is "mixed", we need to use the special declaration
[\ReturnTypeWillChange] in a comment to suppress the warning as long as
we still support PHP < 8.0, which doesn't have a "mixed" type hint.
Bug: T289879
Change-Id: I1a126e602e92b8d13c7795eb6d790effd5ddc986
This ensures that assertions work in a uniform way,
and provides meaningful messages in cause of failure.
Change-Id: Ic01715b9a55444d3df6b5d4097e78cb8ac082b3e
Doesn't make any functional difference but less confusing.
Also, clarify why this testcase is expected to fail.
Change-Id: I56f03d5c02cf624a4eba73d9d546cf6c2ebf6a77
Let PHP do the UTF-8 encoding of Unicode characters in PHP strings.
Also use faster str_replace instead of preg_replace.
Change-Id: I4e99de694a607e2b5df52c6efcd3d863bb42f76e
- Removed the str_replace() call to replace unescaped line terminators
if UTF8_OK is set. PHP 7.1 and later escape these by default.
The speedup isn't much at all (about 1% in my testing when encoding an
API siteinfo result taken from enwiki). Perhaps it's not surprising
given the way str_replace() works[1]. Still, it's better not to spend
CPU time looking for characters that will not occur.
- Changed the algorithm for the optional spaces-to-tabs conversion when
pretty printing. Instead of replacing one indent level throughout the
entire string before replacing the next level, use a regex to replace
in one pass. This is usually faster now that PHP 7 enables PCRE's JIT
compiler by default. Without JIT, the regex was often slower.
The speedup can be large for deeply nested data. For example, in my
testing the languages/i18n data took about 8% less time to encode as
tab-indented JSON, yet the API site info result took about 45% less.
(This, of course, isn't actually relevant to the API even when pretty
printed output is requested, because ApiFormatJson uses the default
indent string of four spaces, which will always be faster unless
support for tab indentation is added to PHP's json extension.)
- Set options using if statements instead of the ternary operator. This
is the clearer way, and maybe the slightly faster one, skipping the
assignment when the flags do not need to be set.
[1]: https://github.com/php/php-src/blob/PHP-8.0.10/ext/standard/string.c#L2969
Change-Id: Iebb1df0264e335a1819956710eeacf6d6b8f1471
The comment added in b9461e3f1c is incorrect. This is actually a
decode error, so is relevant to FormatJson::parse().
Change-Id: I3cc33f0f260c0ba4fe96fb75565f52d089b9a975
FormatJsonUnitTest was split off back when the
rest still needed to be integration tests[1], but
after [2] with global functions being loaded for unit
tests, the rest of FormatJsonTest was moved to a
unit test since it no longer required integration
[1] I86dfe17f794c615048b3c20487b0e84d38d13b93
[2] Ib42c56a67926ebcdba53f4c6c54a5bff98cb77a3
Change-Id: I92bb7a6cafd82d8b2186f92e0953bc18f40b0ee4
My personal best practice is to not document @params when there
is a @dataProvider. I mean, these test…() functions are not
meant to be called from anywhere. They do not really need
documentation. @param tags don't do much but duplicate what the
@dataProvider does. This is error-prone, as demonstrated by the
examples in this patch.
This patch also removes @throws tags from tests. A test…() can
never throw an exception. Otherwise the test would fail.
Most of these are found by the not yet released I10559d8.
Change-Id: I3782bca43f875687cd2be972144a7ab6b298454e
Most of these are found by the not yet released I10559d8.
I remove the type MockObject in some cases when the calling
code really does not need to know if he get's a mock or the
real thing. However, I do this only in places that are very
closely related to the fixes.
Change-Id: I26a4c3c5a8ae141bf56161b52b54bce7e68f2e30
* parent::setUp() should be first, and ::tearDown()
should be last
* Move tests that directly extend PHPUnit\Framework\TestCase
to /unit
Change-Id: I1172855c58f4f52a8f624e6d596ec43beb8c93ff
One major difference with what we've had before is that now we
actually write class names into the serialization - given that
this new mechanism is extencible, we can't establish any kind
of mapping of allowed classes. I do not think it's a problem
though.
Bug: T264394
Change-Id: Ia152f3b76b967aabde2d8a182e3aec7d3002e5ea
This should be the exact same. Its more a style change than anything.
So why do it then?
* I believe this is much less confusing than code mentioning a weird
"standard class". Barely anybody knows what this is, and what the
difference between "object" and "stdClass" is.
* The code is shorter.
* It's even faster. In my micro benchmark it's twice as fast.
Change-Id: I7ee0e8ae6d9264a89b6cd1dd861f0466ae620ccc
Three of the errors are encode errors that won't be emitted when we're
trying to decode JSON, so we can ignore those lines of code.
JSON_ERROR_UTF16 is a new error code in PHP 7.0, so add that in.
Improve test coverage while we're at it. The UTF16 test case was
copied from php-src/ext/json/tests/bug62010.phpt.
Change-Id: I79aa0db3d967d512611f8521bb052af36c3cda8e
Done automatically using the master version of MW codesniffer and
running composer fix.
Bug: T192167
Change-Id: If6b40f515fde32ab5eff074a90e821c30c791827
Out of 140 tests of this file, 131 one of them are pure unit test
Let's keep the 9 in the original file and move the rest
Bug: T87781
Change-Id: I86dfe17f794c615048b3c20487b0e84d38d13b93
This changeset implements T89432 and related tickets and is based on exploration
done at the Prague Hackathon. The goal is to identify tests in MediaWiki core
that can be run without having to install & configure MediaWiki and its dependencies,
and provide a way to execute these tests via the standard phpunit entry point,
allowing for faster development and integration with existing tooling like IDEs.
The initial set of tests that met these criteria were identified using the work Amir did in
I88822667693d9e00ac3d4639c87bc24e5083e5e8. These tests were then moved into a new subdirectory
under phpunit/ and organized into a separate test suite. The environment for this suite
is set up via a PHPUnit bootstrap file without a custom entry point.
You can execute these tests by running:
$ vendor/bin/phpunit -d memory_limit=512M -c tests/phpunit/unit-tests.xml
Bug: T89432
Bug: T87781
Bug: T84948
Change-Id: Iad01033a0548afd4d2a6f2c1ef6fcc9debf72c0d