Commit graph

38 commits

Author SHA1 Message Date
Umherirrender
28dd7fd9cd tests: Fix deprecation filter in ParserCacheSerializationTestCases
Shown in phpunit test before start, class was namespaced in 9bfb75ff

PHP Deprecated:  Use of MediaWiki\Parser\ParserOutput::setTOCHTML was
deprecated in MediaWiki 1.40. [Called from
MediaWiki\Tests\Parser\ParserCacheSerializationTestCases::getParserOutputTestCases
in /workspace/src/tests/phpunit/includes/parser/ParserCacheSerializationTestCases.php
at line 236] in /workspace/src/includes/debug/MWDebug.php on line 378

Bug: T355952
Follow-Up: I4c2cbb0a808b3881a4d6ca489eee5d8c8ebf26cf
Change-Id: I3d8e2beaf68dc55b93297b23e450c3bc89c5b222
2024-01-26 17:06:38 +00:00
James D. Forrester
9bfb75ff90 Namespace ParserOutput
Most used non-namespaced class!

Bug: T353458
Change-Id: I4c2cbb0a808b3881a4d6ca489eee5d8c8ebf26cf
2023-12-14 14:57:34 -05:00
C. Scott Ananian
4b83285954 ParserOutput: Allow passing LinkTarget to title-related methods
Broadened the argument type to allow passing LinkTarget to:
* ParserOutput::addCategory()
* ParserOutput::addLanguageLink()
* ParserOutput::addLink()
* ParserOutput::addImage()
* ParserOutput::addTemplate()

This allows for a tighter interface with Parsoid's
ContentMetadataCollector class and avoids errors caused by passing the
wrong form of string title ("text" with spaces versus "dbkey" with
underscores).

There are a few performance problems remaining after this patch, which
only apply to use by Parsoid (not the legacy parser):

1. ::addLink() does inefficient db requests to fetch the page id for
each link if the optional $id parameter is not passed.  These lookups
should be deferred and a LinkBatch used.  (The legacy parser always
passes $id.)

2. ::addTemplate() similarly requires $page_id (and $rev_id) to be
passed, so is not currently usable by Parsoid.

3. ::addLanguageLink() uses Title::getFullText() which is not present
in LinkTarget and is currently implemented as a full Title lookup.
This is not an issue for the legacy parser, because it already has a
Title object so the lookup is a no-op, but could be improved for
Parsoid's use.

Bug: T296023
Change-Id: If21ec8563c8a619bdde7c0cb6534bb9009480a21
2023-12-08 17:50:29 -05:00
daniel
e3fb964439 Only cache expensive renderings
Pages that are fast to render can be omitted from the parser cache
to preserve disk space and cache write operations.

The threshold is configurable per namespace, so the tradeoff can
be evaluated based on different access patterns. For example, pages
that are accessed rarely, like file description pages on commons,
may have a high threshold configured, while pages that are read
frequently, like wikipedia articles, may be configured to be always
cached, using a 0 threshold.

Filtering is based on a time profile recorded in the ParserOutput.
A generic mechanism for capturing the timing profile is implemented
in the ContentHandler base class. Subclasses may implement a more
rigorous capture mechanism.

Bug: T346765
Change-Id: I38a6f3ef064f98f3ad6a7c60856b0248a94fe9ac
2023-11-30 20:56:12 +00:00
C. Scott Ananian
d20663259f Hard-deprecate ParserOutput::getCategories(), deprecated in 1.40
It is difficult to distinguish this method from OutputPage::addJsConfigVars()
in code search:

   https://codesearch.wmcloud.org/deployed/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3EgetCategories%5C%28&files=&excludeFiles=&repos=

We generally try to replace $output with $parserOutput or $pOutput
as we touch code to improve the ability of codesearch to dig up
deprecated ParserOutput methods.

Bug: T305161
Depends-On: I02dd4f61c43c225b0ef6dc51c3e4f9d967a0a272
Depends-On: I61d2d77591579d825ad9d37f902e40366be55dd6
Depends-On: I91155106b7a9e10d3334f95ba4936d02851bfb11
Depends-On: Iaca745c79d9587571af03b23b21d76a6cba0ebf1
Depends-On: Id10a171c44411b1233ee4d6cf8fbd3dc57744eef
Depends-On: I47a25c011d9bd4b1a15dda4e673e32c25eb64f2b
Depends-On: I683fc768aba50b801f46467fcfa1668fa8731ea6
Change-Id: I5a2ac1c99b8b199102e12f0d32dd6ec5cdc24054
2023-09-29 15:25:50 -04:00
C. Scott Ananian
d421ab57f8 Remove ParserOutput::addOutputHook() and related code
ParserOutput::addOutputHook() has been deprecated since 1.38, and without
any calls to ::addOutputHook() the associated ::getOutputHooks() and
$wgParserOutputHooks configuration do nothing.

Bug: T292321
Bug: T305161
Change-Id: Ib770c680d5e0697980e7e36a323ec56ba1d806b8
2023-09-18 11:34:02 -04:00
Reedy
a1144dc7c5 mark various anonymous functions as static
Change-Id: Iefe896769359f0d32e52bf20aa03e1c3715d5074
2023-08-22 19:38:38 +00:00
Amir Sarabadani
15a278189f Reorg: Move MWTimestamp to MediaWiki\Utils
Bug: T321882
Change-Id: I48c10343295c4eb3d9ef8037343b0070e928f040
2023-08-19 05:53:40 +02:00
C. Scott Ananian
7a8dd531b2 Remove ParserOutput::addWarning, deprecated since 1.38
Replaced with ParserOutput::addWarningMsg()

Bug: T305161
Change-Id: I137b35a2e8250ea7c10059d04071a98a4f968038
2023-08-07 11:57:07 -04:00
C. Scott Ananian
e22d93a6bb Hard-deprecate ParserOutput::{get,set}Flag()
These were deprecated in 1.38; users are expected to use
ParserOutput::{get,set}OutputFlag() instead, which helps eliminate a
confusing aliasing of many MW methods named "flag".

Original deprecation: 06ab90f163

Code search:
    https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos=

Patches for non-production extensions:
 PageProperties: I592d43e2c912df635cd9162180ed20a6136535f1
 CIForms: I238a6c557891bb6d271d2641261ef69542b7957e

Bug: T292868
Bug: T305161
Change-Id: I4525443ab0932241b0cf64ab606f7ab7d6d70b6e
2023-07-28 13:51:02 -04:00
C. Scott Ananian
29853113f7 Deprecate ParserOutput::{get,set}TOCHTML()
No uses in deployed code outside mediawiki-core:

 https://codesearch.wmcloud.org/deployed/?q=%5Bgs%5DetTOCHTML%5C%28&i=nope&files=&excludeFiles=&repos=

Bug: T293513
Change-Id: I3fd82150ac581afbeb94f401672702063586fff0
2023-03-10 20:34:33 -05:00
James D. Forrester
ad06527fb4 Reorg: Namespace the Title class
This is moderately messy.

Process was principally:

* xargs rg --files-with-matches '^use Title;' | grep 'php$' | \
  xargs -P 1 -n 1 sed -i -z 's/use Title;/use MediaWiki\\Title\\Title;/1'
* rg --files-without-match 'MediaWiki\\Title\\Title;' . | grep 'php$' | \
  xargs rg --files-with-matches 'Title\b' | \
  xargs -P 1 -n 1 sed -i -z 's/\nuse /\nuse MediaWiki\\Title\\Title;\nuse /1'
* composer fix

Then manual fix-ups for a few files that don't have any use statements.

Bug: T166010
Follows-Up: Ia5d8cb759dc3bc9e9bbe217d0fb109e2f8c4101a
Change-Id: If8fc9d0d95fc1a114021e282a706fc3e7da3524b
2023-03-02 08:46:53 -05:00
Subramanya Sastry
bcb7009c41 Use real section metadata in tests
* Most of the files were generated from the validate* script.
* Post-processing of these generated files to fix problems:
  - Some of the files were binary-edited via "vi -b" to fix some
    issues with bad property names used in the prior step.
    1.36, 1.38, 1.39 files were all fixed up this way.
  - In addition, the 1.36 file had bad data (not sure if the wrong
    php version was used) but I fixed this by splicing in data
    from the 1.38 file to revert incorrect changes to "Categories"
    and "IndexPolicy" properties.
  - The 1.35 data file was binary edited by splicing data from the
    now 1.36 version.

Change-Id: I4e22b94ce30c2ad9b1f544c15e1c3cd0dd0bce6b
2022-11-23 12:45:27 -05:00
Subramanya Sastry
623625e8f2 Followup to fb747bc0: Fix bad property names
Change-Id: I362b0cf8feca13a91fd91961d400579f2e4ea97e
2022-11-18 16:12:06 -06:00
Subramanya Sastry
fb747bc038 Add section metadata parsercache serialization tests for MW 1.40
* Generate data files for 1.40 only since the new formats only
  showed up in 1.40 and won't be present in the parser cache
  for older MW versions.

Change-Id: I6f297e3091ec2faab7c2203c138800551b01e32a
2022-11-17 15:48:15 -06:00
Brian Wolff
bec8dada48 Clarify generate-html and make ParserOutput behave as expected
Previously:
* It was unclear that generate-html is an optional optimization
* Most of MediaWiki core was doing $parserOutput->setText('') if
html wasn't generated. However this is wrong and will cause
$parserOutput->hasText() to return true and also potentially cause
cache pollution if a content handler both does that and supports
parser cache (Like MassMessage; see T299896)
* The default value of mText in the constructor was '', and most
of the time MW used that default. This doesn't seem right. If
setText() is never called, the ParserOutput should not be considered
to have text
* It was impossible to set mText to null, as $parserOutput->setText(null)
was a no-op. Docs implied you were supposed to do this, so it was very
confusing.

This patch clarifies docs, changes the default value for ParserOutput::$mText
from '' to null, and makes $parserOutput->setText(null) do what you
expect it to. The last two are arguably breaking changes, although
the previous behaviours were unexpected, mostly undocumented and
based on a code search do not appear to be relied on.

It seems like the main reason this only broke MassMessage is most
content handlers either don't support generateHtml, or they don't
support parser cache.

Bug: T306591
Change-Id: I49cdf21411c6b02ac9a221a13393bebe17c7871e
Depends-On: I68ad491735b2df13951399312a4f9c37b63a08fa
2022-05-03 11:23:08 +02:00
Tim Starling
0d94c44743 Fix notice from ParserCacheSerializationTestCases
Change-Id: I6e65952367dd6de30916bfc574d1e4a5db84b998
2022-04-08 10:57:46 +10:00
C. Scott Ananian
05eda60400 Emit deprecation warnings for ParserOutput::addOutputHook()
Once no one is calling ::addOutputHook() we can stub out ::getOutputHook()
to just return an empty array.

Code search:
 https://codesearch.wmcloud.org/deployed/?q=-%3E%28addOutputHook%7CgetOutputHooks%29%5C%28&i=nope&files=&excludeFiles=&repos=

Bug: T292321
Change-Id: I1081696c4cc2e67c3c38b8f6e53054e62ac71502
2022-04-07 02:48:57 +00:00
jenkins-bot
99dee6855a Merge "Change return value of ParserOutput::getPageProperty() when property is missing" 2022-02-19 00:49:48 +00:00
C. Scott Ananian
c39ef6c6c9 Change return value of ParserOutput::getPageProperty() when property is missing
The old ParserOutput::getProperty() method returned `false` when a property
was missing.  This requires callers to use the `?:` syntax to supply default
values, which then causes any falsey value to be treated as missing.
So, for example, setting the defaultsort to '0' will cause the default
sort to be ignored.

Modern php convention is to use `null` for missing values, and the `??`
syntax is a better/more restrictive alternative to `?:`.

We renamed `ParserOutput::getProperty()` to `::getPageProperty()` in
1.38 (Ie963eea5aa0f0e984ced7c4dfa0fd65d57313cfa/T287216) but kept the
return value convention.  Before this actually makes it into a 1.38
release, take the opportunity to fix the return value for the new
`ParserOutput::getPageProperty()` method to return `null` when the
property is missing.

We need to do some temporary workarounds to the places we'd
already swapped over to use the new `::getPageProperty()` method
to allow them to handle either `false` or `null` as a return value;
we'll clean that up once this is merged.

Code search:
https://codesearch.wmcloud.org/deployed/?q=-%3EgetPageProperty%5C%28|T301915&i=nope&files=&excludeFiles=&repos=

Bug: T301915
Depends-On: I3f11ce604970e47b41fc1c123792df8c3045626f
Depends-On: Ie7533f49fe4cad01ebfda29760d23c61e9867b10
Depends-On: Ic5c09f5caa4c897bc553c614fbae9cee159566a2
Depends-On: I0278b2eafd90e77e4fee41c45a1165fb79ddf47e
Depends-On: I383abb6b7dc5e96c0061af13957609f6e31a1065
Depends-On: I79f9f4078e415284af29b15047bafd1c823d7f5b
Depends-On: I02276c48c49f5d2d241a69eb0a6cdf439b572d8b
Depends-On: I71628661b4539a4e35ae32846e719f92bcf782e0
Depends-On: I7e215cb43de0ce150a6bcc00f92481dcdcfed383
Change-Id: Iaa25c390118d2db2b6578cdd558f2defd5351d15
2022-02-18 21:15:58 +00:00
C. Scott Ananian
3c211fdb3c Update ParserCache serialization test cases to use valid category keys
Category keys are supposed to be non-null strings.

The test cases use bogus integer values, which causes issues when
refactoring more strictly enforces validity checks on category sort
key values.

Change-Id: If2937a694ba6bd4c522336f33aa58d40949b5a54
2022-02-17 12:12:53 -05:00
C. Scott Ananian
b46dfcc351 Update ParserCache serialization test cases to use a valid index policy
Valid values for ParserCache::$mIndexPolicy are '', 'noindex', and 'index'.

The test cases use the bogus value 'policy1', which causes issues when
refactoring more strictly enforces validity checks on index policy
values.

Change-Id: I2d00ff4e3ded043d18942c8482a39fac14ec60bc
2022-02-09 12:47:27 -05:00
C. Scott Ananian
0f5dc718ce Add ParserOutput::{set,append}JsConfigVar()
Deprecate ParserOutput::addJsConfigVars() and add setter methods which
better ensure that the ParserOutput contents are independent of parse
order.  This accomodates the asynchronous and incremental parsing goals
on the Parsoid roadmap.

Bug: T300307
Change-Id: I4f08d1098da211f7bf5c43c08c620de224cbf37f
2022-02-04 13:42:59 -05:00
Alexander Vorwerk
1f78d6a249 ParserCacheSerializationTestCases: call ::addModule(Style)?s with an array
Bug: T299865
Change-Id: Ifb0dd97c7023154ba1d834e574a913cfe9ff0f1f
2022-01-23 17:26:32 +01:00
C. Scott Ananian
5ae946d3a6 Rename ParserOutput::setCategoryLinks() and ::getCategoryLinks()
Make ::setCategory() consistent with the corresponding singular method,
which is ::addCategory(), not ::addCategoryLink().  Also, don't return
a value.

This renaming is in preparation for factoring out a write-only base
class from ParserOutput suitable to be used by Parsoid.

Note that OutputPage does distinguish a 'category link' from a
'category list', and there are separate OutputPage::getCategories()
and OutputPage::getCategoryLinks() methods.  However, the category
map in ParserOutput isn't exactly the same as either of these:
it's actually a map (or list of pairs) of category name to sort key.

Rename ParserOutput::getCategoryLinks() to ::getCategoryNames()
in order to clarify that the concept involved is not the same as
the OutputPage "category links" methods.

Code search:
https://codesearch.wmcloud.org/deployed/?q=-%3E(get%7Cset)CategoryLinks%5C(&i=nope&files=&excludeFiles=&repos=

(Note that many of the code search matches are for the methods in
OutputPage, which we are trying to disambiguate here.)

Bug: T287216
Change-Id: Idb383d3d9ef7b76f8a0208a057a3cb8c639465c9
2021-10-15 09:45:36 -07:00
Petr Pchelko
a1aa3e0827 Hard-deprecate all public property access on CacheTime and ParserOutput.
- Added a test where ParserOutput objects with CacheTime
properties set are unserialized from previous versions.
- Generate new serialization tests for 1.38

Now all serialization in production is JSON, so changing
property visibility shouldn't affect ParserCache.

Bug: T263851
Depends-On: I283340ff559420ceee8f286ba3ef202c01206a23
Change-Id: I70d6feb1c995a0a0f763b21261141ae8ee6dc570
2021-10-13 13:27:16 -04:00
C. Scott Ananian
af5d13c5de Rename ParserOutput::{get,set,unset}Property to {get,set,unset}PageProperty
The ::getProperty() naming is too generic and doesn't clearly indicate
that these are "page properties" (which have their own table in the DB).
As part of refactoring a clean API out of ParserOutput which can be used
by Parsoid, clean up the naming here.

Soft-deprecation in this patch, there are a handful of external users
which need to be cleaned up before we hard-deprecate.

Bug: T287216
Change-Id: Ie963eea5aa0f0e984ced7c4dfa0fd65d57313cfa
2021-10-08 10:07:17 -04:00
C. Scott Ananian
db81b56adf Rename ParserOutput::hideNewSection() -> ::setHideNewSection()
This name is consist with the rest of the setter and getter methods
in ParserOutput (note that ParserOutput::getHideNewSection() already
exists and is consistently named).

Hard deprecated the old name for 1.38.  Rarely used outside core, and
a pull request already created for the one outside user:
https://github.com/SkizNet/mediawiki-WikiMirror/pull/15

Code search:
https://codesearch.wmcloud.org/search/?q=hideNewSection&i=nope&files=&excludeFiles=&repos=

Bug: T287216
Change-Id: Ia553373eef78f875a83ad0eebfe2e465ce33272f
2021-09-29 17:47:54 -04:00
Umherirrender
d01d47683c Fix spacing after yield and use statements
Change-Id: Iacb93e96168ec0cd895130c5c8f66b6b44317e34
2021-03-26 23:55:58 +01:00
Umherirrender
8de3b7d324 Use static closures where safe to use
This is micro-optimization of closure code to avoid binding the closure
to $this where it is not needed.

Created by I25a17fb22b6b669e817317a0f45051ae9c608208

Change-Id: I0ffc6200f6c6693d78a3151cb8cea7dce7c21653
2021-02-11 00:13:52 +00:00
Umherirrender
a1de8b8700 Tests: Mark more more closures as static
Result of a new sniff I25a17fb22b6b669e817317a0f45051ae9c608208

Bug: T274036
Change-Id: I695873737167a75f0d94901fa40383a33984ca55
2021-02-09 02:55:57 +00:00
Petr Pchelko
b956c77d27 Merge CacheTime and ParserOutput accessedOptions properties
Change-Id: I5785596d68e8923f8bcbd182ace0b1991bd75c9a
2020-11-19 10:12:39 -07:00
Petr Pchelko
dbdc2a3cd3 Introduce JsonCodec to help with serialization/deserialization
Change-Id: I5433090ae8e2b3f2a4590cc404baf838025546ce
2020-11-19 08:32:21 -07:00
Petr Pchelko
7c68ae9296 Safe ParserOutput extension data and JsonUnserializable helper.
One major difference with what we've had before is that now we
actually write class names into the serialization - given that
this new mechanism is extencible, we can't establish any kind
of mapping of allowed classes. I do not think it's a problem
though.

Bug: T264394
Change-Id: Ia152f3b76b967aabde2d8a182e3aec7d3002e5ea
2020-11-10 11:21:09 -07:00
Petr Pchelko
017cfcf016 Forward-compat for merging CacheTime and ParserOutput mOptions
CacheTime::mUsedOptions and ParserOutput::mAccessedOptions
do exactly the same thing and has to be merged into a single property.
This patch adds forward-compatibility and needs to be deployed
at least one train before the patch which actually merges the properties.

Change-Id: Ic9d71a443994e2545ebf2a826b9155c82961cb88
2020-11-10 07:09:41 -07:00
daniel
cac89b547c ParserOutput: add support for binary properties in JSON.
This introduces a mechanism for encoding binary data in
strings set via setProperty(). This is needed to accommodate compressed
data as used by TemplateData, which uses gzip compression to make the
data fit into the page_props table.

Bug: T266200
Change-Id: I19fa0dea8c25d93fcdec9dc5ddd6f3c9c162b621
2020-11-04 18:52:09 +01:00
Petr Pchelko
8a879605d9 Add deserialization acceptance tests for ParserOutput
Bug: T264397
Change-Id: I6476fd9b8eff0e1b61ce5f43280d1cd9b7aaa77c
2020-10-12 08:55:32 +00:00
daniel
6eea7d7ed5 Add test infra for ParserCache serialization/deserialization
Based on Daniel's work at Ia6e70179b7ee5ce4e93888585ccc30d92da165c3
however was changed enough to move into a separate changeset.

More acceptance tests and data will be added in a followup commit.

Bug: T264397
Change-Id: I135187e83cbfa02b97c5656f0752f8bf1ceb58d0
2020-10-09 08:14:57 -06:00