Commit graph

332 commits

Author SHA1 Message Date
James D. Forrester
70d240cfba Drop CoreParserFunctions::mwnamespace(), deprecated in 1.39
Change-Id: Ib1e3f34724e247a6d9bbe4c9415fcff073bea9e8
2023-12-01 13:57:36 -05:00
James D. Forrester
468e69bccc Namespace Sanitizer under \MediaWiki\Parser
Bug: T166010
Change-Id: Id13dcbf7a0372017495958dbc4f601f40c122508
2023-09-21 05:39:23 +00:00
James D. Forrester
1d0b7ae1e2 Namespace User under \MediaWiki\User
Bug: T166010
Change-Id: I7257302b485588af31384d4f7fc8e30551f161f1
2023-09-19 19:18:16 +00:00
James D. Forrester
459cbb0494 Namespace remaining 'specialpage' files under \MediaWiki\SpecialPage
SpecialPageFactory is already here, but none of the others were yet.

Bug: T166010
Change-Id: I9689bf0a1ab329625e23669b99f019b96295fffd
2023-09-18 18:23:13 +01:00
Tim Starling
6790bf9910 Remove $wgLang usage from Title
StubUserLang was meant to avoid the cost of looking up the user
preferences on requests which don't need it. There's no point in using
it if you are going to unconditionally call a method on the resulting
object.

StubUserLang proxies to RequestContext::getLanguage() via __call(),
which has a cost. Originally this cost was avoided on subsequent calls
by overwriting $wgLang, but this mechanism is not effective if you retain
a reference to the StubUserLang.

Removing the potential for Title::getPageLanguage() to return
StubUserLang simplifies the type declarations for methods that call it.

Bug: T160814
Change-Id: I12ad75c2496ca727580aac55e860178d15febb6e
2023-07-11 11:15:02 +10:00
Daimona Eaytoy
518a5da533 Replace deprecated MWException
Bug: T328220
Change-Id: I0408575ee71e58d1c9e9ebedabab35bd3813f515
2023-06-12 12:27:49 +00:00
Amir Sarabadani
e182010622 Reorg: Move SiteStats*.php to SiteStats/
It's going to be a bit small and narrow but it's better than sitting in
the root of includes/ plus I hope we can hollow out SiteStatsUpdate
class into the third one and or move this under a better directory in
the future.

Bug: T321882
Change-Id: Ia503b53b31ca00600f8c18b61a2652c3e146494e
2023-04-27 01:16:29 +02:00
James D. Forrester
ad06527fb4 Reorg: Namespace the Title class
This is moderately messy.

Process was principally:

* xargs rg --files-with-matches '^use Title;' | grep 'php$' | \
  xargs -P 1 -n 1 sed -i -z 's/use Title;/use MediaWiki\\Title\\Title;/1'
* rg --files-without-match 'MediaWiki\\Title\\Title;' . | grep 'php$' | \
  xargs rg --files-with-matches 'Title\b' | \
  xargs -P 1 -n 1 sed -i -z 's/\nuse /\nuse MediaWiki\\Title\\Title;\nuse /1'
* composer fix

Then manual fix-ups for a few files that don't have any use statements.

Bug: T166010
Follows-Up: Ia5d8cb759dc3bc9e9bbe217d0fb109e2f8c4101a
Change-Id: If8fc9d0d95fc1a114021e282a706fc3e7da3524b
2023-03-02 08:46:53 -05:00
jenkins-bot
2bf924b451 Merge "parser: Use the actual revision timestamp of the page for messages" 2023-02-24 04:58:16 +00:00
jenkins-bot
548ede7d7b Merge "CoreMagicVariables/CoreParserFunction: unify revisionid" 2023-02-24 04:58:07 +00:00
jenkins-bot
7d697a6dea Merge "CoreMagicVariables/CoreParserFunction: unify revisionuser" 2023-02-24 04:25:22 +00:00
Func
4549aa6c5f parser: Use the actual revision timestamp of the page for messages
This patch aims to fix T320338#8334752, any message parsing that
happened when (pre)viewing a page should use the actual timestamp of
the last saved revision. The title is the same as the current page
only for context.

For the case reported in T320338#8334752, edit notice should indicate
the time since the last edit properly.

Bug: T320338
Change-Id: I2a5d44087e74efa4a7e2bb1b75036506e8e3823a
2023-02-24 10:42:14 +08:00
jenkins-bot
9f2e36641c Merge "Reorg: Move category-related classes from includes/ to Category/" 2023-02-09 23:20:40 +00:00
Amir Sarabadani
c8116223b4 Reorg: Move category-related classes from includes/ to Category/
Bug: T321882
Change-Id: I0b86acfdeaa3a2a0a14b7763fd088122820bafdc
2023-02-09 20:18:54 +01:00
Umherirrender
ed169d991e Remove unused arguments to private functions
Found by phan dead detection

Change-Id: I93379b7b9a733206d0e53add04fcdb9478c58755
2023-02-08 19:00:47 +00:00
jenkins-bot
6b66390f82 Merge "Add Parser::msg() helper for messages from extensions or parser functions" 2023-01-24 20:37:39 +00:00
C. Scott Ananian
8593d538a2 Add Parser::msg() helper for messages from extensions or parser functions
Using a standard wfMessage(...)->escaped() in the context of a parser
function or tag hook is wrong, because it'll use the current user's
language and pollute the cache. In reviewing MediaWiki extensions a
lot of them do this wrong.

This provides a helper to make it more likely that folks will get this
right.

Bug: T202481
Change-Id: Ic6dfd111c173b97a959c27e18fa83cc572326fa6
2023-01-23 14:04:27 -05:00
jenkins-bot
c903124b77 Merge "Make use of ?:, ?? and ??= operators in mostly trivial cases" 2022-12-16 02:51:26 +00:00
Amir Sarabadani
a1b4699fea Reorg: Move MagicWord related files to under parser/
This is approved as part of T166010 RFC.

Bug: T321882
Change-Id: Ia4498c0a20e38a6a288dc14065ea8242c84fbc49
2022-12-09 13:48:35 +01:00
thiemowmde
70aa9c8e35 Make use of ?:, ?? and ??= operators in mostly trivial cases
The motivation is to make the code less confusing. I hope this is the
case.

?? is an older PHP 7.0 feature.
??= was added in PHP 7.4, which we can finally use.

Change-Id: Id807affa52bd1151a74c064623b41d950a389560
2022-12-05 21:37:13 +01:00
C. Scott Ananian
24d69ef952 Deprecate Parser::getFunctionLang()
This is identical to Parser::getTargetLanguage() in modern MediaWiki,
since 7df3473cfe in MW 1.19 (2012).

Bug: T318860
Depends-On: If5fa696e27e84a3aa1343551d7482c933da0a9b6
Depends-On: I87a7ceedce173f6de4bb6722ffe594273c7b0359
Change-Id: Ieed03003095656e69b8e64ed307c6bd67c45c1e7
2022-11-16 16:47:16 -05:00
Amir Sarabadani
0fff5089ba Reorg: Move StubObject classes in includes to its own directory
Bug: T166010
Change-Id: Idcf0e9dc6e0841e4f132207bce0f96774dad898c
2022-10-25 16:04:48 -04:00
Tim Starling
0077c5da15 Use short array destructuring instead of list()
Introduced in PHP 7.1. Because it's shorter and looks nice.

I used regex replacement.

Change-Id: I0555e199d126cd44501f859cb4589f8bd49694da
2022-10-21 15:33:37 +11:00
Func
89cb5c3e5a parser: Make the behavior of REVISIONTIMESTAMP consistent
The second parameter to userAdjust() should be supplied with empty
string, which tells it to use the site-default timezone offset
instead of the user settings.
See also the $parser->getRevisionTimestamp() and another use of
userAdjust() above.

Bug: T320338
Change-Id: I05fc1e00f2b954ad04141072967b09ebec146f3e
2022-10-09 17:54:41 +08:00
Daimona Eaytoy
947ff7c0f5 build: Update mediawiki/mediawiki-phan-config to 0.12.0
This patch only adds and removes suppressions, which must be done in the
same patch as the version bump.

Bug: T298571
Change-Id: I4044d4d9ce82b3dae7ba0af85bf04f22cb1dd347
2022-10-08 15:45:42 +02:00
C. Scott Ananian
d8e519987d CoreMagicVariables/CoreParserFunction: unify revisionid
Reduce code duplication by having the "magic variable" implementation
of `revisionid` invoke the corresponding "parser function"
implementation with no arguments.  This reduces code duplication and
also supports consistent results from direct invocation of the parser
function with no arguments in the future.

Bug: T204370
Change-Id: I2dc4799559f440511b4584a73513c17d5b0f1ff0
2022-09-22 13:05:19 -04:00
C. Scott Ananian
2da6deaba4 CoreMagicVariables/CoreParserFunction: unify revisionuser
Reduce code duplication by having the "magic variable" implementation
of `revisionuser` invoke the corresponding "parser function"
implementation with no arguments.  This reduces code duplication and
also supports consistent results from direct invocation of the parser
function with no arguments in the future.

Bug: T204370
Change-Id: I4200e4f05a8987b509349832d6d0b164acbe0dd8
2022-09-22 13:05:19 -04:00
C. Scott Ananian
5cb4693772 Unify no-arg and 1-arg forms of {{REVISIONTIMESTAMP}} and friends
Eliminate a difference between the magic variable (no-arg) and
parser function (1-arg) forms; aka the difference between
{REVISIONTIMESTAMP}} and {{REVISIONTIMESTAMP:{{PAGENAME}}}}.

This is a follow up to I8d25755e4d92bd91988cfb706d85bdb170abb207.

The magic variable contains a MAX_TTS optimisation which reduces
the use of vary-revision-timestamp, since it has severe performance
implications; this patch applies the same optimisation to the
parser function.

The ParserTestRunner has a small issue with test setup:
ParserOptions::setTimestamp() was called with a unix-format timestamp,
where it expected a TS_MW format timestamp.  This issue was
fixed, along with tweaking the test timestamps so that a timestamp
coming from ParserOptions would still be distinguishable from one
coming from the revision.

Bug: T204370
Change-Id: I883d42d67013b6fb0da57c61e715b51d3a807879
2022-09-21 17:02:22 -04:00
C. Scott Ananian
d08e0cdf20 CoreMagicVariables/CoreParserFunctions: unify revisiontimestamp & etc
Reduce code duplication by having the "magic variable" implementations
of:

  revisionday, revisionday2, revisionmonth, revisionmonth1,
  revisionyear, revisiontimestamp

invoke the corresponding "parser function" implementations with no
arguments.  This reduces code duplication and also supports consistent
results from direct invocation of the parser function with no
arguments in the future.

Bug: T204370
Change-Id: I8d25755e4d92bd91988cfb706d85bdb170abb207
2022-09-21 16:58:02 -04:00
C. Scott Ananian
9a37dbda6d Unify the "magic variable" and "parser function" form of several built-ins
The following magic variables also have "parser function" forms, where
the first argument is a user-supplied title:

  pagename, pagenamee, fullpagename, fullpagenamee, subpagename,
  subpagenamee, rootpagename, rootpagenamee, basepagename,
  basepagenamee, talkpagename, talkpagenamee, subjectpagename,
  subjectpagenamee, pageid, cascadingsources, namespace, namespacee,
  namespacenumber, talkspace, talkspacee, subjectspace, subjectspacee

Refactor the code so that the magic variable form invokes the parser
function form with no arguments to reduce code duplication.  We also
tweak the behavior of parser function when invoked with no arguments,
although this change will not be directly visible because the parser
always prefers magic variables over parser functions, so the parser
function is never actually invoked with no arguments.

A future patch may allow the parser function to be invoked with a
hash prefix (I895087c546dc820c77c0dda596dfeb72586b87cc) in which case
consistency will be more important.

Note that `revisionuser`, `revisionid`, `revisionday`, `revisionday2`,
`revisionmonth`, `revisionmonth1`, `revisionyear` and
`revisiontimestamp` are also of a similar form and could be included
in this list, but their magic variable and parser function
implementations do not appear to be consistent.  This will be
addressed in future patches.

In addition, the following magic variables have a "parser function" form
where the presence or absence of the first argument selects "raw" output:

  numberofarticles, numberoffiles, numberofusers, numberofactiveusers,
  numberofpages, numberofadmins, numberofedits

Similar to above, refactor the code so that the magic variable form
invokes the parser function form with no arguments to reduce code
duplication (and to support future direct invocation of the parser
function with no arguments).

Bug: T204370
Change-Id: Iaec33fb40a2d9884daf2852ed6a6a3b53c9d3863
2022-09-19 22:11:58 -04:00
Umherirrender
8509316946 parser: Remove Title::canHaveTalkPage check from fullpagename
The check is only relevant when calling Title::getTalkPage/getTalkNsText

Check exists since the addtion of the functions in a4fafb0

Bug: T317582
Change-Id: I2e36fd963b2f943ed67a93c2573008b2d1fb094b
2022-09-16 22:48:53 +02:00
Subramanya Sastry
98b3ddd7c7 Added Parsoid support for nowiki stripping in args of {{#tag:ext|...}}
* This patch relies on extensions setting a flag in their Parsoid ext.
  config indicating that a specific tag handler needs nowikis stripped
  from #tag arguments.

  In the #tag parser function implementation, Parsoid's SiteConfig is
  looked up to see if nowiki needs to be stripped.

* This need not be limited to nowikis, but to support extension use in
  {{#tag:ext|...}} more generally, we would need to either
  (a) implement the #tag parser function in Parsoid natively; OR
  (b) find a way to call Parsoid from extensionSubstitution

  Soln (a) needs Parsoid to support parser functions natively.

  If this general support becomes necessary, a later patch can
  generalize this appropriately.

Bug: T272939
Bug: T299103
Depends-On: I6a653889afd42fefb61daefd8ac842107dce8759
Depends-On: I56043e0cb7d355a3f0d08e429bb1dbba6acb4fba
Change-Id: I614153af67b5a14f33b7dfc04bd00dd9e03557d0
2022-08-20 20:56:54 -05:00
C. Scott Ananian
8bdb5d4a05 Rename CoreParserFunctions::mwnamespace function to ::namespace
This corresponds to the `namespace` parser function, but between PHP 5.3
and PHP 7, `namespace` was a reserved name that couldn't be used as a
function name.  It was made "semi-reserved" by the PHP 7 context-sensitive
lexer, and MW currently requires PHP >= 7.3.19.

Change-Id: If8a1401c38b9140bb40a3381845a0d115546422a
2022-08-04 14:38:03 -04:00
Brian Wolff
be0ad61b10 Make default value for optional args {{PAGESINCAT:..}} be '' not null
PHP 8.1 doesn't like passing nulls around, and in context the empty
string makes more sense anyways as the default value for unspecified
options.

Bug: T313663
Bug: T313662
Change-Id: Ica9460716129481f9cb1ebff3b660d2d1bb15f55
2022-07-28 10:35:44 -07:00
Reedy
9d61b6de78 parser: Fix CoreParserFunctions::urlencode() null coalescence $arg
Bug: T312680
Change-Id: I6fc3a08390f0c6d2d2c4c4dd79df52ab91e6c1f3
2022-07-10 01:08:07 +00:00
Matěj Suchánek
e47c441078 Fix many typos in comments
Found using IntelliJ's "Typo" code inspection.

Change-Id: I746220ebe6e1e39f6cb503390ec9053e6518cf16
2022-05-10 12:46:11 +00:00
Aryeh Gregor
7b791474a5 Use MainConfigNames instead of string literals, #4
Now largely automated:

VARS=$(grep -o "'[A-Za-z0-9_]*'" includes/MainConfigNames.php | \
  tr "\n" '|' | sed "s/|$/\n/;s/'//g")
sed -i -E "s/'($VARS)'/MainConfigNames::\1/g" \
  $(grep -ERIl "'($VARS)'" includes/)

Then git add -p with lots of error-prone manual checking. Then
semi-manually add all the necessary "use" lines:

vim $(grep -L 'use MediaWiki\\MainConfigNames;' \
  $(git diff --cached --name-only --diff-filter=M HEAD^))

I didn't bother fixing lines that were over 100 characters unless they
were over 120 and triggered phpcs.

Bug: T305805
Change-Id: I74e0ab511abecb276717ad4276a124760a268147
2022-04-26 19:03:37 +03:00
Umherirrender
e3a741661a Remove usage of protection related deprecated Title function
Bug: T306131
Change-Id: I487a12a88ae82c367d1cbb2e52083fe20b27d4ce
2022-04-14 21:42:55 +00:00
Func
0d95d817f8 CoreParserFunctions: Use Parser::getTargetLanguageConverter()
Follow-up 468721ab, the getTargetLanguageConverter() method is made
public in that commit.

Change-Id: I94d472111d8f554c30ee396081745eb1c5c09f98
2022-03-31 22:46:19 +08:00
Umherirrender
6dd8a2bb32 phan: Disable scalar_implicit_cast setting
Make phan stricter about scalar types by setting scalar_implicit_cast to
false (the default in mediawiki-phan-config)

Bug: T242536
Bug: T301991
Change-Id: Ia2fe30b17804186571722e728578121c8b75d455
2022-03-18 18:52:24 +00:00
Umherirrender
3fdf000bbf parser: Simplify CoreParserFunctions::formatRaw for phan
Allow static code analyzer to understand that the factory is always set
by using only on outer if for $raw

Found by phan strict checks

Change-Id: I644f03f08fdb1b23a3074c603d00e2aa863ae8c0
2022-03-09 20:21:53 +01:00
C. Scott Ananian
822020da6f Deprecate Parser::getDefaultSort(), ::setDefaultSort(), ::getCustomDefaultSort()
In modern mediawiki these methods are just wrappers for the 'defaultsort'
page property, and don't need a parser property of their own.

Change-Id: I18bdffd4d6565733fb52cbff409cc25d49a76b65
2022-03-08 15:08:02 -05:00
C. Scott Ananian
a5e96d9d66 Add inline taint information for Sanitizer::remove*Tags()
This moves the taint information to be directly on the method,
moving it out of the SecurityCheckPlugin.  See discussion on
Ieb202ef92bd9888ce767f8dd4d97f19eeb10a073.

We also fix a legit "double-escape" issue flagged by the phan
SecurityCheckPlugin once the correct taint information has been
added.

Followup-To: Ic864c01471c292f11799c4fbdac4d7d30b8bc50f
Change-Id: I0f873618d43cb6daf9c43394a669125469462223
2022-03-07 16:50:58 -05:00
C. Scott Ananian
9f14fbd002 Add Sanitizer::removeSomeTags() which uses Remex to tokenize
The existing Sanitizer::removeHTMLtags() method, in addition to having
dodgy capitalization, uses regular expressions to parse the HTML.
That produces corner cases like T298401 and T67747 and is not guaranteed
to yield balanced or well-formed HTML.

Instead, introduce and use a new Sanitizer::removeSomeTags() method
which is guaranteed to always return balanced and well-formed HTML.

Note that Sanitizer::removeHTMLtags()/::removeSomeTags() take a callback
argument which (as far as I can tell) is never used outside core. Mark
that argument as @internal, and clean up the version used by
::removeSomeTags().

Use the new ::removeSomeTags() method in the two places where
DISPLAYTITLE is handled (following up on T67747).  The use by the
legacy parser is more difficult to replace (and would have a
performace cost), so leave the old ::removeHTMLtags() method in place
for that call site for now: when the legacy parser is replaced by
Parsoid the need for the old ::removeHTMLtags() will go away.  In a
follow-up patch we'll rename ::removeHTMLtags() and mark it @internal
so that we can deprecate ::removeHTMLtags() for external use.

Some benchmarking code added.  On my machine, with PHP 7.4, the new
method tidies short 30-character title strings at a rate of about
6764/s while the tidy-based method being replaced here managed 6384/s.
Sanitizer::removeHTMLtags blazes through short strings 20x faster
(120,915/s); some of this difference is due to the set up cost of
creating the tag whitelist and the Remex pipeline, so further
optimizations could doubtless be done if Sanitizer::removeSomeTags()
is more widely used.

Bug: T299722
Bug: T67747
Change-Id: Ic864c01471c292f11799c4fbdac4d7d30b8bc50f
2022-03-04 14:06:02 -05:00
Umherirrender
9efd9ca45e Add explicit casts between scalar types
* Some functions accept only string, cast ints and floats to string
* After preg_matches or explode() casts numbers to int to do maths
* Cast unix timestamps to int to do maths
* Cast return values from timestamp format function to int
* Cast bitwise operator to bool when needed as bool

* php internal functions like floor/round/ceil documented to return
  float, most cases the result is used as int, added casts

Found by phan strict checks

Change-Id: Icb2de32107f43817acc45fe296fb77acf65c1786
2022-03-01 18:19:33 +01:00
C. Scott Ananian
c39ef6c6c9 Change return value of ParserOutput::getPageProperty() when property is missing
The old ParserOutput::getProperty() method returned `false` when a property
was missing.  This requires callers to use the `?:` syntax to supply default
values, which then causes any falsey value to be treated as missing.
So, for example, setting the defaultsort to '0' will cause the default
sort to be ignored.

Modern php convention is to use `null` for missing values, and the `??`
syntax is a better/more restrictive alternative to `?:`.

We renamed `ParserOutput::getProperty()` to `::getPageProperty()` in
1.38 (Ie963eea5aa0f0e984ced7c4dfa0fd65d57313cfa/T287216) but kept the
return value convention.  Before this actually makes it into a 1.38
release, take the opportunity to fix the return value for the new
`ParserOutput::getPageProperty()` method to return `null` when the
property is missing.

We need to do some temporary workarounds to the places we'd
already swapped over to use the new `::getPageProperty()` method
to allow them to handle either `false` or `null` as a return value;
we'll clean that up once this is merged.

Code search:
https://codesearch.wmcloud.org/deployed/?q=-%3EgetPageProperty%5C%28|T301915&i=nope&files=&excludeFiles=&repos=

Bug: T301915
Depends-On: I3f11ce604970e47b41fc1c123792df8c3045626f
Depends-On: Ie7533f49fe4cad01ebfda29760d23c61e9867b10
Depends-On: Ic5c09f5caa4c897bc553c614fbae9cee159566a2
Depends-On: I0278b2eafd90e77e4fee41c45a1165fb79ddf47e
Depends-On: I383abb6b7dc5e96c0061af13957609f6e31a1065
Depends-On: I79f9f4078e415284af29b15047bafd1c823d7f5b
Depends-On: I02276c48c49f5d2d241a69eb0a6cdf439b572d8b
Depends-On: I71628661b4539a4e35ae32846e719f92bcf782e0
Depends-On: I7e215cb43de0ce150a6bcc00f92481dcdcfed383
Change-Id: Iaa25c390118d2db2b6578cdd558f2defd5351d15
2022-02-18 21:15:58 +00:00
jenkins-bot
66a9e60776 Merge "Add getMemberCount() to Category to supersede getPageCount()" 2022-02-02 03:47:38 +00:00
daniel
026133bb05 remove access to config globals from includes/parser
Loops ServiceOptions through to CoreParserFunctions and CoreTagHooks to
avoid access to the main config from static methods.

Bug: T294739
Change-Id: Ia6c97f2d0952964c2ad6189f8053ad127589b37c
2022-02-01 07:48:57 -08:00
Ammarpad
c7734cbb9f Add getMemberCount() to Category to supersede getPageCount()
getPageCount() method return `cat_pages`, a value that makes sense
on database table but is currently non-intuitve in object context
where there's a value that better deserves the name. This makes it
necessary for callers to repeat same logic to get the content pages
count and a comment to explain the behavior.

In this patch, getMemberCount() is added. It returns the total
member count as getPageCount(), by default, does now.

getPageCount() now takes a parameter and two public constants are
provided for that; Category::COUNT_CONTENT_PAGES return count of all
memebers to retain existing behavior, Category::COUNT_CONTENT_PAGES
will return only content pages.

In future there'd be no need for the parameter. Content pages will
be returned always. Total member count is already accessible with
getMemberCount().

Also improve return type doc of getId() and getName()

Bug: T299350
Change-Id: I63c711ebc697c1a131a50910c854f956d4021254
2022-01-31 13:59:34 +01:00
Tim Starling
c5ef6e3091 PHP 8.1: add ENT_COMPAT to some htmlspecialchars() calls
In PHP 8.1 the default $flags argument to htmlspecialchars() has changed
from ENT_COMPAT to ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401. This
breaks some tests.

I changed all the calls that break unit tests, and some others
based on a quick code review. A lot of callers just use the default for
convenience, and were already over-quoting, so the default should still
be good enough for them.

Change-Id: Ie9fbeae6f0417c6cf29dceaf429243a135f9fecb
2022-01-25 16:30:44 +11:00