2004-02-26 13:37:26 +00:00
|
|
|
|
<?php
|
2004-09-02 23:28:24 +00:00
|
|
|
|
/**
|
2012-04-30 09:22:16 +00:00
|
|
|
|
* PHP parser that converts wiki markup to HTML.
|
|
|
|
|
|
*
|
|
|
|
|
|
* This program is free software; you can redistribute it and/or modify
|
|
|
|
|
|
* it under the terms of the GNU General Public License as published by
|
|
|
|
|
|
* the Free Software Foundation; either version 2 of the License, or
|
|
|
|
|
|
* (at your option) any later version.
|
|
|
|
|
|
*
|
|
|
|
|
|
* This program is distributed in the hope that it will be useful,
|
|
|
|
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
|
|
* GNU General Public License for more details.
|
|
|
|
|
|
*
|
|
|
|
|
|
* You should have received a copy of the GNU General Public License along
|
|
|
|
|
|
* with this program; if not, write to the Free Software Foundation, Inc.,
|
|
|
|
|
|
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
|
|
|
|
|
* http://www.gnu.org/copyleft/gpl.html
|
2007-05-01 23:41:44 +00:00
|
|
|
|
*
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
|
* @file
|
|
|
|
|
|
* @ingroup Parser
|
2004-09-02 23:28:24 +00:00
|
|
|
|
*/
|
2021-08-01 15:11:23 +00:00
|
|
|
|
|
2019-08-18 18:19:05 +00:00
|
|
|
|
use MediaWiki\BadFileLookup;
|
2021-04-25 17:29:33 +00:00
|
|
|
|
use MediaWiki\Cache\CacheKeyHelper;
|
2019-04-12 09:50:30 +00:00
|
|
|
|
use MediaWiki\Config\ServiceOptions;
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
use MediaWiki\HookContainer\HookContainer;
|
|
|
|
|
|
use MediaWiki\HookContainer\HookRunner;
|
2021-08-04 12:56:30 +00:00
|
|
|
|
use MediaWiki\Http\HttpRequestFactory;
|
2022-12-12 02:10:13 +00:00
|
|
|
|
use MediaWiki\Language\RawMessage;
|
2020-02-04 12:42:03 +00:00
|
|
|
|
use MediaWiki\Languages\LanguageConverterFactory;
|
2020-01-03 23:03:14 +00:00
|
|
|
|
use MediaWiki\Languages\LanguageNameUtils;
|
2022-12-05 11:29:37 +00:00
|
|
|
|
use MediaWiki\Linker\Linker;
|
2016-05-13 00:37:17 +00:00
|
|
|
|
use MediaWiki\Linker\LinkRenderer;
|
2018-08-08 14:57:31 +00:00
|
|
|
|
use MediaWiki\Linker\LinkRendererFactory;
|
2019-04-15 12:47:32 +00:00
|
|
|
|
use MediaWiki\Linker\LinkTarget;
|
2022-04-10 15:34:45 +00:00
|
|
|
|
use MediaWiki\MainConfigNames;
|
2016-05-13 00:37:17 +00:00
|
|
|
|
use MediaWiki\MediaWikiServices;
|
2021-04-25 17:29:33 +00:00
|
|
|
|
use MediaWiki\Page\PageIdentity;
|
|
|
|
|
|
use MediaWiki\Page\PageReference;
|
2022-12-09 12:28:41 +00:00
|
|
|
|
use MediaWiki\Parser\MagicWordArray;
|
|
|
|
|
|
use MediaWiki\Parser\MagicWordFactory;
|
Add new ParserOutput::{get,set}OutputFlag() interface
This is a uniform mechanism to access a number of bespoke boolean
flags in ParserOutput. It allows extensibility in core (by adding new
field names to ParserOutputFlags) without exposing new getter/setter
methods to Parsoid. It replaces the ParserOutput::{get,set}Flag()
interface which (a) doesn't allow access to certain flags, and (b) is
typically called with a string rather than a constant, and (c) has a
very generic name. (Note that Parser::setOutputFlag() already called
these "output flags".)
In the future we might unify the representation so that we store
everything in $mFlags and don't have explicit properties in
ParserOutput, but those representation details should be invisible to
the clients of this API. (We might also use a proper enumeration
for ParserOutputFlags, when PHP supports this.)
There is some overlap with ParserOutput::{get,set}ExtensionData(), but
I've left those methods as-is because (a) they allow for non-boolean
data, unlike the *Flag() methods, and (b) it seems worthwhile to
distingush properties set by extensions from properties used by core.
Code search:
https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos=
Bug: T292868
Change-Id: I39bc58d207836df6f328c54be9e3330719cebbeb
2021-10-08 20:04:37 +00:00
|
|
|
|
use MediaWiki\Parser\ParserOutputFlags;
|
2022-04-07 23:52:05 +00:00
|
|
|
|
use MediaWiki\Preferences\SignatureValidatorFactory;
|
2022-10-28 10:04:25 +00:00
|
|
|
|
use MediaWiki\Request\FauxRequest;
|
2020-04-29 05:37:47 +00:00
|
|
|
|
use MediaWiki\Revision\RevisionAccessException;
|
2020-04-09 03:36:39 +00:00
|
|
|
|
use MediaWiki\Revision\RevisionRecord;
|
2020-06-03 03:48:42 +00:00
|
|
|
|
use MediaWiki\Revision\SlotRecord;
|
2020-02-21 00:01:43 +00:00
|
|
|
|
use MediaWiki\SpecialPage\SpecialPageFactory;
|
2022-10-25 16:58:49 +00:00
|
|
|
|
use MediaWiki\StubObject\StubUserLang;
|
2021-02-18 16:51:12 +00:00
|
|
|
|
use MediaWiki\Tidy\TidyDriverBase;
|
2021-03-16 18:31:27 +00:00
|
|
|
|
use MediaWiki\User\UserFactory;
|
|
|
|
|
|
use MediaWiki\User\UserIdentity;
|
2022-04-11 01:26:51 +00:00
|
|
|
|
use MediaWiki\User\UserNameUtils;
|
2021-03-16 18:31:27 +00:00
|
|
|
|
use MediaWiki\User\UserOptionsLookup;
|
2022-04-28 13:33:39 +00:00
|
|
|
|
use MediaWiki\Utils\UrlUtils;
|
2020-01-10 00:00:51 +00:00
|
|
|
|
use Psr\Log\LoggerInterface;
|
2019-06-25 18:53:15 +00:00
|
|
|
|
use Wikimedia\IPUtils;
|
2016-10-12 05:36:03 +00:00
|
|
|
|
use Wikimedia\ScopedCallback;
|
2004-09-02 23:28:24 +00:00
|
|
|
|
|
2012-04-30 09:22:16 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @defgroup Parser Parser
|
|
|
|
|
|
*/
|
2006-06-02 20:54:34 +00:00
|
|
|
|
|
2004-09-02 23:28:24 +00:00
|
|
|
|
/**
|
2008-04-14 07:45:50 +00:00
|
|
|
|
* PHP Parser - Processes wiki markup (which uses a more user-friendly
|
2007-04-04 05:22:37 +00:00
|
|
|
|
* syntax, such as "[[link]]" for making links), and provides a one-way
|
2013-05-10 04:04:33 +00:00
|
|
|
|
* transformation of that wiki markup it into (X)HTML output / markup
|
2007-04-04 05:22:37 +00:00
|
|
|
|
* (which in turn the browser understands, and can display).
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*
|
2012-07-10 15:18:20 +00:00
|
|
|
|
* There are seven main entry points into the Parser class:
|
|
|
|
|
|
*
|
|
|
|
|
|
* - Parser::parse()
|
2010-03-30 21:53:56 +00:00
|
|
|
|
* produces HTML output
|
2014-11-08 19:07:19 +00:00
|
|
|
|
* - Parser::preSaveTransform()
|
|
|
|
|
|
* produces altered wiki markup
|
2012-07-10 15:18:20 +00:00
|
|
|
|
* - Parser::preprocess()
|
2010-03-30 21:53:56 +00:00
|
|
|
|
* removes HTML comments and expands templates
|
2012-07-10 15:18:20 +00:00
|
|
|
|
* - Parser::cleanSig() and Parser::cleanSigInSig()
|
2014-11-08 19:07:19 +00:00
|
|
|
|
* cleans a signature before saving it to preferences
|
2012-07-10 15:18:20 +00:00
|
|
|
|
* - Parser::getSection()
|
2014-11-08 19:07:19 +00:00
|
|
|
|
* return the content of a section from an article for section editing
|
2012-07-10 15:18:20 +00:00
|
|
|
|
* - Parser::replaceSection()
|
2014-11-08 19:07:19 +00:00
|
|
|
|
* replaces a section by number inside an article
|
2012-07-10 15:18:20 +00:00
|
|
|
|
* - Parser::getPreloadText()
|
2014-11-08 19:07:19 +00:00
|
|
|
|
* removes <noinclude> sections and <includeonly> tags
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*
|
2012-07-10 15:18:20 +00:00
|
|
|
|
* @warning $wgUser or $wgTitle or $wgRequest or $wgLang. Keep them away!
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*
|
2012-07-10 15:18:20 +00:00
|
|
|
|
* @par Settings:
|
|
|
|
|
|
* $wgNamespacesWithSubpages
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*
|
2012-07-10 15:18:20 +00:00
|
|
|
|
* @par Settings only within ParserOptions:
|
|
|
|
|
|
* $wgAllowExternalImages
|
|
|
|
|
|
* $wgAllowSpecialInclusion
|
|
|
|
|
|
* $wgInterwikiMagic
|
|
|
|
|
|
* $wgMaxArticleSize
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
|
* @ingroup Parser
|
2004-09-02 23:28:24 +00:00
|
|
|
|
*/
|
2022-12-10 18:49:27 +00:00
|
|
|
|
#[AllowDynamicProperties]
|
2010-03-30 21:20:05 +00:00
|
|
|
|
class Parser {
|
2021-07-28 14:08:59 +00:00
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
# Flags for Parser::setFunctionHook
|
2020-05-15 22:40:17 +00:00
|
|
|
|
public const SFH_NO_HASH = 1;
|
|
|
|
|
|
public const SFH_OBJECT_ARGS = 2;
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
|
|
|
|
|
# Constants needed for external link processing
|
|
|
|
|
|
# Everything except bracket, space, or control characters
|
2011-07-27 18:03:01 +00:00
|
|
|
|
# \p{Zs} is unicode 'separator, space' category. It covers the space 0x20
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# as well as U+3000 is IDEOGRAPHIC SPACE for T21052
|
Remove Preprocessor_DOM, deprecated in 1.34
Remove the deprecated Preprocessor_DOM class, which was hard-deprecated
in 1.34. This begins to simplify parser configuration and reduce redundant
code paths, but I've left two things for cleanup in a future patch:
1. The `preprocessorClass` configuration option to the parser, exposed
in `$wgParserConf`, ServiceWiring, ParserFactory, etc. There is no reason
for this to be exposed as configurable, but I've left this clean up to a
future patch.
2. The `$wgMaxGeneratedPPNodeCount` configuration, exposed also in
ParserOptions. Only Preprocessor_DOM calculated this count, and since
we are only using Preprocessor_Hash now, this configuration has no effect.
But since this value was exposed in ParserOptions and elsewhere, I've
deprecated where needed but left this clean up to a future patch.
Bug: T204945
Change-Id: I727f003f9a42d0c92bcbcce8a8289d5af6cd1298
2020-01-24 21:23:46 +00:00
|
|
|
|
# \x{FFFD} is the Unicode replacement character, which the HTML5 spec
|
2017-02-27 21:27:15 +00:00
|
|
|
|
# uses to replace invalid HTML characters.
|
2020-05-15 22:40:17 +00:00
|
|
|
|
public const EXT_LINK_URL_CLASS = '[^][<>"\\x00-\\x20\\x7F\p{Zs}\x{FFFD}]';
|
2015-01-08 22:00:54 +00:00
|
|
|
|
# Simplified expression to match an IPv4 or IPv6 address, or
|
|
|
|
|
|
# at least one character of a host name (embeds EXT_LINK_URL_CLASS)
|
2020-03-30 17:37:25 +00:00
|
|
|
|
// phpcs:ignore Generic.Files.LineLength
|
|
|
|
|
|
private const EXT_LINK_ADDR = '(?:[0-9.]+|\\[(?i:[0-9a-f:.]+)\\]|[^][<>"\\x00-\\x20\\x7F\p{Zs}\x{FFFD}])';
|
2015-01-08 22:00:54 +00:00
|
|
|
|
# RegExp to make image URLs (embeds IPv6 part of EXT_LINK_ADDR)
|
2018-01-01 13:10:16 +00:00
|
|
|
|
// phpcs:ignore Generic.Files.LineLength
|
2020-03-30 17:37:25 +00:00
|
|
|
|
private const EXT_IMAGE_REGEX = '/^(http:\/\/|https:\/\/)((?:\\[(?i:[0-9a-f:.]+)\\])?[^][<>"\\x00-\\x20\\x7F\p{Zs}\x{FFFD}]+)
|
2011-07-27 18:03:01 +00:00
|
|
|
|
\\/([A-Za-z0-9_.,~%\\-+&;#*?!=()@\\x80-\\xFF]+)\\.((?i)gif|png|jpg|jpeg)$/Sxu';
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
2014-05-16 00:35:59 +00:00
|
|
|
|
# Regular expression for a non-newline space
|
2020-03-30 17:37:25 +00:00
|
|
|
|
private const SPACE_NOT_NL = '(?:\t| |&\#0*160;|&\#[Xx]0*[Aa]0;|\p{Zs})';
|
2014-05-16 00:35:59 +00:00
|
|
|
|
|
2020-04-07 23:52:41 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @var int Preprocess wikitext in transclusion mode
|
|
|
|
|
|
* @deprecated Since 1.36
|
|
|
|
|
|
*/
|
|
|
|
|
|
public const PTD_FOR_INCLUSION = Preprocessor::DOM_FOR_INCLUSION;
|
2008-01-22 10:47:44 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Allowed values for $this->mOutputType
|
|
|
|
|
|
# Parameter to startExternalParse().
|
2020-05-15 22:40:17 +00:00
|
|
|
|
public const OT_HTML = 1; # like parse()
|
|
|
|
|
|
public const OT_WIKI = 2; # like preSaveTransform()
|
|
|
|
|
|
public const OT_PREPROCESS = 3; # like preprocess()
|
|
|
|
|
|
public const OT_MSG = 3;
|
|
|
|
|
|
# like extractSections() - portions of the original are returned unchanged.
|
|
|
|
|
|
public const OT_PLAIN = 4;
|
2008-03-27 00:00:25 +00:00
|
|
|
|
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @var string Prefix and suffix for temporary replacement strings
|
|
|
|
|
|
* for the multipass parser.
|
|
|
|
|
|
*
|
|
|
|
|
|
* \x7f should never appear in input as it's disallowed in XML.
|
|
|
|
|
|
* Using it at the front also gives us a little extra robustness
|
|
|
|
|
|
* since it shouldn't match when butted up against identifier-like
|
|
|
|
|
|
* string constructs.
|
|
|
|
|
|
*
|
|
|
|
|
|
* Must not consist of all title characters, or else it will change
|
|
|
|
|
|
* the behavior of <nowiki> in a link.
|
2015-12-04 02:39:16 +00:00
|
|
|
|
*
|
|
|
|
|
|
* Must have a character that needs escaping in attributes, otherwise
|
|
|
|
|
|
* someone could put a strip marker in an attribute, to get around
|
|
|
|
|
|
* escaping quote marks, and break out of the attribute. Thus we add
|
|
|
|
|
|
* `'".
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
*/
|
2020-05-15 22:40:17 +00:00
|
|
|
|
public const MARKER_SUFFIX = "-QINU`\"'\x7f";
|
|
|
|
|
|
public const MARKER_PREFIX = "\x7f'\"`UNIQ-";
|
2008-03-27 00:00:25 +00:00
|
|
|
|
|
2021-09-15 01:00:06 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Internal Markers used for wrapping the table of contents.
|
|
|
|
|
|
*
|
|
|
|
|
|
* The use of the `mw:` prefix makes sure that the table of contents is
|
|
|
|
|
|
* identified as a block element, and prevents the introduction of `p` tags
|
|
|
|
|
|
* wrapping the table of contents; see BlockLevelPass.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @var string
|
|
|
|
|
|
* @deprecated since 1.38. These markers are used in old cached
|
|
|
|
|
|
* content but not generated from the current parser (or from Parsoid).
|
|
|
|
|
|
* The constants will be removed in a future MediaWiki release.
|
|
|
|
|
|
*/
|
2020-05-15 22:40:17 +00:00
|
|
|
|
public const TOC_START = '<mw:toc>';
|
2021-09-15 01:00:06 +00:00
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* See ::TOC_START
|
|
|
|
|
|
* @var string
|
|
|
|
|
|
* @deprecated since 1.38. See ::TOC_START
|
|
|
|
|
|
*/
|
2020-05-15 22:40:17 +00:00
|
|
|
|
public const TOC_END = '</mw:toc>';
|
2018-03-24 13:32:58 +00:00
|
|
|
|
|
2021-09-15 01:00:06 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Internal marker used by parser to track where the table of
|
|
|
|
|
|
* contents should be. Various magic words can change the position
|
|
|
|
|
|
* during the parse. The table of contents is generated during
|
|
|
|
|
|
* the parse, however skins have the final decision on whether the
|
|
|
|
|
|
* table of contents is injected. This placeholder element
|
|
|
|
|
|
* identifies where in the page the table of contents should be
|
|
|
|
|
|
* injected, if at all.
|
|
|
|
|
|
* @var string
|
|
|
|
|
|
* @see Keep this in sync with BlockLevelPass::execute() and
|
|
|
|
|
|
* RemexCompatMunger::isTableOfContentsMarker()
|
|
|
|
|
|
* @internal This will be made private as soon as old content
|
|
|
|
|
|
* has expired from the cache (at the moment it is needed in
|
|
|
|
|
|
* ParserOutput for a compatibility fallback). Skins should
|
|
|
|
|
|
* *not* directly reference TOC_PLACEHOLDER but instead use
|
|
|
|
|
|
* Parser::replaceTableOfContentsMarker().
|
|
|
|
|
|
*/
|
2021-12-21 03:26:38 +00:00
|
|
|
|
public const TOC_PLACEHOLDER = '<meta property="mw:PageProp/toc" />';
|
2021-09-15 01:00:06 +00:00
|
|
|
|
|
2022-09-15 16:03:38 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Permissive regexp matching TOC_PLACEHOLDER. This allows for some
|
|
|
|
|
|
* minor modifications to the placeholder to be made by extensions
|
|
|
|
|
|
* without breaking the TOC (T317857); note also that Parsoid's version
|
|
|
|
|
|
* of the placeholder might include additional attributes.
|
|
|
|
|
|
* @var string
|
|
|
|
|
|
*/
|
|
|
|
|
|
private const TOC_PLACEHOLDER_REGEX = '/<meta\\b[^>]*\\bproperty\\s*=\\s*"mw:PageProp\\/toc"[^>]*\\/>/';
|
|
|
|
|
|
|
2014-05-16 00:48:01 +00:00
|
|
|
|
# Persistent:
|
2021-02-18 22:07:33 +00:00
|
|
|
|
private $mTagHooks = [];
|
2021-02-18 22:36:51 +00:00
|
|
|
|
private $mFunctionHooks = [];
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $mFunctionSynonyms = [ 0 => [], 1 => [] ];
|
|
|
|
|
|
private $mStripList = [];
|
|
|
|
|
|
private $mVarCache = [];
|
|
|
|
|
|
private $mImageParams = [];
|
|
|
|
|
|
private $mImageParamsMagicArray = [];
|
2020-01-28 22:02:08 +00:00
|
|
|
|
/** @deprecated since 1.35 */
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public $mMarkerIndex = 0;
|
2011-05-26 19:52:56 +00:00
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
# Initialised by initializeVariables()
|
2011-05-26 19:52:56 +00:00
|
|
|
|
|
|
|
|
|
|
/**
|
2014-05-16 00:48:01 +00:00
|
|
|
|
* @var MagicWordArray
|
2011-05-26 19:52:56 +00:00
|
|
|
|
*/
|
2020-04-16 22:45:56 +00:00
|
|
|
|
private $mVariables;
|
2014-05-10 22:52:21 +00:00
|
|
|
|
|
2014-05-16 00:48:01 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @var MagicWordArray
|
|
|
|
|
|
*/
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $mSubstWords;
|
2019-04-12 09:50:30 +00:00
|
|
|
|
|
2014-12-28 20:16:05 +00:00
|
|
|
|
# Initialised in constructor
|
2022-04-28 13:33:39 +00:00
|
|
|
|
private $mExtLinkBracketedRegex;
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Initialized in constructor
|
|
|
|
|
|
*
|
|
|
|
|
|
* @var UrlUtils
|
|
|
|
|
|
*/
|
|
|
|
|
|
private $urlUtils;
|
2015-04-01 00:13:47 +00:00
|
|
|
|
|
2021-02-19 17:26:39 +00:00
|
|
|
|
# Initialized in constructor
|
2020-03-31 15:42:29 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @var Preprocessor
|
|
|
|
|
|
*/
|
2021-02-19 17:40:27 +00:00
|
|
|
|
private $mPreprocessor;
|
2008-01-22 10:10:21 +00:00
|
|
|
|
|
2004-02-29 08:43:29 +00:00
|
|
|
|
# Cleared with clearState():
|
2011-02-24 11:59:51 +00:00
|
|
|
|
/**
|
2014-05-16 00:48:01 +00:00
|
|
|
|
* @var ParserOutput
|
2011-02-24 11:59:51 +00:00
|
|
|
|
*/
|
2021-02-18 23:44:14 +00:00
|
|
|
|
private $mOutput;
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $mAutonumber;
|
2011-02-24 11:59:51 +00:00
|
|
|
|
|
2011-05-26 19:52:56 +00:00
|
|
|
|
/**
|
2014-05-16 00:48:01 +00:00
|
|
|
|
* @var StripState
|
2011-05-26 19:52:56 +00:00
|
|
|
|
*/
|
2021-02-19 22:01:19 +00:00
|
|
|
|
private $mStripState;
|
2014-05-10 22:52:21 +00:00
|
|
|
|
|
2014-05-16 00:48:01 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @var LinkHolderArray
|
|
|
|
|
|
*/
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $mLinkHolders;
|
2014-05-10 22:52:21 +00:00
|
|
|
|
|
2020-03-31 15:42:29 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @var int
|
|
|
|
|
|
*/
|
2022-03-31 16:13:08 +00:00
|
|
|
|
private $mLinkID;
|
|
|
|
|
|
private $mIncludeSizes;
|
2020-03-31 15:44:04 +00:00
|
|
|
|
/** @deprecated since 1.35 */
|
2020-03-31 15:42:29 +00:00
|
|
|
|
public $mPPNodeCount;
|
2020-03-31 15:44:04 +00:00
|
|
|
|
/** @deprecated since 1.35 */
|
Remove Preprocessor_DOM, deprecated in 1.34
Remove the deprecated Preprocessor_DOM class, which was hard-deprecated
in 1.34. This begins to simplify parser configuration and reduce redundant
code paths, but I've left two things for cleanup in a future patch:
1. The `preprocessorClass` configuration option to the parser, exposed
in `$wgParserConf`, ServiceWiring, ParserFactory, etc. There is no reason
for this to be exposed as configurable, but I've left this clean up to a
future patch.
2. The `$wgMaxGeneratedPPNodeCount` configuration, exposed also in
ParserOptions. Only Preprocessor_DOM calculated this count, and since
we are only using Preprocessor_Hash now, this configuration has no effect.
But since this value was exposed in ParserOptions and elsewhere, I've
deprecated where needed but left this clean up to a future patch.
Bug: T204945
Change-Id: I727f003f9a42d0c92bcbcce8a8289d5af6cd1298
2020-01-24 21:23:46 +00:00
|
|
|
|
public $mHighestExpansionDepth;
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $mTplRedirCache;
|
2020-03-31 15:42:29 +00:00
|
|
|
|
/** @internal */
|
|
|
|
|
|
public $mHeadings;
|
2022-03-31 16:13:08 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @var array<string,string>
|
|
|
|
|
|
*/
|
|
|
|
|
|
private $mDoubleUnderscores;
|
2020-03-31 15:44:04 +00:00
|
|
|
|
/** @deprecated since 1.35 */
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public $mExpensiveFunctionCount; # number of expensive parser function calls
|
2022-03-31 16:13:08 +00:00
|
|
|
|
private $mShowToc;
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $mForceTocPosition;
|
2019-06-03 16:08:04 +00:00
|
|
|
|
/** @var array */
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $mTplDomCache;
|
2014-05-10 22:52:21 +00:00
|
|
|
|
|
2014-05-16 00:48:01 +00:00
|
|
|
|
/**
|
2022-03-01 21:42:03 +00:00
|
|
|
|
* @var UserIdentity|null
|
2014-05-16 00:48:01 +00:00
|
|
|
|
*/
|
2021-09-23 01:08:02 +00:00
|
|
|
|
private $mUser;
|
2004-02-29 08:43:29 +00:00
|
|
|
|
|
2006-02-02 13:42:50 +00:00
|
|
|
|
# Temporary
|
|
|
|
|
|
# These are variables reset at least once per parse regardless of $clearState
|
2011-02-19 01:02:56 +00:00
|
|
|
|
|
2011-05-28 17:18:50 +00:00
|
|
|
|
/**
|
2020-03-04 10:08:00 +00:00
|
|
|
|
* @var ParserOptions|null
|
2020-03-31 15:44:04 +00:00
|
|
|
|
* @deprecated since 1.35, use Parser::getOptions()
|
2011-05-28 17:18:50 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public $mOptions;
|
2014-05-10 22:52:21 +00:00
|
|
|
|
|
2014-05-16 00:48:01 +00:00
|
|
|
|
/**
|
2019-10-17 16:59:04 +00:00
|
|
|
|
* Since 1.34, leaving `mTitle` uninitialized or setting `mTitle` to
|
|
|
|
|
|
* `null` is deprecated.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @var Title|null
|
2020-03-31 15:44:04 +00:00
|
|
|
|
* @deprecated since 1.35, use Parser::getTitle()
|
2014-05-16 00:48:01 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public $mTitle; # Title context, used for self-link rendering and similar things
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $mOutputType; # Output type, one of the OT_xxx constants
|
|
|
|
|
|
/** @deprecated since 1.35 */
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public $ot; # Shortcut alias, see setOutputType()
|
2022-03-31 16:13:08 +00:00
|
|
|
|
private $mRevisionId; # ID to display in {{REVISIONID}} tags
|
|
|
|
|
|
private $mRevisionTimestamp; # The timestamp of the specified revision ID
|
|
|
|
|
|
private $mRevisionUser; # User to display in {{REVISIONUSER}} tag
|
|
|
|
|
|
private $mRevisionSize; # Size to display in {{REVISIONSIZE}} variable
|
|
|
|
|
|
private $mInputSize = false; # For {{PAGESIZE}} on current page.
|
2014-05-10 22:52:21 +00:00
|
|
|
|
|
2020-04-09 03:36:39 +00:00
|
|
|
|
/** @var RevisionRecord|null */
|
|
|
|
|
|
private $mRevisionRecordObject;
|
|
|
|
|
|
|
2012-09-26 07:42:17 +00:00
|
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @var array Array with the language name of each language link (i.e. the
|
2012-09-26 07:42:17 +00:00
|
|
|
|
* interwiki prefix) in the key, value arbitrary. Used to avoid sending
|
|
|
|
|
|
* duplicate language links to the ParserOutput.
|
|
|
|
|
|
*/
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $mLangLinkLanguages;
|
2012-09-26 07:42:17 +00:00
|
|
|
|
|
2014-09-16 00:07:52 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @var MapCacheLRU|null
|
|
|
|
|
|
* @since 1.24
|
|
|
|
|
|
*
|
|
|
|
|
|
* A cache of the current revisions of titles. Keys are $title->getPrefixedDbKey()
|
|
|
|
|
|
*/
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $currentRevisionCache;
|
2014-09-16 00:07:52 +00:00
|
|
|
|
|
2013-10-27 20:18:06 +00:00
|
|
|
|
/**
|
2017-05-23 12:48:32 +00:00
|
|
|
|
* @var bool|string Recursive call protection.
|
2020-03-31 15:42:29 +00:00
|
|
|
|
* @internal
|
2013-10-27 20:18:06 +00:00
|
|
|
|
*/
|
2022-03-31 16:13:08 +00:00
|
|
|
|
private $mInParse = false;
|
2013-10-27 20:18:06 +00:00
|
|
|
|
|
2014-11-12 20:28:32 +00:00
|
|
|
|
/** @var SectionProfiler */
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $mProfiler;
|
2014-11-12 20:28:32 +00:00
|
|
|
|
|
2016-05-13 00:37:17 +00:00
|
|
|
|
/**
|
2016-09-26 23:25:17 +00:00
|
|
|
|
* @var LinkRenderer
|
2016-05-13 00:37:17 +00:00
|
|
|
|
*/
|
2020-03-31 15:44:04 +00:00
|
|
|
|
private $mLinkRenderer;
|
2016-05-13 00:37:17 +00:00
|
|
|
|
|
2018-07-25 11:55:18 +00:00
|
|
|
|
/** @var MagicWordFactory */
|
|
|
|
|
|
private $magicWordFactory;
|
|
|
|
|
|
|
2018-08-03 08:25:15 +00:00
|
|
|
|
/** @var Language */
|
|
|
|
|
|
private $contLang;
|
|
|
|
|
|
|
2020-02-04 12:42:03 +00:00
|
|
|
|
/** @var LanguageConverterFactory */
|
|
|
|
|
|
private $languageConverterFactory;
|
|
|
|
|
|
|
2018-08-03 08:43:00 +00:00
|
|
|
|
/** @var ParserFactory */
|
|
|
|
|
|
private $factory;
|
|
|
|
|
|
|
2018-08-15 01:11:59 +00:00
|
|
|
|
/** @var SpecialPageFactory */
|
|
|
|
|
|
private $specialPageFactory;
|
|
|
|
|
|
|
2021-04-25 17:29:33 +00:00
|
|
|
|
/** @var TitleFormatter */
|
|
|
|
|
|
private $titleFormatter;
|
|
|
|
|
|
|
2019-04-12 09:50:30 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* This is called $svcOptions instead of $options like elsewhere to avoid confusion with
|
|
|
|
|
|
* $mOptions, which is public and widely used, and also with the local variable $options used
|
|
|
|
|
|
* for ParserOptions throughout this file.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @var ServiceOptions
|
|
|
|
|
|
*/
|
|
|
|
|
|
private $svcOptions;
|
2018-08-08 14:49:46 +00:00
|
|
|
|
|
2018-08-08 14:57:31 +00:00
|
|
|
|
/** @var LinkRendererFactory */
|
|
|
|
|
|
private $linkRendererFactory;
|
|
|
|
|
|
|
2018-08-05 12:50:01 +00:00
|
|
|
|
/** @var NamespaceInfo */
|
|
|
|
|
|
private $nsInfo;
|
|
|
|
|
|
|
2019-06-27 03:35:50 +00:00
|
|
|
|
/** @var LoggerInterface */
|
|
|
|
|
|
private $logger;
|
|
|
|
|
|
|
2019-08-18 18:19:05 +00:00
|
|
|
|
/** @var BadFileLookup */
|
|
|
|
|
|
private $badFileLookup;
|
|
|
|
|
|
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
/** @var HookContainer */
|
|
|
|
|
|
private $hookContainer;
|
|
|
|
|
|
|
|
|
|
|
|
/** @var HookRunner */
|
|
|
|
|
|
private $hookRunner;
|
|
|
|
|
|
|
2021-02-18 16:51:12 +00:00
|
|
|
|
/** @var TidyDriverBase */
|
|
|
|
|
|
private $tidy;
|
2021-02-10 15:42:26 +00:00
|
|
|
|
|
2021-03-16 18:31:27 +00:00
|
|
|
|
/** @var UserOptionsLookup */
|
|
|
|
|
|
private $userOptionsLookup;
|
|
|
|
|
|
|
|
|
|
|
|
/** @var UserFactory */
|
|
|
|
|
|
private $userFactory;
|
|
|
|
|
|
|
2021-08-04 12:56:30 +00:00
|
|
|
|
/** @var HttpRequestFactory */
|
|
|
|
|
|
private $httpRequestFactory;
|
|
|
|
|
|
|
2021-10-08 16:37:26 +00:00
|
|
|
|
/** @var TrackingCategories */
|
|
|
|
|
|
private $trackingCategories;
|
|
|
|
|
|
|
2022-04-07 23:52:05 +00:00
|
|
|
|
/** @var SignatureValidatorFactory */
|
|
|
|
|
|
private $signatureValidatorFactory;
|
|
|
|
|
|
|
2022-04-11 01:26:51 +00:00
|
|
|
|
/** @var UserNameUtils */
|
|
|
|
|
|
private $userNameUtils;
|
|
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
2019-10-25 08:07:22 +00:00
|
|
|
|
* @internal For use by ServiceWiring
|
2019-04-12 09:50:30 +00:00
|
|
|
|
*/
|
2019-10-08 18:30:42 +00:00
|
|
|
|
public const CONSTRUCTOR_OPTIONS = [
|
2019-04-12 09:50:30 +00:00
|
|
|
|
// See documentation for the corresponding config options
|
2022-03-11 16:21:33 +00:00
|
|
|
|
// Many of these are only used in (eg) CoreMagicVariables
|
2022-04-26 15:48:03 +00:00
|
|
|
|
MainConfigNames::AllowDisplayTitle,
|
|
|
|
|
|
MainConfigNames::AllowSlowParserFunctions,
|
|
|
|
|
|
MainConfigNames::ArticlePath,
|
|
|
|
|
|
MainConfigNames::EnableScaryTranscluding,
|
|
|
|
|
|
MainConfigNames::ExtraInterlanguageLinkPrefixes,
|
|
|
|
|
|
MainConfigNames::FragmentMode,
|
|
|
|
|
|
MainConfigNames::Localtimezone,
|
|
|
|
|
|
MainConfigNames::MaxSigChars,
|
|
|
|
|
|
MainConfigNames::MaxTocLevel,
|
|
|
|
|
|
MainConfigNames::MiserMode,
|
|
|
|
|
|
MainConfigNames::RawHtml,
|
|
|
|
|
|
MainConfigNames::ScriptPath,
|
|
|
|
|
|
MainConfigNames::Server,
|
|
|
|
|
|
MainConfigNames::ServerName,
|
|
|
|
|
|
MainConfigNames::ShowHostnames,
|
|
|
|
|
|
MainConfigNames::SignatureValidation,
|
2022-04-10 15:34:45 +00:00
|
|
|
|
MainConfigNames::Sitename,
|
2022-04-26 15:48:03 +00:00
|
|
|
|
MainConfigNames::StylePath,
|
|
|
|
|
|
MainConfigNames::TranscludeCacheExpiry,
|
|
|
|
|
|
MainConfigNames::PreprocessorCacheThreshold,
|
2019-04-12 09:50:30 +00:00
|
|
|
|
];
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
2021-02-18 18:09:17 +00:00
|
|
|
|
* Constructing parsers directly is not allowed! Use a ParserFactory.
|
2020-02-04 12:42:03 +00:00
|
|
|
|
* @internal
|
2019-04-12 09:50:30 +00:00
|
|
|
|
*
|
2021-02-18 18:09:17 +00:00
|
|
|
|
* @param ServiceOptions $svcOptions
|
|
|
|
|
|
* @param MagicWordFactory $magicWordFactory
|
|
|
|
|
|
* @param Language $contLang Content language
|
|
|
|
|
|
* @param ParserFactory $factory
|
2022-04-28 13:33:39 +00:00
|
|
|
|
* @param UrlUtils $urlUtils
|
2021-02-18 18:09:17 +00:00
|
|
|
|
* @param SpecialPageFactory $spFactory
|
|
|
|
|
|
* @param LinkRendererFactory $linkRendererFactory
|
|
|
|
|
|
* @param NamespaceInfo $nsInfo
|
|
|
|
|
|
* @param LoggerInterface $logger
|
|
|
|
|
|
* @param BadFileLookup $badFileLookup
|
|
|
|
|
|
* @param LanguageConverterFactory $languageConverterFactory
|
|
|
|
|
|
* @param HookContainer $hookContainer
|
|
|
|
|
|
* @param TidyDriverBase $tidy
|
2021-02-19 17:26:39 +00:00
|
|
|
|
* @param WANObjectCache $wanCache
|
2021-03-16 18:31:27 +00:00
|
|
|
|
* @param UserOptionsLookup $userOptionsLookup
|
|
|
|
|
|
* @param UserFactory $userFactory
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param TitleFormatter $titleFormatter
|
2021-08-04 12:56:30 +00:00
|
|
|
|
* @param HttpRequestFactory $httpRequestFactory
|
2021-10-08 16:37:26 +00:00
|
|
|
|
* @param TrackingCategories $trackingCategories
|
2022-04-07 23:52:05 +00:00
|
|
|
|
* @param SignatureValidatorFactory $signatureValidatorFactory
|
2022-04-11 01:26:51 +00:00
|
|
|
|
* @param UserNameUtils $userNameUtils
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2018-08-03 08:25:15 +00:00
|
|
|
|
public function __construct(
|
2021-02-18 18:09:17 +00:00
|
|
|
|
ServiceOptions $svcOptions,
|
|
|
|
|
|
MagicWordFactory $magicWordFactory,
|
|
|
|
|
|
Language $contLang,
|
|
|
|
|
|
ParserFactory $factory,
|
2022-04-28 13:33:39 +00:00
|
|
|
|
UrlUtils $urlUtils,
|
2021-02-18 18:09:17 +00:00
|
|
|
|
SpecialPageFactory $spFactory,
|
|
|
|
|
|
LinkRendererFactory $linkRendererFactory,
|
|
|
|
|
|
NamespaceInfo $nsInfo,
|
|
|
|
|
|
LoggerInterface $logger,
|
|
|
|
|
|
BadFileLookup $badFileLookup,
|
|
|
|
|
|
LanguageConverterFactory $languageConverterFactory,
|
|
|
|
|
|
HookContainer $hookContainer,
|
2021-02-19 17:26:39 +00:00
|
|
|
|
TidyDriverBase $tidy,
|
2021-03-16 18:31:27 +00:00
|
|
|
|
WANObjectCache $wanCache,
|
|
|
|
|
|
UserOptionsLookup $userOptionsLookup,
|
2021-04-25 17:29:33 +00:00
|
|
|
|
UserFactory $userFactory,
|
2021-08-04 12:56:30 +00:00
|
|
|
|
TitleFormatter $titleFormatter,
|
2021-10-08 16:37:26 +00:00
|
|
|
|
HttpRequestFactory $httpRequestFactory,
|
2022-04-07 23:52:05 +00:00
|
|
|
|
TrackingCategories $trackingCategories,
|
2022-04-11 01:26:51 +00:00
|
|
|
|
SignatureValidatorFactory $signatureValidatorFactory,
|
|
|
|
|
|
UserNameUtils $userNameUtils
|
2018-08-03 08:25:15 +00:00
|
|
|
|
) {
|
2020-04-16 14:46:00 +00:00
|
|
|
|
if ( ParserFactory::$inParserFactory === 0 ) {
|
2021-02-18 18:09:17 +00:00
|
|
|
|
// Direct construction of Parser was deprecated in 1.34 and
|
|
|
|
|
|
// removed in 1.36; use a ParserFactory instead.
|
|
|
|
|
|
throw new MWException( 'Direct construction of Parser not allowed' );
|
2020-04-16 14:46:00 +00:00
|
|
|
|
}
|
2021-02-18 18:09:17 +00:00
|
|
|
|
$svcOptions->assertRequiredOptions( self::CONSTRUCTOR_OPTIONS );
|
|
|
|
|
|
$this->svcOptions = $svcOptions;
|
|
|
|
|
|
|
2022-04-28 13:33:39 +00:00
|
|
|
|
$this->urlUtils = $urlUtils;
|
|
|
|
|
|
$this->mExtLinkBracketedRegex = '/\[(((?i)' . $this->urlUtils->validProtocols() . ')' .
|
2015-01-08 22:00:54 +00:00
|
|
|
|
self::EXT_LINK_ADDR .
|
Parser: Fix quadratic regexp edge case
In external link syntax like `[http://example.org Example]`,
the space between link target and label is technically optional
when the label starts with characters not allowed in the URL,
such as `[http://example.org<b>Example</b>]`.
This is done with a regexp that matches a required opening bracket,
a required URL, optional spaces, optional label, and a required
closing bracket. The real regexp is messy to handle various characters
allowed in each part, but for illustration purposes, it's basically
the same as `\[([^\]\s\<]+) *([^\]\s]*?)\]`.
When given input that looks like a link, but doesn't have the closing
bracket, the regexp engine (PCRE) would therefore attempt matching
every possible combination of target and label lengths before failing:
Input: [http://example.org
| Target | Label |
| ------------------ | ----- |
| http://example.or | g |
| http://example.o | r |
| http://example.o | rg |
| http://example. | o |
| http://example. | or |
| http://example. | org |
| http://example | . |
| http://example | .o |
| http://example | .or |
| http://example | .org |
| http://exampl | e |
| http://exampl | e. |
| http://exampl | e.o |
| http://exampl | e.or |
| http://exampl | e.org |
…and so on. This would take (1 + 2 + 3 + … + 18) = 171 steps to fail
in this example, or `N * (N+1) / 2` steps in general. For sufficiently
large inputs this hits a limit designed to protect against exactly
this situation, and the whole wikitext parser crashes.
(To hit the pathological case, it's also required for a `]` to appear
somewhere later in the input, otherwise PCRE would detect that a match
is never possible and exit before doing any of the above.)
Live example: https://regex101.com/debugger/?regex=%5C%5B%28%5B%5E%5C%5D%5Cs%5C%3C%5D%2B%29%20%2A%28%5B%5E%5C%5D%5Cs%5D%2A%3F%29%5C%5D&testString=%5Bhttp%3A%2F%2Fexample.org%0A%5D
We can fix it by changing the lazy quantifier `*?` to the greedy `*`.
This is correct for this regexp only because the label isn't allowed
to contain ']' (otherwise, the first external link on the page would
consume all of the content until the last external link as its label).
This allows PCRE to only consider the cases where the label has the
maximum possible length:
| Target | Label |
| ------------------ | ----- |
| http://example.or | g |
| http://example.o | rg |
| http://example. | org |
| http://example | .org |
| http://exampl | e.org |
…and so on. Only 18 steps, or `N` steps in general.
Live example: https://regex101.com/debugger/?regex=%5C%5B%28%5B%5E%5C%5D%5Cs%5C%3C%5D%2B%29%20%2A%28%5B%5E%5C%5D%5Cs%5D%2A%29%5C%5D&testString=%5Bhttp%3A%2F%2Fexample.org%0A%5D
I think this bug has been present since 2004, when external link
parsing was rewritten in badf11ffe6 (SVN r4579).
Bug: T321467
Change-Id: I993a10d9a90ab28cce61eba6beabee8a06a2d562
2022-11-04 23:23:56 +00:00
|
|
|
|
self::EXT_LINK_URL_CLASS . '*)\p{Zs}*([^\]\\x00-\\x08\\x0a-\\x1F\\x{FFFD}]*)\]/Su';
|
2018-07-25 11:55:18 +00:00
|
|
|
|
|
2021-02-18 18:09:17 +00:00
|
|
|
|
$this->magicWordFactory = $magicWordFactory;
|
2018-08-03 08:25:15 +00:00
|
|
|
|
|
2021-02-18 18:09:17 +00:00
|
|
|
|
$this->contLang = $contLang;
|
2018-08-03 08:43:00 +00:00
|
|
|
|
|
2021-02-18 18:09:17 +00:00
|
|
|
|
$this->factory = $factory;
|
|
|
|
|
|
$this->specialPageFactory = $spFactory;
|
|
|
|
|
|
$this->linkRendererFactory = $linkRendererFactory;
|
|
|
|
|
|
$this->nsInfo = $nsInfo;
|
|
|
|
|
|
$this->logger = $logger;
|
|
|
|
|
|
$this->badFileLookup = $badFileLookup;
|
2020-02-04 12:42:03 +00:00
|
|
|
|
|
2021-02-18 18:09:17 +00:00
|
|
|
|
$this->languageConverterFactory = $languageConverterFactory;
|
2020-04-16 21:53:05 +00:00
|
|
|
|
|
2021-02-18 18:09:17 +00:00
|
|
|
|
$this->hookContainer = $hookContainer;
|
|
|
|
|
|
$this->hookRunner = new HookRunner( $hookContainer );
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
|
2021-02-18 18:09:17 +00:00
|
|
|
|
$this->tidy = $tidy;
|
2021-02-10 15:42:26 +00:00
|
|
|
|
|
2021-02-19 17:26:39 +00:00
|
|
|
|
$this->mPreprocessor = new Preprocessor_Hash(
|
|
|
|
|
|
$this,
|
|
|
|
|
|
$wanCache,
|
|
|
|
|
|
[
|
2022-04-26 15:48:03 +00:00
|
|
|
|
'cacheThreshold' => $svcOptions->get( MainConfigNames::PreprocessorCacheThreshold ),
|
2021-10-26 22:29:20 +00:00
|
|
|
|
'disableLangConversion' => $languageConverterFactory->isConversionDisabled(),
|
2021-02-19 17:26:39 +00:00
|
|
|
|
]
|
|
|
|
|
|
);
|
|
|
|
|
|
|
2021-03-16 18:31:27 +00:00
|
|
|
|
$this->userOptionsLookup = $userOptionsLookup;
|
|
|
|
|
|
$this->userFactory = $userFactory;
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->titleFormatter = $titleFormatter;
|
2021-08-04 12:56:30 +00:00
|
|
|
|
$this->httpRequestFactory = $httpRequestFactory;
|
2021-10-08 16:37:26 +00:00
|
|
|
|
$this->trackingCategories = $trackingCategories;
|
2022-04-07 23:52:05 +00:00
|
|
|
|
$this->signatureValidatorFactory = $signatureValidatorFactory;
|
2022-04-11 01:26:51 +00:00
|
|
|
|
$this->userNameUtils = $userNameUtils;
|
2021-03-16 18:31:27 +00:00
|
|
|
|
|
2021-02-18 18:45:57 +00:00
|
|
|
|
// These steps used to be done in "::firstCallInit()"
|
|
|
|
|
|
// (if you're chasing a reference from some old code)
|
2022-01-28 19:39:24 +00:00
|
|
|
|
CoreParserFunctions::register(
|
|
|
|
|
|
$this,
|
|
|
|
|
|
new ServiceOptions( CoreParserFunctions::REGISTER_OPTIONS, $svcOptions )
|
|
|
|
|
|
);
|
|
|
|
|
|
CoreTagHooks::register(
|
|
|
|
|
|
$this,
|
|
|
|
|
|
new ServiceOptions( CoreTagHooks::REGISTER_OPTIONS, $svcOptions )
|
|
|
|
|
|
);
|
2021-02-18 18:45:57 +00:00
|
|
|
|
$this->initializeVariables();
|
|
|
|
|
|
|
|
|
|
|
|
$this->hookRunner->onParserFirstCallInit( $this );
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
2008-08-26 06:48:24 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Reduce memory usage to reduce the impact of circular references
|
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function __destruct() {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
if ( isset( $this->mLinkHolders ) ) {
|
2019-08-29 13:19:39 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypeObjectUnsetDeclaredProperty
|
2011-04-27 20:05:39 +00:00
|
|
|
|
unset( $this->mLinkHolders );
|
2008-08-26 14:37:15 +00:00
|
|
|
|
}
|
2019-08-29 09:59:59 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypeSuspiciousNonTraversableForeach
|
2008-08-26 06:48:24 +00:00
|
|
|
|
foreach ( $this as $name => $value ) {
|
|
|
|
|
|
unset( $this->$name );
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2012-11-15 00:05:24 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Allow extensions to clean up when the parser is cloned
|
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function __clone() {
|
2013-10-27 20:18:06 +00:00
|
|
|
|
$this->mInParse = false;
|
2014-09-22 16:49:28 +00:00
|
|
|
|
|
2016-12-11 22:45:07 +00:00
|
|
|
|
// T58226: When you create a reference "to" an object field, that
|
2014-09-22 16:49:28 +00:00
|
|
|
|
// makes the object field itself be a reference too (until the other
|
|
|
|
|
|
// reference goes out of scope). When cloning, any field that's a
|
|
|
|
|
|
// reference is copied as a reference in the new object. Both of these
|
|
|
|
|
|
// are defined PHP5 behaviors, as inconvenient as it is for us when old
|
|
|
|
|
|
// hooks from PHP4 days are passing fields by reference.
|
2016-02-17 09:09:32 +00:00
|
|
|
|
foreach ( [ 'mStripState', 'mVarCache' ] as $k ) {
|
2014-09-22 16:49:28 +00:00
|
|
|
|
// Make a non-reference copy of the field, then rebind the field to
|
|
|
|
|
|
// reference the new copy.
|
|
|
|
|
|
$tmp = $this->$k;
|
|
|
|
|
|
$this->$k =& $tmp;
|
|
|
|
|
|
unset( $tmp );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2021-02-19 17:26:39 +00:00
|
|
|
|
$this->mPreprocessor = clone $this->mPreprocessor;
|
|
|
|
|
|
$this->mPreprocessor->resetParser( $this );
|
|
|
|
|
|
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onParserCloned( $this );
|
2012-11-15 00:05:24 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2006-07-03 11:07:00 +00:00
|
|
|
|
/**
|
2021-02-18 18:45:57 +00:00
|
|
|
|
* Used to do various kinds of initialisation on the first call of the
|
|
|
|
|
|
* parser.
|
2020-04-16 21:53:05 +00:00
|
|
|
|
* @deprecated since 1.35, this initialization is done in the constructor
|
|
|
|
|
|
* and manual calls to ::firstCallInit() have no effect.
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.7
|
2006-07-03 11:07:00 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function firstCallInit() {
|
2021-02-18 18:45:57 +00:00
|
|
|
|
/*
|
|
|
|
|
|
* This method should be hard-deprecated once remaining calls are
|
|
|
|
|
|
* removed; it no longer does anything.
|
|
|
|
|
|
*/
|
2006-10-17 08:49:27 +00:00
|
|
|
|
}
|
2006-07-03 11:07:00 +00:00
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Clear Parser state
|
|
|
|
|
|
*
|
2020-06-26 12:14:23 +00:00
|
|
|
|
* @internal
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function clearState() {
|
2019-08-12 06:10:22 +00:00
|
|
|
|
$this->resetOutput();
|
2004-02-26 13:37:26 +00:00
|
|
|
|
$this->mAutonumber = 0;
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->mLinkHolders = new LinkHolderArray(
|
|
|
|
|
|
$this,
|
|
|
|
|
|
$this->getContentLanguageConverter(),
|
|
|
|
|
|
$this->getHookContainer()
|
|
|
|
|
|
);
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$this->mLinkID = 0;
|
2021-02-25 20:19:54 +00:00
|
|
|
|
$this->mRevisionTimestamp = null;
|
|
|
|
|
|
$this->mRevisionId = null;
|
|
|
|
|
|
$this->mRevisionUser = null;
|
|
|
|
|
|
$this->mRevisionSize = null;
|
2020-04-09 03:36:39 +00:00
|
|
|
|
$this->mRevisionRecordObject = null;
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$this->mVarCache = [];
|
2010-12-10 18:17:20 +00:00
|
|
|
|
$this->mUser = null;
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$this->mLangLinkLanguages = [];
|
2014-09-16 00:07:52 +00:00
|
|
|
|
$this->currentRevisionCache = null;
|
2007-01-17 19:48:48 +00:00
|
|
|
|
|
2018-02-28 02:11:56 +00:00
|
|
|
|
$this->mStripState = new StripState( $this );
|
2005-12-24 23:05:18 +00:00
|
|
|
|
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# Clear these on every parse, T6549
|
2021-02-25 20:19:54 +00:00
|
|
|
|
$this->mTplRedirCache = [];
|
|
|
|
|
|
$this->mTplDomCache = [];
|
2006-02-02 13:42:50 +00:00
|
|
|
|
|
2006-05-23 07:19:01 +00:00
|
|
|
|
$this->mShowToc = true;
|
|
|
|
|
|
$this->mForceTocPosition = false;
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$this->mIncludeSizes = [
|
2006-08-10 21:28:49 +00:00
|
|
|
|
'post-expand' => 0,
|
2007-11-20 10:55:08 +00:00
|
|
|
|
'arg' => 0,
|
2016-02-17 09:09:32 +00:00
|
|
|
|
];
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$this->mPPNodeCount = 0;
|
2012-05-04 20:44:14 +00:00
|
|
|
|
$this->mHighestExpansionDepth = 0;
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$this->mHeadings = [];
|
|
|
|
|
|
$this->mDoubleUnderscores = [];
|
2008-04-07 22:11:31 +00:00
|
|
|
|
$this->mExpensiveFunctionCount = 0;
|
2006-07-02 17:43:32 +00:00
|
|
|
|
|
2014-11-12 20:28:32 +00:00
|
|
|
|
$this->mProfiler = new SectionProfiler();
|
|
|
|
|
|
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onParserClearState( $this );
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2019-08-12 06:10:22 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Reset the ParserOutput
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.34
|
2019-08-12 06:10:22 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function resetOutput() {
|
|
|
|
|
|
$this->mOutput = new ParserOutput;
|
|
|
|
|
|
$this->mOptions->registerWatcher( [ $this->mOutput, 'recordOption' ] );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2006-01-07 13:09:30 +00:00
|
|
|
|
/**
|
2005-12-30 09:33:11 +00:00
|
|
|
|
* Convert wikitext to HTML
|
|
|
|
|
|
* Do not call this function recursively.
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*
|
2014-07-24 17:43:25 +00:00
|
|
|
|
* @param string $text Text we want to parse
|
2018-08-31 15:55:44 +00:00
|
|
|
|
* @param-taint $text escapes_htmlnoent
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param PageReference $page
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param ParserOptions $options
|
|
|
|
|
|
* @param bool $linestart
|
|
|
|
|
|
* @param bool $clearState
|
2019-06-27 07:17:06 +00:00
|
|
|
|
* @param int|null $revid ID of the revision being rendered. This is used to render
|
|
|
|
|
|
* REVISION* magic words. 0 means that any current revision will be used. Null means
|
|
|
|
|
|
* that {{REVISIONID}}/{{REVISIONUSER}} will be empty and {{REVISIONTIMESTAMP}} will
|
|
|
|
|
|
* use the current timestamp.
|
2021-06-17 14:32:05 +00:00
|
|
|
|
* @return ParserOutput
|
2018-08-31 15:55:44 +00:00
|
|
|
|
* @return-taint escaped
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.10 method is public
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2016-08-30 19:35:08 +00:00
|
|
|
|
public function parse(
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$text, PageReference $page, ParserOptions $options,
|
2014-05-10 23:03:45 +00:00
|
|
|
|
$linestart = true, $clearState = true, $revid = null
|
|
|
|
|
|
) {
|
2013-10-27 20:18:06 +00:00
|
|
|
|
if ( $clearState ) {
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
// We use U+007F DELETE to construct strip markers, so we have to make
|
|
|
|
|
|
// sure that this character does not occur in the input text.
|
|
|
|
|
|
$text = strtr( $text, "\x7f", "?" );
|
2013-10-27 20:18:06 +00:00
|
|
|
|
$magicScopeVariable = $this->lock();
|
|
|
|
|
|
}
|
2017-02-27 21:27:15 +00:00
|
|
|
|
// Strip U+0000 NULL (T159174)
|
|
|
|
|
|
$text = str_replace( "\000", '', $text );
|
2013-10-27 20:18:06 +00:00
|
|
|
|
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->startParse( $page, $options, self::OT_HTML, $clearState );
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2014-09-16 00:07:52 +00:00
|
|
|
|
$this->currentRevisionCache = null;
|
2013-01-15 21:38:29 +00:00
|
|
|
|
$this->mInputSize = strlen( $text );
|
2021-11-09 16:31:27 +00:00
|
|
|
|
$this->mOutput->resetParseStartTime();
|
2013-01-15 21:38:29 +00:00
|
|
|
|
|
2006-07-10 18:25:56 +00:00
|
|
|
|
$oldRevisionId = $this->mRevisionId;
|
2020-04-09 03:36:39 +00:00
|
|
|
|
$oldRevisionRecordObject = $this->mRevisionRecordObject;
|
2006-11-21 09:53:45 +00:00
|
|
|
|
$oldRevisionTimestamp = $this->mRevisionTimestamp;
|
2010-12-10 18:17:20 +00:00
|
|
|
|
$oldRevisionUser = $this->mRevisionUser;
|
2013-09-04 19:09:36 +00:00
|
|
|
|
$oldRevisionSize = $this->mRevisionSize;
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $revid !== null ) {
|
2006-07-10 18:25:56 +00:00
|
|
|
|
$this->mRevisionId = $revid;
|
2020-04-09 03:36:39 +00:00
|
|
|
|
$this->mRevisionRecordObject = null;
|
2006-11-21 09:53:45 +00:00
|
|
|
|
$this->mRevisionTimestamp = null;
|
2010-12-10 18:17:20 +00:00
|
|
|
|
$this->mRevisionUser = null;
|
2013-09-04 19:09:36 +00:00
|
|
|
|
$this->mRevisionSize = null;
|
2006-07-10 18:25:56 +00:00
|
|
|
|
}
|
2011-01-23 16:07:13 +00:00
|
|
|
|
|
2005-04-21 06:30:48 +00:00
|
|
|
|
$text = $this->internalParse( $text );
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onParserAfterParse( $this, $text, $this->mStripState );
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2014-11-02 18:14:53 +00:00
|
|
|
|
$text = $this->internalParseHalfParsed( $text, true, $linestart );
|
2005-04-26 20:50:16 +00:00
|
|
|
|
|
2010-04-09 19:02:04 +00:00
|
|
|
|
/**
|
2011-01-07 01:38:06 +00:00
|
|
|
|
* A converted title will be provided in the output object if title and
|
2011-02-19 19:18:02 +00:00
|
|
|
|
* content conversion are enabled, the article text does not contain
|
|
|
|
|
|
* a conversion-suppressing double-underscore tag, and no
|
2011-01-07 01:38:06 +00:00
|
|
|
|
* {{DISPLAYTITLE:...}} is present. DISPLAYTITLE takes precedence over
|
|
|
|
|
|
* automatic link conversion.
|
2010-04-09 19:02:04 +00:00
|
|
|
|
*/
|
2021-10-04 18:40:24 +00:00
|
|
|
|
if ( !$options->getDisableTitleConversion()
|
|
|
|
|
|
&& !isset( $this->mDoubleUnderscores['nocontentconvert'] )
|
|
|
|
|
|
&& !isset( $this->mDoubleUnderscores['notitleconvert'] )
|
|
|
|
|
|
&& $this->mOutput->getDisplayTitle() === false
|
2013-12-01 20:39:00 +00:00
|
|
|
|
) {
|
2021-10-04 18:40:24 +00:00
|
|
|
|
$titleText = $this->getTargetLanguageConverter()->getConvRuleTitle();
|
2021-10-29 14:07:50 +00:00
|
|
|
|
if ( $titleText !== false ) {
|
|
|
|
|
|
$titleText = Sanitizer::removeSomeTags( $titleText );
|
|
|
|
|
|
} else {
|
2022-08-09 02:52:53 +00:00
|
|
|
|
[ $nsText, $nsSeparator, $mainText ] = $this->getTargetLanguageConverter()->convertSplitTitle( $page );
|
|
|
|
|
|
// In the future, those three pieces could be stored separately rather than joined into $titleText,
|
|
|
|
|
|
// and OutputPage would format them and join them together, to resolve T314399.
|
|
|
|
|
|
$titleText = self::formatPageTitle( $nsText, $nsSeparator, $mainText );
|
2010-04-09 19:02:04 +00:00
|
|
|
|
}
|
2021-10-29 14:07:50 +00:00
|
|
|
|
$this->mOutput->setTitleText( $titleText );
|
2010-01-08 08:22:19 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2017-12-08 05:33:05 +00:00
|
|
|
|
# Compute runtime adaptive expiry if set
|
2016-08-30 19:35:08 +00:00
|
|
|
|
$this->mOutput->finalizeAdaptiveCacheExpiry();
|
|
|
|
|
|
|
|
|
|
|
|
# Warn if too many heavyweight parser functions were used
|
2012-05-04 18:56:28 +00:00
|
|
|
|
if ( $this->mExpensiveFunctionCount > $this->mOptions->getExpensiveParserFunctionLimit() ) {
|
|
|
|
|
|
$this->limitationWarn( 'expensive-parserfunction',
|
|
|
|
|
|
$this->mExpensiveFunctionCount,
|
|
|
|
|
|
$this->mOptions->getExpensiveParserFunctionLimit()
|
|
|
|
|
|
);
|
2008-04-07 22:11:31 +00:00
|
|
|
|
}
|
2004-10-15 17:39:10 +00:00
|
|
|
|
|
2017-12-08 05:33:05 +00:00
|
|
|
|
# Information on limits, for the benefit of users who try to skirt them
|
2022-04-10 15:34:45 +00:00
|
|
|
|
if ( MediaWikiServices::getInstance()->getMainConfig()->get(
|
|
|
|
|
|
MainConfigNames::EnableParserLimitReporting ) ) {
|
2021-11-09 16:31:27 +00:00
|
|
|
|
$this->makeLimitReport();
|
2006-08-10 21:28:49 +00:00
|
|
|
|
}
|
2017-04-27 16:58:17 +00:00
|
|
|
|
|
|
|
|
|
|
# Wrap non-interface parser output in a <div> so it can be targeted
|
|
|
|
|
|
# with CSS (T37247)
|
|
|
|
|
|
$class = $this->mOptions->getWrapOutputClass();
|
|
|
|
|
|
if ( $class !== false && !$this->mOptions->getInterfaceMessage() ) {
|
2018-08-28 16:48:10 +00:00
|
|
|
|
$this->mOutput->addWrapperDivClass( $class );
|
2017-04-27 16:58:17 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2004-03-06 01:49:16 +00:00
|
|
|
|
$this->mOutput->setText( $text );
|
2010-01-15 19:14:23 +00:00
|
|
|
|
|
2006-07-10 18:25:56 +00:00
|
|
|
|
$this->mRevisionId = $oldRevisionId;
|
2020-04-09 03:36:39 +00:00
|
|
|
|
$this->mRevisionRecordObject = $oldRevisionRecordObject;
|
2006-11-21 09:53:45 +00:00
|
|
|
|
$this->mRevisionTimestamp = $oldRevisionTimestamp;
|
2010-12-10 18:17:20 +00:00
|
|
|
|
$this->mRevisionUser = $oldRevisionUser;
|
2013-09-04 19:09:36 +00:00
|
|
|
|
$this->mRevisionSize = $oldRevisionSize;
|
2013-01-15 21:38:29 +00:00
|
|
|
|
$this->mInputSize = false;
|
2014-09-16 00:07:52 +00:00
|
|
|
|
$this->currentRevisionCache = null;
|
2005-12-30 09:33:11 +00:00
|
|
|
|
|
2004-03-06 01:49:16 +00:00
|
|
|
|
return $this->mOutput;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2017-12-08 05:33:05 +00:00
|
|
|
|
/**
|
2021-11-09 16:31:27 +00:00
|
|
|
|
* Set the limit report data in the current ParserOutput.
|
2017-12-08 05:33:05 +00:00
|
|
|
|
*/
|
|
|
|
|
|
protected function makeLimitReport() {
|
|
|
|
|
|
$maxIncludeSize = $this->mOptions->getMaxIncludeSize();
|
|
|
|
|
|
|
|
|
|
|
|
$cpuTime = $this->mOutput->getTimeSinceStart( 'cpu' );
|
|
|
|
|
|
if ( $cpuTime !== null ) {
|
|
|
|
|
|
$this->mOutput->setLimitReportData( 'limitreport-cputime',
|
|
|
|
|
|
sprintf( "%.3f", $cpuTime )
|
|
|
|
|
|
);
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
$wallTime = $this->mOutput->getTimeSinceStart( 'wall' );
|
|
|
|
|
|
$this->mOutput->setLimitReportData( 'limitreport-walltime',
|
|
|
|
|
|
sprintf( "%.3f", $wallTime )
|
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
|
|
$this->mOutput->setLimitReportData( 'limitreport-ppvisitednodes',
|
|
|
|
|
|
[ $this->mPPNodeCount, $this->mOptions->getMaxPPNodeCount() ]
|
|
|
|
|
|
);
|
|
|
|
|
|
$this->mOutput->setLimitReportData( 'limitreport-postexpandincludesize',
|
|
|
|
|
|
[ $this->mIncludeSizes['post-expand'], $maxIncludeSize ]
|
|
|
|
|
|
);
|
|
|
|
|
|
$this->mOutput->setLimitReportData( 'limitreport-templateargumentsize',
|
|
|
|
|
|
[ $this->mIncludeSizes['arg'], $maxIncludeSize ]
|
|
|
|
|
|
);
|
|
|
|
|
|
$this->mOutput->setLimitReportData( 'limitreport-expansiondepth',
|
|
|
|
|
|
[ $this->mHighestExpansionDepth, $this->mOptions->getMaxPPExpandDepth() ]
|
|
|
|
|
|
);
|
|
|
|
|
|
$this->mOutput->setLimitReportData( 'limitreport-expensivefunctioncount',
|
|
|
|
|
|
[ $this->mExpensiveFunctionCount, $this->mOptions->getExpensiveParserFunctionLimit() ]
|
|
|
|
|
|
);
|
2018-02-28 02:11:56 +00:00
|
|
|
|
|
2022-10-21 04:32:38 +00:00
|
|
|
|
foreach ( $this->mStripState->getLimitReport() as [ $key, $value ] ) {
|
2018-02-28 02:11:56 +00:00
|
|
|
|
$this->mOutput->setLimitReportData( $key, $value );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onParserLimitReportPrepare( $this, $this->mOutput );
|
2017-12-08 05:33:05 +00:00
|
|
|
|
|
|
|
|
|
|
// Add on template profiling data in human/machine readable way
|
|
|
|
|
|
$dataByFunc = $this->mProfiler->getFunctionStats();
|
2021-02-10 22:31:02 +00:00
|
|
|
|
uasort( $dataByFunc, static function ( $a, $b ) {
|
Use PHP 7 '<=>' operator in 'sort()' callbacks
`$a <=> $b` returns `-1` if `$a` is lesser, `1` if `$b` is lesser,
and `0` if they are equal, which are exactly the values 'sort()'
callbacks are supposed to return.
It also enables the neat idiom `$a[x] <=> $b[x] ?: $a[y] <=> $b[y]`
to sort arrays of objects first by 'x', and by 'y' if they are equal.
* Replace a common pattern like `return $a < $b ? -1 : 1` with the
new operator (and similar patterns with the variables, the numbers
or the comparison inverted). Some of the uses were previously not
correctly handling the variables being equal; this is now
automatically fixed.
* Also replace `return $a - $b`, which is equivalent to `return
$a <=> $b` if both variables are integers but less intuitive.
* (Do not replace `return strcmp( $a, $b )`. It is also equivalent
when both variables are strings, but if any of the variables is not,
'strcmp()' converts it to a string before comparison, which could
give different results than '<=>', so changing this would require
careful review and isn't worth it.)
* Also replace `return $a > $b`, which presumably sort of works most
of the time (returns `1` if `$b` is lesser, and `0` if they are
equal or `$a` is lesser) but is erroneous.
Change-Id: I19a3d2fc8fcdb208c10330bd7a42c4e05d7f5cf3
2017-10-06 20:39:13 +00:00
|
|
|
|
return $b['real'] <=> $a['real']; // descending order
|
2017-12-08 05:33:05 +00:00
|
|
|
|
} );
|
|
|
|
|
|
$profileReport = [];
|
|
|
|
|
|
foreach ( array_slice( $dataByFunc, 0, 10 ) as $item ) {
|
|
|
|
|
|
$profileReport[] = sprintf( "%6.2f%% %8.3f %6d %s",
|
|
|
|
|
|
$item['%real'], $item['real'], $item['calls'],
|
|
|
|
|
|
htmlspecialchars( $item['name'] ) );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
$this->mOutput->setLimitReportData( 'limitreport-timingprofile', $profileReport );
|
|
|
|
|
|
|
|
|
|
|
|
// Add other cache related metadata
|
2022-04-26 15:48:03 +00:00
|
|
|
|
if ( $this->svcOptions->get( MainConfigNames::ShowHostnames ) ) {
|
2017-12-08 05:33:05 +00:00
|
|
|
|
$this->mOutput->setLimitReportData( 'cachereport-origin', wfHostname() );
|
|
|
|
|
|
}
|
|
|
|
|
|
$this->mOutput->setLimitReportData( 'cachereport-timestamp',
|
|
|
|
|
|
$this->mOutput->getCacheTime() );
|
|
|
|
|
|
$this->mOutput->setLimitReportData( 'cachereport-ttl',
|
|
|
|
|
|
$this->mOutput->getCacheExpiry() );
|
|
|
|
|
|
$this->mOutput->setLimitReportData( 'cachereport-transientcontent',
|
2021-11-23 20:35:49 +00:00
|
|
|
|
$this->mOutput->hasReducedExpiry() );
|
2017-12-08 05:33:05 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2006-08-06 14:01:47 +00:00
|
|
|
|
/**
|
2014-11-02 18:14:53 +00:00
|
|
|
|
* Half-parse wikitext to half-parsed HTML. This recursive parser entry point
|
|
|
|
|
|
* can be called from an extension tag hook.
|
|
|
|
|
|
*
|
|
|
|
|
|
* The output of this function IS NOT SAFE PARSED HTML; it is "half-parsed"
|
|
|
|
|
|
* instead, which means that lists and links have not been fully parsed yet,
|
|
|
|
|
|
* and strip markers are still present.
|
|
|
|
|
|
*
|
|
|
|
|
|
* Use recursiveTagParseFully() to fully parse wikitext to output-safe HTML.
|
2009-08-30 06:37:10 +00:00
|
|
|
|
*
|
2014-11-02 18:14:53 +00:00
|
|
|
|
* Use this function if you're a parser tag hook and you want to parse
|
|
|
|
|
|
* wikitext before or after applying additional transformations, and you
|
|
|
|
|
|
* intend to *return the result as hook output*, which will cause it to go
|
|
|
|
|
|
* through the rest of parsing process automatically.
|
|
|
|
|
|
*
|
|
|
|
|
|
* If $frame is not provided, then template variables (e.g., {{{1}}}) within
|
|
|
|
|
|
* $text are not expanded
|
2009-08-30 06:37:10 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text Text extension wants to have parsed
|
2018-08-31 15:55:44 +00:00
|
|
|
|
* @param-taint $text escapes_htmlnoent
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @param PPFrame|false $frame The frame to use for expanding any template variables
|
2014-11-02 18:14:53 +00:00
|
|
|
|
* @return string UNSAFE half-parsed HTML
|
2018-08-31 15:55:44 +00:00
|
|
|
|
* @return-taint escaped
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.8
|
2006-08-06 14:01:47 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function recursiveTagParse( $text, $frame = false ) {
|
2009-08-30 06:37:10 +00:00
|
|
|
|
$text = $this->internalParse( $text, false, $frame );
|
2006-08-06 14:01:47 +00:00
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2014-11-02 18:14:53 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Fully parse wikitext to fully parsed HTML. This recursive parser entry
|
|
|
|
|
|
* point can be called from an extension tag hook.
|
|
|
|
|
|
*
|
|
|
|
|
|
* The output of this function is fully-parsed HTML that is safe for output.
|
|
|
|
|
|
* If you're a parser tag hook, you might want to use recursiveTagParse()
|
|
|
|
|
|
* instead.
|
|
|
|
|
|
*
|
|
|
|
|
|
* If $frame is not provided, then template variables (e.g., {{{1}}}) within
|
|
|
|
|
|
* $text are not expanded
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.25
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text Text extension wants to have parsed
|
2018-08-31 15:55:44 +00:00
|
|
|
|
* @param-taint $text escapes_htmlnoent
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @param PPFrame|false $frame The frame to use for expanding any template variables
|
2014-11-02 18:14:53 +00:00
|
|
|
|
* @return string Fully parsed HTML
|
2018-08-31 15:55:44 +00:00
|
|
|
|
* @return-taint escaped
|
2014-11-02 18:14:53 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function recursiveTagParseFully( $text, $frame = false ) {
|
|
|
|
|
|
$text = $this->recursiveTagParse( $text, $frame );
|
|
|
|
|
|
$text = $this->internalParseHalfParsed( $text, false );
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2019-10-16 21:43:44 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Needed by Parsoid/PHP to ensure all the hooks for extensions
|
|
|
|
|
|
* are run in the right order. The primary differences between this
|
|
|
|
|
|
* and recursiveTagParseFully are:
|
|
|
|
|
|
* (a) absence of $frame
|
|
|
|
|
|
* (b) passing true to internalParseHalfParse so all hooks are run
|
|
|
|
|
|
* (c) running 'ParserAfterParse' hook at the same point in the parsing
|
|
|
|
|
|
* pipeline when parse() does it. This kinda mimics Parsoid/JS behavior
|
|
|
|
|
|
* where exttags are processed by the M/w API.
|
|
|
|
|
|
*
|
|
|
|
|
|
* This is a temporary convenience method and will go away as we proceed
|
|
|
|
|
|
* further with Parsoid <-> Parser.php integration.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @internal
|
|
|
|
|
|
* @deprecated
|
|
|
|
|
|
* @param string $text Wikitext source of the extension
|
2019-11-04 01:10:23 +00:00
|
|
|
|
* @return string
|
2019-10-16 21:43:44 +00:00
|
|
|
|
* @return-taint escaped
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function parseExtensionTagAsTopLevelDoc( $text ) {
|
|
|
|
|
|
$text = $this->recursiveTagParse( $text );
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onParserAfterParse( $this, $text, $this->mStripState );
|
2019-10-16 21:43:44 +00:00
|
|
|
|
$text = $this->internalParseHalfParsed( $text, true );
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2006-08-14 07:10:31 +00:00
|
|
|
|
/**
|
2006-08-15 02:24:59 +00:00
|
|
|
|
* Expand templates and variables in the text, producing valid, static wikitext.
|
|
|
|
|
|
* Also removes comments.
|
2013-12-06 23:28:09 +00:00
|
|
|
|
* Do not call this function recursively.
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param ?PageReference $page
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param ParserOptions $options
|
|
|
|
|
|
* @param int|null $revid
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @param PPFrame|false $frame
|
2012-02-09 21:35:05 +00:00
|
|
|
|
* @return mixed|string
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.8
|
2006-08-14 07:10:31 +00:00
|
|
|
|
*/
|
2021-07-11 19:11:37 +00:00
|
|
|
|
public function preprocess(
|
|
|
|
|
|
$text,
|
|
|
|
|
|
?PageReference $page,
|
|
|
|
|
|
ParserOptions $options,
|
|
|
|
|
|
$revid = null,
|
|
|
|
|
|
$frame = false
|
2014-12-28 20:16:05 +00:00
|
|
|
|
) {
|
2013-10-27 20:18:06 +00:00
|
|
|
|
$magicScopeVariable = $this->lock();
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->startParse( $page, $options, self::OT_PREPROCESS, true );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $revid !== null ) {
|
2007-05-31 16:01:26 +00:00
|
|
|
|
$this->mRevisionId = $revid;
|
|
|
|
|
|
}
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onParserBeforePreprocess( $this, $text, $this->mStripState );
|
2013-12-06 23:28:09 +00:00
|
|
|
|
$text = $this->replaceVariables( $text, $frame );
|
2006-11-21 09:53:45 +00:00
|
|
|
|
$text = $this->mStripState->unstripBoth( $text );
|
2006-08-14 07:10:31 +00:00
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2011-08-14 20:22:52 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Recursive parser entry point that can be called from an extension tag
|
|
|
|
|
|
* hook.
|
|
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text Text to be expanded
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @param PPFrame|false $frame The frame to use for expanding any template variables
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @return string
|
2012-01-09 19:11:55 +00:00
|
|
|
|
* @since 1.19
|
2011-08-14 20:22:52 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function recursivePreprocess( $text, $frame = false ) {
|
|
|
|
|
|
$text = $this->replaceVariables( $text, $frame );
|
|
|
|
|
|
$text = $this->mStripState->unstripBoth( $text );
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2010-03-03 02:41:14 +00:00
|
|
|
|
/**
|
2016-12-11 22:45:07 +00:00
|
|
|
|
* Process the wikitext for the "?preload=" feature. (T7210)
|
2010-03-03 02:41:14 +00:00
|
|
|
|
*
|
2012-07-10 12:48:06 +00:00
|
|
|
|
* "<noinclude>", "<includeonly>" etc. are parsed as for template
|
|
|
|
|
|
* transclusion, comments, templates, arguments, tags hooks and parser
|
|
|
|
|
|
* functions are untouched.
|
2011-09-11 21:07:17 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param PageReference $page
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param ParserOptions $options
|
|
|
|
|
|
* @param array $params
|
|
|
|
|
|
* @return string
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.17
|
2010-03-03 02:41:14 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public function getPreloadText( $text, PageReference $page, ParserOptions $options, $params = [] ) {
|
2014-04-07 01:03:02 +00:00
|
|
|
|
$msg = new RawMessage( $text );
|
|
|
|
|
|
$text = $msg->params( $params )->plain();
|
|
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Parser (re)initialisation
|
2013-10-27 20:18:06 +00:00
|
|
|
|
$magicScopeVariable = $this->lock();
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->startParse( $page, $options, self::OT_PLAIN, true );
|
2010-03-03 02:41:14 +00:00
|
|
|
|
|
|
|
|
|
|
$flags = PPFrame::NO_ARGS | PPFrame::NO_TEMPLATES;
|
2020-04-07 23:52:41 +00:00
|
|
|
|
$dom = $this->preprocessToDom( $text, Preprocessor::DOM_FOR_INCLUSION );
|
2011-02-19 19:18:02 +00:00
|
|
|
|
$text = $this->getPreprocessor()->newFrame()->expand( $dom, $flags );
|
|
|
|
|
|
$text = $this->mStripState->unstripBoth( $text );
|
|
|
|
|
|
return $text;
|
2010-03-03 02:41:14 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2010-12-10 18:17:20 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Set the current user.
|
|
|
|
|
|
* Should only be used when doing pre-save transform.
|
|
|
|
|
|
*
|
2021-03-16 18:31:27 +00:00
|
|
|
|
* @param UserIdentity|null $user user identity or null (to reset)
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.17
|
2010-12-10 18:17:20 +00:00
|
|
|
|
*/
|
2021-03-16 18:31:27 +00:00
|
|
|
|
public function setUser( ?UserIdentity $user ) {
|
2021-09-23 01:08:02 +00:00
|
|
|
|
$this->mUser = $user;
|
2010-12-10 18:17:20 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2010-06-10 21:05:58 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Set the context title
|
2011-05-28 17:18:50 +00:00
|
|
|
|
*
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @deprecated since 1.37, use setPage() instead.
|
2019-08-28 12:39:25 +00:00
|
|
|
|
* @param Title|null $t
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.12
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*/
|
2019-08-28 12:39:25 +00:00
|
|
|
|
public function setTitle( Title $t = null ) {
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->setPage( $t );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.6
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @deprecated since 1.37, use getPage instead.
|
|
|
|
|
|
* @return Title
|
|
|
|
|
|
*/
|
2021-07-22 03:11:47 +00:00
|
|
|
|
public function getTitle(): Title {
|
2021-04-25 17:29:33 +00:00
|
|
|
|
if ( !$this->mTitle ) {
|
|
|
|
|
|
$this->mTitle = Title::makeTitle( NS_SPECIAL, 'Badtitle/Parser' );
|
|
|
|
|
|
}
|
|
|
|
|
|
return $this->mTitle;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Set the page used as context for parsing, e.g. when resolving relative subpage links.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.37
|
|
|
|
|
|
* @param ?PageReference $t
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function setPage( ?PageReference $t = null ) {
|
2014-04-16 18:10:37 +00:00
|
|
|
|
if ( !$t ) {
|
2019-08-30 09:43:35 +00:00
|
|
|
|
$t = Title::makeTitle( NS_SPECIAL, 'Badtitle/Parser' );
|
2021-04-25 17:29:33 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
// For now (early 1.37 alpha), always convert to Title, so we don't have to do it over
|
|
|
|
|
|
// and over again in other methods. Eventually, we will no longer need to have a Title
|
|
|
|
|
|
// instance internally.
|
|
|
|
|
|
$t = Title::castFromPageReference( $t );
|
2010-12-11 03:52:35 +00:00
|
|
|
|
}
|
2010-06-10 21:05:58 +00:00
|
|
|
|
|
2014-01-02 11:16:21 +00:00
|
|
|
|
if ( $t->hasFragment() ) {
|
2010-06-10 21:05:58 +00:00
|
|
|
|
# Strip the fragment to avoid various odd effects
|
2016-04-24 06:27:00 +00:00
|
|
|
|
$this->mTitle = $t->createFragmentTarget( '' );
|
2010-06-10 21:05:58 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
$this->mTitle = $t;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* Returns the page used as context for parsing, e.g. when resolving relative subpage links.
|
|
|
|
|
|
* @since 1.37
|
|
|
|
|
|
* @return ?PageReference
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public function getPage(): ?PageReference {
|
2010-06-10 21:05:58 +00:00
|
|
|
|
return $this->mTitle;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
2020-04-16 15:17:12 +00:00
|
|
|
|
* Accessor for the output type.
|
|
|
|
|
|
* @return int One of the Parser::OT_... constants
|
|
|
|
|
|
* @since 1.35
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function getOutputType(): int {
|
|
|
|
|
|
return $this->mOutputType;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Mutator for the output type.
|
2020-03-04 10:08:00 +00:00
|
|
|
|
* @param int $ot One of the Parser::OT_… constants
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.8
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*/
|
2020-04-16 15:17:12 +00:00
|
|
|
|
public function setOutputType( $ot ): void {
|
2010-06-10 21:05:58 +00:00
|
|
|
|
$this->mOutputType = $ot;
|
|
|
|
|
|
# Shortcut alias
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$this->ot = [
|
2010-06-10 21:05:58 +00:00
|
|
|
|
'html' => $ot == self::OT_HTML,
|
|
|
|
|
|
'wiki' => $ot == self::OT_WIKI,
|
|
|
|
|
|
'pre' => $ot == self::OT_PREPROCESS,
|
|
|
|
|
|
'plain' => $ot == self::OT_PLAIN,
|
2016-02-17 09:09:32 +00:00
|
|
|
|
];
|
2010-06-10 21:05:58 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Accessor/mutator for the output type
|
|
|
|
|
|
*
|
2013-03-11 17:15:01 +00:00
|
|
|
|
* @param int|null $x New value or null to just get the current one
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return int
|
2020-04-16 15:17:12 +00:00
|
|
|
|
* @deprecated since 1.35, use getOutputType()/setOutputType()
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function OutputType( $x = null ) {
|
2020-04-16 16:48:29 +00:00
|
|
|
|
wfDeprecated( __METHOD__, '1.35' );
|
2010-06-10 21:05:58 +00:00
|
|
|
|
return wfSetVar( $this->mOutputType, $x );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return ParserOutput
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.14
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function getOutput() {
|
2010-06-10 21:05:58 +00:00
|
|
|
|
return $this->mOutput;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
2020-03-04 10:08:00 +00:00
|
|
|
|
* @return ParserOptions|null
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.6
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function getOptions() {
|
2010-06-10 21:05:58 +00:00
|
|
|
|
return $this->mOptions;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2020-04-16 15:17:12 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Mutator for the ParserOptions object
|
|
|
|
|
|
* @param ParserOptions $options The new parser options
|
|
|
|
|
|
* @since 1.35
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function setOptions( ParserOptions $options ): void {
|
|
|
|
|
|
$this->mOptions = $options;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2010-06-10 21:05:58 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Accessor/mutator for the ParserOptions object
|
|
|
|
|
|
*
|
2018-06-26 21:14:43 +00:00
|
|
|
|
* @param ParserOptions|null $x New value or null to just get the current one
|
2012-02-09 19:29:36 +00:00
|
|
|
|
* @return ParserOptions Current ParserOptions object
|
2020-04-16 15:17:12 +00:00
|
|
|
|
* @deprecated since 1.35, use getOptions() / setOptions()
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function Options( $x = null ) {
|
2020-04-16 16:48:29 +00:00
|
|
|
|
wfDeprecated( __METHOD__, '1.35' );
|
2010-06-10 21:05:58 +00:00
|
|
|
|
return wfSetVar( $this->mOptions, $x );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2011-05-28 17:18:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @return int
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.14
|
2011-05-28 17:18:50 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function nextLinkID() {
|
2010-06-10 21:05:58 +00:00
|
|
|
|
return $this->mLinkID++;
|
|
|
|
|
|
}
|
2006-02-28 05:18:36 +00:00
|
|
|
|
|
2011-05-28 17:18:50 +00:00
|
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param int $id
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.8
|
2011-05-28 17:18:50 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function setLinkID( $id ) {
|
2011-02-23 06:58:15 +00:00
|
|
|
|
$this->mLinkID = $id;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2011-02-19 01:02:56 +00:00
|
|
|
|
/**
|
2012-03-05 05:53:12 +00:00
|
|
|
|
* Get a language object for use in parser functions such as {{FORMATNUM:}}
|
2011-02-19 01:02:56 +00:00
|
|
|
|
* @return Language
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.7
|
2022-09-28 19:52:56 +00:00
|
|
|
|
* @deprecated since 1.40; use ::getTargetLanguage() instead.
|
2011-02-19 01:02:56 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function getFunctionLang() {
|
2022-09-28 19:52:56 +00:00
|
|
|
|
wfDeprecated( __METHOD__, '1.40' );
|
2012-03-05 05:53:12 +00:00
|
|
|
|
return $this->getTargetLanguage();
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
2012-05-03 20:02:27 +00:00
|
|
|
|
* Get the target language for the content being parsed. This is usually the
|
|
|
|
|
|
* language that the content is in.
|
2012-08-20 14:55:28 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @since 1.19
|
|
|
|
|
|
*
|
2021-03-25 20:37:28 +00:00
|
|
|
|
* @return Language|StubUserLang
|
2012-03-05 05:53:12 +00:00
|
|
|
|
*/
|
2012-08-20 14:55:28 +00:00
|
|
|
|
public function getTargetLanguage() {
|
2008-03-07 14:02:12 +00:00
|
|
|
|
$target = $this->mOptions->getTargetLanguage();
|
2012-08-20 14:55:28 +00:00
|
|
|
|
|
2008-03-07 14:02:12 +00:00
|
|
|
|
if ( $target !== null ) {
|
|
|
|
|
|
return $target;
|
2013-04-20 15:38:24 +00:00
|
|
|
|
} elseif ( $this->mOptions->getInterfaceMessage() ) {
|
2011-10-19 14:16:01 +00:00
|
|
|
|
return $this->mOptions->getUserLangObj();
|
2008-03-07 14:02:12 +00:00
|
|
|
|
}
|
2012-08-20 14:55:28 +00:00
|
|
|
|
|
2019-10-18 19:50:58 +00:00
|
|
|
|
return $this->getTitle()->getPageLanguage();
|
2006-07-03 11:07:00 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2010-12-10 18:17:20 +00:00
|
|
|
|
/**
|
2021-09-23 01:08:02 +00:00
|
|
|
|
* Get a user either from the user set on Parser if it's set,
|
|
|
|
|
|
* or from the ParserOptions object otherwise.
|
|
|
|
|
|
*
|
2021-10-03 19:37:42 +00:00
|
|
|
|
* @since 1.36
|
2021-03-16 18:31:27 +00:00
|
|
|
|
* @return UserIdentity
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function getUserIdentity(): UserIdentity {
|
2021-09-23 01:08:02 +00:00
|
|
|
|
return $this->mUser ?? $this->getOptions()->getUserIdentity();
|
2021-03-16 18:31:27 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2008-01-21 16:36:08 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get a preprocessor object
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return Preprocessor
|
2021-02-19 17:40:27 +00:00
|
|
|
|
* @since 1.12.0
|
2008-01-21 16:36:08 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function getPreprocessor() {
|
2008-01-21 16:36:08 +00:00
|
|
|
|
return $this->mPreprocessor;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2016-05-13 00:37:17 +00:00
|
|
|
|
/**
|
2016-09-26 23:25:17 +00:00
|
|
|
|
* Get a LinkRenderer instance to make links with
|
2016-05-13 00:37:17 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @since 1.28
|
2016-09-26 23:25:17 +00:00
|
|
|
|
* @return LinkRenderer
|
2016-05-13 00:37:17 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function getLinkRenderer() {
|
2018-08-08 14:57:31 +00:00
|
|
|
|
// XXX We make the LinkRenderer with current options and then cache it forever
|
2016-05-13 00:37:17 +00:00
|
|
|
|
if ( !$this->mLinkRenderer ) {
|
2018-08-08 14:57:31 +00:00
|
|
|
|
$this->mLinkRenderer = $this->linkRendererFactory->create();
|
2016-05-13 00:37:17 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
return $this->mLinkRenderer;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2018-07-25 12:14:13 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get the MagicWordFactory that this Parser is using
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.32
|
|
|
|
|
|
* @return MagicWordFactory
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function getMagicWordFactory() {
|
|
|
|
|
|
return $this->magicWordFactory;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2018-07-26 12:37:13 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get the content language that this Parser is using
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.32
|
|
|
|
|
|
* @return Language
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function getContentLanguage() {
|
2018-08-03 08:25:15 +00:00
|
|
|
|
return $this->contLang;
|
2018-07-26 12:37:13 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2020-02-23 00:42:34 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get the BadFileLookup instance that this Parser is using
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.35
|
|
|
|
|
|
* @return BadFileLookup
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function getBadFileLookup() {
|
|
|
|
|
|
return $this->badFileLookup;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2005-07-03 07:15:53 +00:00
|
|
|
|
/**
|
2006-06-01 19:38:14 +00:00
|
|
|
|
* Replaces all occurrences of HTML-style comments and the given tags
|
2008-02-09 21:48:41 +00:00
|
|
|
|
* in the text with a random marker and returns the next text. The output
|
2006-06-01 19:38:14 +00:00
|
|
|
|
* parameter $matches will be an associative array filled with data in
|
|
|
|
|
|
* the form:
|
2012-07-10 12:48:06 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @code
|
2016-08-26 11:36:58 +00:00
|
|
|
|
* 'UNIQ-xxxxx' => [
|
2006-06-01 19:38:14 +00:00
|
|
|
|
* 'element',
|
|
|
|
|
|
* 'tag content',
|
2016-08-26 11:36:58 +00:00
|
|
|
|
* [ 'param' => 'x' ],
|
|
|
|
|
|
* '<element param="x">tag content</element>' ]
|
2012-07-10 12:48:06 +00:00
|
|
|
|
* @endcode
|
2005-07-03 07:15:53 +00:00
|
|
|
|
*
|
2021-06-17 11:34:09 +00:00
|
|
|
|
* @param string[] $elements List of element names. Comments are always extracted.
|
2013-03-11 17:15:01 +00:00
|
|
|
|
* @param string $text Source text string.
|
2021-06-17 11:34:09 +00:00
|
|
|
|
* @param array[] &$matches Out parameter, Array: extracted tags
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @return string Stripped text
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2019-08-27 09:23:52 +00:00
|
|
|
|
public static function extractTagsAndParams( array $elements, $text, &$matches ) {
|
2006-08-06 14:01:47 +00:00
|
|
|
|
static $n = 1;
|
2004-06-08 18:11:28 +00:00
|
|
|
|
$stripped = '';
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$matches = [];
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
2006-06-01 19:38:14 +00:00
|
|
|
|
$taglist = implode( '|', $elements );
|
2019-03-01 21:15:22 +00:00
|
|
|
|
$start = "/<($taglist)(\\s+[^>]*?|\\s*?)(\/?>)|<(!--)/i";
|
2004-03-26 17:14:23 +00:00
|
|
|
|
|
2010-01-27 02:41:22 +00:00
|
|
|
|
while ( $text != '' ) {
|
2005-06-03 08:12:48 +00:00
|
|
|
|
$p = preg_split( $start, $text, 2, PREG_SPLIT_DELIM_CAPTURE );
|
2004-03-26 17:14:23 +00:00
|
|
|
|
$stripped .= $p[0];
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( count( $p ) < 5 ) {
|
2005-06-03 08:12:48 +00:00
|
|
|
|
break;
|
|
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( count( $p ) > 5 ) {
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# comment
|
2013-03-07 16:50:43 +00:00
|
|
|
|
$element = $p[4];
|
2006-06-01 19:38:14 +00:00
|
|
|
|
$attributes = '';
|
2013-03-07 16:50:43 +00:00
|
|
|
|
$close = '';
|
|
|
|
|
|
$inside = $p[5];
|
2006-04-08 04:40:09 +00:00
|
|
|
|
} else {
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# tag
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ , $element, $attributes, $close, $inside ] = $p;
|
2006-04-08 04:40:09 +00:00
|
|
|
|
}
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
$marker = self::MARKER_PREFIX . "-$element-" . sprintf( '%08X', $n++ ) . self::MARKER_SUFFIX;
|
2005-06-03 08:12:48 +00:00
|
|
|
|
$stripped .= $marker;
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
2006-06-01 19:38:14 +00:00
|
|
|
|
if ( $close === '/>' ) {
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Empty element tag, <tag />
|
2006-06-01 06:16:55 +00:00
|
|
|
|
$content = null;
|
2005-11-13 04:47:03 +00:00
|
|
|
|
$text = $inside;
|
2006-06-01 19:38:14 +00:00
|
|
|
|
$tail = null;
|
2004-03-26 17:14:23 +00:00
|
|
|
|
} else {
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $element === '!--' ) {
|
2006-06-01 19:38:14 +00:00
|
|
|
|
$end = '/(-->)/';
|
2006-06-01 06:16:55 +00:00
|
|
|
|
} else {
|
2006-06-01 19:38:14 +00:00
|
|
|
|
$end = "/(<\\/$element\\s*>)/i";
|
2006-06-01 06:16:55 +00:00
|
|
|
|
}
|
2006-06-01 19:38:14 +00:00
|
|
|
|
$q = preg_split( $end, $inside, 2, PREG_SPLIT_DELIM_CAPTURE );
|
2006-06-01 06:16:55 +00:00
|
|
|
|
$content = $q[0];
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( count( $q ) < 3 ) {
|
2005-11-13 04:47:03 +00:00
|
|
|
|
# No end tag -- let it run out to the end of the text.
|
2006-06-01 19:38:14 +00:00
|
|
|
|
$tail = '';
|
2006-06-01 08:24:22 +00:00
|
|
|
|
$text = '';
|
2005-11-13 04:47:03 +00:00
|
|
|
|
} else {
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ , $tail, $text ] = $q;
|
2005-11-13 04:47:03 +00:00
|
|
|
|
}
|
2004-03-26 17:14:23 +00:00
|
|
|
|
}
|
2006-10-17 08:49:27 +00:00
|
|
|
|
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$matches[$marker] = [ $element,
|
2006-06-01 06:16:55 +00:00
|
|
|
|
$content,
|
|
|
|
|
|
Sanitizer::decodeTagAttributes( $attributes ),
|
2016-02-17 09:09:32 +00:00
|
|
|
|
"<$element$attributes$close$content$tail" ];
|
2004-03-26 17:14:23 +00:00
|
|
|
|
}
|
|
|
|
|
|
return $stripped;
|
2004-04-12 23:59:37 +00:00
|
|
|
|
}
|
2004-03-26 17:14:23 +00:00
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
2007-11-20 10:55:08 +00:00
|
|
|
|
* Get a list of strippable XML-like elements
|
2011-05-28 17:18:50 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return array
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function getStripList() {
|
2010-02-03 07:10:58 +00:00
|
|
|
|
return $this->mStripList;
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
2006-10-17 08:49:27 +00:00
|
|
|
|
|
2019-08-12 06:10:22 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @return StripState
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.34
|
2019-08-12 06:10:22 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function getStripState() {
|
|
|
|
|
|
return $this->mStripState;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Add an item to the strip state
|
|
|
|
|
|
* Returns the unique tag which must be inserted into the stripped text
|
|
|
|
|
|
* The tag will be replaced with the original text in unstrip()
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return string
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function insertStripItem( $text ) {
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
$marker = self::MARKER_PREFIX . "-item-{$this->mMarkerIndex}-" . self::MARKER_SUFFIX;
|
2008-02-05 08:23:58 +00:00
|
|
|
|
$this->mMarkerIndex++;
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
$this->mStripState->addGeneral( $marker, $text );
|
|
|
|
|
|
return $marker;
|
2004-04-09 15:29:33 +00:00
|
|
|
|
}
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Parse the wiki syntax used to render tables.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function handleTables( $text ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$lines = StringUtils::explode( "\n", $text );
|
|
|
|
|
|
$out = '';
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$td_history = []; # Is currently a td tag open?
|
|
|
|
|
|
$last_tag_history = []; # Save history of last lag activated (td, th or caption)
|
|
|
|
|
|
$tr_history = []; # Is currently a tr tag open?
|
|
|
|
|
|
$tr_attributes = []; # history of tr attributes
|
|
|
|
|
|
$has_opened_tr = []; # Did this table open a <tr> element?
|
2011-09-15 12:10:53 +00:00
|
|
|
|
$indent_level = 0; # indent level of the table
|
2008-08-26 14:37:15 +00:00
|
|
|
|
|
2011-01-26 01:16:18 +00:00
|
|
|
|
foreach ( $lines as $outLine ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$line = trim( $outLine );
|
2006-12-04 20:08:53 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
if ( $line === '' ) { # empty line, go to next line
|
2013-01-26 21:11:09 +00:00
|
|
|
|
$out .= $outLine . "\n";
|
2006-12-04 20:08:53 +00:00
|
|
|
|
continue;
|
|
|
|
|
|
}
|
2011-09-15 12:10:53 +00:00
|
|
|
|
|
|
|
|
|
|
$first_character = $line[0];
|
2015-07-03 18:53:06 +00:00
|
|
|
|
$first_two = substr( $line, 0, 2 );
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$matches = [];
|
2006-12-04 20:08:53 +00:00
|
|
|
|
|
2015-07-08 21:17:59 +00:00
|
|
|
|
if ( preg_match( '/^(:*)\s*\{\|(.*)$/', $line, $matches ) ) {
|
2011-09-15 12:10:53 +00:00
|
|
|
|
# First check if we are starting a new table
|
|
|
|
|
|
$indent_level = strlen( $matches[1] );
|
2006-01-07 13:31:29 +00:00
|
|
|
|
|
2006-11-21 09:53:45 +00:00
|
|
|
|
$attributes = $this->mStripState->unstripBoth( $matches[2] );
|
2013-01-26 21:11:09 +00:00
|
|
|
|
$attributes = Sanitizer::fixTagAttributes( $attributes, 'table' );
|
|
|
|
|
|
|
|
|
|
|
|
$outLine = str_repeat( '<dl><dd>', $indent_level ) . "<table{$attributes}>";
|
|
|
|
|
|
array_push( $td_history, false );
|
|
|
|
|
|
array_push( $last_tag_history, '' );
|
|
|
|
|
|
array_push( $tr_history, false );
|
|
|
|
|
|
array_push( $tr_attributes, '' );
|
|
|
|
|
|
array_push( $has_opened_tr, false );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
} elseif ( count( $td_history ) == 0 ) {
|
|
|
|
|
|
# Don't do any of the following
|
2013-01-26 21:11:09 +00:00
|
|
|
|
$out .= $outLine . "\n";
|
2011-09-15 12:10:53 +00:00
|
|
|
|
continue;
|
2015-07-03 18:53:06 +00:00
|
|
|
|
} elseif ( $first_two === '|}' ) {
|
2011-09-15 12:10:53 +00:00
|
|
|
|
# We are ending a table
|
2013-01-26 21:11:09 +00:00
|
|
|
|
$line = '</table>' . substr( $line, 2 );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
$last_tag = array_pop( $last_tag_history );
|
2006-12-04 20:08:53 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
if ( !array_pop( $has_opened_tr ) ) {
|
|
|
|
|
|
$line = "<tr><td></td></tr>{$line}";
|
2006-12-04 20:08:53 +00:00
|
|
|
|
}
|
2011-04-15 22:36:09 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
if ( array_pop( $tr_history ) ) {
|
|
|
|
|
|
$line = "</tr>{$line}";
|
2011-04-14 10:02:51 +00:00
|
|
|
|
}
|
2006-12-04 20:08:53 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
if ( array_pop( $td_history ) ) {
|
|
|
|
|
|
$line = "</{$last_tag}>{$line}";
|
2006-12-04 20:08:53 +00:00
|
|
|
|
}
|
2011-09-15 12:10:53 +00:00
|
|
|
|
array_pop( $tr_attributes );
|
2018-03-01 23:02:54 +00:00
|
|
|
|
if ( $indent_level > 0 ) {
|
|
|
|
|
|
$outLine = rtrim( $line ) . str_repeat( '</dd></dl>', $indent_level );
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$outLine = $line;
|
|
|
|
|
|
}
|
2015-07-03 18:53:06 +00:00
|
|
|
|
} elseif ( $first_two === '|-' ) {
|
2011-09-15 12:10:53 +00:00
|
|
|
|
# Now we have a table row
|
|
|
|
|
|
$line = preg_replace( '#^\|-+#', '', $line );
|
2006-12-04 20:08:53 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
# Whats after the tag is now only attributes
|
2011-04-12 21:27:24 +00:00
|
|
|
|
$attributes = $this->mStripState->unstripBoth( $line );
|
|
|
|
|
|
$attributes = Sanitizer::fixTagAttributes( $attributes, 'tr' );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
array_pop( $tr_attributes );
|
|
|
|
|
|
array_push( $tr_attributes, $attributes );
|
2011-04-13 11:24:38 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
$line = '';
|
|
|
|
|
|
$last_tag = array_pop( $last_tag_history );
|
|
|
|
|
|
array_pop( $has_opened_tr );
|
2013-01-26 21:11:09 +00:00
|
|
|
|
array_push( $has_opened_tr, true );
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
if ( array_pop( $tr_history ) ) {
|
|
|
|
|
|
$line = '</tr>';
|
2011-04-12 21:27:24 +00:00
|
|
|
|
}
|
2006-12-04 20:08:53 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
if ( array_pop( $td_history ) ) {
|
|
|
|
|
|
$line = "</{$last_tag}>{$line}";
|
2011-04-12 21:27:24 +00:00
|
|
|
|
}
|
2011-04-13 11:24:38 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
$outLine = $line;
|
2013-01-26 21:11:09 +00:00
|
|
|
|
array_push( $tr_history, false );
|
|
|
|
|
|
array_push( $td_history, false );
|
|
|
|
|
|
array_push( $last_tag_history, '' );
|
2014-05-10 23:03:45 +00:00
|
|
|
|
} elseif ( $first_character === '|'
|
|
|
|
|
|
|| $first_character === '!'
|
2015-07-03 18:53:06 +00:00
|
|
|
|
|| $first_two === '|+'
|
2014-05-10 23:03:45 +00:00
|
|
|
|
) {
|
2011-09-15 12:10:53 +00:00
|
|
|
|
# This might be cell elements, td, th or captions
|
2015-07-03 18:53:06 +00:00
|
|
|
|
if ( $first_two === '|+' ) {
|
2011-09-15 12:10:53 +00:00
|
|
|
|
$first_character = '+';
|
2015-07-03 18:53:06 +00:00
|
|
|
|
$line = substr( $line, 2 );
|
|
|
|
|
|
} else {
|
2013-01-26 21:11:09 +00:00
|
|
|
|
$line = substr( $line, 1 );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
}
|
2011-04-13 11:24:38 +00:00
|
|
|
|
|
2016-01-28 04:21:35 +00:00
|
|
|
|
// Implies both are valid for table headings.
|
2011-09-15 12:10:53 +00:00
|
|
|
|
if ( $first_character === '!' ) {
|
2016-02-05 16:00:56 +00:00
|
|
|
|
$line = StringUtils::replaceMarkup( '!!', '||', $line );
|
2011-04-14 10:02:51 +00:00
|
|
|
|
}
|
2011-04-13 11:24:38 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
# Split up multiple cells on the same line.
|
|
|
|
|
|
# FIXME : This can result in improper nesting of tags processed
|
2016-01-28 04:21:35 +00:00
|
|
|
|
# by earlier parser steps.
|
|
|
|
|
|
$cells = explode( '||', $line );
|
2006-12-04 20:08:53 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
$outLine = '';
|
2006-12-04 20:08:53 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
# Loop through each table cell
|
|
|
|
|
|
foreach ( $cells as $cell ) {
|
|
|
|
|
|
$previous = '';
|
|
|
|
|
|
if ( $first_character !== '+' ) {
|
|
|
|
|
|
$tr_after = array_pop( $tr_attributes );
|
|
|
|
|
|
if ( !array_pop( $tr_history ) ) {
|
|
|
|
|
|
$previous = "<tr{$tr_after}>\n";
|
|
|
|
|
|
}
|
2013-01-26 21:11:09 +00:00
|
|
|
|
array_push( $tr_history, true );
|
|
|
|
|
|
array_push( $tr_attributes, '' );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
array_pop( $has_opened_tr );
|
2013-01-26 21:11:09 +00:00
|
|
|
|
array_push( $has_opened_tr, true );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
}
|
2011-01-26 01:16:18 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
$last_tag = array_pop( $last_tag_history );
|
2011-04-12 21:27:24 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
if ( array_pop( $td_history ) ) {
|
|
|
|
|
|
$previous = "</{$last_tag}>\n{$previous}";
|
|
|
|
|
|
}
|
2004-02-29 08:43:29 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
if ( $first_character === '|' ) {
|
|
|
|
|
|
$last_tag = 'td';
|
|
|
|
|
|
} elseif ( $first_character === '!' ) {
|
|
|
|
|
|
$last_tag = 'th';
|
|
|
|
|
|
} elseif ( $first_character === '+' ) {
|
|
|
|
|
|
$last_tag = 'caption';
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$last_tag = '';
|
|
|
|
|
|
}
|
2011-04-12 21:27:24 +00:00
|
|
|
|
|
2013-01-26 21:11:09 +00:00
|
|
|
|
array_push( $last_tag_history, $last_tag );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
|
|
|
|
|
|
# A cell could contain both parameters and data
|
2013-01-26 21:11:09 +00:00
|
|
|
|
$cell_data = explode( '|', $cell, 2 );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# T2553: Note that a '|' inside an invalid link should not
|
2011-09-15 12:10:53 +00:00
|
|
|
|
# be mistaken as delimiting cell parameters
|
2016-12-13 22:22:25 +00:00
|
|
|
|
# Bug T153140: Neither should language converter markup.
|
|
|
|
|
|
if ( preg_match( '/\[\[|-\{/', $cell_data[0] ) === 1 ) {
|
2018-03-01 23:02:54 +00:00
|
|
|
|
$cell = "{$previous}<{$last_tag}>" . trim( $cell );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
} elseif ( count( $cell_data ) == 1 ) {
|
2018-03-01 23:02:54 +00:00
|
|
|
|
// Whitespace in cells is trimmed
|
|
|
|
|
|
$cell = "{$previous}<{$last_tag}>" . trim( $cell_data[0] );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
$attributes = $this->mStripState->unstripBoth( $cell_data[0] );
|
2013-01-26 21:11:09 +00:00
|
|
|
|
$attributes = Sanitizer::fixTagAttributes( $attributes, $last_tag );
|
2018-03-01 23:02:54 +00:00
|
|
|
|
// Whitespace in cells is trimmed
|
|
|
|
|
|
$cell = "{$previous}<{$last_tag}{$attributes}>" . trim( $cell_data[1] );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
}
|
2011-04-12 21:27:24 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
$outLine .= $cell;
|
2013-01-26 21:11:09 +00:00
|
|
|
|
array_push( $td_history, true );
|
2011-09-15 12:10:53 +00:00
|
|
|
|
}
|
2011-04-13 19:46:09 +00:00
|
|
|
|
}
|
2011-09-15 12:10:53 +00:00
|
|
|
|
$out .= $outLine . "\n";
|
2011-04-12 21:27:24 +00:00
|
|
|
|
}
|
2011-06-24 20:25:16 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
# Closing open td, tr && table
|
|
|
|
|
|
while ( count( $td_history ) > 0 ) {
|
|
|
|
|
|
if ( array_pop( $td_history ) ) {
|
|
|
|
|
|
$out .= "</td>\n";
|
2006-12-04 20:08:53 +00:00
|
|
|
|
}
|
2011-09-15 12:10:53 +00:00
|
|
|
|
if ( array_pop( $tr_history ) ) {
|
|
|
|
|
|
$out .= "</tr>\n";
|
2011-04-12 21:27:24 +00:00
|
|
|
|
}
|
2011-09-15 12:10:53 +00:00
|
|
|
|
if ( !array_pop( $has_opened_tr ) ) {
|
2013-02-09 21:44:24 +00:00
|
|
|
|
$out .= "<tr><td></td></tr>\n";
|
2011-04-12 21:27:24 +00:00
|
|
|
|
}
|
2011-09-15 12:10:53 +00:00
|
|
|
|
|
|
|
|
|
|
$out .= "</table>\n";
|
2004-02-28 05:55:13 +00:00
|
|
|
|
}
|
2011-09-15 12:10:53 +00:00
|
|
|
|
|
|
|
|
|
|
# Remove trailing line-ending (b/c)
|
|
|
|
|
|
if ( substr( $out, -1 ) === "\n" ) {
|
|
|
|
|
|
$out = substr( $out, 0, -1 );
|
2011-04-12 21:27:24 +00:00
|
|
|
|
}
|
2004-02-26 13:37:26 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
# special case: don't return empty table
|
|
|
|
|
|
if ( $out === "<table>\n<tr><td></td></tr>\n</table>" ) {
|
|
|
|
|
|
$out = '';
|
|
|
|
|
|
}
|
2006-12-04 20:08:53 +00:00
|
|
|
|
|
2011-09-15 12:10:53 +00:00
|
|
|
|
return $out;
|
2004-02-28 05:55:13 +00:00
|
|
|
|
}
|
2004-02-26 13:37:26 +00:00
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
2014-11-02 18:14:53 +00:00
|
|
|
|
* Helper function for parse() that transforms wiki markup into half-parsed
|
2008-01-22 10:47:44 +00:00
|
|
|
|
* HTML. Only called for $mOutputType == self::OT_HTML.
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*
|
2020-06-26 12:14:23 +00:00
|
|
|
|
* @internal
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2016-04-14 20:01:08 +00:00
|
|
|
|
* @param string $text The text to parse
|
2018-08-31 04:46:10 +00:00
|
|
|
|
* @param-taint $text escapes_html
|
2016-04-14 20:01:08 +00:00
|
|
|
|
* @param bool $isMain Whether this is being called from the main parse() function
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @param PPFrame|false $frame A pre-processor frame
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return string
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function internalParse( $text, $isMain = true, $frame = false ) {
|
2009-06-20 18:25:30 +00:00
|
|
|
|
$origText = $text;
|
2004-04-11 16:46:06 +00:00
|
|
|
|
|
2006-08-06 14:01:47 +00:00
|
|
|
|
# Hook to suspend the parser in this state
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
if ( !$this->hookRunner->onParserBeforeInternalParse( $this, $text, $this->mStripState ) ) {
|
2013-02-09 21:44:24 +00:00
|
|
|
|
return $text;
|
2006-08-06 14:01:47 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# if $frame is provided, then use $frame for replacing any variables
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $frame ) {
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# use frame depth to infer how include/noinclude tags should be handled
|
|
|
|
|
|
# depth=0 means this is the top-level document; otherwise it's an included document
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( !$frame->depth ) {
|
2009-08-30 06:37:10 +00:00
|
|
|
|
$flag = 0;
|
2010-03-30 21:20:05 +00:00
|
|
|
|
} else {
|
2020-04-07 23:52:41 +00:00
|
|
|
|
$flag = Preprocessor::DOM_FOR_INCLUSION;
|
2010-04-10 16:44:10 +00:00
|
|
|
|
}
|
2009-08-30 06:37:10 +00:00
|
|
|
|
$dom = $this->preprocessToDom( $text, $flag );
|
|
|
|
|
|
$text = $frame->expand( $dom );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
} else {
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# if $frame is not provided, then use old-style replaceVariables
|
2009-08-30 06:37:10 +00:00
|
|
|
|
$text = $this->replaceVariables( $text );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2022-03-04 19:05:41 +00:00
|
|
|
|
$text = Sanitizer::internalRemoveHtmlTags(
|
2014-05-10 23:03:45 +00:00
|
|
|
|
$text,
|
2020-01-25 15:45:18 +00:00
|
|
|
|
// Callback from the Sanitizer for expanding items found in
|
|
|
|
|
|
// HTML attribute values, so they can be safely tested and escaped.
|
|
|
|
|
|
function ( &$text, $frame = false ) {
|
|
|
|
|
|
$text = $this->replaceVariables( $text, $frame );
|
|
|
|
|
|
$text = $this->mStripState->unstripBoth( $text );
|
|
|
|
|
|
},
|
2014-05-10 23:03:45 +00:00
|
|
|
|
false,
|
2020-01-25 15:47:28 +00:00
|
|
|
|
[],
|
2020-04-02 15:17:41 +00:00
|
|
|
|
[]
|
2014-05-10 23:03:45 +00:00
|
|
|
|
);
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onInternalParseBeforeLinks( $this, $text, $this->mStripState );
|
2004-02-26 13:37:26 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Tables need to come after variable replacement for things to work
|
|
|
|
|
|
# properly; putting them before other transformations should keep
|
|
|
|
|
|
# exciting things like link expansions from showing up in surprising
|
|
|
|
|
|
# places.
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
$text = $this->handleTables( $text );
|
2006-05-23 07:19:01 +00:00
|
|
|
|
|
2004-06-08 18:11:28 +00:00
|
|
|
|
$text = preg_replace( '/(^|\n)-----*/', '\\1<hr />', $text );
|
2004-02-26 13:37:26 +00:00
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
$text = $this->handleDoubleUnderscore( $text );
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
$text = $this->handleHeadings( $text );
|
|
|
|
|
|
$text = $this->handleInternalLinks( $text );
|
|
|
|
|
|
$text = $this->handleAllQuotes( $text );
|
|
|
|
|
|
$text = $this->handleExternalLinks( $text );
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
# handleInternalLinks may sometimes leave behind
|
|
|
|
|
|
# absolute URLs, which have to be masked to hide them from handleExternalLinks
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
$text = str_replace( self::MARKER_PREFIX . 'NOPARSE', '', $text );
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
$text = $this->handleMagicLinks( $text );
|
|
|
|
|
|
$text = $this->finalizeHeadings( $text, $origText, $isMain );
|
2004-04-12 16:10:17 +00:00
|
|
|
|
|
2004-02-26 13:37:26 +00:00
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
2004-08-07 12:35:59 +00:00
|
|
|
|
|
2020-01-23 18:39:23 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Shorthand for getting a Language Converter for Target language
|
|
|
|
|
|
*
|
2021-08-04 01:42:47 +00:00
|
|
|
|
* @since public since 1.38
|
2020-01-23 18:39:23 +00:00
|
|
|
|
* @return ILanguageConverter
|
|
|
|
|
|
*/
|
2021-08-04 01:42:47 +00:00
|
|
|
|
public function getTargetLanguageConverter(): ILanguageConverter {
|
2020-02-04 12:42:03 +00:00
|
|
|
|
return $this->languageConverterFactory->getLanguageConverter(
|
|
|
|
|
|
$this->getTargetLanguage()
|
|
|
|
|
|
);
|
2020-01-23 18:39:23 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Shorthand for getting a Language Converter for Content language
|
|
|
|
|
|
*
|
|
|
|
|
|
* @return ILanguageConverter
|
|
|
|
|
|
*/
|
2021-07-22 03:11:47 +00:00
|
|
|
|
private function getContentLanguageConverter(): ILanguageConverter {
|
2020-02-04 12:42:03 +00:00
|
|
|
|
return $this->languageConverterFactory->getLanguageConverter(
|
|
|
|
|
|
$this->getContentLanguage()
|
|
|
|
|
|
);
|
2020-01-23 18:39:23 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get a HookContainer capable of returning metadata about hooks or running
|
|
|
|
|
|
* extension hooks.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.35
|
|
|
|
|
|
* @return HookContainer
|
|
|
|
|
|
*/
|
|
|
|
|
|
protected function getHookContainer() {
|
|
|
|
|
|
return $this->hookContainer;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Get a HookRunner for calling core hooks
|
|
|
|
|
|
*
|
|
|
|
|
|
* @internal This is for use by core only. Hook interfaces may be removed
|
|
|
|
|
|
* without notice.
|
|
|
|
|
|
* @since 1.35
|
|
|
|
|
|
* @return HookRunner
|
|
|
|
|
|
*/
|
|
|
|
|
|
protected function getHookRunner() {
|
|
|
|
|
|
return $this->hookRunner;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2014-11-02 18:14:53 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Helper function for parse() that transforms half-parsed HTML into fully
|
|
|
|
|
|
* parsed HTML.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @param bool $isMain
|
|
|
|
|
|
* @param bool $linestart
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function internalParseHalfParsed( $text, $isMain = true, $linestart = true ) {
|
|
|
|
|
|
$text = $this->mStripState->unstripGeneral( $text );
|
|
|
|
|
|
|
2020-01-25 15:45:36 +00:00
|
|
|
|
$text = BlockLevelPass::doBlockLevels( $text, $linestart );
|
2014-11-02 18:14:53 +00:00
|
|
|
|
|
2019-10-29 16:52:47 +00:00
|
|
|
|
$this->replaceLinkHoldersPrivate( $text );
|
2014-11-02 18:14:53 +00:00
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* The input doesn't get language converted if
|
|
|
|
|
|
* a) It's disabled
|
|
|
|
|
|
* b) Content isn't converted
|
|
|
|
|
|
* c) It's a conversion table
|
|
|
|
|
|
* d) it is an interface message (which is in the user language)
|
|
|
|
|
|
*/
|
|
|
|
|
|
if ( !( $this->mOptions->getDisableContentConversion()
|
|
|
|
|
|
|| isset( $this->mDoubleUnderscores['nocontentconvert'] ) )
|
2019-03-29 20:12:24 +00:00
|
|
|
|
&& !$this->mOptions->getInterfaceMessage()
|
2014-11-02 18:14:53 +00:00
|
|
|
|
) {
|
2019-03-29 20:12:24 +00:00
|
|
|
|
# The position of the convert() call should not be changed. it
|
|
|
|
|
|
# assumes that the links are all replaced and the only thing left
|
|
|
|
|
|
# is the <nowiki> mark.
|
2020-01-23 18:39:23 +00:00
|
|
|
|
$text = $this->getTargetLanguageConverter()->convert( $text );
|
2022-03-08 20:53:14 +00:00
|
|
|
|
// Record information necessary for language conversion of TOC.
|
|
|
|
|
|
$this->mOutput->setExtensionData(
|
|
|
|
|
|
// T303329: this should migrate out of extension data
|
|
|
|
|
|
'core:target-lang',
|
|
|
|
|
|
$this->getTargetLanguage()->getCode()
|
|
|
|
|
|
);
|
|
|
|
|
|
$this->mOutput->setExtensionData(
|
|
|
|
|
|
// T303329: this should migrate out of extension data
|
|
|
|
|
|
'core:target-lang-variant',
|
|
|
|
|
|
$this->getTargetLanguageConverter()->getPreferredVariant()
|
|
|
|
|
|
);
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$this->mOutput->setOutputFlag( ParserOutputFlags::NO_TOC_CONVERSION );
|
2014-11-02 18:14:53 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
$text = $this->mStripState->unstripNoWiki( $text );
|
|
|
|
|
|
|
|
|
|
|
|
$text = $this->mStripState->unstripGeneral( $text );
|
|
|
|
|
|
|
2021-02-18 16:51:12 +00:00
|
|
|
|
$text = $this->tidy->tidy( $text, [ Sanitizer::class, 'armorFrenchSpaces' ] );
|
2014-11-02 18:14:53 +00:00
|
|
|
|
|
|
|
|
|
|
if ( $isMain ) {
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onParserAfterTidy( $this, $text );
|
2014-11-02 18:14:53 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Replace special strings like "ISBN xxx" and "RFC xxx" with
|
|
|
|
|
|
* magic external links.
|
|
|
|
|
|
*
|
|
|
|
|
|
* DML
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return string
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
private function handleMagicLinks( $text ) {
|
2022-04-28 13:33:39 +00:00
|
|
|
|
$prots = $this->urlUtils->validAbsoluteProtocols();
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$urlChar = self::EXT_LINK_URL_CLASS;
|
2015-01-08 22:00:54 +00:00
|
|
|
|
$addr = self::EXT_LINK_ADDR;
|
2014-05-16 00:35:59 +00:00
|
|
|
|
$space = self::SPACE_NOT_NL; # non-newline space
|
|
|
|
|
|
$spdash = "(?:-|$space)"; # a dash or a non-newline space
|
|
|
|
|
|
$spaces = "$space++"; # possessive match of 1 or more spaces
|
2006-10-17 08:49:27 +00:00
|
|
|
|
$text = preg_replace_callback(
|
2017-02-25 21:53:36 +00:00
|
|
|
|
'!(?: # Start cases
|
|
|
|
|
|
(<a[ \t\r\n>].*?</a>) | # m[1]: Skip link text
|
|
|
|
|
|
(<.*?>) | # m[2]: Skip stuff inside HTML elements' . "
|
|
|
|
|
|
(\b # m[3]: Free external links
|
|
|
|
|
|
(?i:$prots)
|
|
|
|
|
|
($addr$urlChar*) # m[4]: Post-protocol path
|
|
|
|
|
|
) |
|
|
|
|
|
|
\b(?:RFC|PMID) $spaces # m[5]: RFC or PMID, capture number
|
2014-05-16 00:35:59 +00:00
|
|
|
|
([0-9]+)\b |
|
2017-02-25 21:53:36 +00:00
|
|
|
|
\bISBN $spaces ( # m[6]: ISBN, capture number
|
2015-07-13 15:31:52 +00:00
|
|
|
|
(?: 97[89] $spdash? )? # optional 13-digit ISBN prefix
|
|
|
|
|
|
(?: [0-9] $spdash? ){9} # 9 digits with opt. delimiters
|
|
|
|
|
|
[0-9Xx] # check digit
|
2014-05-16 00:35:59 +00:00
|
|
|
|
)\b
|
2021-02-25 20:19:54 +00:00
|
|
|
|
)!xu",
|
|
|
|
|
|
[ $this, 'magicLinkCallback' ],
|
|
|
|
|
|
$text
|
|
|
|
|
|
);
|
2006-10-17 08:49:27 +00:00
|
|
|
|
return $text;
|
2006-08-06 14:01:47 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2011-05-28 17:18:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @throws MWException
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param array $m
|
2018-04-06 11:07:01 +00:00
|
|
|
|
* @return string HTML
|
2011-05-28 17:18:50 +00:00
|
|
|
|
*/
|
2019-08-27 09:23:52 +00:00
|
|
|
|
private function magicLinkCallback( array $m ) {
|
2009-02-27 17:56:00 +00:00
|
|
|
|
if ( isset( $m[1] ) && $m[1] !== '' ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
# Skip anchor
|
|
|
|
|
|
return $m[0];
|
2009-02-27 17:56:00 +00:00
|
|
|
|
} elseif ( isset( $m[2] ) && $m[2] !== '' ) {
|
2006-08-06 14:01:47 +00:00
|
|
|
|
# Skip HTML element
|
|
|
|
|
|
return $m[0];
|
2009-02-27 17:56:00 +00:00
|
|
|
|
} elseif ( isset( $m[3] ) && $m[3] !== '' ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
# Free external link
|
2015-07-13 15:31:52 +00:00
|
|
|
|
return $this->makeFreeExternalLink( $m[0], strlen( $m[4] ) );
|
|
|
|
|
|
} elseif ( isset( $m[5] ) && $m[5] !== '' ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
# RFC or PMID
|
|
|
|
|
|
if ( substr( $m[0], 0, 3 ) === 'RFC' ) {
|
2016-09-09 07:28:49 +00:00
|
|
|
|
if ( !$this->mOptions->getMagicRFCLinks() ) {
|
|
|
|
|
|
return $m[0];
|
|
|
|
|
|
}
|
2006-08-06 14:01:47 +00:00
|
|
|
|
$keyword = 'RFC';
|
|
|
|
|
|
$urlmsg = 'rfcurl';
|
2014-03-15 11:32:44 +00:00
|
|
|
|
$cssClass = 'mw-magiclink-rfc';
|
2016-11-03 18:06:20 +00:00
|
|
|
|
$trackingCat = 'magiclink-tracking-rfc';
|
2015-07-13 15:31:52 +00:00
|
|
|
|
$id = $m[5];
|
2008-08-26 14:37:15 +00:00
|
|
|
|
} elseif ( substr( $m[0], 0, 4 ) === 'PMID' ) {
|
2016-09-09 07:28:49 +00:00
|
|
|
|
if ( !$this->mOptions->getMagicPMIDLinks() ) {
|
|
|
|
|
|
return $m[0];
|
|
|
|
|
|
}
|
2006-08-06 14:01:47 +00:00
|
|
|
|
$keyword = 'PMID';
|
|
|
|
|
|
$urlmsg = 'pubmedurl';
|
2014-03-15 11:32:44 +00:00
|
|
|
|
$cssClass = 'mw-magiclink-pmid';
|
2016-11-03 18:06:20 +00:00
|
|
|
|
$trackingCat = 'magiclink-tracking-pmid';
|
2015-07-13 15:31:52 +00:00
|
|
|
|
$id = $m[5];
|
2006-08-06 14:01:47 +00:00
|
|
|
|
} else {
|
2021-02-25 20:19:54 +00:00
|
|
|
|
// Should never happen
|
2013-01-26 21:11:09 +00:00
|
|
|
|
throw new MWException( __METHOD__ . ': unrecognised match type "' .
|
2010-03-30 21:20:05 +00:00
|
|
|
|
substr( $m[0], 0, 20 ) . '"' );
|
2006-08-06 14:01:47 +00:00
|
|
|
|
}
|
2012-08-29 08:07:10 +00:00
|
|
|
|
$url = wfMessage( $urlmsg, $id )->inContentLanguage()->text();
|
2016-11-03 18:06:20 +00:00
|
|
|
|
$this->addTrackingCategory( $trackingCat );
|
2019-10-18 19:50:58 +00:00
|
|
|
|
return Linker::makeExternalLink(
|
|
|
|
|
|
$url,
|
|
|
|
|
|
"{$keyword} {$id}",
|
|
|
|
|
|
true,
|
|
|
|
|
|
$cssClass,
|
|
|
|
|
|
[],
|
|
|
|
|
|
$this->getTitle()
|
|
|
|
|
|
);
|
2016-09-09 07:28:49 +00:00
|
|
|
|
} elseif ( isset( $m[6] ) && $m[6] !== ''
|
|
|
|
|
|
&& $this->mOptions->getMagicISBNLinks()
|
|
|
|
|
|
) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
# ISBN
|
2015-07-13 15:31:52 +00:00
|
|
|
|
$isbn = $m[6];
|
2014-05-16 00:35:59 +00:00
|
|
|
|
$space = self::SPACE_NOT_NL; # non-newline space
|
|
|
|
|
|
$isbn = preg_replace( "/$space/", ' ', $isbn );
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$num = strtr( $isbn, [
|
2008-08-26 14:37:15 +00:00
|
|
|
|
'-' => '',
|
|
|
|
|
|
' ' => '',
|
|
|
|
|
|
'x' => 'X',
|
2016-02-17 09:09:32 +00:00
|
|
|
|
] );
|
2016-11-03 18:06:20 +00:00
|
|
|
|
$this->addTrackingCategory( 'magiclink-tracking-isbn' );
|
2016-05-24 07:37:36 +00:00
|
|
|
|
return $this->getLinkRenderer()->makeKnownLink(
|
|
|
|
|
|
SpecialPage::getTitleFor( 'Booksources', $num ),
|
|
|
|
|
|
"ISBN $isbn",
|
|
|
|
|
|
[
|
|
|
|
|
|
'class' => 'internal mw-magiclink-isbn',
|
|
|
|
|
|
'title' => false // suppress title attribute
|
|
|
|
|
|
]
|
|
|
|
|
|
);
|
2008-08-26 14:37:15 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
return $m[0];
|
2006-08-06 14:01:47 +00:00
|
|
|
|
}
|
2004-08-04 01:53:29 +00:00
|
|
|
|
}
|
2004-07-12 19:49:20 +00:00
|
|
|
|
|
2008-08-26 14:37:15 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Make a free external link, given a user-supplied URL
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $url
|
2015-07-13 15:31:52 +00:00
|
|
|
|
* @param int $numPostProto
|
|
|
|
|
|
* The number of characters after the protocol.
|
2011-08-05 00:33:03 +00:00
|
|
|
|
* @return string HTML
|
2020-06-26 12:14:23 +00:00
|
|
|
|
* @internal
|
2008-08-26 14:37:15 +00:00
|
|
|
|
*/
|
2020-01-25 15:45:18 +00:00
|
|
|
|
private function makeFreeExternalLink( $url, $numPostProto ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$trail = '';
|
|
|
|
|
|
|
|
|
|
|
|
# The characters '<' and '>' (which were escaped by
|
2022-03-04 19:05:41 +00:00
|
|
|
|
# internalRemoveHtmlTags()) should not be included in
|
2008-08-26 14:37:15 +00:00
|
|
|
|
# URLs, per RFC 2396.
|
2015-09-23 19:16:24 +00:00
|
|
|
|
# Make terminate a URL as well (bug T84937)
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$m2 = [];
|
2015-09-26 17:17:49 +00:00
|
|
|
|
if ( preg_match(
|
|
|
|
|
|
'/&(lt|gt|nbsp|#x0*(3[CcEe]|[Aa]0)|#0*(60|62|160));/',
|
|
|
|
|
|
$url,
|
|
|
|
|
|
$m2,
|
|
|
|
|
|
PREG_OFFSET_CAPTURE
|
|
|
|
|
|
) ) {
|
2010-03-30 21:20:05 +00:00
|
|
|
|
$trail = substr( $url, $m2[0][1] ) . $trail;
|
|
|
|
|
|
$url = substr( $url, 0, $m2[0][1] );
|
2008-08-26 14:37:15 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
# Move trailing punctuation to $trail
|
|
|
|
|
|
$sep = ',;\.:!?';
|
|
|
|
|
|
# If there is no left bracket, then consider right brackets fair game too
|
|
|
|
|
|
if ( strpos( $url, '(' ) === false ) {
|
|
|
|
|
|
$sep .= ')';
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2014-12-11 20:15:28 +00:00
|
|
|
|
$urlRev = strrev( $url );
|
|
|
|
|
|
$numSepChars = strspn( $urlRev, $sep );
|
|
|
|
|
|
# Don't break a trailing HTML entity by moving the ; into $trail
|
|
|
|
|
|
# This is in hot code, so use substr_compare to avoid having to
|
|
|
|
|
|
# create a new string object for the comparison
|
2015-06-16 19:06:19 +00:00
|
|
|
|
if ( $numSepChars && substr_compare( $url, ";", -$numSepChars, 1 ) === 0 ) {
|
2014-12-11 20:15:28 +00:00
|
|
|
|
# more optimization: instead of running preg_match with a $
|
|
|
|
|
|
# anchor, which can be slow, do the match on the reversed
|
|
|
|
|
|
# string starting at the desired offset.
|
|
|
|
|
|
# un-reversed regexp is: /&([a-z]+|#x[\da-f]+|#\d+)$/i
|
|
|
|
|
|
if ( preg_match( '/\G([a-z]+|[\da-f]+x#|\d+#)&/i', $urlRev, $m2, 0, $numSepChars ) ) {
|
|
|
|
|
|
$numSepChars--;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
2008-08-26 14:37:15 +00:00
|
|
|
|
if ( $numSepChars ) {
|
|
|
|
|
|
$trail = substr( $url, -$numSepChars ) . $trail;
|
|
|
|
|
|
$url = substr( $url, 0, -$numSepChars );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2015-07-13 15:31:52 +00:00
|
|
|
|
# Verify that we still have a real URL after trail removal, and
|
|
|
|
|
|
# not just lone protocol
|
|
|
|
|
|
if ( strlen( $trail ) >= $numPostProto ) {
|
|
|
|
|
|
return $url . $trail;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2015-01-08 22:00:54 +00:00
|
|
|
|
$url = Sanitizer::cleanUrl( $url );
|
|
|
|
|
|
|
2008-08-26 14:37:15 +00:00
|
|
|
|
# Is this an external image?
|
|
|
|
|
|
$text = $this->maybeMakeExternalImage( $url );
|
|
|
|
|
|
if ( $text === false ) {
|
|
|
|
|
|
# Not an image, make a link
|
2021-02-25 20:19:54 +00:00
|
|
|
|
$text = Linker::makeExternalLink(
|
|
|
|
|
|
$url,
|
2020-01-23 18:39:23 +00:00
|
|
|
|
$this->getTargetLanguageConverter()->markNoConversion( $url ),
|
2021-02-25 20:19:54 +00:00
|
|
|
|
true,
|
|
|
|
|
|
'free',
|
|
|
|
|
|
$this->getExternalLinkAttribs( $url ),
|
|
|
|
|
|
$this->getTitle()
|
|
|
|
|
|
);
|
2008-08-26 14:37:15 +00:00
|
|
|
|
# Register it in the output object...
|
2016-03-11 01:08:06 +00:00
|
|
|
|
$this->mOutput->addExternalLink( $url );
|
2008-08-26 14:37:15 +00:00
|
|
|
|
}
|
|
|
|
|
|
return $text . $trail;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Parse headers and return html
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function handleHeadings( $text ) {
|
2004-02-26 13:37:26 +00:00
|
|
|
|
for ( $i = 6; $i >= 1; --$i ) {
|
2005-11-29 09:55:50 +00:00
|
|
|
|
$h = str_repeat( '=', $i );
|
2018-03-21 01:01:55 +00:00
|
|
|
|
// Trim non-newline whitespace from headings
|
|
|
|
|
|
// Using \s* will break for: "==\n===\n" and parse as <h2>=</h2>
|
|
|
|
|
|
$text = preg_replace( "/^(?:$h)[ \\t]*(.+?)[ \\t]*(?:$h)\\s*$/m", "<h$i>\\1</h$i>", $text );
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Replace single quotes with HTML markup
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
|
|
|
|
|
*
|
|
|
|
|
|
* @return string The altered text
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function handleAllQuotes( $text ) {
|
2004-06-08 18:11:28 +00:00
|
|
|
|
$outtext = '';
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$lines = StringUtils::explode( "\n", $text );
|
2004-05-26 16:29:04 +00:00
|
|
|
|
foreach ( $lines as $line ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$outtext .= $this->doQuotes( $line ) . "\n";
|
2004-05-26 16:29:04 +00:00
|
|
|
|
}
|
2013-01-26 21:11:09 +00:00
|
|
|
|
$outtext = substr( $outtext, 0, -1 );
|
2004-06-05 04:51:24 +00:00
|
|
|
|
return $outtext;
|
2004-05-26 16:29:04 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
2019-11-04 19:23:34 +00:00
|
|
|
|
* Helper function for handleAllQuotes()
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return string
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @internal
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2008-08-25 04:27:40 +00:00
|
|
|
|
public function doQuotes( $text ) {
|
2010-01-27 02:41:22 +00:00
|
|
|
|
$arr = preg_split( "/(''+)/", $text, -1, PREG_SPLIT_DELIM_CAPTURE );
|
2013-08-24 17:25:54 +00:00
|
|
|
|
$countarr = count( $arr );
|
|
|
|
|
|
if ( $countarr == 1 ) {
|
2004-08-06 20:47:21 +00:00
|
|
|
|
return $text;
|
2013-08-16 21:08:00 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// First, do some preliminary work. This may shift some apostrophes from
|
|
|
|
|
|
// being mark-up to being text. It also counts the number of occurrences
|
|
|
|
|
|
// of bold and italics mark-ups.
|
|
|
|
|
|
$numbold = 0;
|
|
|
|
|
|
$numitalics = 0;
|
2013-08-24 17:25:54 +00:00
|
|
|
|
for ( $i = 1; $i < $countarr; $i += 2 ) {
|
|
|
|
|
|
$thislen = strlen( $arr[$i] );
|
2013-08-16 21:08:00 +00:00
|
|
|
|
// If there are ever four apostrophes, assume the first is supposed to
|
|
|
|
|
|
// be text, and the remaining three constitute mark-up for bold text.
|
2016-12-11 22:45:07 +00:00
|
|
|
|
// (T15227: ''''foo'''' turns into ' ''' foo ' ''')
|
2013-08-24 17:25:54 +00:00
|
|
|
|
if ( $thislen == 4 ) {
|
2013-08-16 21:08:00 +00:00
|
|
|
|
$arr[$i - 1] .= "'";
|
|
|
|
|
|
$arr[$i] = "'''";
|
2013-08-24 17:25:54 +00:00
|
|
|
|
$thislen = 3;
|
|
|
|
|
|
} elseif ( $thislen > 5 ) {
|
2013-08-16 21:08:00 +00:00
|
|
|
|
// If there are more than 5 apostrophes in a row, assume they're all
|
|
|
|
|
|
// text except for the last 5.
|
2016-12-11 22:45:07 +00:00
|
|
|
|
// (T15227: ''''''foo'''''' turns into ' ''''' foo ' ''''')
|
2013-08-24 17:25:54 +00:00
|
|
|
|
$arr[$i - 1] .= str_repeat( "'", $thislen - 5 );
|
2013-08-16 21:08:00 +00:00
|
|
|
|
$arr[$i] = "'''''";
|
2013-08-24 17:25:54 +00:00
|
|
|
|
$thislen = 5;
|
2010-01-27 02:41:22 +00:00
|
|
|
|
}
|
2013-08-16 21:08:00 +00:00
|
|
|
|
// Count the number of occurrences of bold and italics mark-ups.
|
2013-08-24 17:25:54 +00:00
|
|
|
|
if ( $thislen == 2 ) {
|
2013-08-16 21:08:00 +00:00
|
|
|
|
$numitalics++;
|
2013-08-24 17:25:54 +00:00
|
|
|
|
} elseif ( $thislen == 3 ) {
|
2013-08-16 21:08:00 +00:00
|
|
|
|
$numbold++;
|
2013-08-24 17:25:54 +00:00
|
|
|
|
} elseif ( $thislen == 5 ) {
|
2013-08-16 21:08:00 +00:00
|
|
|
|
$numitalics++;
|
|
|
|
|
|
$numbold++;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
2010-01-27 02:41:22 +00:00
|
|
|
|
|
2013-08-16 21:08:00 +00:00
|
|
|
|
// If there is an odd number of both bold and italics, it is likely
|
|
|
|
|
|
// that one of the bold ones was meant to be an apostrophe followed
|
|
|
|
|
|
// by italics. Which one we cannot know for certain, but it is more
|
|
|
|
|
|
// likely to be one that has a single-letter word before it.
|
|
|
|
|
|
if ( ( $numbold % 2 == 1 ) && ( $numitalics % 2 == 1 ) ) {
|
|
|
|
|
|
$firstsingleletterword = -1;
|
|
|
|
|
|
$firstmultiletterword = -1;
|
|
|
|
|
|
$firstspace = -1;
|
2013-08-24 17:25:54 +00:00
|
|
|
|
for ( $i = 1; $i < $countarr; $i += 2 ) {
|
2013-08-16 21:08:00 +00:00
|
|
|
|
if ( strlen( $arr[$i] ) == 3 ) {
|
|
|
|
|
|
$x1 = substr( $arr[$i - 1], -1 );
|
|
|
|
|
|
$x2 = substr( $arr[$i - 1], -2, 1 );
|
|
|
|
|
|
if ( $x1 === ' ' ) {
|
|
|
|
|
|
if ( $firstspace == -1 ) {
|
|
|
|
|
|
$firstspace = $i;
|
|
|
|
|
|
}
|
|
|
|
|
|
} elseif ( $x2 === ' ' ) {
|
2015-08-17 11:13:27 +00:00
|
|
|
|
$firstsingleletterword = $i;
|
|
|
|
|
|
// if $firstsingleletterword is set, we don't
|
|
|
|
|
|
// look at the other options, so we can bail early.
|
|
|
|
|
|
break;
|
2019-03-29 20:12:24 +00:00
|
|
|
|
} elseif ( $firstmultiletterword == -1 ) {
|
|
|
|
|
|
$firstmultiletterword = $i;
|
2010-01-27 02:41:22 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2013-08-16 21:08:00 +00:00
|
|
|
|
}
|
2004-08-07 12:35:59 +00:00
|
|
|
|
|
2013-08-16 21:08:00 +00:00
|
|
|
|
// If there is a single-letter word, use it!
|
|
|
|
|
|
if ( $firstsingleletterword > -1 ) {
|
|
|
|
|
|
$arr[$firstsingleletterword] = "''";
|
|
|
|
|
|
$arr[$firstsingleletterword - 1] .= "'";
|
|
|
|
|
|
} elseif ( $firstmultiletterword > -1 ) {
|
|
|
|
|
|
// If not, but there's a multi-letter word, use that one.
|
|
|
|
|
|
$arr[$firstmultiletterword] = "''";
|
|
|
|
|
|
$arr[$firstmultiletterword - 1] .= "'";
|
|
|
|
|
|
} elseif ( $firstspace > -1 ) {
|
|
|
|
|
|
// ... otherwise use the first one that has neither.
|
|
|
|
|
|
// (notice that it is possible for all three to be -1 if, for example,
|
|
|
|
|
|
// there is only one pentuple-apostrophe in the line)
|
|
|
|
|
|
$arr[$firstspace] = "''";
|
|
|
|
|
|
$arr[$firstspace - 1] .= "'";
|
2004-08-06 20:47:21 +00:00
|
|
|
|
}
|
2013-08-16 21:08:00 +00:00
|
|
|
|
}
|
2004-08-07 12:35:59 +00:00
|
|
|
|
|
2013-08-16 21:08:00 +00:00
|
|
|
|
// Now let's actually convert our apostrophic mush to HTML!
|
|
|
|
|
|
$output = '';
|
|
|
|
|
|
$buffer = '';
|
|
|
|
|
|
$state = '';
|
|
|
|
|
|
$i = 0;
|
|
|
|
|
|
foreach ( $arr as $r ) {
|
|
|
|
|
|
if ( ( $i % 2 ) == 0 ) {
|
|
|
|
|
|
if ( $state === 'both' ) {
|
|
|
|
|
|
$buffer .= $r;
|
2010-03-30 21:20:05 +00:00
|
|
|
|
} else {
|
2013-08-16 21:08:00 +00:00
|
|
|
|
$output .= $r;
|
|
|
|
|
|
}
|
|
|
|
|
|
} else {
|
2013-08-24 17:25:54 +00:00
|
|
|
|
$thislen = strlen( $r );
|
|
|
|
|
|
if ( $thislen == 2 ) {
|
2021-02-25 20:19:54 +00:00
|
|
|
|
// two quotes - open or close italics
|
2013-08-16 21:08:00 +00:00
|
|
|
|
if ( $state === 'i' ) {
|
|
|
|
|
|
$output .= '</i>';
|
|
|
|
|
|
$state = '';
|
|
|
|
|
|
} elseif ( $state === 'bi' ) {
|
|
|
|
|
|
$output .= '</i>';
|
|
|
|
|
|
$state = 'b';
|
|
|
|
|
|
} elseif ( $state === 'ib' ) {
|
|
|
|
|
|
$output .= '</b></i><b>';
|
|
|
|
|
|
$state = 'b';
|
|
|
|
|
|
} elseif ( $state === 'both' ) {
|
|
|
|
|
|
$output .= '<b><i>' . $buffer . '</i>';
|
|
|
|
|
|
$state = 'b';
|
|
|
|
|
|
} else { // $state can be 'b' or ''
|
|
|
|
|
|
$output .= '<i>';
|
|
|
|
|
|
$state .= 'i';
|
|
|
|
|
|
}
|
2013-08-24 17:25:54 +00:00
|
|
|
|
} elseif ( $thislen == 3 ) {
|
2021-02-25 20:19:54 +00:00
|
|
|
|
// three quotes - open or close bold
|
2013-08-16 21:08:00 +00:00
|
|
|
|
if ( $state === 'b' ) {
|
|
|
|
|
|
$output .= '</b>';
|
|
|
|
|
|
$state = '';
|
|
|
|
|
|
} elseif ( $state === 'bi' ) {
|
|
|
|
|
|
$output .= '</i></b><i>';
|
|
|
|
|
|
$state = 'i';
|
|
|
|
|
|
} elseif ( $state === 'ib' ) {
|
|
|
|
|
|
$output .= '</b>';
|
|
|
|
|
|
$state = 'i';
|
|
|
|
|
|
} elseif ( $state === 'both' ) {
|
|
|
|
|
|
$output .= '<i><b>' . $buffer . '</b>';
|
|
|
|
|
|
$state = 'i';
|
|
|
|
|
|
} else { // $state can be 'i' or ''
|
|
|
|
|
|
$output .= '<b>';
|
|
|
|
|
|
$state .= 'b';
|
|
|
|
|
|
}
|
2013-08-24 17:25:54 +00:00
|
|
|
|
} elseif ( $thislen == 5 ) {
|
2021-02-25 20:19:54 +00:00
|
|
|
|
// five quotes - open or close both separately
|
2013-08-16 21:08:00 +00:00
|
|
|
|
if ( $state === 'b' ) {
|
|
|
|
|
|
$output .= '</b><i>';
|
|
|
|
|
|
$state = 'i';
|
|
|
|
|
|
} elseif ( $state === 'i' ) {
|
|
|
|
|
|
$output .= '</i><b>';
|
|
|
|
|
|
$state = 'b';
|
|
|
|
|
|
} elseif ( $state === 'bi' ) {
|
|
|
|
|
|
$output .= '</i></b>';
|
|
|
|
|
|
$state = '';
|
|
|
|
|
|
} elseif ( $state === 'ib' ) {
|
|
|
|
|
|
$output .= '</b></i>';
|
|
|
|
|
|
$state = '';
|
|
|
|
|
|
} elseif ( $state === 'both' ) {
|
|
|
|
|
|
$output .= '<i><b>' . $buffer . '</b></i>';
|
|
|
|
|
|
$state = '';
|
|
|
|
|
|
} else { // ($state == '')
|
|
|
|
|
|
$buffer = '';
|
|
|
|
|
|
$state = 'both';
|
2004-08-06 20:47:21 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
}
|
2013-08-16 21:08:00 +00:00
|
|
|
|
$i++;
|
|
|
|
|
|
}
|
|
|
|
|
|
// Now close all remaining tags. Notice that the order is important.
|
|
|
|
|
|
if ( $state === 'b' || $state === 'ib' ) {
|
|
|
|
|
|
$output .= '</b>';
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( $state === 'i' || $state === 'bi' || $state === 'ib' ) {
|
|
|
|
|
|
$output .= '</i>';
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( $state === 'bi' ) {
|
|
|
|
|
|
$output .= '</b>';
|
|
|
|
|
|
}
|
|
|
|
|
|
// There might be lonely ''''', so make sure we have a buffer
|
|
|
|
|
|
if ( $state === 'both' && $buffer ) {
|
|
|
|
|
|
$output .= '<b><i>' . $buffer . '</i></b>';
|
2004-05-26 16:29:04 +00:00
|
|
|
|
}
|
2013-08-16 21:08:00 +00:00
|
|
|
|
return $output;
|
2004-05-26 16:29:04 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Replace external links (REL)
|
|
|
|
|
|
*
|
|
|
|
|
|
* Note: this is all very hackish and the order of execution matters a lot.
|
|
|
|
|
|
* Make sure to run tests/parser/parserTests.php if you change this code.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function handleExternalLinks( $text ) {
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$bits = preg_split( $this->mExtLinkBracketedRegex, $text, -1, PREG_SPLIT_DELIM_CAPTURE );
|
2019-08-30 13:09:51 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypeComparisonFromArray See phan issue #3161
|
2012-08-29 08:07:10 +00:00
|
|
|
|
if ( $bits === false ) {
|
2022-11-09 16:19:20 +00:00
|
|
|
|
throw new RuntimeException( "PCRE failure" );
|
2012-08-29 08:07:10 +00:00
|
|
|
|
}
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$s = array_shift( $bits );
|
2004-08-07 18:24:12 +00:00
|
|
|
|
|
|
|
|
|
|
$i = 0;
|
2013-04-13 11:36:24 +00:00
|
|
|
|
while ( $i < count( $bits ) ) {
|
2004-08-07 18:24:12 +00:00
|
|
|
|
$url = $bits[$i++];
|
2012-12-09 03:27:02 +00:00
|
|
|
|
$i++; // protocol
|
2004-08-07 18:24:12 +00:00
|
|
|
|
$text = $bits[$i++];
|
|
|
|
|
|
$trail = $bits[$i++];
|
2004-08-14 22:38:46 +00:00
|
|
|
|
|
2004-10-11 16:57:49 +00:00
|
|
|
|
# The characters '<' and '>' (which were escaped by
|
2022-03-04 19:05:41 +00:00
|
|
|
|
# internalRemoveHtmlTags()) should not be included in
|
2004-10-11 16:57:49 +00:00
|
|
|
|
# URLs, per RFC 2396.
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$m2 = [];
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( preg_match( '/&(lt|gt);/', $url, $m2, PREG_OFFSET_CAPTURE ) ) {
|
|
|
|
|
|
$text = substr( $url, $m2[0][1] ) . ' ' . $text;
|
|
|
|
|
|
$url = substr( $url, 0, $m2[0][1] );
|
2004-10-11 16:57:49 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2004-08-07 18:24:12 +00:00
|
|
|
|
# If the link text is an image URL, replace it with an <img> tag
|
|
|
|
|
|
# This happened by accident in the original parser, but some people used it extensively
|
2005-04-27 07:48:14 +00:00
|
|
|
|
$img = $this->maybeMakeExternalImage( $text );
|
2004-08-07 18:24:12 +00:00
|
|
|
|
if ( $img !== false ) {
|
|
|
|
|
|
$text = $img;
|
|
|
|
|
|
}
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2004-08-07 18:24:12 +00:00
|
|
|
|
$dtrail = '';
|
2004-08-07 08:54:52 +00:00
|
|
|
|
|
2018-02-28 21:11:09 +00:00
|
|
|
|
# Set linktype for CSS
|
|
|
|
|
|
$linktype = 'text';
|
2005-01-15 23:56:26 +00:00
|
|
|
|
|
2004-08-07 18:24:12 +00:00
|
|
|
|
# No link text, e.g. [http://domain.tld/some.link]
|
|
|
|
|
|
if ( $text == '' ) {
|
2011-07-12 20:55:05 +00:00
|
|
|
|
# Autonumber
|
2012-03-05 05:53:12 +00:00
|
|
|
|
$langObj = $this->getTargetLanguage();
|
2011-07-12 20:55:05 +00:00
|
|
|
|
$text = '[' . $langObj->formatNum( ++$this->mAutonumber ) . ']';
|
|
|
|
|
|
$linktype = 'autonumber';
|
2004-08-07 18:24:12 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
# Have link text, e.g. [http://domain.tld/some.link text]s
|
|
|
|
|
|
# Check for trail
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $dtrail, $trail ] = Linker::splitTrail( $trail );
|
2004-08-07 18:24:12 +00:00
|
|
|
|
}
|
2004-08-14 22:38:46 +00:00
|
|
|
|
|
2018-06-08 16:18:05 +00:00
|
|
|
|
// Excluding protocol-relative URLs may avoid many false positives.
|
2022-04-28 13:33:39 +00:00
|
|
|
|
if ( preg_match( '/^(?:' . $this->urlUtils->validAbsoluteProtocols() . ')/', $text ) ) {
|
2020-01-23 18:39:23 +00:00
|
|
|
|
$text = $this->getTargetLanguageConverter()->markNoConversion( $text );
|
2018-06-08 16:18:05 +00:00
|
|
|
|
}
|
2006-10-17 08:49:27 +00:00
|
|
|
|
|
2006-07-11 19:54:20 +00:00
|
|
|
|
$url = Sanitizer::cleanUrl( $url );
|
2005-01-30 04:11:22 +00:00
|
|
|
|
|
2004-08-07 18:24:12 +00:00
|
|
|
|
# Use the encoded URL
|
|
|
|
|
|
# This means that users can paste URLs directly into the text
|
2010-05-30 17:33:59 +00:00
|
|
|
|
# Funny characters like ö aren't valid in URLs anyway
|
2004-08-07 18:24:12 +00:00
|
|
|
|
# This was changed in August 2004
|
2011-04-03 11:44:11 +00:00
|
|
|
|
$s .= Linker::makeExternalLink( $url, $text, false, $linktype,
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$this->getExternalLinkAttribs( $url ), $this->getTitle() ) . $dtrail . $trail;
|
2006-01-26 13:29:14 +00:00
|
|
|
|
|
2006-03-17 01:02:14 +00:00
|
|
|
|
# Register link in the output object.
|
2016-03-11 01:08:06 +00:00
|
|
|
|
$this->mOutput->addExternalLink( $url );
|
2004-08-07 18:24:12 +00:00
|
|
|
|
}
|
2004-08-07 08:54:52 +00:00
|
|
|
|
|
2021-10-25 19:15:52 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypeMismatchReturnNullable False positive from array_shift
|
2004-02-26 13:37:26 +00:00
|
|
|
|
return $s;
|
|
|
|
|
|
}
|
2014-03-15 19:57:00 +00:00
|
|
|
|
|
2012-11-12 22:52:43 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get the rel attribute for a particular external link.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.21
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @internal
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @param string|false $url Optional URL, to extract the domain from for rel =>
|
2012-11-12 22:52:43 +00:00
|
|
|
|
* nofollow if appropriate
|
2019-04-15 12:47:32 +00:00
|
|
|
|
* @param LinkTarget|null $title Optional LinkTarget, for wgNoFollowNsExceptions lookups
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return string|null Rel attribute for $url
|
2012-11-12 22:52:43 +00:00
|
|
|
|
*/
|
2019-08-27 09:23:52 +00:00
|
|
|
|
public static function getExternalLinkRel( $url = false, LinkTarget $title = null ) {
|
2022-01-06 18:44:56 +00:00
|
|
|
|
$mainConfig = MediaWikiServices::getInstance()->getMainConfig();
|
2022-04-10 15:34:45 +00:00
|
|
|
|
$noFollowLinks = $mainConfig->get( MainConfigNames::NoFollowLinks );
|
|
|
|
|
|
$noFollowNsExceptions = $mainConfig->get( MainConfigNames::NoFollowNsExceptions );
|
|
|
|
|
|
$noFollowDomainExceptions = $mainConfig->get( MainConfigNames::NoFollowDomainExceptions );
|
2012-11-12 22:52:43 +00:00
|
|
|
|
$ns = $title ? $title->getNamespace() : false;
|
2022-01-06 18:44:56 +00:00
|
|
|
|
if ( $noFollowLinks && !in_array( $ns, $noFollowNsExceptions )
|
|
|
|
|
|
&& !wfMatchesDomainList( $url, $noFollowDomainExceptions )
|
2013-12-01 20:39:00 +00:00
|
|
|
|
) {
|
2012-11-12 22:52:43 +00:00
|
|
|
|
return 'nofollow';
|
|
|
|
|
|
}
|
|
|
|
|
|
return null;
|
|
|
|
|
|
}
|
2014-03-15 19:57:00 +00:00
|
|
|
|
|
2009-01-23 18:03:12 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get an associative array of additional HTML attributes appropriate for a
|
|
|
|
|
|
* particular external link. This currently may include rel => nofollow
|
|
|
|
|
|
* (depending on configuration, namespace, and the URL's domain) and/or a
|
|
|
|
|
|
* target attribute (depending on configuration).
|
|
|
|
|
|
*
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @internal
|
2016-06-01 04:25:18 +00:00
|
|
|
|
* @param string $url URL to extract the domain from for rel =>
|
2009-01-23 18:03:12 +00:00
|
|
|
|
* nofollow if appropriate
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return array Associative array of HTML attributes
|
2009-01-23 18:03:12 +00:00
|
|
|
|
*/
|
2016-06-01 04:25:18 +00:00
|
|
|
|
public function getExternalLinkAttribs( $url ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$attribs = [];
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$rel = self::getExternalLinkRel( $url, $this->getTitle() );
|
2016-04-25 18:08:46 +00:00
|
|
|
|
|
|
|
|
|
|
$target = $this->mOptions->getExternalLinkTarget();
|
|
|
|
|
|
if ( $target ) {
|
|
|
|
|
|
$attribs['target'] = $target;
|
|
|
|
|
|
if ( !in_array( $target, [ '_self', '_parent', '_top' ] ) ) {
|
|
|
|
|
|
// T133507. New windows can navigate parent cross-origin.
|
|
|
|
|
|
// Including noreferrer due to lacking browser
|
|
|
|
|
|
// support of noopener. Eventually noreferrer should be removed.
|
|
|
|
|
|
if ( $rel !== '' ) {
|
|
|
|
|
|
$rel .= ' ';
|
|
|
|
|
|
}
|
|
|
|
|
|
$rel .= 'noreferrer noopener';
|
|
|
|
|
|
}
|
2008-09-30 01:00:40 +00:00
|
|
|
|
}
|
2016-04-25 18:08:46 +00:00
|
|
|
|
$attribs['rel'] = $rel;
|
2008-09-30 01:00:40 +00:00
|
|
|
|
return $attribs;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2006-01-26 13:29:14 +00:00
|
|
|
|
/**
|
2013-12-21 02:14:48 +00:00
|
|
|
|
* Replace unusual escape codes in a URL with their equivalent characters
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2013-12-21 02:14:48 +00:00
|
|
|
|
* This generally follows the syntax defined in RFC 3986, with special
|
|
|
|
|
|
* consideration for HTTP query strings.
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @internal
|
2013-12-21 02:14:48 +00:00
|
|
|
|
* @param string $url
|
2011-08-05 00:33:03 +00:00
|
|
|
|
* @return string
|
2006-01-26 13:29:14 +00:00
|
|
|
|
*/
|
2013-12-21 02:14:48 +00:00
|
|
|
|
public static function normalizeLinkUrl( $url ) {
|
2016-11-19 00:50:43 +00:00
|
|
|
|
# Test for RFC 3986 IPv6 syntax
|
|
|
|
|
|
$scheme = '[a-z][a-z0-9+.-]*:';
|
|
|
|
|
|
$userinfo = '(?:[a-z0-9\-._~!$&\'()*+,;=:]|%[0-9a-f]{2})*';
|
|
|
|
|
|
$ipv6Host = '\\[((?:[0-9a-f:]|%3[0-A]|%[46][1-6])+)\\]';
|
|
|
|
|
|
if ( preg_match( "<^(?:{$scheme})?//(?:{$userinfo}@)?{$ipv6Host}(?:[:/?#].*|)$>i", $url, $m ) &&
|
2019-06-25 18:53:15 +00:00
|
|
|
|
IPUtils::isValid( rawurldecode( $m[1] ) )
|
2016-11-19 00:50:43 +00:00
|
|
|
|
) {
|
|
|
|
|
|
$isIPv6 = rawurldecode( $m[1] );
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$isIPv6 = false;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
# Make sure unsafe characters are encoded
|
2021-02-25 20:19:54 +00:00
|
|
|
|
$url = preg_replace_callback(
|
|
|
|
|
|
'/[\x00-\x20"<>\[\\\\\]^`{|}\x7F-\xFF]/',
|
2021-02-10 22:31:02 +00:00
|
|
|
|
static function ( $m ) {
|
2013-12-21 02:14:48 +00:00
|
|
|
|
return rawurlencode( $m[0] );
|
|
|
|
|
|
},
|
|
|
|
|
|
$url
|
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
|
|
$ret = '';
|
|
|
|
|
|
$end = strlen( $url );
|
|
|
|
|
|
|
|
|
|
|
|
# Fragment part - 'fragment'
|
|
|
|
|
|
$start = strpos( $url, '#' );
|
|
|
|
|
|
if ( $start !== false && $start < $end ) {
|
|
|
|
|
|
$ret = self::normalizeUrlComponent(
|
|
|
|
|
|
substr( $url, $start, $end - $start ), '"#%<>[\]^`{|}' ) . $ret;
|
|
|
|
|
|
$end = $start;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
# Query part - 'query' minus &=+;
|
|
|
|
|
|
$start = strpos( $url, '?' );
|
|
|
|
|
|
if ( $start !== false && $start < $end ) {
|
|
|
|
|
|
$ret = self::normalizeUrlComponent(
|
|
|
|
|
|
substr( $url, $start, $end - $start ), '"#%<>[\]^`{|}&=+;' ) . $ret;
|
|
|
|
|
|
$end = $start;
|
2006-01-26 13:29:14 +00:00
|
|
|
|
}
|
2013-12-21 02:14:48 +00:00
|
|
|
|
|
|
|
|
|
|
# Scheme and path part - 'pchar'
|
|
|
|
|
|
# (we assume no userinfo or encoded colons in the host)
|
|
|
|
|
|
$ret = self::normalizeUrlComponent(
|
|
|
|
|
|
substr( $url, 0, $end ), '"#%<>[\]^`{|}/?' ) . $ret;
|
|
|
|
|
|
|
2016-11-19 00:50:43 +00:00
|
|
|
|
# Fix IPv6 syntax
|
|
|
|
|
|
if ( $isIPv6 !== false ) {
|
|
|
|
|
|
$ipv6Host = "%5B({$isIPv6})%5D";
|
|
|
|
|
|
$ret = preg_replace(
|
|
|
|
|
|
"<^((?:{$scheme})?//(?:{$userinfo}@)?){$ipv6Host}(?=[:/?#]|$)>i",
|
|
|
|
|
|
"$1[$2]",
|
|
|
|
|
|
$ret
|
|
|
|
|
|
);
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2013-12-21 02:14:48 +00:00
|
|
|
|
return $ret;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
private static function normalizeUrlComponent( $component, $unsafe ) {
|
2021-02-10 22:31:02 +00:00
|
|
|
|
$callback = static function ( $matches ) use ( $unsafe ) {
|
2013-12-21 02:14:48 +00:00
|
|
|
|
$char = urldecode( $matches[0] );
|
|
|
|
|
|
$ord = ord( $char );
|
|
|
|
|
|
if ( $ord > 32 && $ord < 127 && strpos( $unsafe, $char ) === false ) {
|
|
|
|
|
|
# Unescape it
|
|
|
|
|
|
return $char;
|
|
|
|
|
|
} else {
|
|
|
|
|
|
# Leave it escaped, but use uppercase for a-f
|
|
|
|
|
|
return strtoupper( $matches[0] );
|
|
|
|
|
|
}
|
|
|
|
|
|
};
|
|
|
|
|
|
return preg_replace_callback( '/%[0-9A-Fa-f]{2}/', $callback, $component );
|
2006-01-26 13:29:14 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
2005-10-26 22:13:02 +00:00
|
|
|
|
* make an image if it's allowed, either through the global
|
2008-09-01 18:49:14 +00:00
|
|
|
|
* option, through the exception, or through the on-wiki whitelist
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2014-05-10 23:07:20 +00:00
|
|
|
|
* @param string $url
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return string
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2014-05-10 23:07:20 +00:00
|
|
|
|
private function maybeMakeExternalImage( $url ) {
|
2005-10-26 22:13:02 +00:00
|
|
|
|
$imagesfrom = $this->mOptions->getAllowExternalImagesFrom();
|
2010-03-30 21:20:05 +00:00
|
|
|
|
$imagesexception = !empty( $imagesfrom );
|
2004-08-07 18:24:12 +00:00
|
|
|
|
$text = false;
|
2008-09-01 18:49:14 +00:00
|
|
|
|
# $imagesfrom could be either a single string or an array of strings, parse out the latter
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $imagesexception && is_array( $imagesfrom ) ) {
|
2008-09-01 18:49:14 +00:00
|
|
|
|
$imagematch = false;
|
2010-03-30 21:20:05 +00:00
|
|
|
|
foreach ( $imagesfrom as $match ) {
|
|
|
|
|
|
if ( strpos( $url, $match ) === 0 ) {
|
2008-09-01 18:49:14 +00:00
|
|
|
|
$imagematch = true;
|
|
|
|
|
|
break;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
} elseif ( $imagesexception ) {
|
|
|
|
|
|
$imagematch = ( strpos( $url, $imagesfrom ) === 0 );
|
2008-09-01 18:49:14 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
$imagematch = false;
|
|
|
|
|
|
}
|
2014-05-10 23:03:45 +00:00
|
|
|
|
|
2006-01-07 13:09:30 +00:00
|
|
|
|
if ( $this->mOptions->getAllowExternalImages()
|
2014-05-10 23:03:45 +00:00
|
|
|
|
|| ( $imagesexception && $imagematch )
|
|
|
|
|
|
) {
|
2007-11-20 10:55:08 +00:00
|
|
|
|
if ( preg_match( self::EXT_IMAGE_REGEX, $url ) ) {
|
2004-08-07 18:24:12 +00:00
|
|
|
|
# Image found
|
2011-04-03 11:44:11 +00:00
|
|
|
|
$text = Linker::makeExternalImage( $url );
|
2004-08-07 18:24:12 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( !$text && $this->mOptions->getEnableImageWhitelist()
|
2014-05-10 23:03:45 +00:00
|
|
|
|
&& preg_match( self::EXT_IMAGE_REGEX, $url )
|
|
|
|
|
|
) {
|
|
|
|
|
|
$whitelist = explode(
|
|
|
|
|
|
"\n",
|
|
|
|
|
|
wfMessage( 'external_image_whitelist' )->inContentLanguage()->text()
|
|
|
|
|
|
);
|
|
|
|
|
|
|
2010-03-30 21:20:05 +00:00
|
|
|
|
foreach ( $whitelist as $entry ) {
|
2008-09-01 18:49:14 +00:00
|
|
|
|
# Sanitize the regex fragment, make it case-insensitive, ignore blank entries/comments
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( strpos( $entry, '#' ) === 0 || $entry === '' ) {
|
2008-09-01 18:49:14 +00:00
|
|
|
|
continue;
|
2010-03-30 21:20:05 +00:00
|
|
|
|
}
|
2020-11-29 11:37:44 +00:00
|
|
|
|
// @phan-suppress-next-line SecurityCheck-ReDoS preg_quote is not wanted here
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( preg_match( '/' . str_replace( '/', '\\/', $entry ) . '/i', $url ) ) {
|
2008-09-01 18:49:14 +00:00
|
|
|
|
# Image matches a whitelist entry
|
2011-04-03 11:44:11 +00:00
|
|
|
|
$text = Linker::makeExternalImage( $url );
|
2008-09-01 18:49:14 +00:00
|
|
|
|
break;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
2004-08-07 18:24:12 +00:00
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Process [[ ]] wikilinks
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
|
|
|
|
|
*
|
|
|
|
|
|
* @return string Processed text
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function handleInternalLinks( $text ) {
|
|
|
|
|
|
$this->mLinkHolders->merge( $this->handleInternalLinks2( $text ) );
|
|
|
|
|
|
return $text;
|
2008-08-26 14:37:15 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Process [[ ]] wikilinks (RIL)
|
|
|
|
|
|
* @param string &$s
|
|
|
|
|
|
* @return LinkHolderArray
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function handleInternalLinks2( &$s ) {
|
2013-02-09 20:10:44 +00:00
|
|
|
|
static $tc = false, $e1, $e1_img;
|
2004-05-25 14:26:14 +00:00
|
|
|
|
# the % is needed to support urlencoded titles as well
|
2010-01-07 04:13:14 +00:00
|
|
|
|
if ( !$tc ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$tc = Title::legalChars() . '#%';
|
|
|
|
|
|
# Match a link having the form [[namespace:link|alternate]]trail
|
Moving Conrad's recent parser work out to a branch. Reverted r62434, r62416, r62150, r62111, r62085, r62081, r62080, r62077, r62076, r62069, r62049, r62035.
2010-02-19 05:19:32 +00:00
|
|
|
|
$e1 = "/^([{$tc}]+)(?:\\|(.+?))?]](.*)\$/sD";
|
2008-08-26 14:37:15 +00:00
|
|
|
|
# Match cases where there is no "]]", which might still be images
|
|
|
|
|
|
$e1_img = "/^([{$tc}]+)\\|(.*)\$/sD";
|
|
|
|
|
|
}
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$holders = new LinkHolderArray(
|
|
|
|
|
|
$this,
|
|
|
|
|
|
$this->getContentLanguageConverter(),
|
|
|
|
|
|
$this->getHookContainer() );
|
2006-11-08 07:12:03 +00:00
|
|
|
|
|
2012-03-19 21:40:39 +00:00
|
|
|
|
# split the entire text string on occurrences of [[
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$a = StringUtils::explode( '[[', ' ' . $s );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
# get the first element (all text up to first [[), and remove the space we added
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$s = $a->current();
|
|
|
|
|
|
$a->next();
|
|
|
|
|
|
$line = $a->current(); # Workaround for broken ArrayIterator::next() that returns "void"
|
2004-05-26 16:29:04 +00:00
|
|
|
|
$s = substr( $s, 1 );
|
|
|
|
|
|
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$nottalk = !$this->getTitle()->isTalkPage();
|
2019-09-15 13:53:14 +00:00
|
|
|
|
|
2012-03-05 05:53:12 +00:00
|
|
|
|
$useLinkPrefixExtension = $this->getTargetLanguage()->linkPrefixExtension();
|
2007-11-13 09:55:45 +00:00
|
|
|
|
$e2 = null;
|
|
|
|
|
|
if ( $useLinkPrefixExtension ) {
|
|
|
|
|
|
# Match the end of a line for a word that's not followed by whitespace,
|
|
|
|
|
|
# e.g. in the case of 'The Arab al[[Razi]]', 'al' will be matched
|
2018-08-03 08:25:15 +00:00
|
|
|
|
$charset = $this->contLang->linkPrefixCharset();
|
Remove linkprefix message, add $linkPrefixCharset
The existing "linkprefix" message is unlikely to be accurately
customized by message translators (as shown by the fact that, of the 10
distinct customizations prior to Iaa7eaa44 (which made them even more
complicated), 3 were broken or entirely ineffective, 1 was half
ineffective, and 2 more seem to have included the Latin-1 Supplement by
accident) or by local wiki admins. So, like linktrail before it, let's
move it out of the system messages and into a separate language
variable.
At the same time, let's make it a simple character set (like
$wgLegalTitleChars) rather than a complicated regular expression. The
complicated regex now lives in the parser.
This also adjusts the output of the API's action=query&meta=siteinfo and
adds an accessor parallel to the linkTrail accessor to Language.
Note the following changes that are not simply extracting the existing
charset from the linkprefix message for $linkPrefixCharset:
* The En message matched all non-ASCII UTF-8 characters by matching the
component bytes (\\x80-\\xff). The new character set is equivalent.
* Various languages were identical to En and so have no $linkPrefixCharset
set. These are: Ary Az Ce Ga Id Ka Kiu Km Ltg Mk Ms Ne Nn Ro Roa_tara Sc Si
Sr_ec Sr_el Tl Tt_cyrl Tt_latn Ug_arab War
* Cu, Uk, and Udm are changed to match any number of „ or « in the prefix.
* Cv tried to include "«" that was redundant to the range \\x80-\\xff
(see En comment). This was removed.
* Diq was entirely bogus, and so was removed.
* Gu included many additional UTF-8 characters that are redundant to the
range \\x80-\\xff (see En comment). These were removed, and the
resulting character set is equivalent to En.
* Mt has been broken since it was introduced in r37242. The charset used is
equivalent to the broken regex.
Bug: 56031
Change-Id: I3369851b33113fc118a1bace38f3ac310cdd9725
2013-10-23 18:06:19 +00:00
|
|
|
|
$e2 = "/^((?>.*[^$charset]|))(.+)$/sDu";
|
2016-02-17 09:09:32 +00:00
|
|
|
|
$m = [];
|
2004-06-02 22:54:01 +00:00
|
|
|
|
if ( preg_match( $e2, $s, $m ) ) {
|
|
|
|
|
|
$first_prefix = $m[2];
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$first_prefix = false;
|
|
|
|
|
|
}
|
2022-03-29 19:28:18 +00:00
|
|
|
|
$prefix = false;
|
2004-05-26 16:29:04 +00:00
|
|
|
|
} else {
|
2022-03-29 19:28:18 +00:00
|
|
|
|
$first_prefix = false;
|
2004-06-02 22:54:01 +00:00
|
|
|
|
$prefix = '';
|
2004-05-26 16:29:04 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2019-10-29 07:32:44 +00:00
|
|
|
|
# Some namespaces don't allow subpages
|
|
|
|
|
|
$useSubpages = $this->nsInfo->hasSubpages(
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$this->getTitle()->getNamespace()
|
2019-10-29 07:32:44 +00:00
|
|
|
|
);
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
2004-10-05 03:55:41 +00:00
|
|
|
|
# Loop for each link
|
2013-02-09 21:44:24 +00:00
|
|
|
|
for ( ; $line !== false && $line !== null; $a->next(), $line = $a->current() ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
# Check for excessive memory usage
|
|
|
|
|
|
if ( $holders->isBig() ) {
|
|
|
|
|
|
# Too big
|
|
|
|
|
|
# Do the existence check, replace the link holders and clear the array
|
|
|
|
|
|
$holders->replace( $s );
|
|
|
|
|
|
$holders->clear();
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2004-06-02 22:54:01 +00:00
|
|
|
|
if ( $useLinkPrefixExtension ) {
|
2021-10-25 19:15:52 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypeMismatchArgumentNullableInternal $e2 is set under this condition
|
2004-06-02 22:54:01 +00:00
|
|
|
|
if ( preg_match( $e2, $s, $m ) ) {
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ , $s, $prefix ] = $m;
|
2004-06-02 22:54:01 +00:00
|
|
|
|
} else {
|
2013-01-26 21:11:09 +00:00
|
|
|
|
$prefix = '';
|
2004-06-02 22:54:01 +00:00
|
|
|
|
}
|
|
|
|
|
|
# first link
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $first_prefix ) {
|
2004-06-02 22:54:01 +00:00
|
|
|
|
$prefix = $first_prefix;
|
|
|
|
|
|
$first_prefix = false;
|
|
|
|
|
|
}
|
2004-06-02 22:39:06 +00:00
|
|
|
|
}
|
2004-07-12 19:49:20 +00:00
|
|
|
|
|
2004-10-05 03:55:41 +00:00
|
|
|
|
$might_be_img = false;
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
2004-05-26 16:29:04 +00:00
|
|
|
|
if ( preg_match( $e1, $line, $m ) ) { # page with normal text or alt
|
Moving Conrad's recent parser work out to a branch. Reverted r62434, r62416, r62150, r62111, r62085, r62081, r62080, r62077, r62076, r62069, r62049, r62035.
2010-02-19 05:19:32 +00:00
|
|
|
|
$text = $m[2];
|
2005-04-12 06:07:23 +00:00
|
|
|
|
# If we get a ] at the beginning of $m[3] that means we have a link that's something like:
|
2008-05-05 20:50:40 +00:00
|
|
|
|
# [[Image:Foo.jpg|[http://example.com desc]]] <- having three ] in a row fucks up,
|
2005-04-12 06:07:23 +00:00
|
|
|
|
# the real problem is with the $e1 regex
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# See T1500.
|
2005-05-18 09:21:47 +00:00
|
|
|
|
# Still some problems for cases where the ] is meant to be outside punctuation,
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# and no image is in sight. See T4095.
|
2013-12-01 20:39:00 +00:00
|
|
|
|
if ( $text !== ''
|
|
|
|
|
|
&& substr( $m[3], 0, 1 ) === ']'
|
|
|
|
|
|
&& strpos( $text, '[' ) !== false
|
|
|
|
|
|
) {
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
$text .= ']'; # so that handleExternalLinks($text) works later
|
2006-08-06 14:01:47 +00:00
|
|
|
|
$m[3] = substr( $m[3], 1 );
|
2005-04-12 06:07:23 +00:00
|
|
|
|
}
|
2004-05-26 16:29:04 +00:00
|
|
|
|
# fix up urlencoded title texts
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( strpos( $m[1], '%' ) !== false ) {
|
2006-03-24 16:43:57 +00:00
|
|
|
|
# Should anchors '#' also be rejected?
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$m[1] = str_replace( [ '<', '>' ], [ '<', '>' ], rawurldecode( $m[1] ) );
|
2006-08-06 14:01:47 +00:00
|
|
|
|
}
|
2004-05-26 16:29:04 +00:00
|
|
|
|
$trail = $m[3];
|
2014-05-10 23:03:45 +00:00
|
|
|
|
} elseif ( preg_match( $e1_img, $line, $m ) ) {
|
|
|
|
|
|
# Invalid, but might be an image with a link in its caption
|
2004-10-05 03:55:41 +00:00
|
|
|
|
$might_be_img = true;
|
|
|
|
|
|
$text = $m[2];
|
Moving Conrad's recent parser work out to a branch. Reverted r62434, r62416, r62150, r62111, r62085, r62081, r62080, r62077, r62076, r62069, r62049, r62035.
2010-02-19 05:19:32 +00:00
|
|
|
|
if ( strpos( $m[1], '%' ) !== false ) {
|
2016-06-08 02:35:15 +00:00
|
|
|
|
$m[1] = str_replace( [ '<', '>' ], [ '<', '>' ], rawurldecode( $m[1] ) );
|
Moving Conrad's recent parser work out to a branch. Reverted r62434, r62416, r62150, r62111, r62085, r62081, r62080, r62077, r62076, r62069, r62049, r62035.
2010-02-19 05:19:32 +00:00
|
|
|
|
}
|
2004-10-05 03:55:41 +00:00
|
|
|
|
$trail = "";
|
2004-10-05 00:21:52 +00:00
|
|
|
|
} else { # Invalid form; output directly
|
2013-02-09 21:44:24 +00:00
|
|
|
|
$s .= $prefix . '[[' . $line;
|
2004-10-05 00:21:52 +00:00
|
|
|
|
continue;
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2004-05-26 16:29:04 +00:00
|
|
|
|
|
2022-03-28 20:10:05 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypePossiblyInvalidDimOffset preg_match success when reached here
|
2016-12-27 09:51:50 +00:00
|
|
|
|
$origLink = ltrim( $m[1], ' ' );
|
2014-07-21 02:35:50 +00:00
|
|
|
|
|
2004-09-24 18:29:01 +00:00
|
|
|
|
# Don't allow internal links to pages containing
|
|
|
|
|
|
# PROTO: where PROTO is a valid URL protocol; these
|
|
|
|
|
|
# should be external links.
|
2022-04-28 13:33:39 +00:00
|
|
|
|
if ( preg_match( '/^(?i:' . $this->urlUtils->validProtocols() . ')/', $origLink ) ) {
|
2013-02-09 21:44:24 +00:00
|
|
|
|
$s .= $prefix . '[[' . $line;
|
2004-09-24 18:29:01 +00:00
|
|
|
|
continue;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2004-09-25 20:35:38 +00:00
|
|
|
|
# Make subpage if necessary
|
2009-07-03 05:13:58 +00:00
|
|
|
|
if ( $useSubpages ) {
|
2019-10-29 07:32:44 +00:00
|
|
|
|
$link = Linker::normalizeSubpageLink(
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$this->getTitle(), $origLink, $text
|
2019-10-29 07:32:44 +00:00
|
|
|
|
);
|
2004-11-28 03:29:50 +00:00
|
|
|
|
} else {
|
2014-07-21 02:35:50 +00:00
|
|
|
|
$link = $origLink;
|
2004-11-28 03:29:50 +00:00
|
|
|
|
}
|
2004-09-25 20:35:38 +00:00
|
|
|
|
|
2018-03-01 18:59:43 +00:00
|
|
|
|
// \x7f isn't a default legal title char, so most likely strip
|
|
|
|
|
|
// markers will force us into the "invalid form" path above. But,
|
|
|
|
|
|
// just in case, let's assert that xmlish tags aren't valid in
|
|
|
|
|
|
// the title position.
|
|
|
|
|
|
$unstrip = $this->mStripState->killMarkers( $link );
|
|
|
|
|
|
$noMarkers = ( $unstrip === $link );
|
|
|
|
|
|
|
|
|
|
|
|
$nt = $noMarkers ? Title::newFromText( $link ) : null;
|
2009-12-11 21:07:27 +00:00
|
|
|
|
if ( $nt === null ) {
|
2004-06-08 18:11:28 +00:00
|
|
|
|
$s .= $prefix . '[[' . $line;
|
2004-05-26 16:29:04 +00:00
|
|
|
|
continue;
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2004-10-04 03:47:39 +00:00
|
|
|
|
|
2004-05-26 16:29:04 +00:00
|
|
|
|
$ns = $nt->getNamespace();
|
2014-01-23 09:21:57 +00:00
|
|
|
|
$iw = $nt->getInterwiki();
|
2006-10-17 08:49:27 +00:00
|
|
|
|
|
2017-06-26 23:20:31 +00:00
|
|
|
|
$noforce = ( substr( $origLink, 0, 1 ) !== ':' );
|
|
|
|
|
|
|
2009-07-03 05:13:58 +00:00
|
|
|
|
if ( $might_be_img ) { # if this is actually an invalid link
|
2020-07-22 17:29:48 +00:00
|
|
|
|
if ( $ns === NS_FILE && $noforce ) { # but might be an image
|
2004-10-05 03:55:41 +00:00
|
|
|
|
$found = false;
|
2008-08-26 14:37:15 +00:00
|
|
|
|
while ( true ) {
|
2010-03-30 21:20:05 +00:00
|
|
|
|
# look at the next 'line' to see if we can close it there
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$a->next();
|
|
|
|
|
|
$next_line = $a->current();
|
|
|
|
|
|
if ( $next_line === false || $next_line === null ) {
|
|
|
|
|
|
break;
|
|
|
|
|
|
}
|
2006-08-06 14:01:47 +00:00
|
|
|
|
$m = explode( ']]', $next_line, 3 );
|
|
|
|
|
|
if ( count( $m ) == 3 ) {
|
|
|
|
|
|
# the first ]] closes the inner link, the second the image
|
2004-10-05 03:55:41 +00:00
|
|
|
|
$found = true;
|
2006-08-06 14:01:47 +00:00
|
|
|
|
$text .= "[[{$m[0]}]]{$m[1]}";
|
2004-10-05 03:55:41 +00:00
|
|
|
|
$trail = $m[2];
|
|
|
|
|
|
break;
|
2006-08-06 14:01:47 +00:00
|
|
|
|
} elseif ( count( $m ) == 2 ) {
|
2010-03-30 21:20:05 +00:00
|
|
|
|
# if there's exactly one ]] that's fine, we'll keep looking
|
2006-08-06 14:01:47 +00:00
|
|
|
|
$text .= "[[{$m[0]}]]{$m[1]}";
|
2004-10-05 03:55:41 +00:00
|
|
|
|
} else {
|
2010-03-30 21:20:05 +00:00
|
|
|
|
# if $next_line is invalid too, we need look no further
|
2004-10-05 03:55:41 +00:00
|
|
|
|
$text .= '[[' . $next_line;
|
|
|
|
|
|
break;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( !$found ) {
|
|
|
|
|
|
# we couldn't find the end of this imageLink, so output it raw
|
2010-03-30 21:20:05 +00:00
|
|
|
|
# but don't ignore what might be perfectly normal links in the text we've examined
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
$holders->merge( $this->handleInternalLinks2( $text ) );
|
2006-08-06 14:01:47 +00:00
|
|
|
|
$s .= "{$prefix}[[$link|$text";
|
2004-10-05 03:55:41 +00:00
|
|
|
|
# note: no $trail, because without an end, there *is* no trail
|
|
|
|
|
|
continue;
|
|
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
} else { # it's not an image, so output it raw
|
2006-08-06 14:01:47 +00:00
|
|
|
|
$s .= "{$prefix}[[$link|$text";
|
2004-10-05 03:55:41 +00:00
|
|
|
|
# note: no $trail, because without an end, there *is* no trail
|
|
|
|
|
|
continue;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2013-03-07 16:50:43 +00:00
|
|
|
|
$wasblank = ( $text == '' );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $wasblank ) {
|
|
|
|
|
|
$text = $link;
|
2017-06-26 23:20:31 +00:00
|
|
|
|
if ( !$noforce ) {
|
|
|
|
|
|
# Strip off leading ':'
|
|
|
|
|
|
$text = substr( $text, 1 );
|
|
|
|
|
|
}
|
2010-06-23 23:29:54 +00:00
|
|
|
|
} else {
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# T6598 madness. Handle the quotes only if they come from the alternate part
|
2010-12-14 09:01:48 +00:00
|
|
|
|
# [[Lista d''e paise d''o munno]] -> <a href="...">Lista d''e paise d''o munno</a>
|
2010-12-19 04:31:15 +00:00
|
|
|
|
# [[Criticism of Harry Potter|Criticism of ''Harry Potter'']]
|
2010-12-14 09:01:48 +00:00
|
|
|
|
# -> <a href="Criticism of Harry Potter">Criticism of <i>Harry Potter</i></a>
|
|
|
|
|
|
$text = $this->doQuotes( $text );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
}
|
2004-10-05 03:55:41 +00:00
|
|
|
|
|
2004-08-16 20:01:21 +00:00
|
|
|
|
# Link not escaped by : , create the various objects
|
2014-08-01 08:17:49 +00:00
|
|
|
|
if ( $noforce && !$nt->wasLocalInterwiki() ) {
|
2004-08-16 20:01:21 +00:00
|
|
|
|
# Interwikis
|
2014-06-20 01:29:05 +00:00
|
|
|
|
if (
|
|
|
|
|
|
$iw && $this->mOptions->getInterwikiMagic() && $nottalk && (
|
2020-01-03 23:03:14 +00:00
|
|
|
|
MediaWikiServices::getInstance()->getLanguageNameUtils()
|
|
|
|
|
|
->getLanguageName(
|
|
|
|
|
|
$iw,
|
|
|
|
|
|
LanguageNameUtils::AUTONYMS,
|
|
|
|
|
|
LanguageNameUtils::DEFINED
|
|
|
|
|
|
)
|
2022-04-26 15:48:03 +00:00
|
|
|
|
|| in_array( $iw, $this->svcOptions->get( MainConfigNames::ExtraInterlanguageLinkPrefixes ) )
|
2014-08-01 08:17:49 +00:00
|
|
|
|
)
|
2014-05-10 23:03:45 +00:00
|
|
|
|
) {
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# T26502: filter duplicates
|
2012-09-26 07:42:17 +00:00
|
|
|
|
if ( !isset( $this->mLangLinkLanguages[$iw] ) ) {
|
|
|
|
|
|
$this->mLangLinkLanguages[$iw] = true;
|
|
|
|
|
|
$this->mOutput->addLanguageLink( $nt->getFullText() );
|
|
|
|
|
|
}
|
2012-10-08 11:58:54 +00:00
|
|
|
|
|
2017-08-13 18:19:13 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Strip the whitespace interwiki links produce, see T10897
|
|
|
|
|
|
*/
|
2017-09-09 03:44:42 +00:00
|
|
|
|
$s = rtrim( $s . $prefix ) . $trail; # T175416
|
2004-05-26 16:29:04 +00:00
|
|
|
|
continue;
|
|
|
|
|
|
}
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
2020-07-22 17:29:48 +00:00
|
|
|
|
if ( $ns === NS_FILE ) {
|
2019-10-18 19:50:58 +00:00
|
|
|
|
if ( !$this->badFileLookup->isBadFile( $nt->getDBkey(), $this->getTitle() ) ) {
|
2009-07-03 05:13:58 +00:00
|
|
|
|
if ( $wasblank ) {
|
|
|
|
|
|
# if no parameters were passed, $text
|
|
|
|
|
|
# becomes something like "File:Foo.png",
|
|
|
|
|
|
# which we don't want to pass on to the
|
|
|
|
|
|
# image generator
|
|
|
|
|
|
$text = '';
|
|
|
|
|
|
} else {
|
|
|
|
|
|
# recursively parse links inside the image caption
|
|
|
|
|
|
# actually, this will parse them in any other parameters, too,
|
|
|
|
|
|
# but it might be hard to fix that, and it doesn't matter ATM
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
$text = $this->handleExternalLinks( $text );
|
|
|
|
|
|
$holders->merge( $this->handleInternalLinks2( $text ) );
|
2009-07-03 05:13:58 +00:00
|
|
|
|
}
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
# cloak any absolute URLs inside the image markup, so handleExternalLinks() won't touch them
|
2019-11-04 19:23:34 +00:00
|
|
|
|
$s .= $prefix . $this->armorLinks(
|
2011-03-23 03:13:37 +00:00
|
|
|
|
$this->makeImage( $nt, $text, $holders ) ) . $trail;
|
2015-01-08 17:53:33 +00:00
|
|
|
|
continue;
|
2005-04-12 04:03:21 +00:00
|
|
|
|
}
|
2020-07-22 17:29:48 +00:00
|
|
|
|
} elseif ( $ns === NS_CATEGORY ) {
|
2017-08-13 18:19:13 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Strip the whitespace Category links produce, see T2087
|
|
|
|
|
|
*/
|
2017-09-09 03:44:42 +00:00
|
|
|
|
$s = rtrim( $s . $prefix ) . $trail; # T2087, T87753
|
2004-05-26 16:29:04 +00:00
|
|
|
|
|
2004-09-07 22:08:01 +00:00
|
|
|
|
if ( $wasblank ) {
|
2022-02-16 18:54:01 +00:00
|
|
|
|
$sortkey = $this->mOutput->getPageProperty( 'defaultsort' ) ?? '';
|
2004-09-07 22:08:01 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
$sortkey = $text;
|
|
|
|
|
|
}
|
2005-12-05 08:19:52 +00:00
|
|
|
|
$sortkey = Sanitizer::decodeCharReferences( $sortkey );
|
2006-06-23 09:20:44 +00:00
|
|
|
|
$sortkey = str_replace( "\n", '', $sortkey );
|
2020-01-23 18:39:23 +00:00
|
|
|
|
$sortkey = $this->getTargetLanguageConverter()->convertCategoryKey( $sortkey );
|
2005-12-30 09:33:11 +00:00
|
|
|
|
$this->mOutput->addCategory( $nt->getDBkey(), $sortkey );
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
2004-05-26 16:29:04 +00:00
|
|
|
|
continue;
|
|
|
|
|
|
}
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2004-10-08 04:27:07 +00:00
|
|
|
|
|
2013-10-15 13:42:48 +00:00
|
|
|
|
# Self-link checking. For some languages, variants of the title are checked in
|
|
|
|
|
|
# LinkHolderArray::doVariants() to allow batching the existence checks necessary
|
|
|
|
|
|
# for linking to a different variant.
|
2020-07-22 17:29:48 +00:00
|
|
|
|
if ( $ns !== NS_SPECIAL && $nt->equals( $this->getTitle() ) && !$nt->hasFragment() ) {
|
2013-10-15 13:42:48 +00:00
|
|
|
|
$s .= $prefix . Linker::makeSelfLinkObj( $nt, $text, '', $trail );
|
|
|
|
|
|
continue;
|
2004-04-20 21:08:24 +00:00
|
|
|
|
}
|
2004-03-20 12:10:25 +00:00
|
|
|
|
|
2008-12-13 04:14:40 +00:00
|
|
|
|
# NS_MEDIA is a pseudo-namespace for linking directly to a file
|
2011-05-17 22:03:20 +00:00
|
|
|
|
# @todo FIXME: Should do batch file existence checks, see comment below
|
2020-07-22 17:29:48 +00:00
|
|
|
|
if ( $ns === NS_MEDIA ) {
|
2008-04-19 21:29:19 +00:00
|
|
|
|
# Give extensions a chance to select the file revision for us
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$options = [];
|
2011-09-06 18:11:53 +00:00
|
|
|
|
$descQuery = false;
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onBeforeParserFetchFileAndTitle(
|
2022-03-16 23:34:23 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypeMismatchArgument Type mismatch on pass-by-ref args
|
2022-03-01 22:39:53 +00:00
|
|
|
|
$this, $nt, $options, $descQuery
|
|
|
|
|
|
);
|
2011-03-24 01:44:48 +00:00
|
|
|
|
# Fetch and register the file (file title may be different via hooks)
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $file, $nt ] = $this->fetchFileAndTitle( $nt, $options );
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
# Cloak with NOPARSE to avoid replacement in handleExternalLinks
|
2019-11-04 19:23:34 +00:00
|
|
|
|
$s .= $prefix . $this->armorLinks(
|
2011-04-03 11:44:11 +00:00
|
|
|
|
Linker::makeMediaLinkFile( $nt, $file, $text ) ) . $trail;
|
2004-06-05 02:22:16 +00:00
|
|
|
|
continue;
|
2004-05-26 16:29:04 +00:00
|
|
|
|
}
|
2008-12-13 04:14:40 +00:00
|
|
|
|
|
|
|
|
|
|
# Some titles, such as valid special pages or files in foreign repos, should
|
|
|
|
|
|
# be shown as bluelinks even though they're not included in the page table
|
2011-05-17 22:03:20 +00:00
|
|
|
|
# @todo FIXME: isAlwaysKnown() can be expensive for file links; we should really do
|
2008-12-13 04:14:40 +00:00
|
|
|
|
# batch file existence checks for NS_FILE and NS_MEDIA
|
2010-04-18 00:39:12 +00:00
|
|
|
|
if ( $iw == '' && $nt->isAlwaysKnown() ) {
|
2009-01-01 00:05:08 +00:00
|
|
|
|
$this->mOutput->addLink( $nt );
|
2019-11-04 19:23:34 +00:00
|
|
|
|
$s .= $this->makeKnownLinkHolder( $nt, $text, $trail, $prefix );
|
2008-12-13 04:14:40 +00:00
|
|
|
|
} else {
|
2009-01-01 00:05:08 +00:00
|
|
|
|
# Links will be added to the output link list after checking
|
2020-06-18 12:25:45 +00:00
|
|
|
|
$s .= $holders->makeHolder( $nt, $text, $trail, $prefix );
|
2008-12-13 04:14:40 +00:00
|
|
|
|
}
|
2004-02-28 23:38:08 +00:00
|
|
|
|
}
|
2008-08-26 14:37:15 +00:00
|
|
|
|
return $holders;
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2019-10-29 07:34:25 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Render a forced-blue link inline; protect against double expansion of
|
|
|
|
|
|
* URLs if we're in a mode that prepends full URL prefixes to internal links.
|
|
|
|
|
|
* Since this little disaster has to split off the trail text to avoid
|
|
|
|
|
|
* breaking URLs in the following text without breaking trails on the
|
|
|
|
|
|
* wiki links, it's been made into a horrible function.
|
|
|
|
|
|
*
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param LinkTarget $nt
|
2019-10-29 07:34:25 +00:00
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @param string $trail
|
|
|
|
|
|
* @param string $prefix
|
|
|
|
|
|
* @return string HTML-wikitext mix oh yuck
|
|
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
private function makeKnownLinkHolder( LinkTarget $nt, $text = '', $trail = '', $prefix = '' ) {
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $inside, $trail ] = Linker::splitTrail( $trail );
|
2011-03-13 14:00:38 +00:00
|
|
|
|
|
|
|
|
|
|
if ( $text == '' ) {
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$text = htmlspecialchars( $this->titleFormatter->getPrefixedText( $nt ) );
|
2011-03-13 14:00:38 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2016-05-13 00:37:17 +00:00
|
|
|
|
$link = $this->getLinkRenderer()->makeKnownLink(
|
2016-05-26 22:00:49 +00:00
|
|
|
|
$nt, new HtmlArmor( "$prefix$text$inside" )
|
2016-05-13 00:37:17 +00:00
|
|
|
|
);
|
2011-03-13 14:00:38 +00:00
|
|
|
|
|
2019-11-04 19:23:34 +00:00
|
|
|
|
return $this->armorLinks( $link ) . $trail;
|
2019-10-29 07:34:25 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Insert a NOPARSE hacky thing into any inline links in a chunk that's
|
|
|
|
|
|
* going to go through further parsing steps before inline URL expansion.
|
|
|
|
|
|
*
|
|
|
|
|
|
* Not needed quite as much as it used to be since free links are a bit
|
|
|
|
|
|
* more sensible these days. But bracketed links are still an issue.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text More-or-less HTML
|
|
|
|
|
|
* @return string Less-or-more HTML with NOPARSE bits
|
|
|
|
|
|
*/
|
2019-11-04 19:23:34 +00:00
|
|
|
|
private function armorLinks( $text ) {
|
2022-04-28 13:33:39 +00:00
|
|
|
|
return preg_replace( '/\b((?i)' . $this->urlUtils->validProtocols() . ')/',
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
self::MARKER_PREFIX . "NOPARSE$1", $text );
|
2005-12-28 22:58:54 +00:00
|
|
|
|
}
|
2005-04-27 07:48:14 +00:00
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
2008-08-26 14:37:15 +00:00
|
|
|
|
* Make lists from lines starting with ':', '*', '#', etc. (DBL)
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @param bool $linestart Whether or not this is at the start of a line.
|
2020-06-26 12:14:23 +00:00
|
|
|
|
* @internal
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @return string The lists rendered as HTML
|
2020-01-25 15:45:36 +00:00
|
|
|
|
* @deprecated since 1.35, will not be supported in future parsers
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function doBlockLevels( $text, $linestart ) {
|
2020-01-25 15:45:36 +00:00
|
|
|
|
wfDeprecated( __METHOD__, '1.35' );
|
2016-05-06 02:56:37 +00:00
|
|
|
|
return BlockLevelPass::doBlockLevels( $text, $linestart );
|
2004-09-27 21:01:39 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Return value of a magic variable (like PAGENAME)
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $index Magic variable identifier as mapped in MagicWordFactory::$mVariableIDs
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @param PPFrame|false $frame
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function expandMagicVariable( $index, $frame = false ) {
|
2004-11-21 14:07:24 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Some of these require message or data lookups and can be
|
|
|
|
|
|
* expensive to check many times.
|
|
|
|
|
|
*/
|
2022-08-12 17:09:39 +00:00
|
|
|
|
if ( isset( $this->mVarCache[$index] ) ) {
|
2019-03-29 20:12:24 +00:00
|
|
|
|
return $this->mVarCache[$index];
|
2006-10-17 08:49:27 +00:00
|
|
|
|
}
|
2005-11-26 23:04:05 +00:00
|
|
|
|
|
2022-03-11 16:21:33 +00:00
|
|
|
|
$ts = new MWTimestamp( $this->mOptions->getTimestamp() /* TS_MW */ );
|
|
|
|
|
|
if ( $this->hookContainer->isRegistered( 'ParserGetVariableValueTs' ) ) {
|
|
|
|
|
|
$s = $ts->getTimestamp( TS_UNIX );
|
|
|
|
|
|
$this->hookRunner->onParserGetVariableValueTs( $this, $s );
|
|
|
|
|
|
$ts = new MWTimestamp( $s );
|
|
|
|
|
|
}
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
2020-06-05 02:54:51 +00:00
|
|
|
|
$value = CoreMagicVariables::expand(
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this, $index, $ts, $this->nsInfo, $this->svcOptions, $this->logger
|
2020-03-26 19:49:58 +00:00
|
|
|
|
);
|
2011-08-16 19:29:52 +00:00
|
|
|
|
|
2020-03-26 19:49:58 +00:00
|
|
|
|
if ( $value === null ) {
|
|
|
|
|
|
// Not a defined core magic word
|
2022-08-12 17:13:20 +00:00
|
|
|
|
// Don't give this hook unrestricted access to mVarCache
|
|
|
|
|
|
$fakeCache = [];
|
2022-03-01 22:39:53 +00:00
|
|
|
|
$this->hookRunner->onParserGetVariableValueSwitch(
|
2022-08-12 17:13:20 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypeMismatchArgument $value is passed as null but returned as string
|
|
|
|
|
|
$this, $fakeCache, $index, $value, $frame
|
2022-03-01 22:39:53 +00:00
|
|
|
|
);
|
2022-08-12 17:13:20 +00:00
|
|
|
|
// Cache the value returned by the hook by falling through here.
|
|
|
|
|
|
// Assert the the hook returned a non-null value for this MV
|
|
|
|
|
|
'@phan-var string $value';
|
2004-03-20 15:03:26 +00:00
|
|
|
|
}
|
2009-10-02 09:46:17 +00:00
|
|
|
|
|
2020-03-26 19:49:58 +00:00
|
|
|
|
$this->mVarCache[$index] = $value;
|
2009-10-02 09:46:17 +00:00
|
|
|
|
|
|
|
|
|
|
return $value;
|
2004-03-20 15:03:26 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Initialize the magic variables (like CURRENTMONTHNAME) and
|
|
|
|
|
|
* substitution modifiers.
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function initializeVariables() {
|
2018-07-25 11:55:18 +00:00
|
|
|
|
$variableIDs = $this->magicWordFactory->getVariableIDs();
|
|
|
|
|
|
$substIDs = $this->magicWordFactory->getSubstIDs();
|
2006-07-03 11:07:00 +00:00
|
|
|
|
|
2018-07-25 12:22:00 +00:00
|
|
|
|
$this->mVariables = $this->magicWordFactory->newArray( $variableIDs );
|
|
|
|
|
|
$this->mSubstWords = $this->magicWordFactory->newArray( $substIDs );
|
2004-03-20 15:03:26 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2005-10-19 06:24:30 +00:00
|
|
|
|
/**
|
2020-04-07 23:52:41 +00:00
|
|
|
|
* Get the document object model for the given wikitext
|
2005-10-19 06:24:30 +00:00
|
|
|
|
*
|
2020-04-07 23:52:41 +00:00
|
|
|
|
* @see Preprocessor::preprocessToObj()
|
2008-01-05 12:39:12 +00:00
|
|
|
|
*
|
2008-01-19 09:03:45 +00:00
|
|
|
|
* The generated DOM tree must depend only on the input text and the flags.
|
2020-04-07 23:52:41 +00:00
|
|
|
|
* The DOM tree must be the same in OT_HTML and OT_WIKI mode, to avoid a
|
|
|
|
|
|
* regression of T6899.
|
2008-01-05 12:39:12 +00:00
|
|
|
|
*
|
2008-04-14 07:45:50 +00:00
|
|
|
|
* Any flag added to the $flags parameter here, or any other parameter liable to cause a
|
|
|
|
|
|
* change in the DOM tree for a given text, must be passed through the section identifier
|
|
|
|
|
|
* in the section edit link and thus back to extractSections().
|
2008-01-05 12:39:12 +00:00
|
|
|
|
*
|
2020-04-07 23:52:41 +00:00
|
|
|
|
* @param string $text Wikitext
|
|
|
|
|
|
* @param int $flags Bit field of Preprocessor::DOM_* constants
|
2011-05-01 23:54:41 +00:00
|
|
|
|
* @return PPNode
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.23 method is public
|
2005-10-19 06:24:30 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function preprocessToDom( $text, $flags = 0 ) {
|
2020-04-07 23:52:41 +00:00
|
|
|
|
return $this->getPreprocessor()->preprocessToObj( $text, $flags );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Replace magic variables, templates, and template arguments
|
|
|
|
|
|
* with the appropriate text. Templates are substituted recursively,
|
|
|
|
|
|
* taking care to avoid infinite loops.
|
|
|
|
|
|
*
|
|
|
|
|
|
* Note that the substitution depends on value of $mOutputType:
|
2008-01-22 10:47:44 +00:00
|
|
|
|
* self::OT_WIKI: only {{subst:}} templates
|
|
|
|
|
|
* self::OT_PREPROCESS: templates but not extension tags
|
|
|
|
|
|
* self::OT_HTML: all templates and extension tags
|
2005-07-03 07:15:53 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text The text to transform
|
2019-06-03 16:08:04 +00:00
|
|
|
|
* @param false|PPFrame|array $frame Object describing the arguments passed to the
|
2014-05-10 23:07:20 +00:00
|
|
|
|
* template. Arguments may also be provided as an associative array, as
|
|
|
|
|
|
* was the usual case before MW1.12. Providing arguments this way may be
|
|
|
|
|
|
* useful for extensions wishing to perform variable replacement
|
|
|
|
|
|
* explicitly.
|
|
|
|
|
|
* @param bool $argsOnly Only do argument (triple-brace) expansion, not
|
|
|
|
|
|
* double-brace expansion.
|
2011-05-01 23:54:41 +00:00
|
|
|
|
* @return string
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.24 method is public
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2014-05-10 23:07:20 +00:00
|
|
|
|
public function replaceVariables( $text, $frame = false, $argsOnly = false ) {
|
2009-03-02 02:06:01 +00:00
|
|
|
|
# Is there any text? Also, Prevent too big inclusions!
|
2015-09-09 01:48:33 +00:00
|
|
|
|
$textSize = strlen( $text );
|
|
|
|
|
|
if ( $textSize < 1 || $textSize > $this->mOptions->getMaxIncludeSize() ) {
|
2004-11-21 14:07:24 +00:00
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
if ( $frame === false ) {
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$frame = $this->getPreprocessor()->newFrame();
|
2007-11-20 10:55:08 +00:00
|
|
|
|
} elseif ( !( $frame instanceof PPFrame ) ) {
|
2019-06-27 03:35:50 +00:00
|
|
|
|
$this->logger->debug(
|
|
|
|
|
|
__METHOD__ . " called using plain parameters instead of " .
|
|
|
|
|
|
"a PPFrame instance. Creating custom frame."
|
|
|
|
|
|
);
|
2010-03-30 21:20:05 +00:00
|
|
|
|
$frame = $this->getPreprocessor()->newCustomFrame( $frame );
|
2004-05-23 03:39:24 +00:00
|
|
|
|
}
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$dom = $this->preprocessToDom( $text );
|
|
|
|
|
|
$flags = $argsOnly ? PPFrame::NO_TEMPLATES : 0;
|
2007-12-01 07:13:31 +00:00
|
|
|
|
$text = $frame->expand( $dom, $flags );
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2004-09-21 23:30:46 +00:00
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2008-05-19 21:33:47 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Warn the user when a parser limitation is reached
|
|
|
|
|
|
* Will warn at most once the user per limitation type
|
|
|
|
|
|
*
|
2015-09-28 22:39:31 +00:00
|
|
|
|
* The results are shown during preview and run through the Parser (See EditPage.php)
|
|
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $limitationType Should be one of:
|
2010-05-15 10:35:54 +00:00
|
|
|
|
* 'expensive-parserfunction' (corresponding messages:
|
|
|
|
|
|
* 'expensive-parserfunction-warning',
|
2010-03-30 21:20:05 +00:00
|
|
|
|
* 'expensive-parserfunction-category')
|
2010-05-15 10:35:54 +00:00
|
|
|
|
* 'post-expand-template-argument' (corresponding messages:
|
|
|
|
|
|
* 'post-expand-template-argument-warning',
|
2010-03-30 21:20:05 +00:00
|
|
|
|
* 'post-expand-template-argument-category')
|
2010-05-15 10:35:54 +00:00
|
|
|
|
* 'post-expand-template-inclusion' (corresponding messages:
|
|
|
|
|
|
* 'post-expand-template-inclusion-warning',
|
2010-03-30 21:20:05 +00:00
|
|
|
|
* 'post-expand-template-inclusion-category')
|
2013-08-16 14:01:46 +00:00
|
|
|
|
* 'node-count-exceeded' (corresponding messages:
|
|
|
|
|
|
* 'node-count-exceeded-warning',
|
|
|
|
|
|
* 'node-count-exceeded-category')
|
|
|
|
|
|
* 'expansion-depth-exceeded' (corresponding messages:
|
|
|
|
|
|
* 'expansion-depth-exceeded-warning',
|
|
|
|
|
|
* 'expansion-depth-exceeded-category')
|
2014-05-10 23:05:51 +00:00
|
|
|
|
* @param string|int|null $current Current value
|
|
|
|
|
|
* @param string|int|null $max Maximum allowed, when an explicit limit has been
|
2018-05-19 20:46:54 +00:00
|
|
|
|
* exceeded, provide the values (optional)
|
2020-01-25 15:45:59 +00:00
|
|
|
|
* @internal
|
2008-05-19 21:33:47 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function limitationWarn( $limitationType, $current = '', $max = '' ) {
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# does no harm if $current and $max are present but are unnecessary for the message
|
2015-09-28 22:39:31 +00:00
|
|
|
|
# Not doing ->inLanguage( $this->mOptions->getUserLangObj() ), since this is shown
|
|
|
|
|
|
# only during preview, and that would split the parser cache unnecessarily.
|
2021-10-15 19:42:40 +00:00
|
|
|
|
$this->mOutput->addWarningMsg(
|
|
|
|
|
|
"$limitationType-warning",
|
|
|
|
|
|
Message::numParam( $current ),
|
|
|
|
|
|
Message::numParam( $max )
|
|
|
|
|
|
);
|
2009-10-11 12:52:08 +00:00
|
|
|
|
$this->addTrackingCategory( "$limitationType-category" );
|
2008-05-19 21:33:47 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Return the text of a template, after recursively
|
|
|
|
|
|
* replacing any variables or templates within the template.
|
|
|
|
|
|
*
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param array $piece The parts of the template
|
2014-05-10 23:05:51 +00:00
|
|
|
|
* $piece['title']: the title, i.e. the part before the |
|
|
|
|
|
|
* $piece['parts']: the parameter array
|
|
|
|
|
|
* $piece['lineStart']: whether the brace was at the start of a line
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param PPFrame $frame The current frame, contains template arguments
|
2014-05-10 23:05:51 +00:00
|
|
|
|
* @throws Exception
|
2019-06-03 16:08:04 +00:00
|
|
|
|
* @return string|array The text of the template
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @internal
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2019-08-27 09:23:52 +00:00
|
|
|
|
public function braceSubstitution( array $piece, PPFrame $frame ) {
|
2014-05-10 23:03:45 +00:00
|
|
|
|
// Flags
|
|
|
|
|
|
|
|
|
|
|
|
// $text has been filled
|
|
|
|
|
|
$found = false;
|
2022-03-29 19:28:18 +00:00
|
|
|
|
$text = '';
|
2014-05-10 23:03:45 +00:00
|
|
|
|
// wiki markup in $text should be escaped
|
|
|
|
|
|
$nowiki = false;
|
|
|
|
|
|
// $text is HTML, armour it against wikitext transformation
|
|
|
|
|
|
$isHTML = false;
|
|
|
|
|
|
// Force interwiki transclusion to be done in raw mode not rendered
|
|
|
|
|
|
$forceRawInterwiki = false;
|
|
|
|
|
|
// $text is a DOM node needing expansion in a child frame
|
|
|
|
|
|
$isChildObj = false;
|
|
|
|
|
|
// $text is a DOM node needing expansion in the current frame
|
|
|
|
|
|
$isLocalObj = false;
|
2004-07-12 19:49:20 +00:00
|
|
|
|
|
2006-02-01 04:41:53 +00:00
|
|
|
|
# Title object, where $text came from
|
2011-11-08 20:58:57 +00:00
|
|
|
|
$title = false;
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2008-04-14 07:45:50 +00:00
|
|
|
|
# $part1 is the bit before the first |, and must contain only title characters.
|
|
|
|
|
|
# Various prefixes will be stripped from it later.
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$titleWithSpaces = $frame->expand( $piece['title'] );
|
|
|
|
|
|
$part1 = trim( $titleWithSpaces );
|
|
|
|
|
|
$titleText = false;
|
2004-09-25 05:16:38 +00:00
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
# Original title text preserved for various purposes
|
|
|
|
|
|
$originalTitle = $part1;
|
2006-10-17 08:49:27 +00:00
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
# $args is a list of argument nodes, starting from index 0, not including $part1
|
2014-05-10 23:03:45 +00:00
|
|
|
|
# @todo FIXME: If piece['parts'] is null then the call to getLength()
|
|
|
|
|
|
# below won't work b/c this $args isn't an object
|
2018-06-30 09:43:00 +00:00
|
|
|
|
$args = ( $piece['parts'] == null ) ? [] : $piece['parts'];
|
2011-10-27 01:19:34 +00:00
|
|
|
|
|
2014-11-12 20:28:32 +00:00
|
|
|
|
$profileSection = null; // profile templates
|
2011-06-17 16:05:05 +00:00
|
|
|
|
|
2020-07-29 18:59:27 +00:00
|
|
|
|
$sawDeprecatedTemplateEquals = false; // T91154
|
|
|
|
|
|
|
2004-03-20 15:03:26 +00:00
|
|
|
|
# SUBST
|
2020-11-20 13:31:40 +00:00
|
|
|
|
// @phan-suppress-next-line PhanImpossibleCondition
|
2004-05-23 03:39:24 +00:00
|
|
|
|
if ( !$found ) {
|
2010-02-15 09:34:51 +00:00
|
|
|
|
$substMatch = $this->mSubstWords->matchStartAndRemove( $part1 );
|
2010-01-30 11:58:19 +00:00
|
|
|
|
|
2010-01-30 12:46:16 +00:00
|
|
|
|
# Possibilities for substMatch: "subst", "safesubst" or FALSE
|
2010-02-15 09:34:51 +00:00
|
|
|
|
# Decide whether to expand template or keep wikitext as-is.
|
2010-02-22 07:02:12 +00:00
|
|
|
|
if ( $this->ot['wiki'] ) {
|
2010-02-15 09:34:51 +00:00
|
|
|
|
if ( $substMatch === false ) {
|
|
|
|
|
|
$literal = true; # literal when in PST with no prefix
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$literal = false; # expand when in PST with subst: or safesubst:
|
|
|
|
|
|
}
|
|
|
|
|
|
} else {
|
|
|
|
|
|
if ( $substMatch == 'subst' ) {
|
|
|
|
|
|
$literal = true; # literal when not in PST with plain subst:
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$literal = false; # expand when not in PST with safesubst: or no prefix
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( $literal ) {
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$text = $frame->virtualBracketedImplode( '{{', '|', '}}', $titleWithSpaces, $args );
|
|
|
|
|
|
$isLocalObj = true;
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$found = true;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
# Variables
|
2008-01-21 16:36:08 +00:00
|
|
|
|
if ( !$found && $args->getLength() == 0 ) {
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$id = $this->mVariables->matchStartToEnd( $part1 );
|
|
|
|
|
|
if ( $id !== false ) {
|
2022-08-12 17:22:23 +00:00
|
|
|
|
if ( strpos( $part1, ':' ) !== false ) {
|
|
|
|
|
|
wfDeprecatedMsg(
|
|
|
|
|
|
'Registering a magic variable with a name including a colon',
|
|
|
|
|
|
'1.39', false, false
|
|
|
|
|
|
);
|
|
|
|
|
|
}
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
$text = $this->expandMagicVariable( $id, $frame );
|
2018-07-25 11:55:18 +00:00
|
|
|
|
if ( $this->magicWordFactory->getCacheTTL( $id ) > -1 ) {
|
|
|
|
|
|
$this->mOutput->updateCacheExpiry(
|
|
|
|
|
|
$this->magicWordFactory->getCacheTTL( $id ) );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
}
|
2004-03-20 15:03:26 +00:00
|
|
|
|
$found = true;
|
|
|
|
|
|
}
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2006-09-30 04:53:36 +00:00
|
|
|
|
# MSG, MSGNW and RAW
|
2004-03-20 15:03:26 +00:00
|
|
|
|
if ( !$found ) {
|
|
|
|
|
|
# Check for MSGNW:
|
2018-07-25 11:55:18 +00:00
|
|
|
|
$mwMsgnw = $this->magicWordFactory->get( 'msgnw' );
|
2004-04-11 16:46:06 +00:00
|
|
|
|
if ( $mwMsgnw->matchStartAndRemove( $part1 ) ) {
|
2004-03-20 15:03:26 +00:00
|
|
|
|
$nowiki = true;
|
2005-06-12 13:39:28 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
# Remove obsolete MSG:
|
2018-07-25 11:55:18 +00:00
|
|
|
|
$mwMsg = $this->magicWordFactory->get( 'msg' );
|
2005-06-12 13:39:28 +00:00
|
|
|
|
$mwMsg->matchStartAndRemove( $part1 );
|
2004-03-20 15:03:26 +00:00
|
|
|
|
}
|
2011-07-30 15:56:54 +00:00
|
|
|
|
|
|
|
|
|
|
# Check for RAW:
|
2018-07-25 11:55:18 +00:00
|
|
|
|
$mwRaw = $this->magicWordFactory->get( 'raw' );
|
2011-07-30 15:56:54 +00:00
|
|
|
|
if ( $mwRaw->matchStartAndRemove( $part1 ) ) {
|
|
|
|
|
|
$forceRawInterwiki = true;
|
|
|
|
|
|
}
|
2004-03-20 15:03:26 +00:00
|
|
|
|
}
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2006-07-03 11:07:00 +00:00
|
|
|
|
# Parser functions
|
2004-04-05 10:38:40 +00:00
|
|
|
|
if ( !$found ) {
|
2006-07-03 11:07:00 +00:00
|
|
|
|
$colonPos = strpos( $part1, ':' );
|
|
|
|
|
|
if ( $colonPos !== false ) {
|
2013-03-04 03:35:05 +00:00
|
|
|
|
$func = substr( $part1, 0, $colonPos );
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$funcArgs = [ trim( substr( $part1, $colonPos + 1 ) ) ];
|
|
|
|
|
|
$argsLength = $args->getLength();
|
|
|
|
|
|
for ( $i = 0; $i < $argsLength; $i++ ) {
|
2013-03-04 03:35:05 +00:00
|
|
|
|
$funcArgs[] = $args->item( $i );
|
2006-04-30 18:02:03 +00:00
|
|
|
|
}
|
2018-05-27 01:14:51 +00:00
|
|
|
|
|
|
|
|
|
|
$result = $this->callParserFunction( $frame, $func, $funcArgs );
|
2013-03-04 03:35:05 +00:00
|
|
|
|
|
2017-12-07 21:16:47 +00:00
|
|
|
|
// Extract any forwarded flags
|
2017-12-30 10:32:06 +00:00
|
|
|
|
if ( isset( $result['title'] ) ) {
|
|
|
|
|
|
$title = $result['title'];
|
|
|
|
|
|
}
|
2017-12-07 21:16:47 +00:00
|
|
|
|
if ( isset( $result['found'] ) ) {
|
|
|
|
|
|
$found = $result['found'];
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( array_key_exists( 'text', $result ) ) {
|
|
|
|
|
|
// a string or null
|
|
|
|
|
|
$text = $result['text'];
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( isset( $result['nowiki'] ) ) {
|
|
|
|
|
|
$nowiki = $result['nowiki'];
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( isset( $result['isHTML'] ) ) {
|
|
|
|
|
|
$isHTML = $result['isHTML'];
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( isset( $result['forceRawInterwiki'] ) ) {
|
|
|
|
|
|
$forceRawInterwiki = $result['forceRawInterwiki'];
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( isset( $result['isChildObj'] ) ) {
|
|
|
|
|
|
$isChildObj = $result['isChildObj'];
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( isset( $result['isLocalObj'] ) ) {
|
|
|
|
|
|
$isLocalObj = $result['isLocalObj'];
|
|
|
|
|
|
}
|
2006-04-05 09:40:25 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
# Finish mangling title and then check for loops.
|
|
|
|
|
|
# Set $title to a Title object and $titleText to the PDBK
|
2004-03-20 15:03:26 +00:00
|
|
|
|
if ( !$found ) {
|
2004-09-25 20:13:14 +00:00
|
|
|
|
$ns = NS_TEMPLATE;
|
2007-11-20 10:55:08 +00:00
|
|
|
|
# Split the title into page and subpage
|
2006-03-24 16:44:52 +00:00
|
|
|
|
$subpage = '';
|
2019-10-29 07:32:44 +00:00
|
|
|
|
$relative = Linker::normalizeSubpageLink(
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$this->getTitle(), $part1, $subpage
|
2019-10-29 07:32:44 +00:00
|
|
|
|
);
|
2013-05-30 12:23:33 +00:00
|
|
|
|
if ( $part1 !== $relative ) {
|
|
|
|
|
|
$part1 = $relative;
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$ns = $this->getTitle()->getNamespace();
|
2004-09-25 20:13:14 +00:00
|
|
|
|
}
|
|
|
|
|
|
$title = Title::newFromText( $part1, $ns );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
if ( $title ) {
|
2006-08-17 22:20:06 +00:00
|
|
|
|
$titleText = $title->getPrefixedText();
|
2006-06-17 08:55:44 +00:00
|
|
|
|
# Check for language variants if the template is not found
|
2020-01-23 18:39:23 +00:00
|
|
|
|
if ( $this->getTargetLanguageConverter()->hasVariants() && $title->getArticleID() == 0 ) {
|
|
|
|
|
|
$this->getTargetLanguageConverter()->findVariantLink( $part1, $title, true );
|
2006-06-17 08:55:44 +00:00
|
|
|
|
}
|
2008-01-11 03:25:41 +00:00
|
|
|
|
# Do recursion depth check
|
|
|
|
|
|
$limit = $this->mOptions->getMaxTemplateDepth();
|
|
|
|
|
|
if ( $frame->depth >= $limit ) {
|
|
|
|
|
|
$found = true;
|
2010-05-15 10:35:54 +00:00
|
|
|
|
$text = '<span class="error">'
|
2012-08-29 08:07:10 +00:00
|
|
|
|
. wfMessage( 'parser-template-recursion-depth-warning' )
|
|
|
|
|
|
->numParams( $limit )->inContentLanguage()->text()
|
2010-03-30 21:20:05 +00:00
|
|
|
|
. '</span>';
|
2008-01-11 03:25:41 +00:00
|
|
|
|
}
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2006-06-17 08:55:44 +00:00
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
# Load from database
|
|
|
|
|
|
if ( !$found && $title ) {
|
2014-11-12 20:28:32 +00:00
|
|
|
|
$profileSection = $this->mProfiler->scopedProfileIn( $title->getPrefixedDBkey() );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
if ( !$title->isExternal() ) {
|
2011-11-02 20:55:08 +00:00
|
|
|
|
if ( $title->isSpecialPage()
|
2010-05-15 10:35:54 +00:00
|
|
|
|
&& $this->mOptions->getAllowSpecialInclusion()
|
2013-12-01 20:39:00 +00:00
|
|
|
|
&& $this->ot['html']
|
|
|
|
|
|
) {
|
2018-08-15 01:11:59 +00:00
|
|
|
|
$specialPage = $this->specialPageFactory->getPage( $title->getDBkey() );
|
2011-09-29 15:14:34 +00:00
|
|
|
|
// Pass the template arguments as URL parameters.
|
|
|
|
|
|
// "uselang" will have no effect since the Language object
|
|
|
|
|
|
// is forced to the one defined in ParserOptions.
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$pageArgs = [];
|
2014-05-10 23:03:45 +00:00
|
|
|
|
$argsLength = $args->getLength();
|
|
|
|
|
|
for ( $i = 0; $i < $argsLength; $i++ ) {
|
2011-08-02 15:40:03 +00:00
|
|
|
|
$bits = $args->item( $i )->splitArg();
|
|
|
|
|
|
if ( strval( $bits['index'] ) === '' ) {
|
|
|
|
|
|
$name = trim( $frame->expand( $bits['name'], PPFrame::STRIP_COMMENTS ) );
|
|
|
|
|
|
$value = trim( $frame->expand( $bits['value'] ) );
|
|
|
|
|
|
$pageArgs[$name] = $value;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
2011-09-29 15:14:34 +00:00
|
|
|
|
|
|
|
|
|
|
// Create a new context to execute the special page
|
2011-08-02 15:40:03 +00:00
|
|
|
|
$context = new RequestContext;
|
|
|
|
|
|
$context->setTitle( $title );
|
|
|
|
|
|
$context->setRequest( new FauxRequest( $pageArgs ) );
|
2014-06-20 16:18:29 +00:00
|
|
|
|
if ( $specialPage && $specialPage->maxIncludeCacheTime() === 0 ) {
|
2021-07-28 14:08:59 +00:00
|
|
|
|
$context->setUser( $this->userFactory->newFromUserIdentity( $this->getUserIdentity() ) );
|
2014-06-20 16:18:29 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
// If this page is cached, then we better not be per user.
|
|
|
|
|
|
$context->setUser( User::newFromName( '127.0.0.1', false ) );
|
|
|
|
|
|
}
|
2011-11-23 10:28:21 +00:00
|
|
|
|
$context->setLanguage( $this->mOptions->getUserLangObj() );
|
2018-08-15 01:11:59 +00:00
|
|
|
|
$ret = $this->specialPageFactory->capturePath( $title, $context, $this->getLinkRenderer() );
|
2011-08-02 15:40:03 +00:00
|
|
|
|
if ( $ret ) {
|
|
|
|
|
|
$text = $context->getOutput()->getHTML();
|
2011-08-02 16:31:22 +00:00
|
|
|
|
$this->mOutput->addOutputPageMetadata( $context->getOutput() );
|
2006-01-31 03:44:08 +00:00
|
|
|
|
$found = true;
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$isHTML = true;
|
2014-06-20 16:18:29 +00:00
|
|
|
|
if ( $specialPage && $specialPage->maxIncludeCacheTime() !== false ) {
|
2016-08-30 19:35:08 +00:00
|
|
|
|
$this->mOutput->updateRuntimeAdaptiveExpiry(
|
|
|
|
|
|
$specialPage->maxIncludeCacheTime()
|
|
|
|
|
|
);
|
2014-06-20 16:18:29 +00:00
|
|
|
|
}
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
2018-08-05 12:50:01 +00:00
|
|
|
|
} elseif ( $this->nsInfo->isNonincludable( $title->getNamespace() ) ) {
|
2010-03-30 21:53:56 +00:00
|
|
|
|
$found = false; # access denied
|
2019-06-27 03:35:50 +00:00
|
|
|
|
$this->logger->debug(
|
|
|
|
|
|
__METHOD__ .
|
|
|
|
|
|
": template inclusion denied for " . $title->getPrefixedDBkey()
|
|
|
|
|
|
);
|
2007-11-20 10:55:08 +00:00
|
|
|
|
} else {
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $text, $title ] = $this->getTemplateDom( $title );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
if ( $text !== false ) {
|
|
|
|
|
|
$found = true;
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$isChildObj = true;
|
2020-07-29 18:59:27 +00:00
|
|
|
|
if (
|
|
|
|
|
|
$title->getNamespace() === NS_TEMPLATE &&
|
|
|
|
|
|
$title->getDBkey() === '=' &&
|
|
|
|
|
|
$originalTitle === '='
|
|
|
|
|
|
) {
|
|
|
|
|
|
// Note that we won't get here if `=` is evaluated
|
|
|
|
|
|
// (in the future) as a parser function, nor if
|
|
|
|
|
|
// the Template namespace is given explicitly,
|
|
|
|
|
|
// ie `{{Template:=}}`. Only `{{=}}` triggers.
|
|
|
|
|
|
$sawDeprecatedTemplateEquals = true; // T91154
|
|
|
|
|
|
}
|
2006-01-31 03:44:08 +00:00
|
|
|
|
}
|
2005-02-05 07:14:25 +00:00
|
|
|
|
}
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
# If the title is valid but undisplayable, make a link to it
|
|
|
|
|
|
if ( !$found && ( $this->ot['html'] || $this->ot['pre'] ) ) {
|
|
|
|
|
|
$text = "[[:$titleText]]";
|
|
|
|
|
|
$found = true;
|
2006-02-01 04:41:53 +00:00
|
|
|
|
}
|
2011-09-29 22:08:00 +00:00
|
|
|
|
} elseif ( $title->isTrans() ) {
|
|
|
|
|
|
# Interwiki transclusion
|
|
|
|
|
|
if ( $this->ot['html'] && !$forceRawInterwiki ) {
|
|
|
|
|
|
$text = $this->interwikiTransclude( $title, 'render' );
|
|
|
|
|
|
$isHTML = true;
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$text = $this->interwikiTransclude( $title, 'raw' );
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Preprocess it like a template
|
2020-04-07 23:52:41 +00:00
|
|
|
|
$text = $this->preprocessToDom( $text, Preprocessor::DOM_FOR_INCLUSION );
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$isChildObj = true;
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
2011-09-29 22:08:00 +00:00
|
|
|
|
$found = true;
|
2004-03-20 15:03:26 +00:00
|
|
|
|
}
|
2009-04-05 15:01:25 +00:00
|
|
|
|
|
|
|
|
|
|
# Do infinite loop check
|
|
|
|
|
|
# This has to be done after redirect resolution to avoid infinite loops via redirects
|
|
|
|
|
|
if ( !$frame->loopCheck( $title ) ) {
|
|
|
|
|
|
$found = true;
|
2012-08-29 08:07:10 +00:00
|
|
|
|
$text = '<span class="error">'
|
|
|
|
|
|
. wfMessage( 'parser-template-loop-warning', $titleText )->inContentLanguage()->text()
|
|
|
|
|
|
. '</span>';
|
2017-03-17 11:36:45 +00:00
|
|
|
|
$this->addTrackingCategory( 'template-loop-category' );
|
2021-10-15 19:42:40 +00:00
|
|
|
|
$this->mOutput->addWarningMsg(
|
|
|
|
|
|
'template-loop-warning',
|
|
|
|
|
|
Message::plaintextParam( $titleText )
|
|
|
|
|
|
);
|
2019-06-27 03:35:50 +00:00
|
|
|
|
$this->logger->debug( __METHOD__ . ": template loop broken at '$titleText'" );
|
2009-04-05 15:01:25 +00:00
|
|
|
|
}
|
2004-03-20 15:03:26 +00:00
|
|
|
|
}
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2008-01-03 05:37:32 +00:00
|
|
|
|
# If we haven't found text to substitute by now, we're done
|
|
|
|
|
|
# Recover the source wikitext and return it
|
|
|
|
|
|
if ( !$found ) {
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$text = $frame->virtualBracketedImplode( '{{', '|', '}}', $titleWithSpaces, $args );
|
2014-11-12 20:28:32 +00:00
|
|
|
|
if ( $profileSection ) {
|
|
|
|
|
|
$this->mProfiler->scopedProfileOut( $profileSection );
|
2011-10-27 01:19:34 +00:00
|
|
|
|
}
|
2016-02-17 19:57:37 +00:00
|
|
|
|
return [ 'object' => $text ];
|
2008-01-03 05:37:32 +00:00
|
|
|
|
}
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2008-01-03 05:37:32 +00:00
|
|
|
|
# Expand DOM-style return values in a child frame
|
2008-01-21 16:36:08 +00:00
|
|
|
|
if ( $isChildObj ) {
|
2008-01-03 05:37:32 +00:00
|
|
|
|
# Clean up argument array
|
|
|
|
|
|
$newFrame = $frame->newChild( $args, $title );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
2008-01-24 04:29:56 +00:00
|
|
|
|
if ( $nowiki ) {
|
|
|
|
|
|
$text = $newFrame->expand( $text, PPFrame::RECOVER_ORIG );
|
|
|
|
|
|
} elseif ( $titleText !== false && $newFrame->isEmpty() ) {
|
2008-01-03 05:37:32 +00:00
|
|
|
|
# Expansion is eligible for the empty-frame cache
|
2014-05-29 00:54:55 +00:00
|
|
|
|
$text = $newFrame->cachedExpand( $titleText, $text );
|
2008-01-03 05:37:32 +00:00
|
|
|
|
} else {
|
2008-01-11 03:25:41 +00:00
|
|
|
|
# Uncached expansion
|
2008-01-03 05:37:32 +00:00
|
|
|
|
$text = $newFrame->expand( $text );
|
2004-09-25 16:06:10 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2008-01-24 04:29:56 +00:00
|
|
|
|
if ( $isLocalObj && $nowiki ) {
|
|
|
|
|
|
$text = $frame->expand( $text, PPFrame::RECOVER_ORIG );
|
|
|
|
|
|
$isLocalObj = false;
|
|
|
|
|
|
}
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
2014-11-12 20:28:32 +00:00
|
|
|
|
if ( $profileSection ) {
|
|
|
|
|
|
$this->mProfiler->scopedProfileOut( $profileSection );
|
2011-10-27 01:19:34 +00:00
|
|
|
|
}
|
2020-07-29 18:59:27 +00:00
|
|
|
|
if (
|
|
|
|
|
|
$sawDeprecatedTemplateEquals &&
|
|
|
|
|
|
$this->mStripState->unstripBoth( $text ) !== '='
|
|
|
|
|
|
) {
|
|
|
|
|
|
// T91154: {{=}} is deprecated when it doesn't expand to `=`;
|
|
|
|
|
|
// use {{Template:=}} if you must.
|
|
|
|
|
|
$this->addTrackingCategory( 'template-equals-category' );
|
2021-10-15 19:42:40 +00:00
|
|
|
|
$this->mOutput->addWarningMsg( 'template-equals-warning' );
|
2020-07-29 18:59:27 +00:00
|
|
|
|
}
|
2011-10-27 01:19:34 +00:00
|
|
|
|
|
2008-01-03 05:37:32 +00:00
|
|
|
|
# Replace raw HTML by a placeholder
|
|
|
|
|
|
if ( $isHTML ) {
|
2012-05-12 21:25:35 +00:00
|
|
|
|
$text = $this->insertStripItem( $text );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
} elseif ( $nowiki && ( $this->ot['html'] || $this->ot['pre'] ) ) {
|
|
|
|
|
|
# Escape nowiki-style return values
|
2008-01-03 05:37:32 +00:00
|
|
|
|
$text = wfEscapeWikiText( $text );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
} elseif ( is_string( $text )
|
2010-05-15 10:35:54 +00:00
|
|
|
|
&& !$piece['lineStart']
|
2013-12-01 20:39:00 +00:00
|
|
|
|
&& preg_match( '/^(?:{\\||:|;|#|\*)/', $text )
|
|
|
|
|
|
) {
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# T2529: if the template begins with a table or block-level
|
2011-01-26 01:16:18 +00:00
|
|
|
|
# element, it should be treated as beginning a new line.
|
2013-03-04 08:44:38 +00:00
|
|
|
|
# This behavior is somewhat controversial.
|
2008-01-03 05:37:32 +00:00
|
|
|
|
$text = "\n" . $text;
|
|
|
|
|
|
}
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
2008-01-21 16:36:08 +00:00
|
|
|
|
if ( is_string( $text ) && !$this->incrementIncludeSize( 'post-expand', strlen( $text ) ) ) {
|
2006-08-10 21:28:49 +00:00
|
|
|
|
# Error, oversize inclusion
|
2010-05-19 00:03:54 +00:00
|
|
|
|
if ( $titleText !== false ) {
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# Make a working, properly escaped link if possible (T25588)
|
2010-05-19 00:03:54 +00:00
|
|
|
|
$text = "[[:$titleText]]";
|
|
|
|
|
|
} else {
|
|
|
|
|
|
# This will probably not be a working link, but at least it may
|
|
|
|
|
|
# provide some hint of where the problem is
|
2021-08-26 19:24:53 +00:00
|
|
|
|
$originalTitle = preg_replace( '/^:/', '', $originalTitle );
|
2010-05-19 00:03:54 +00:00
|
|
|
|
$text = "[[:$originalTitle]]";
|
|
|
|
|
|
}
|
2014-05-10 23:03:45 +00:00
|
|
|
|
$text .= $this->insertStripItem( '<!-- WARNING: template omitted, '
|
|
|
|
|
|
. 'post-expand include size too large -->' );
|
2008-05-19 21:33:47 +00:00
|
|
|
|
$this->limitationWarn( 'post-expand-template-inclusion' );
|
2006-08-10 21:28:49 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2008-01-21 16:36:08 +00:00
|
|
|
|
if ( $isLocalObj ) {
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$ret = [ 'object' => $text ];
|
2008-01-21 16:36:08 +00:00
|
|
|
|
} else {
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$ret = [ 'text' => $text ];
|
2008-01-21 16:36:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
return $ret;
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2006-01-07 13:31:29 +00:00
|
|
|
|
|
2013-03-04 03:35:05 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Call a parser function and return an array with text and flags.
|
|
|
|
|
|
*
|
|
|
|
|
|
* The returned array will always contain a boolean 'found', indicating
|
|
|
|
|
|
* whether the parser function was found or not. It may also contain the
|
|
|
|
|
|
* following:
|
|
|
|
|
|
* text: string|object, resulting wikitext or PP DOM object
|
|
|
|
|
|
* isHTML: bool, $text is HTML, armour it against wikitext transformation
|
|
|
|
|
|
* isChildObj: bool, $text is a DOM node needing expansion in a child frame
|
|
|
|
|
|
* isLocalObj: bool, $text is a DOM node needing expansion in the current frame
|
|
|
|
|
|
* nowiki: bool, wiki markup in $text should be escaped
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.21
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param PPFrame $frame The current frame, contains template arguments
|
|
|
|
|
|
* @param string $function Function name
|
|
|
|
|
|
* @param array $args Arguments to the function
|
2013-03-04 03:35:05 +00:00
|
|
|
|
* @return array
|
|
|
|
|
|
*/
|
2019-08-27 09:23:52 +00:00
|
|
|
|
public function callParserFunction( PPFrame $frame, $function, array $args = [] ) {
|
2013-03-04 03:35:05 +00:00
|
|
|
|
# Case sensitive functions
|
|
|
|
|
|
if ( isset( $this->mFunctionSynonyms[1][$function] ) ) {
|
|
|
|
|
|
$function = $this->mFunctionSynonyms[1][$function];
|
|
|
|
|
|
} else {
|
|
|
|
|
|
# Case insensitive functions
|
2018-08-03 08:25:15 +00:00
|
|
|
|
$function = $this->contLang->lc( $function );
|
2013-03-04 03:35:05 +00:00
|
|
|
|
if ( isset( $this->mFunctionSynonyms[0][$function] ) ) {
|
|
|
|
|
|
$function = $this->mFunctionSynonyms[0][$function];
|
|
|
|
|
|
} else {
|
2016-02-17 19:57:37 +00:00
|
|
|
|
return [ 'found' => false ];
|
2013-03-04 03:35:05 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $callback, $flags ] = $this->mFunctionHooks[$function];
|
2013-03-04 03:35:05 +00:00
|
|
|
|
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$allArgs = [ $this ];
|
2014-11-11 19:28:28 +00:00
|
|
|
|
if ( $flags & self::SFH_OBJECT_ARGS ) {
|
2013-03-04 03:35:05 +00:00
|
|
|
|
# Convert arguments to PPNodes and collect for appending to $allArgs
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$funcArgs = [];
|
2013-03-04 03:35:05 +00:00
|
|
|
|
foreach ( $args as $k => $v ) {
|
|
|
|
|
|
if ( $v instanceof PPNode || $k === 0 ) {
|
|
|
|
|
|
$funcArgs[] = $v;
|
|
|
|
|
|
} else {
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$funcArgs[] = $this->mPreprocessor->newPartNodeArray( [ $k => $v ] )->item( 0 );
|
2013-03-04 03:35:05 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
# Add a frame parameter, and pass the arguments as an array
|
|
|
|
|
|
$allArgs[] = $frame;
|
|
|
|
|
|
$allArgs[] = $funcArgs;
|
|
|
|
|
|
} else {
|
|
|
|
|
|
# Convert arguments to plain text and append to $allArgs
|
|
|
|
|
|
foreach ( $args as $k => $v ) {
|
|
|
|
|
|
if ( $v instanceof PPNode ) {
|
|
|
|
|
|
$allArgs[] = trim( $frame->expand( $v ) );
|
|
|
|
|
|
} elseif ( is_int( $k ) && $k >= 0 ) {
|
|
|
|
|
|
$allArgs[] = trim( $v );
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$allArgs[] = trim( "$k=$v" );
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2018-06-08 02:58:35 +00:00
|
|
|
|
$result = $callback( ...$allArgs );
|
2013-03-04 03:35:05 +00:00
|
|
|
|
|
|
|
|
|
|
# The interface for function hooks allows them to return a wikitext
|
|
|
|
|
|
# string or an array containing the string and any flags. This mungs
|
|
|
|
|
|
# things around to match what this method should return.
|
|
|
|
|
|
if ( !is_array( $result ) ) {
|
2017-08-11 13:53:17 +00:00
|
|
|
|
$result = [
|
2013-03-04 03:35:05 +00:00
|
|
|
|
'found' => true,
|
|
|
|
|
|
'text' => $result,
|
2016-02-17 19:57:37 +00:00
|
|
|
|
];
|
2013-03-04 03:35:05 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
if ( isset( $result[0] ) && !isset( $result['text'] ) ) {
|
|
|
|
|
|
$result['text'] = $result[0];
|
|
|
|
|
|
}
|
|
|
|
|
|
unset( $result[0] );
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$result += [
|
2013-03-04 03:35:05 +00:00
|
|
|
|
'found' => true,
|
2016-02-17 19:57:37 +00:00
|
|
|
|
];
|
2013-03-04 03:35:05 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
$noparse = true;
|
|
|
|
|
|
$preprocessFlags = 0;
|
|
|
|
|
|
if ( isset( $result['noparse'] ) ) {
|
|
|
|
|
|
$noparse = $result['noparse'];
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( isset( $result['preprocessFlags'] ) ) {
|
|
|
|
|
|
$preprocessFlags = $result['preprocessFlags'];
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
if ( !$noparse ) {
|
|
|
|
|
|
$result['text'] = $this->preprocessToDom( $result['text'], $preprocessFlags );
|
|
|
|
|
|
$result['isChildObj'] = true;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
return $result;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get the semi-parsed DOM representation of a template with a given title,
|
|
|
|
|
|
* and its redirect destination title. Cached.
|
2011-05-01 23:54:41 +00:00
|
|
|
|
*
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param LinkTarget $title
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2011-05-01 23:54:41 +00:00
|
|
|
|
* @return array
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.12
|
2007-11-20 10:55:08 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public function getTemplateDom( LinkTarget $title ) {
|
2007-11-21 23:07:36 +00:00
|
|
|
|
$cacheTitle = $title;
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$titleKey = CacheKeyHelper::getKeyForPage( $title );
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
2021-04-25 17:29:33 +00:00
|
|
|
|
if ( isset( $this->mTplRedirCache[$titleKey] ) ) {
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $ns, $dbk ] = $this->mTplRedirCache[$titleKey];
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$title = Title::makeTitle( $ns, $dbk );
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$titleKey = CacheKeyHelper::getKeyForPage( $title );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
2021-04-25 17:29:33 +00:00
|
|
|
|
if ( isset( $this->mTplDomCache[$titleKey] ) ) {
|
|
|
|
|
|
return [ $this->mTplDomCache[$titleKey], $title ];
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Cache miss, go to the database
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $text, $title ] = $this->fetchTemplateAndTitle( $title );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
|
|
|
|
|
if ( $text === false ) {
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->mTplDomCache[$titleKey] = false;
|
2016-02-17 19:57:37 +00:00
|
|
|
|
return [ false, $title ];
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2020-04-07 23:52:41 +00:00
|
|
|
|
$dom = $this->preprocessToDom( $text, Preprocessor::DOM_FOR_INCLUSION );
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->mTplDomCache[$titleKey] = $dom;
|
2007-11-21 23:07:36 +00:00
|
|
|
|
|
2021-04-25 17:29:33 +00:00
|
|
|
|
if ( !$title->isSamePageAs( $cacheTitle ) ) {
|
|
|
|
|
|
$this->mTplRedirCache[ CacheKeyHelper::getKeyForPage( $cacheTitle ) ] =
|
2017-05-02 19:18:39 +00:00
|
|
|
|
[ $title->getNamespace(), $title->getDBkey() ];
|
2007-11-21 23:07:36 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2016-02-17 19:57:37 +00:00
|
|
|
|
return [ $dom, $title ];
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2020-04-09 03:36:39 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Fetch the current revision of a given title as a RevisionRecord.
|
|
|
|
|
|
* Note that the revision (and even the title) may not exist in the database,
|
|
|
|
|
|
* so everything contributing to the output of the parser should use this method
|
|
|
|
|
|
* where possible, rather than getting the revisions themselves. This
|
|
|
|
|
|
* method also caches its results, so using it benefits performance.
|
|
|
|
|
|
*
|
2020-05-27 19:20:36 +00:00
|
|
|
|
* This can return null if the callback returns false
|
2020-04-09 03:36:39 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @since 1.35
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param LinkTarget $link
|
2020-05-27 19:20:36 +00:00
|
|
|
|
* @return RevisionRecord|null
|
2020-04-09 03:36:39 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public function fetchCurrentRevisionRecordOfTitle( LinkTarget $link ) {
|
|
|
|
|
|
$cacheKey = CacheKeyHelper::getKeyForPage( $link );
|
2014-09-16 00:07:52 +00:00
|
|
|
|
if ( !$this->currentRevisionCache ) {
|
|
|
|
|
|
$this->currentRevisionCache = new MapCacheLRU( 100 );
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( !$this->currentRevisionCache->has( $cacheKey ) ) {
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$title = Title::castFromLinkTarget( $link ); // hook signature compat
|
2020-05-27 19:20:36 +00:00
|
|
|
|
$revisionRecord =
|
2020-04-09 03:36:39 +00:00
|
|
|
|
// Defaults to Parser::statelessFetchRevisionRecord()
|
|
|
|
|
|
call_user_func(
|
|
|
|
|
|
$this->mOptions->getCurrentRevisionRecordCallback(),
|
|
|
|
|
|
$title,
|
|
|
|
|
|
$this
|
2020-05-27 19:20:36 +00:00
|
|
|
|
);
|
2022-11-09 02:12:50 +00:00
|
|
|
|
if ( $revisionRecord === false ) {
|
2020-05-27 19:20:36 +00:00
|
|
|
|
// Parser::statelessFetchRevisionRecord() can return false;
|
|
|
|
|
|
// normalize it to null.
|
|
|
|
|
|
$revisionRecord = null;
|
|
|
|
|
|
}
|
|
|
|
|
|
$this->currentRevisionCache->set( $cacheKey, $revisionRecord );
|
2014-09-16 00:07:52 +00:00
|
|
|
|
}
|
|
|
|
|
|
return $this->currentRevisionCache->get( $cacheKey );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2019-07-04 10:01:31 +00:00
|
|
|
|
/**
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param LinkTarget $link
|
2019-07-04 10:01:31 +00:00
|
|
|
|
* @return bool
|
|
|
|
|
|
* @since 1.34
|
2020-01-25 15:45:59 +00:00
|
|
|
|
* @internal
|
2019-07-04 10:01:31 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public function isCurrentRevisionOfTitleCached( LinkTarget $link ) {
|
|
|
|
|
|
$key = CacheKeyHelper::getKeyForPage( $link );
|
2019-07-04 10:01:31 +00:00
|
|
|
|
return (
|
|
|
|
|
|
$this->currentRevisionCache &&
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->currentRevisionCache->has( $key )
|
2019-07-04 10:01:31 +00:00
|
|
|
|
);
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2020-04-09 03:36:39 +00:00
|
|
|
|
/**
|
2021-05-02 23:55:07 +00:00
|
|
|
|
* Wrapper around RevisionLookup::getKnownCurrentRevision
|
2020-04-09 03:36:39 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @since 1.34
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param LinkTarget $link
|
2020-04-09 03:36:39 +00:00
|
|
|
|
* @param Parser|null $parser
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @return RevisionRecord|false False if missing
|
2020-04-09 03:36:39 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public static function statelessFetchRevisionRecord( LinkTarget $link, $parser = null ) {
|
|
|
|
|
|
if ( $link instanceof PageIdentity ) {
|
|
|
|
|
|
// probably a Title, just use it.
|
|
|
|
|
|
$page = $link;
|
|
|
|
|
|
} else {
|
|
|
|
|
|
// XXX: use RevisionStore::getPageForLink()!
|
|
|
|
|
|
// ...but get the info for the current revision at the same time?
|
|
|
|
|
|
// Should RevisionStore::getKnownCurrentRevision accept a LinkTarget?
|
|
|
|
|
|
$page = Title::castFromLinkTarget( $link );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2020-04-09 03:36:39 +00:00
|
|
|
|
$revRecord = MediaWikiServices::getInstance()
|
|
|
|
|
|
->getRevisionLookup()
|
2021-10-25 19:15:52 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypeMismatchArgumentNullable castFrom does not return null here
|
2021-04-25 17:29:33 +00:00
|
|
|
|
->getKnownCurrentRevision( $page );
|
2020-04-09 03:36:39 +00:00
|
|
|
|
return $revRecord;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2006-01-11 02:19:41 +00:00
|
|
|
|
/**
|
2006-03-11 17:13:49 +00:00
|
|
|
|
* Fetch the unparsed text of a template and register a reference to it.
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param LinkTarget $link
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return array ( string or false, Title )
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.11
|
2006-01-11 02:19:41 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public function fetchTemplateAndTitle( LinkTarget $link ) {
|
|
|
|
|
|
// Use Title for compatibility with callbacks and return type
|
|
|
|
|
|
$title = Title::castFromLinkTarget( $link );
|
|
|
|
|
|
|
2014-05-10 23:03:45 +00:00
|
|
|
|
// Defaults to Parser::statelessFetchTemplate()
|
|
|
|
|
|
$templateCb = $this->mOptions->getTemplateCallback();
|
2022-04-04 09:57:04 +00:00
|
|
|
|
$stuff = $templateCb( $title, $this );
|
|
|
|
|
|
$revRecord = $stuff['revision-record'] ?? null;
|
2020-06-24 01:22:48 +00:00
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$text = $stuff['text'];
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
if ( is_string( $stuff['text'] ) ) {
|
2019-07-29 14:29:58 +00:00
|
|
|
|
// We use U+007F DELETE to distinguish strip markers from regular text
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
$text = strtr( $text, "\x7f", "?" );
|
|
|
|
|
|
}
|
2017-10-06 22:17:58 +00:00
|
|
|
|
$finalTitle = $stuff['finalTitle'] ?? $title;
|
2019-07-29 14:29:58 +00:00
|
|
|
|
foreach ( ( $stuff['deps'] ?? [] ) as $dep ) {
|
|
|
|
|
|
$this->mOutput->addTemplate( $dep['title'], $dep['page_id'], $dep['rev_id'] );
|
2020-06-24 01:22:48 +00:00
|
|
|
|
if ( $dep['title']->equals( $this->getTitle() ) && $revRecord instanceof RevisionRecord ) {
|
2019-07-29 14:29:58 +00:00
|
|
|
|
// Self-transclusion; final result may change based on the new page version
|
2020-04-29 05:37:47 +00:00
|
|
|
|
try {
|
2020-06-24 01:22:48 +00:00
|
|
|
|
$sha1 = $revRecord->getSha1();
|
2020-04-29 05:37:47 +00:00
|
|
|
|
} catch ( RevisionAccessException $e ) {
|
|
|
|
|
|
$sha1 = null;
|
|
|
|
|
|
}
|
Add new ParserOutput::{get,set}OutputFlag() interface
This is a uniform mechanism to access a number of bespoke boolean
flags in ParserOutput. It allows extensibility in core (by adding new
field names to ParserOutputFlags) without exposing new getter/setter
methods to Parsoid. It replaces the ParserOutput::{get,set}Flag()
interface which (a) doesn't allow access to certain flags, and (b) is
typically called with a string rather than a constant, and (c) has a
very generic name. (Note that Parser::setOutputFlag() already called
these "output flags".)
In the future we might unify the representation so that we store
everything in $mFlags and don't have explicit properties in
ParserOutput, but those representation details should be invisible to
the clients of this API. (We might also use a proper enumeration
for ParserOutputFlags, when PHP supports this.)
There is some overlap with ParserOutput::{get,set}ExtensionData(), but
I've left those methods as-is because (a) they allow for non-boolean
data, unlike the *Flag() methods, and (b) it seems worthwhile to
distingush properties set by extensions from properties used by core.
Code search:
https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos=
Bug: T292868
Change-Id: I39bc58d207836df6f328c54be9e3330719cebbeb
2021-10-08 20:04:37 +00:00
|
|
|
|
$this->setOutputFlag( ParserOutputFlags::VARY_REVISION_SHA1, 'Self transclusion' );
|
2020-04-29 05:37:47 +00:00
|
|
|
|
$this->getOutput()->setRevisionUsedSha1Base36( $sha1 );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2019-07-29 14:29:58 +00:00
|
|
|
|
|
2016-02-17 19:57:37 +00:00
|
|
|
|
return [ $text, $finalTitle ];
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Static function to get a template
|
|
|
|
|
|
* Can be overridden via ParserOptions::setTemplateCallback().
|
2011-05-01 23:54:41 +00:00
|
|
|
|
*
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param LinkTarget $page
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @param Parser|false $parser
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2021-05-02 23:55:07 +00:00
|
|
|
|
* @return array
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.12
|
2007-11-20 10:55:08 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public static function statelessFetchTemplate( $page, $parser = false ) {
|
|
|
|
|
|
$title = Title::castFromLinkTarget( $page ); // for compatibility with return type
|
2007-05-31 16:01:26 +00:00
|
|
|
|
$text = $skip = false;
|
2007-05-01 22:42:41 +00:00
|
|
|
|
$finalTitle = $title;
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$deps = [];
|
2020-06-03 03:48:42 +00:00
|
|
|
|
$revRecord = null;
|
parser: new BeforeParserFetchTemplateRevisionRecord hook
This new hook provides for the use case in T47096 (allowing the
Translate extension to transclude a page from another language) by
adding a new hook which would let us deprecate and replace two awkward
legacy hooks (one with an embarrassing capitalization issue). The new
hook is a little more tightly scoped in terms of what it allows and
gives access to, and it uses the new RevisionRecord API.
In addition, the new hook uses LinkTarget instead of Title per
current best practices. (PageIdentity is not appropriate for
reasons documented at the hook invocation site.)
The original BeforeParserFetchTemplateAndtitle (sic) hook allowed
redirecting the revision id of a template inclusion, but not the
title. The only known current use is Extension:ApprovedRevs; the
FlaggedRevs extension replaces the entire function using
ParserOptions::setCurrentRevisionRecordCallback().
Extension:Translate would like to redirect the title as well, possibly
recursively (for a limited number of hops) to handle fallback
languages. That is, when invoked on Foo/fr, including Template:Bar
would redirect to Template:Bar/fr -- and, if that doesn't exist, then
Template:Bar/fr would redirect to its fallback language, say
Template:Bar/en. It uses the top-level page title as context to set
the desired page language. This would require 2 invocations of the
hook; we've set the recursion limit to 3 to provide a little bit
of future-proofing.
The hook added in this patch uses RevisionRecord instead of int
$rev_id, and thus can handle the case where the redirect is to a page
which doesn't exist (by setting the RevisionRecord to a
MutableRevisionRecord with the correct title and no main slot content)
in the fallback language case above.
The new hook deprecates BeforeParserFetchTemplateAndtitle and replaces
ParserFetchTemplate as well (deprecated in 1.35). Code search:
https://codesearch.wmcloud.org/search/?q=BeforeParserFetchTemplateAndtitle&i=nope&files=&repos=
Bug: T47096
Change-Id: Ia5b5d339706ce4084c16948300e0e3418b11792e
2020-07-29 23:32:45 +00:00
|
|
|
|
$contextTitle = $parser ? $parser->getTitle() : null;
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
parser: new BeforeParserFetchTemplateRevisionRecord hook
This new hook provides for the use case in T47096 (allowing the
Translate extension to transclude a page from another language) by
adding a new hook which would let us deprecate and replace two awkward
legacy hooks (one with an embarrassing capitalization issue). The new
hook is a little more tightly scoped in terms of what it allows and
gives access to, and it uses the new RevisionRecord API.
In addition, the new hook uses LinkTarget instead of Title per
current best practices. (PageIdentity is not appropriate for
reasons documented at the hook invocation site.)
The original BeforeParserFetchTemplateAndtitle (sic) hook allowed
redirecting the revision id of a template inclusion, but not the
title. The only known current use is Extension:ApprovedRevs; the
FlaggedRevs extension replaces the entire function using
ParserOptions::setCurrentRevisionRecordCallback().
Extension:Translate would like to redirect the title as well, possibly
recursively (for a limited number of hops) to handle fallback
languages. That is, when invoked on Foo/fr, including Template:Bar
would redirect to Template:Bar/fr -- and, if that doesn't exist, then
Template:Bar/fr would redirect to its fallback language, say
Template:Bar/en. It uses the top-level page title as context to set
the desired page language. This would require 2 invocations of the
hook; we've set the recursion limit to 3 to provide a little bit
of future-proofing.
The hook added in this patch uses RevisionRecord instead of int
$rev_id, and thus can handle the case where the redirect is to a page
which doesn't exist (by setting the RevisionRecord to a
MutableRevisionRecord with the correct title and no main slot content)
in the fallback language case above.
The new hook deprecates BeforeParserFetchTemplateAndtitle and replaces
ParserFetchTemplate as well (deprecated in 1.35). Code search:
https://codesearch.wmcloud.org/search/?q=BeforeParserFetchTemplateAndtitle&i=nope&files=&repos=
Bug: T47096
Change-Id: Ia5b5d339706ce4084c16948300e0e3418b11792e
2020-07-29 23:32:45 +00:00
|
|
|
|
# Loop to fetch the article, with up to 2 redirects
|
2020-04-15 21:50:29 +00:00
|
|
|
|
$revLookup = MediaWikiServices::getInstance()->getRevisionLookup();
|
parser: new BeforeParserFetchTemplateRevisionRecord hook
This new hook provides for the use case in T47096 (allowing the
Translate extension to transclude a page from another language) by
adding a new hook which would let us deprecate and replace two awkward
legacy hooks (one with an embarrassing capitalization issue). The new
hook is a little more tightly scoped in terms of what it allows and
gives access to, and it uses the new RevisionRecord API.
In addition, the new hook uses LinkTarget instead of Title per
current best practices. (PageIdentity is not appropriate for
reasons documented at the hook invocation site.)
The original BeforeParserFetchTemplateAndtitle (sic) hook allowed
redirecting the revision id of a template inclusion, but not the
title. The only known current use is Extension:ApprovedRevs; the
FlaggedRevs extension replaces the entire function using
ParserOptions::setCurrentRevisionRecordCallback().
Extension:Translate would like to redirect the title as well, possibly
recursively (for a limited number of hops) to handle fallback
languages. That is, when invoked on Foo/fr, including Template:Bar
would redirect to Template:Bar/fr -- and, if that doesn't exist, then
Template:Bar/fr would redirect to its fallback language, say
Template:Bar/en. It uses the top-level page title as context to set
the desired page language. This would require 2 invocations of the
hook; we've set the recursion limit to 3 to provide a little bit
of future-proofing.
The hook added in this patch uses RevisionRecord instead of int
$rev_id, and thus can handle the case where the redirect is to a page
which doesn't exist (by setting the RevisionRecord to a
MutableRevisionRecord with the correct title and no main slot content)
in the fallback language case above.
The new hook deprecates BeforeParserFetchTemplateAndtitle and replaces
ParserFetchTemplate as well (deprecated in 1.35). Code search:
https://codesearch.wmcloud.org/search/?q=BeforeParserFetchTemplateAndtitle&i=nope&files=&repos=
Bug: T47096
Change-Id: Ia5b5d339706ce4084c16948300e0e3418b11792e
2020-07-29 23:32:45 +00:00
|
|
|
|
for ( $i = 0; $i < 3 && is_object( $title ); $i++ ) {
|
2007-05-31 16:01:26 +00:00
|
|
|
|
# Give extensions a chance to select the revision instead
|
parser: new BeforeParserFetchTemplateRevisionRecord hook
This new hook provides for the use case in T47096 (allowing the
Translate extension to transclude a page from another language) by
adding a new hook which would let us deprecate and replace two awkward
legacy hooks (one with an embarrassing capitalization issue). The new
hook is a little more tightly scoped in terms of what it allows and
gives access to, and it uses the new RevisionRecord API.
In addition, the new hook uses LinkTarget instead of Title per
current best practices. (PageIdentity is not appropriate for
reasons documented at the hook invocation site.)
The original BeforeParserFetchTemplateAndtitle (sic) hook allowed
redirecting the revision id of a template inclusion, but not the
title. The only known current use is Extension:ApprovedRevs; the
FlaggedRevs extension replaces the entire function using
ParserOptions::setCurrentRevisionRecordCallback().
Extension:Translate would like to redirect the title as well, possibly
recursively (for a limited number of hops) to handle fallback
languages. That is, when invoked on Foo/fr, including Template:Bar
would redirect to Template:Bar/fr -- and, if that doesn't exist, then
Template:Bar/fr would redirect to its fallback language, say
Template:Bar/en. It uses the top-level page title as context to set
the desired page language. This would require 2 invocations of the
hook; we've set the recursion limit to 3 to provide a little bit
of future-proofing.
The hook added in this patch uses RevisionRecord instead of int
$rev_id, and thus can handle the case where the redirect is to a page
which doesn't exist (by setting the RevisionRecord to a
MutableRevisionRecord with the correct title and no main slot content)
in the fallback language case above.
The new hook deprecates BeforeParserFetchTemplateAndtitle and replaces
ParserFetchTemplate as well (deprecated in 1.35). Code search:
https://codesearch.wmcloud.org/search/?q=BeforeParserFetchTemplateAndtitle&i=nope&files=&repos=
Bug: T47096
Change-Id: Ia5b5d339706ce4084c16948300e0e3418b11792e
2020-07-29 23:32:45 +00:00
|
|
|
|
$revRecord = null; # Assume no hook
|
2010-03-30 21:53:56 +00:00
|
|
|
|
$id = false; # Assume current
|
parser: new BeforeParserFetchTemplateRevisionRecord hook
This new hook provides for the use case in T47096 (allowing the
Translate extension to transclude a page from another language) by
adding a new hook which would let us deprecate and replace two awkward
legacy hooks (one with an embarrassing capitalization issue). The new
hook is a little more tightly scoped in terms of what it allows and
gives access to, and it uses the new RevisionRecord API.
In addition, the new hook uses LinkTarget instead of Title per
current best practices. (PageIdentity is not appropriate for
reasons documented at the hook invocation site.)
The original BeforeParserFetchTemplateAndtitle (sic) hook allowed
redirecting the revision id of a template inclusion, but not the
title. The only known current use is Extension:ApprovedRevs; the
FlaggedRevs extension replaces the entire function using
ParserOptions::setCurrentRevisionRecordCallback().
Extension:Translate would like to redirect the title as well, possibly
recursively (for a limited number of hops) to handle fallback
languages. That is, when invoked on Foo/fr, including Template:Bar
would redirect to Template:Bar/fr -- and, if that doesn't exist, then
Template:Bar/fr would redirect to its fallback language, say
Template:Bar/en. It uses the top-level page title as context to set
the desired page language. This would require 2 invocations of the
hook; we've set the recursion limit to 3 to provide a little bit
of future-proofing.
The hook added in this patch uses RevisionRecord instead of int
$rev_id, and thus can handle the case where the redirect is to a page
which doesn't exist (by setting the RevisionRecord to a
MutableRevisionRecord with the correct title and no main slot content)
in the fallback language case above.
The new hook deprecates BeforeParserFetchTemplateAndtitle and replaces
ParserFetchTemplate as well (deprecated in 1.35). Code search:
https://codesearch.wmcloud.org/search/?q=BeforeParserFetchTemplateAndtitle&i=nope&files=&repos=
Bug: T47096
Change-Id: Ia5b5d339706ce4084c16948300e0e3418b11792e
2020-07-29 23:32:45 +00:00
|
|
|
|
$origTitle = $title;
|
|
|
|
|
|
$titleChanged = false;
|
|
|
|
|
|
Hooks::runner()->onBeforeParserFetchTemplateRevisionRecord(
|
|
|
|
|
|
# The $title is a not a PageIdentity, as it may
|
|
|
|
|
|
# contain fragments or even represent an attempt to transclude
|
|
|
|
|
|
# a broken or otherwise-missing Title, which the hook may
|
|
|
|
|
|
# fix up. Similarly, the $contextTitle may represent a special
|
|
|
|
|
|
# page or other page which "exists" as a parsing context but
|
|
|
|
|
|
# is not in the DB.
|
|
|
|
|
|
$contextTitle, $title,
|
|
|
|
|
|
$skip, $revRecord
|
|
|
|
|
|
);
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $skip ) {
|
2007-05-31 16:01:26 +00:00
|
|
|
|
$text = false;
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$deps[] = [
|
2013-04-20 15:38:24 +00:00
|
|
|
|
'title' => $title,
|
|
|
|
|
|
'page_id' => $title->getArticleID(),
|
|
|
|
|
|
'rev_id' => null
|
2016-02-17 19:57:37 +00:00
|
|
|
|
];
|
2007-05-31 16:01:26 +00:00
|
|
|
|
break;
|
|
|
|
|
|
}
|
2011-03-23 18:23:55 +00:00
|
|
|
|
# Get the revision
|
parser: new BeforeParserFetchTemplateRevisionRecord hook
This new hook provides for the use case in T47096 (allowing the
Translate extension to transclude a page from another language) by
adding a new hook which would let us deprecate and replace two awkward
legacy hooks (one with an embarrassing capitalization issue). The new
hook is a little more tightly scoped in terms of what it allows and
gives access to, and it uses the new RevisionRecord API.
In addition, the new hook uses LinkTarget instead of Title per
current best practices. (PageIdentity is not appropriate for
reasons documented at the hook invocation site.)
The original BeforeParserFetchTemplateAndtitle (sic) hook allowed
redirecting the revision id of a template inclusion, but not the
title. The only known current use is Extension:ApprovedRevs; the
FlaggedRevs extension replaces the entire function using
ParserOptions::setCurrentRevisionRecordCallback().
Extension:Translate would like to redirect the title as well, possibly
recursively (for a limited number of hops) to handle fallback
languages. That is, when invoked on Foo/fr, including Template:Bar
would redirect to Template:Bar/fr -- and, if that doesn't exist, then
Template:Bar/fr would redirect to its fallback language, say
Template:Bar/en. It uses the top-level page title as context to set
the desired page language. This would require 2 invocations of the
hook; we've set the recursion limit to 3 to provide a little bit
of future-proofing.
The hook added in this patch uses RevisionRecord instead of int
$rev_id, and thus can handle the case where the redirect is to a page
which doesn't exist (by setting the RevisionRecord to a
MutableRevisionRecord with the correct title and no main slot content)
in the fallback language case above.
The new hook deprecates BeforeParserFetchTemplateAndtitle and replaces
ParserFetchTemplate as well (deprecated in 1.35). Code search:
https://codesearch.wmcloud.org/search/?q=BeforeParserFetchTemplateAndtitle&i=nope&files=&repos=
Bug: T47096
Change-Id: Ia5b5d339706ce4084c16948300e0e3418b11792e
2020-07-29 23:32:45 +00:00
|
|
|
|
if ( !$revRecord ) {
|
|
|
|
|
|
if ( $id ) {
|
|
|
|
|
|
# Handle $id returned by deprecated legacy hook
|
|
|
|
|
|
$revRecord = $revLookup->getRevisionById( $id );
|
|
|
|
|
|
} elseif ( $parser ) {
|
|
|
|
|
|
$revRecord = $parser->fetchCurrentRevisionRecordOfTitle( $title );
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$revRecord = $revLookup->getRevisionByTitle( $title );
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( $revRecord ) {
|
|
|
|
|
|
# Update title, as $revRecord may have been changed by hook
|
|
|
|
|
|
$title = Title::newFromLinkTarget(
|
|
|
|
|
|
$revRecord->getPageAsLinkTarget()
|
|
|
|
|
|
);
|
|
|
|
|
|
$deps[] = [
|
|
|
|
|
|
'title' => $title,
|
|
|
|
|
|
'page_id' => $revRecord->getPageId(),
|
|
|
|
|
|
'rev_id' => $revRecord->getId(),
|
|
|
|
|
|
];
|
2014-09-16 00:07:52 +00:00
|
|
|
|
} else {
|
parser: new BeforeParserFetchTemplateRevisionRecord hook
This new hook provides for the use case in T47096 (allowing the
Translate extension to transclude a page from another language) by
adding a new hook which would let us deprecate and replace two awkward
legacy hooks (one with an embarrassing capitalization issue). The new
hook is a little more tightly scoped in terms of what it allows and
gives access to, and it uses the new RevisionRecord API.
In addition, the new hook uses LinkTarget instead of Title per
current best practices. (PageIdentity is not appropriate for
reasons documented at the hook invocation site.)
The original BeforeParserFetchTemplateAndtitle (sic) hook allowed
redirecting the revision id of a template inclusion, but not the
title. The only known current use is Extension:ApprovedRevs; the
FlaggedRevs extension replaces the entire function using
ParserOptions::setCurrentRevisionRecordCallback().
Extension:Translate would like to redirect the title as well, possibly
recursively (for a limited number of hops) to handle fallback
languages. That is, when invoked on Foo/fr, including Template:Bar
would redirect to Template:Bar/fr -- and, if that doesn't exist, then
Template:Bar/fr would redirect to its fallback language, say
Template:Bar/en. It uses the top-level page title as context to set
the desired page language. This would require 2 invocations of the
hook; we've set the recursion limit to 3 to provide a little bit
of future-proofing.
The hook added in this patch uses RevisionRecord instead of int
$rev_id, and thus can handle the case where the redirect is to a page
which doesn't exist (by setting the RevisionRecord to a
MutableRevisionRecord with the correct title and no main slot content)
in the fallback language case above.
The new hook deprecates BeforeParserFetchTemplateAndtitle and replaces
ParserFetchTemplate as well (deprecated in 1.35). Code search:
https://codesearch.wmcloud.org/search/?q=BeforeParserFetchTemplateAndtitle&i=nope&files=&repos=
Bug: T47096
Change-Id: Ia5b5d339706ce4084c16948300e0e3418b11792e
2020-07-29 23:32:45 +00:00
|
|
|
|
$deps[] = [
|
|
|
|
|
|
'title' => $title,
|
|
|
|
|
|
'page_id' => $title->getArticleID(),
|
|
|
|
|
|
'rev_id' => null,
|
|
|
|
|
|
];
|
|
|
|
|
|
}
|
|
|
|
|
|
if ( !$title->equals( $origTitle ) ) {
|
|
|
|
|
|
# If we fetched a rev from a different title, register
|
|
|
|
|
|
# the original title too...
|
|
|
|
|
|
$deps[] = [
|
|
|
|
|
|
'title' => $origTitle,
|
|
|
|
|
|
'page_id' => $origTitle->getArticleID(),
|
|
|
|
|
|
'rev_id' => null,
|
|
|
|
|
|
];
|
|
|
|
|
|
$titleChanged = true;
|
2014-09-16 00:07:52 +00:00
|
|
|
|
}
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# If there is no current revision, there is no page
|
parser: new BeforeParserFetchTemplateRevisionRecord hook
This new hook provides for the use case in T47096 (allowing the
Translate extension to transclude a page from another language) by
adding a new hook which would let us deprecate and replace two awkward
legacy hooks (one with an embarrassing capitalization issue). The new
hook is a little more tightly scoped in terms of what it allows and
gives access to, and it uses the new RevisionRecord API.
In addition, the new hook uses LinkTarget instead of Title per
current best practices. (PageIdentity is not appropriate for
reasons documented at the hook invocation site.)
The original BeforeParserFetchTemplateAndtitle (sic) hook allowed
redirecting the revision id of a template inclusion, but not the
title. The only known current use is Extension:ApprovedRevs; the
FlaggedRevs extension replaces the entire function using
ParserOptions::setCurrentRevisionRecordCallback().
Extension:Translate would like to redirect the title as well, possibly
recursively (for a limited number of hops) to handle fallback
languages. That is, when invoked on Foo/fr, including Template:Bar
would redirect to Template:Bar/fr -- and, if that doesn't exist, then
Template:Bar/fr would redirect to its fallback language, say
Template:Bar/en. It uses the top-level page title as context to set
the desired page language. This would require 2 invocations of the
hook; we've set the recursion limit to 3 to provide a little bit
of future-proofing.
The hook added in this patch uses RevisionRecord instead of int
$rev_id, and thus can handle the case where the redirect is to a page
which doesn't exist (by setting the RevisionRecord to a
MutableRevisionRecord with the correct title and no main slot content)
in the fallback language case above.
The new hook deprecates BeforeParserFetchTemplateAndtitle and replaces
ParserFetchTemplate as well (deprecated in 1.35). Code search:
https://codesearch.wmcloud.org/search/?q=BeforeParserFetchTemplateAndtitle&i=nope&files=&repos=
Bug: T47096
Change-Id: Ia5b5d339706ce4084c16948300e0e3418b11792e
2020-07-29 23:32:45 +00:00
|
|
|
|
if ( $revRecord === null || $revRecord->getId() === null ) {
|
2018-06-11 06:55:11 +00:00
|
|
|
|
$linkCache = MediaWikiServices::getInstance()->getLinkCache();
|
2008-05-13 16:06:31 +00:00
|
|
|
|
$linkCache->addBadLinkObj( $title );
|
|
|
|
|
|
}
|
2020-06-03 03:48:42 +00:00
|
|
|
|
if ( $revRecord ) {
|
parser: new BeforeParserFetchTemplateRevisionRecord hook
This new hook provides for the use case in T47096 (allowing the
Translate extension to transclude a page from another language) by
adding a new hook which would let us deprecate and replace two awkward
legacy hooks (one with an embarrassing capitalization issue). The new
hook is a little more tightly scoped in terms of what it allows and
gives access to, and it uses the new RevisionRecord API.
In addition, the new hook uses LinkTarget instead of Title per
current best practices. (PageIdentity is not appropriate for
reasons documented at the hook invocation site.)
The original BeforeParserFetchTemplateAndtitle (sic) hook allowed
redirecting the revision id of a template inclusion, but not the
title. The only known current use is Extension:ApprovedRevs; the
FlaggedRevs extension replaces the entire function using
ParserOptions::setCurrentRevisionRecordCallback().
Extension:Translate would like to redirect the title as well, possibly
recursively (for a limited number of hops) to handle fallback
languages. That is, when invoked on Foo/fr, including Template:Bar
would redirect to Template:Bar/fr -- and, if that doesn't exist, then
Template:Bar/fr would redirect to its fallback language, say
Template:Bar/en. It uses the top-level page title as context to set
the desired page language. This would require 2 invocations of the
hook; we've set the recursion limit to 3 to provide a little bit
of future-proofing.
The hook added in this patch uses RevisionRecord instead of int
$rev_id, and thus can handle the case where the redirect is to a page
which doesn't exist (by setting the RevisionRecord to a
MutableRevisionRecord with the correct title and no main slot content)
in the fallback language case above.
The new hook deprecates BeforeParserFetchTemplateAndtitle and replaces
ParserFetchTemplate as well (deprecated in 1.35). Code search:
https://codesearch.wmcloud.org/search/?q=BeforeParserFetchTemplateAndtitle&i=nope&files=&repos=
Bug: T47096
Change-Id: Ia5b5d339706ce4084c16948300e0e3418b11792e
2020-07-29 23:32:45 +00:00
|
|
|
|
if ( $titleChanged && !$revRecord->hasSlot( SlotRecord::MAIN ) ) {
|
|
|
|
|
|
// We've added this (missing) title to the dependencies;
|
|
|
|
|
|
// give the hook another chance to redirect it to an
|
|
|
|
|
|
// actual page.
|
|
|
|
|
|
$text = false;
|
|
|
|
|
|
$finalTitle = $title;
|
|
|
|
|
|
continue;
|
2020-06-03 03:48:42 +00:00
|
|
|
|
}
|
2021-03-17 01:17:20 +00:00
|
|
|
|
if ( $revRecord->hasSlot( SlotRecord::MAIN ) ) { // T276476
|
|
|
|
|
|
$content = $revRecord->getContent( SlotRecord::MAIN );
|
|
|
|
|
|
$text = $content ? $content->getWikitextForTransclusion() : null;
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$text = false;
|
|
|
|
|
|
}
|
2016-08-27 16:44:51 +00:00
|
|
|
|
|
2012-06-08 07:03:18 +00:00
|
|
|
|
if ( $text === false || $text === null ) {
|
|
|
|
|
|
$text = false;
|
|
|
|
|
|
break;
|
|
|
|
|
|
}
|
2020-07-22 17:29:48 +00:00
|
|
|
|
} elseif ( $title->getNamespace() === NS_MEDIAWIKI ) {
|
2018-07-26 12:37:13 +00:00
|
|
|
|
$message = wfMessage( MediaWikiServices::getInstance()->getContentLanguage()->
|
|
|
|
|
|
lcfirst( $title->getText() ) )->inContentLanguage();
|
2011-01-14 10:51:05 +00:00
|
|
|
|
if ( !$message->exists() ) {
|
2007-01-07 12:30:46 +00:00
|
|
|
|
$text = false;
|
|
|
|
|
|
break;
|
|
|
|
|
|
}
|
2011-01-14 10:51:05 +00:00
|
|
|
|
$text = $message->plain();
|
2021-08-24 12:17:12 +00:00
|
|
|
|
break;
|
2007-01-07 12:30:46 +00:00
|
|
|
|
} else {
|
2006-01-11 02:19:41 +00:00
|
|
|
|
break;
|
|
|
|
|
|
}
|
2022-03-29 18:11:06 +00:00
|
|
|
|
// @phan-suppress-next-line PhanPossiblyUndeclaredVariable Only reached when content is set
|
2012-06-11 10:35:46 +00:00
|
|
|
|
if ( !$content ) {
|
2006-01-11 02:19:41 +00:00
|
|
|
|
break;
|
|
|
|
|
|
}
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Redirect?
|
2007-05-01 22:42:41 +00:00
|
|
|
|
$finalTitle = $title;
|
2012-06-11 10:35:46 +00:00
|
|
|
|
$title = $content->getRedirectTarget();
|
2006-01-11 02:19:41 +00:00
|
|
|
|
}
|
2020-06-03 03:48:42 +00:00
|
|
|
|
|
2020-06-24 01:22:48 +00:00
|
|
|
|
$retValues = [
|
2021-05-02 23:55:07 +00:00
|
|
|
|
// previously, when this also returned a Revision object, we set
|
|
|
|
|
|
// 'revision-record' to false instead of null if it was unavailable,
|
|
|
|
|
|
// so that callers to use isset and then rely on the revision-record
|
|
|
|
|
|
// key instead of the revision key, even if there was no corresponding
|
|
|
|
|
|
// object - we continue to set to false here for backwards compatability
|
|
|
|
|
|
'revision-record' => $revRecord ?: false,
|
2007-11-20 10:55:08 +00:00
|
|
|
|
'text' => $text,
|
|
|
|
|
|
'finalTitle' => $finalTitle,
|
2019-07-29 14:29:58 +00:00
|
|
|
|
'deps' => $deps
|
|
|
|
|
|
];
|
2021-05-02 23:55:07 +00:00
|
|
|
|
return $retValues;
|
2006-01-11 02:19:41 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2011-03-23 17:35:40 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Fetch a file and its title and register a reference to it.
|
2011-09-06 18:11:53 +00:00
|
|
|
|
* If 'broken' is a key in $options then the file will appear as a broken thumbnail.
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param LinkTarget $link
|
2013-03-11 17:15:01 +00:00
|
|
|
|
* @param array $options Array of options to RepoGroup::findFile
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return array ( File or false, Title of file )
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.18
|
2011-03-23 17:35:40 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public function fetchFileAndTitle( LinkTarget $link, array $options = [] ) {
|
|
|
|
|
|
$file = $this->fetchFileNoRegister( $link, $options );
|
2013-04-25 00:28:03 +00:00
|
|
|
|
|
2011-04-04 01:22:08 +00:00
|
|
|
|
$time = $file ? $file->getTimestamp() : false;
|
|
|
|
|
|
$sha1 = $file ? $file->getSha1() : false;
|
2011-03-23 18:23:55 +00:00
|
|
|
|
# Register the file as a dependency...
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->mOutput->addImage( $link->getDBkey(), $time, $sha1 );
|
|
|
|
|
|
if ( $file && !$link->isSameLinkAs( $file->getTitle() ) ) {
|
2022-09-27 09:37:13 +00:00
|
|
|
|
# Update fetched file title after resolving redirects, etc.
|
|
|
|
|
|
$link = $file->getTitle();
|
|
|
|
|
|
$this->mOutput->addImage( $link->getDBkey(), $time, $sha1 );
|
2011-03-23 18:23:55 +00:00
|
|
|
|
}
|
2021-04-25 17:29:33 +00:00
|
|
|
|
|
|
|
|
|
|
$title = Title::castFromLinkTarget( $link ); // for return type compat
|
2016-02-17 19:57:37 +00:00
|
|
|
|
return [ $file, $title ];
|
2011-03-23 03:13:37 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2013-04-25 00:28:03 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Helper function for fetchFileAndTitle.
|
|
|
|
|
|
*
|
|
|
|
|
|
* Also useful if you need to fetch a file but not use it yet,
|
|
|
|
|
|
* for example to get the file's handler.
|
|
|
|
|
|
*
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param LinkTarget $link
|
2013-04-25 00:28:03 +00:00
|
|
|
|
* @param array $options Array of options to RepoGroup::findFile
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @return File|false
|
2013-04-25 00:28:03 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
protected function fetchFileNoRegister( LinkTarget $link, array $options = [] ) {
|
2013-04-25 00:28:03 +00:00
|
|
|
|
if ( isset( $options['broken'] ) ) {
|
|
|
|
|
|
$file = false; // broken thumbnail forced by hook
|
2020-03-08 21:47:14 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
$repoGroup = MediaWikiServices::getInstance()->getRepoGroup();
|
|
|
|
|
|
if ( isset( $options['sha1'] ) ) { // get by (sha1,timestamp)
|
|
|
|
|
|
$file = $repoGroup->findFileFromKey( $options['sha1'], $options );
|
|
|
|
|
|
} else { // get by (name,timestamp)
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$file = $repoGroup->findFile( $link, $options );
|
2020-03-08 21:47:14 +00:00
|
|
|
|
}
|
2013-04-25 00:28:03 +00:00
|
|
|
|
}
|
|
|
|
|
|
return $file;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2011-09-29 22:08:00 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Transclude an interwiki link.
|
|
|
|
|
|
*
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param LinkTarget $link
|
2018-06-08 20:43:03 +00:00
|
|
|
|
* @param string $action Usually one of (raw, render)
|
2011-09-29 22:08:00 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return string
|
2020-01-25 15:45:59 +00:00
|
|
|
|
* @internal
|
2011-09-29 22:08:00 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public function interwikiTransclude( LinkTarget $link, $action ) {
|
2022-04-26 15:48:03 +00:00
|
|
|
|
if ( !$this->svcOptions->get( MainConfigNames::EnableScaryTranscluding ) ) {
|
2013-02-03 19:42:08 +00:00
|
|
|
|
return wfMessage( 'scarytranscludedisabled' )->inContentLanguage()->text();
|
2011-09-29 22:08:00 +00:00
|
|
|
|
}
|
2011-07-30 15:56:54 +00:00
|
|
|
|
|
2021-04-25 17:29:33 +00:00
|
|
|
|
// TODO: extract relevant functionality from Title
|
|
|
|
|
|
$title = Title::castFromLinkTarget( $link );
|
|
|
|
|
|
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$url = $title->getFullURL( [ 'action' => $action ] );
|
2018-06-08 20:43:03 +00:00
|
|
|
|
if ( strlen( $url ) > 1024 ) {
|
2012-08-29 08:07:10 +00:00
|
|
|
|
return wfMessage( 'scarytranscludetoolong' )->inContentLanguage()->text();
|
2011-09-29 22:08:00 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2018-06-08 20:43:03 +00:00
|
|
|
|
$wikiId = $title->getTransWikiID(); // remote wiki ID or false
|
|
|
|
|
|
|
|
|
|
|
|
$fname = __METHOD__;
|
|
|
|
|
|
$cache = MediaWikiServices::getInstance()->getMainWANObjectCache();
|
|
|
|
|
|
|
|
|
|
|
|
$data = $cache->getWithSetCallback(
|
|
|
|
|
|
$cache->makeGlobalKey(
|
|
|
|
|
|
'interwiki-transclude',
|
|
|
|
|
|
( $wikiId !== false ) ? $wikiId : 'external',
|
|
|
|
|
|
sha1( $url )
|
|
|
|
|
|
),
|
2022-04-26 15:48:03 +00:00
|
|
|
|
$this->svcOptions->get( MainConfigNames::TranscludeCacheExpiry ),
|
2021-08-04 12:56:30 +00:00
|
|
|
|
function ( $oldValue, &$ttl ) use ( $url, $fname, $cache ) {
|
|
|
|
|
|
$req = $this->httpRequestFactory->create( $url, [], $fname );
|
2018-06-08 20:43:03 +00:00
|
|
|
|
|
|
|
|
|
|
$status = $req->execute(); // Status object
|
|
|
|
|
|
if ( !$status->isOK() ) {
|
|
|
|
|
|
$ttl = $cache::TTL_UNCACHEABLE;
|
|
|
|
|
|
} elseif ( $req->getResponseHeader( 'X-Database-Lagged' ) !== null ) {
|
|
|
|
|
|
$ttl = min( $cache::TTL_LAGGED, $ttl );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
return [
|
|
|
|
|
|
'text' => $status->isOK() ? $req->getContent() : null,
|
|
|
|
|
|
'code' => $req->getStatus()
|
|
|
|
|
|
];
|
|
|
|
|
|
},
|
|
|
|
|
|
[
|
|
|
|
|
|
'checkKeys' => ( $wikiId !== false )
|
|
|
|
|
|
? [ $cache->makeGlobalKey( 'interwiki-page', $wikiId, $title->getDBkey() ) ]
|
|
|
|
|
|
: [],
|
|
|
|
|
|
'pcGroup' => 'interwiki-transclude:5',
|
|
|
|
|
|
'pcTTL' => $cache::TTL_PROC_LONG
|
|
|
|
|
|
]
|
|
|
|
|
|
);
|
|
|
|
|
|
|
|
|
|
|
|
if ( is_string( $data['text'] ) ) {
|
|
|
|
|
|
$text = $data['text'];
|
|
|
|
|
|
} elseif ( $data['code'] != 200 ) {
|
2014-05-10 23:03:45 +00:00
|
|
|
|
// Though we failed to fetch the content, this status is useless.
|
2018-06-08 20:43:03 +00:00
|
|
|
|
$text = wfMessage( 'scarytranscludefailed-httpstatus' )
|
|
|
|
|
|
->params( $url, $data['code'] )->inContentLanguage()->text();
|
2012-08-16 21:48:46 +00:00
|
|
|
|
} else {
|
2018-06-08 20:43:03 +00:00
|
|
|
|
$text = wfMessage( 'scarytranscludefailed', $url )->inContentLanguage()->text();
|
2011-09-29 22:08:00 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Triple brace replacement -- used for template arguments
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @internal
|
2011-05-01 23:54:41 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param array $piece
|
|
|
|
|
|
* @param PPFrame $frame
|
2011-05-01 23:54:41 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return array
|
2020-01-25 15:45:59 +00:00
|
|
|
|
* @internal
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2019-08-27 09:23:52 +00:00
|
|
|
|
public function argSubstitution( array $piece, PPFrame $frame ) {
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$error = false;
|
|
|
|
|
|
$parts = $piece['parts'];
|
2008-01-13 09:23:58 +00:00
|
|
|
|
$nameWithSpaces = $frame->expand( $piece['title'] );
|
|
|
|
|
|
$argName = trim( $nameWithSpaces );
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$object = false;
|
2008-01-13 12:47:38 +00:00
|
|
|
|
$text = $frame->getArgument( $argName );
|
2013-02-03 19:42:08 +00:00
|
|
|
|
if ( $text === false && $parts->getLength() > 0
|
2013-12-01 19:58:51 +00:00
|
|
|
|
&& ( $this->ot['html']
|
|
|
|
|
|
|| $this->ot['pre']
|
|
|
|
|
|
|| ( $this->ot['wiki'] && $frame->isTemplate() )
|
|
|
|
|
|
)
|
2008-01-30 02:52:14 +00:00
|
|
|
|
) {
|
2008-01-13 06:40:14 +00:00
|
|
|
|
# No match in frame, use the supplied default
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$object = $parts->item( 0 )->getChildren();
|
2004-05-23 03:39:24 +00:00
|
|
|
|
}
|
2006-08-10 21:28:49 +00:00
|
|
|
|
if ( !$this->incrementIncludeSize( 'arg', strlen( $text ) ) ) {
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$error = '<!-- WARNING: argument omitted, expansion size too large -->';
|
2008-05-19 21:33:47 +00:00
|
|
|
|
$this->limitationWarn( 'post-expand-template-argument' );
|
2006-08-10 21:28:49 +00:00
|
|
|
|
}
|
2004-07-12 19:49:20 +00:00
|
|
|
|
|
2008-01-21 16:36:08 +00:00
|
|
|
|
if ( $text === false && $object === false ) {
|
2008-01-13 06:40:14 +00:00
|
|
|
|
# No match anywhere
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$object = $frame->virtualBracketedImplode( '{{{', '|', '}}}', $nameWithSpaces, $parts );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
if ( $error !== false ) {
|
|
|
|
|
|
$text .= $error;
|
|
|
|
|
|
}
|
2008-01-21 16:36:08 +00:00
|
|
|
|
if ( $object !== false ) {
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$ret = [ 'object' => $object ];
|
2008-01-21 16:36:08 +00:00
|
|
|
|
} else {
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$ret = [ 'text' => $text ];
|
2008-01-21 16:36:08 +00:00
|
|
|
|
}
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
2008-01-21 16:36:08 +00:00
|
|
|
|
return $ret;
|
2004-05-23 03:39:24 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2022-07-22 00:47:09 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @param string $lowerTagName
|
|
|
|
|
|
* @return bool
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function tagNeedsNowikiStrippedInTagPF( string $lowerTagName ): bool {
|
|
|
|
|
|
$parsoidSiteConfig = MediaWikiServices::getInstance()->getParsoidSiteConfig();
|
|
|
|
|
|
return $parsoidSiteConfig->tagNeedsNowikiStrippedInTagPF( $lowerTagName );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Return the text to be used for a given extension tag.
|
|
|
|
|
|
* This is the ghost of strip().
|
|
|
|
|
|
*
|
2013-03-11 17:15:01 +00:00
|
|
|
|
* @param array $params Associative array of parameters:
|
2008-01-21 16:36:08 +00:00
|
|
|
|
* name PPNode for the tag name
|
|
|
|
|
|
* attr PPNode for unparsed text where tag attributes are thought to be
|
2008-01-09 07:13:54 +00:00
|
|
|
|
* attributes Optional associative array of parsed attributes
|
2007-11-20 10:55:08 +00:00
|
|
|
|
* inner Contents of extension element
|
|
|
|
|
|
* noClose Original text did not have a close tag
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param PPFrame $frame
|
2022-07-22 00:47:09 +00:00
|
|
|
|
* @param bool $processNowiki Process nowiki tags by running the nowiki tag handler
|
|
|
|
|
|
* Normally, nowikis are only processed for the HTML output type. With this
|
|
|
|
|
|
* arg set to true, they are processed (and converted to a nowiki strip marker)
|
|
|
|
|
|
* for all output types.
|
2011-05-01 23:54:41 +00:00
|
|
|
|
*
|
2012-10-07 23:35:26 +00:00
|
|
|
|
* @throws MWException
|
2011-05-01 23:54:41 +00:00
|
|
|
|
* @return string
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @internal
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.12
|
2007-11-20 10:55:08 +00:00
|
|
|
|
*/
|
2022-07-22 00:47:09 +00:00
|
|
|
|
public function extensionSubstitution( array $params, PPFrame $frame, bool $processNowiki = false ) {
|
2016-10-13 16:53:31 +00:00
|
|
|
|
static $errorStr = '<span class="error">';
|
|
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$name = $frame->expand( $params['name'] );
|
2022-10-08 14:50:45 +00:00
|
|
|
|
if ( str_starts_with( $name, $errorStr ) ) {
|
2016-10-13 16:53:31 +00:00
|
|
|
|
// Probably expansion depth or node count exceeded. Just punt the
|
|
|
|
|
|
// error up.
|
|
|
|
|
|
return $name;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2022-10-08 14:50:45 +00:00
|
|
|
|
// Parse attributes from XML-like wikitext syntax
|
2022-07-07 10:27:14 +00:00
|
|
|
|
$attrText = !isset( $params['attr'] ) ? '' : $frame->expand( $params['attr'] );
|
2022-10-08 14:50:45 +00:00
|
|
|
|
if ( str_starts_with( $attrText, $errorStr ) ) {
|
2016-10-13 16:53:31 +00:00
|
|
|
|
// See above
|
|
|
|
|
|
return $attrText;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2016-10-31 20:37:26 +00:00
|
|
|
|
// We can't safely check if the expansion for $content resulted in an
|
|
|
|
|
|
// error, because the content could happen to be the error string
|
|
|
|
|
|
// (T149622).
|
2008-01-09 07:13:54 +00:00
|
|
|
|
$content = !isset( $params['inner'] ) ? null : $frame->expand( $params['inner'] );
|
2016-10-13 16:53:31 +00:00
|
|
|
|
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
$marker = self::MARKER_PREFIX . "-$name-"
|
2014-05-10 23:03:45 +00:00
|
|
|
|
. sprintf( '%08X', $this->mMarkerIndex++ ) . self::MARKER_SUFFIX;
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
Add support to enable Scribunto & Parsoid to handle nowikis properly
* Lua modules have been written to inspect nowiki strip state markers
and extract nowiki content to further process them. Callers might have
used nowikis in arguments for any number of reasons including needing
to have the argument be treated as raw text intead of wikitext.
While we might add first-class typing features to wikitext, templates,
extensions, and the like in the future which would let Parsoid process
template arguments based on type info (rather than as wikitext always),
we need a solution now to enable modules to work properly with Parsoid.
* The core issue is the decoupled model used by Parsoid where
transclusions are preprocessed before further processing. Since
nowikis cannot be processed and stripped during preprocessing,
Lua modules don't have access to nowiki strip markers in this model.
* In this patch, we change extension tag processsing for nowikis.
When generating HTML, nowikis are replaced with a 'nowiki' strip
marker with the nowiki's "innerXML" (only tag contents).
In this patch, during preprocessing, instead of adding a 'general'
strip marker with the "outerXML" (tag contents and the tag wrapper),
we add a 'nowiki' strip marker with its "outerXML".
* Since Parsoid (and any clients using the preprocessed output) will
unstrip all strip markers, the shift from a general to nowiki
strip marker won't make a difference.
* To support Scribunto and Lua modules unstrip usage, this patch adds
new functionality to StripState to replace the (preprocessing-)nowiki
strip markers with whatever its users want. So, Scribunto could
pass in a callback that replaces these with the "innerXML" by
stripping out the tag wrapper.
* Hat tip to Tim Starling for recommending this strategy.
* Updated strip state tests.
Bug: T272507
Bug: T299103
Depends-On: Id6ea611549e98893f53094116a3851e9c42b8dc8
Change-Id: Ied0295feab06027a8df885b3215435e596f0353b
2022-08-25 00:25:51 +00:00
|
|
|
|
$normalizedName = strtolower( $name );
|
|
|
|
|
|
$isNowiki = $normalizedName === 'nowiki';
|
|
|
|
|
|
$markerType = $isNowiki ? 'nowiki' : 'general';
|
|
|
|
|
|
if ( $this->ot['html'] || ( $processNowiki && $isNowiki ) ) {
|
2008-01-09 07:13:54 +00:00
|
|
|
|
$attributes = Sanitizer::decodeTagAttributes( $attrText );
|
2022-10-08 14:50:45 +00:00
|
|
|
|
// Merge in attributes passed via {{#tag:}} parser function
|
2008-01-09 07:13:54 +00:00
|
|
|
|
if ( isset( $params['attributes'] ) ) {
|
2017-05-01 17:18:38 +00:00
|
|
|
|
$attributes += $params['attributes'];
|
2008-01-09 07:13:54 +00:00
|
|
|
|
}
|
2010-02-03 07:10:58 +00:00
|
|
|
|
|
2022-10-08 14:50:45 +00:00
|
|
|
|
if ( isset( $this->mTagHooks[$normalizedName] ) ) {
|
2020-12-23 20:21:46 +00:00
|
|
|
|
// Note that $content may be null here, for example if the
|
|
|
|
|
|
// tag is self-closed.
|
2022-10-08 14:50:45 +00:00
|
|
|
|
$output = call_user_func_array( $this->mTagHooks[$normalizedName],
|
2016-02-17 19:57:37 +00:00
|
|
|
|
[ $content, $attributes, $this, $frame ] );
|
2010-02-03 07:10:58 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
$output = '<span class="error">Invalid tag extension name: ' .
|
2022-10-08 14:50:45 +00:00
|
|
|
|
htmlspecialchars( $normalizedName ) . '</span>';
|
2010-02-03 07:10:58 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
if ( is_array( $output ) ) {
|
2017-12-07 21:16:47 +00:00
|
|
|
|
// Extract flags
|
2010-02-03 07:10:58 +00:00
|
|
|
|
$flags = $output;
|
|
|
|
|
|
$output = $flags[0];
|
2017-12-07 21:16:47 +00:00
|
|
|
|
if ( isset( $flags['markerType'] ) ) {
|
|
|
|
|
|
$markerType = $flags['markerType'];
|
|
|
|
|
|
}
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
} else {
|
2022-10-08 14:50:45 +00:00
|
|
|
|
// We're substituting a {{subst:#tag:}} parser function.
|
|
|
|
|
|
// Convert the attributes it passed into the XML-like string.
|
2008-01-19 09:03:45 +00:00
|
|
|
|
if ( isset( $params['attributes'] ) ) {
|
|
|
|
|
|
foreach ( $params['attributes'] as $attrName => $attrValue ) {
|
|
|
|
|
|
$attrText .= ' ' . htmlspecialchars( $attrName ) . '="' .
|
2022-01-25 05:23:23 +00:00
|
|
|
|
htmlspecialchars( $attrValue, ENT_COMPAT ) . '"';
|
2008-01-19 09:03:45 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2007-12-17 15:07:25 +00:00
|
|
|
|
if ( $content === null ) {
|
|
|
|
|
|
$output = "<$name$attrText/>";
|
2007-11-20 10:55:08 +00:00
|
|
|
|
} else {
|
2020-01-09 23:48:34 +00:00
|
|
|
|
$close = $params['close'] === null ? '' : $frame->expand( $params['close'] );
|
2022-10-08 14:50:45 +00:00
|
|
|
|
if ( str_starts_with( $close, $errorStr ) ) {
|
2016-10-13 16:53:31 +00:00
|
|
|
|
// See above
|
|
|
|
|
|
return $close;
|
|
|
|
|
|
}
|
2007-12-17 15:07:25 +00:00
|
|
|
|
$output = "<$name$attrText>$content$close";
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $markerType === 'none' ) {
|
2009-07-11 13:03:35 +00:00
|
|
|
|
return $output;
|
2010-02-03 07:10:58 +00:00
|
|
|
|
} elseif ( $markerType === 'nowiki' ) {
|
2011-02-23 06:58:15 +00:00
|
|
|
|
$this->mStripState->addNoWiki( $marker, $output );
|
2010-02-03 07:10:58 +00:00
|
|
|
|
} elseif ( $markerType === 'general' ) {
|
2011-02-23 06:58:15 +00:00
|
|
|
|
$this->mStripState->addGeneral( $marker, $output );
|
2010-02-03 07:10:58 +00:00
|
|
|
|
} else {
|
2013-01-26 21:11:09 +00:00
|
|
|
|
throw new MWException( __METHOD__ . ': invalid marker type' );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
return $marker;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
2006-08-10 21:28:49 +00:00
|
|
|
|
* Increment an include size counter
|
|
|
|
|
|
*
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param string $type The type of expansion
|
|
|
|
|
|
* @param int $size The size of the text
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return bool False if this inclusion would take it over the maximum, true otherwise
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2020-01-30 17:45:17 +00:00
|
|
|
|
private function incrementIncludeSize( $type, $size ) {
|
2010-12-11 03:53:22 +00:00
|
|
|
|
if ( $this->mIncludeSizes[$type] + $size > $this->mOptions->getMaxIncludeSize() ) {
|
2004-04-11 16:46:06 +00:00
|
|
|
|
return false;
|
2006-08-10 21:28:49 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
$this->mIncludeSizes[$type] += $size;
|
|
|
|
|
|
return true;
|
2004-04-11 16:46:06 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2008-04-18 14:34:38 +00:00
|
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return bool False if the limit has been exceeded
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.13
|
2008-04-18 14:34:38 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function incrementExpensiveFunctionCount() {
|
2008-04-18 14:34:38 +00:00
|
|
|
|
$this->mExpensiveFunctionCount++;
|
2012-05-04 18:56:28 +00:00
|
|
|
|
return $this->mExpensiveFunctionCount <= $this->mOptions->getExpensiveParserFunctionLimit();
|
2008-04-18 14:34:38 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Strip double-underscore items like __NOGALLERY__ and __NOTOC__
|
|
|
|
|
|
* Fills $this->mDoubleUnderscores, returns the modified text
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function handleDoubleUnderscore( $text ) {
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# The position of __TOC__ needs to be recorded
|
2018-07-25 11:55:18 +00:00
|
|
|
|
$mw = $this->magicWordFactory->get( 'toc' );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $mw->match( $text ) ) {
|
2006-05-23 07:19:01 +00:00
|
|
|
|
$this->mShowToc = true;
|
|
|
|
|
|
$this->mForceTocPosition = true;
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Set a placeholder. At the end we'll fill it in with the TOC.
|
2021-09-15 01:00:06 +00:00
|
|
|
|
$text = $mw->replace( self::TOC_PLACEHOLDER, $text, 1 );
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Only keep the first one.
|
2006-05-23 07:19:01 +00:00
|
|
|
|
$text = $mw->replace( '', $text );
|
|
|
|
|
|
}
|
2008-02-20 08:53:12 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Now match and remove the rest of them
|
2018-07-25 11:55:18 +00:00
|
|
|
|
$mwa = $this->magicWordFactory->getDoubleUnderscoreArray();
|
2008-02-20 08:53:12 +00:00
|
|
|
|
$this->mDoubleUnderscores = $mwa->matchAndRemove( $text );
|
|
|
|
|
|
|
|
|
|
|
|
if ( isset( $this->mDoubleUnderscores['nogallery'] ) ) {
|
2020-12-02 14:57:17 +00:00
|
|
|
|
$this->mOutput->setNoGallery( true );
|
2008-02-20 08:53:12 +00:00
|
|
|
|
}
|
|
|
|
|
|
if ( isset( $this->mDoubleUnderscores['notoc'] ) && !$this->mForceTocPosition ) {
|
|
|
|
|
|
$this->mShowToc = false;
|
|
|
|
|
|
}
|
2014-05-10 23:03:45 +00:00
|
|
|
|
if ( isset( $this->mDoubleUnderscores['hiddencat'] )
|
2020-07-22 17:29:48 +00:00
|
|
|
|
&& $this->getTitle()->getNamespace() === NS_CATEGORY
|
2014-05-10 23:03:45 +00:00
|
|
|
|
) {
|
2009-09-28 17:55:00 +00:00
|
|
|
|
$this->addTrackingCategory( 'hidden-category-category' );
|
2008-02-20 08:53:12 +00:00
|
|
|
|
}
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# (T10068) Allow control over whether robots index a page.
|
|
|
|
|
|
# __INDEX__ always overrides __NOINDEX__, see T16899
|
2019-10-18 19:50:58 +00:00
|
|
|
|
if ( isset( $this->mDoubleUnderscores['noindex'] ) && $this->getTitle()->canUseNoindex() ) {
|
2008-07-23 19:49:46 +00:00
|
|
|
|
$this->mOutput->setIndexPolicy( 'noindex' );
|
2009-09-28 17:55:00 +00:00
|
|
|
|
$this->addTrackingCategory( 'noindex-category' );
|
|
|
|
|
|
}
|
2019-10-18 19:50:58 +00:00
|
|
|
|
if ( isset( $this->mDoubleUnderscores['index'] ) && $this->getTitle()->canUseNoindex() ) {
|
2008-07-23 19:49:46 +00:00
|
|
|
|
$this->mOutput->setIndexPolicy( 'index' );
|
2009-09-28 17:55:00 +00:00
|
|
|
|
$this->addTrackingCategory( 'index-category' );
|
2008-07-23 19:49:46 +00:00
|
|
|
|
}
|
2010-12-11 03:52:35 +00:00
|
|
|
|
|
2010-07-10 11:30:11 +00:00
|
|
|
|
# Cache all double underscores in the database
|
|
|
|
|
|
foreach ( $this->mDoubleUnderscores as $key => $val ) {
|
2021-10-07 16:13:46 +00:00
|
|
|
|
$this->mOutput->setPageProperty( $key, '' );
|
2010-07-10 11:30:11 +00:00
|
|
|
|
}
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2006-05-23 07:19:01 +00:00
|
|
|
|
return $text;
|
2010-01-07 04:13:14 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2009-09-28 17:55:00 +00:00
|
|
|
|
/**
|
2014-09-23 14:36:40 +00:00
|
|
|
|
* @see ParserOutput::addTrackingCategory()
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param string $msg Message key
|
|
|
|
|
|
* @return bool Whether the addition was successful
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.19 method is public
|
2009-09-28 17:55:00 +00:00
|
|
|
|
*/
|
2012-01-11 15:42:29 +00:00
|
|
|
|
public function addTrackingCategory( $msg ) {
|
2021-10-08 16:37:26 +00:00
|
|
|
|
return $this->trackingCategories->addTrackingCategory(
|
|
|
|
|
|
$this->mOutput, $msg, $this->getPage()
|
|
|
|
|
|
);
|
2006-05-23 07:19:01 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* This function accomplishes several tasks:
|
|
|
|
|
|
* 1) Auto-number headings if that option is enabled
|
|
|
|
|
|
* 2) Add an [edit] link to sections for users who have enabled the option and can edit the page
|
|
|
|
|
|
* 3) Add a Table of contents on the top for users who have enabled the option
|
|
|
|
|
|
* 4) Auto-anchor headings
|
|
|
|
|
|
*
|
|
|
|
|
|
* It loops through all headlines, collects the necessary data, then splits up the
|
|
|
|
|
|
* string and re-inserts the newly formatted headlines.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @param string $origText Original, untouched wikitext
|
|
|
|
|
|
* @param bool $isMain
|
|
|
|
|
|
* @return mixed|string
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function finalizeHeadings( $text, $origText, $isMain = true ) {
|
2004-02-26 13:37:26 +00:00
|
|
|
|
# Inhibit editsection links if requested in the page
|
2010-08-05 14:37:50 +00:00
|
|
|
|
if ( isset( $this->mDoubleUnderscores['noeditsection'] ) ) {
|
2017-11-22 23:12:40 +00:00
|
|
|
|
$maybeShowEditLink = false;
|
2010-08-05 14:37:50 +00:00
|
|
|
|
} else {
|
2017-11-22 23:12:40 +00:00
|
|
|
|
$maybeShowEditLink = true; /* Actual presence will depend on post-cache transforms */
|
2011-01-03 20:17:20 +00:00
|
|
|
|
}
|
2004-02-26 13:37:26 +00:00
|
|
|
|
|
2004-04-03 04:38:09 +00:00
|
|
|
|
# Get all headlines for numbering them and adding funky stuff like [edit]
|
|
|
|
|
|
# links - this is for later, but we need the number of headlines right now
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
# NOTE: white space in headings have been trimmed in handleHeadings. They shouldn't
|
2018-03-21 01:01:55 +00:00
|
|
|
|
# be trimmed here since whitespace in HTML headings is significant.
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$matches = [];
|
2014-05-10 23:03:45 +00:00
|
|
|
|
$numMatches = preg_match_all(
|
2018-03-21 01:01:55 +00:00
|
|
|
|
'/<H(?P<level>[1-6])(?P<attrib>.*?>)(?P<header>[\s\S]*?)<\/H[1-6] *>/i',
|
2014-05-10 23:03:45 +00:00
|
|
|
|
$text,
|
|
|
|
|
|
$matches
|
|
|
|
|
|
);
|
2004-04-03 04:38:09 +00:00
|
|
|
|
|
|
|
|
|
|
# if there are fewer than 4 headlines in the article, do not show TOC
|
2006-05-23 10:01:45 +00:00
|
|
|
|
# unless it's been explicitly enabled.
|
|
|
|
|
|
$enoughToc = $this->mShowToc &&
|
2010-03-30 21:20:05 +00:00
|
|
|
|
( ( $numMatches >= 4 ) || $this->mForceTocPosition );
|
2004-04-03 04:38:09 +00:00
|
|
|
|
|
2006-05-01 20:35:08 +00:00
|
|
|
|
# Allow user to stipulate that a page should have a "new section"
|
|
|
|
|
|
# link added via __NEWSECTIONLINK__
|
2008-02-20 08:53:12 +00:00
|
|
|
|
if ( isset( $this->mDoubleUnderscores['newsectionlink'] ) ) {
|
2006-05-01 20:35:08 +00:00
|
|
|
|
$this->mOutput->setNewSection( true );
|
2008-02-20 08:53:12 +00:00
|
|
|
|
}
|
2006-05-01 20:35:08 +00:00
|
|
|
|
|
2009-02-19 22:14:59 +00:00
|
|
|
|
# Allow user to remove the "new section"
|
|
|
|
|
|
# link via __NONEWSECTIONLINK__
|
|
|
|
|
|
if ( isset( $this->mDoubleUnderscores['nonewsectionlink'] ) ) {
|
2021-09-29 20:48:33 +00:00
|
|
|
|
$this->mOutput->setHideNewSection( true );
|
2009-02-19 22:14:59 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2006-05-23 10:01:45 +00:00
|
|
|
|
# if the string __FORCETOC__ (not case-sensitive) occurs in the HTML,
|
|
|
|
|
|
# override above conditions and always show TOC above first header
|
2008-02-20 08:53:12 +00:00
|
|
|
|
if ( isset( $this->mDoubleUnderscores['forcetoc'] ) ) {
|
2006-05-23 07:19:01 +00:00
|
|
|
|
$this->mShowToc = true;
|
2006-05-23 10:01:45 +00:00
|
|
|
|
$enoughToc = true;
|
2004-04-03 04:38:09 +00:00
|
|
|
|
}
|
2004-07-12 19:49:20 +00:00
|
|
|
|
|
2004-02-26 13:37:26 +00:00
|
|
|
|
# headline counter
|
2004-03-21 11:28:44 +00:00
|
|
|
|
$headlineCount = 0;
|
2007-04-30 19:51:56 +00:00
|
|
|
|
$numVisible = 0;
|
2004-02-26 13:37:26 +00:00
|
|
|
|
|
|
|
|
|
|
# Ugh .. the TOC should have neat indentation levels which can be
|
|
|
|
|
|
# passed to the skin functions. These are determined here
|
2004-06-08 18:11:28 +00:00
|
|
|
|
$toc = '';
|
|
|
|
|
|
$full = '';
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$head = [];
|
|
|
|
|
|
$sublevelCount = [];
|
|
|
|
|
|
$levelCount = [];
|
2004-03-29 14:48:07 +00:00
|
|
|
|
$level = 0;
|
|
|
|
|
|
$prevlevel = 0;
|
2005-01-15 23:21:52 +00:00
|
|
|
|
$toclevel = 0;
|
|
|
|
|
|
$prevtoclevel = 0;
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
$markerRegex = self::MARKER_PREFIX . "-h-(\d+)-" . self::MARKER_SUFFIX;
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$baseTitleText = $this->getTitle()->getPrefixedDBkey();
|
2009-06-20 18:25:30 +00:00
|
|
|
|
$oldType = $this->mOutputType;
|
|
|
|
|
|
$this->setOutputType( self::OT_WIKI );
|
|
|
|
|
|
$frame = $this->getPreprocessor()->newFrame();
|
|
|
|
|
|
$root = $this->preprocessToDom( $origText );
|
|
|
|
|
|
$node = $root->getFirstChild();
|
|
|
|
|
|
$byteOffset = 0;
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$tocraw = [];
|
|
|
|
|
|
$refers = [];
|
2005-01-15 23:21:52 +00:00
|
|
|
|
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$headlines = $numMatches !== false ? $matches[3] : [];
|
2015-05-12 18:43:59 +00:00
|
|
|
|
|
2022-04-26 15:48:03 +00:00
|
|
|
|
$maxTocLevel = $this->svcOptions->get( MainConfigNames::MaxTocLevel );
|
2015-05-12 18:43:59 +00:00
|
|
|
|
foreach ( $headlines as $headline ) {
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$isTemplate = false;
|
|
|
|
|
|
$titleText = false;
|
|
|
|
|
|
$sectionIndex = false;
|
2005-01-15 23:21:52 +00:00
|
|
|
|
$numbering = '';
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$markerMatches = [];
|
2013-02-09 22:03:53 +00:00
|
|
|
|
if ( preg_match( "/^$markerRegex/", $headline, $markerMatches ) ) {
|
2022-02-26 16:28:48 +00:00
|
|
|
|
$serial = (int)$markerMatches[1];
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $titleText, $sectionIndex ] = $this->mHeadings[$serial];
|
2010-03-30 21:20:05 +00:00
|
|
|
|
$isTemplate = ( $titleText != $baseTitleText );
|
2013-03-06 20:06:31 +00:00
|
|
|
|
$headline = preg_replace( "/^$markerRegex\\s*/", "", $headline );
|
2004-09-21 04:33:51 +00:00
|
|
|
|
}
|
2004-09-21 08:31:25 +00:00
|
|
|
|
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $toclevel ) {
|
2004-03-21 11:28:44 +00:00
|
|
|
|
$prevlevel = $level;
|
|
|
|
|
|
}
|
2022-03-05 20:05:01 +00:00
|
|
|
|
$level = (int)$matches[1][$headlineCount];
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
2009-06-25 11:05:22 +00:00
|
|
|
|
if ( $level > $prevlevel ) {
|
|
|
|
|
|
# Increase TOC level
|
|
|
|
|
|
$toclevel++;
|
|
|
|
|
|
$sublevelCount[$toclevel] = 0;
|
2018-08-08 14:49:46 +00:00
|
|
|
|
if ( $toclevel < $maxTocLevel ) {
|
2009-06-25 11:05:22 +00:00
|
|
|
|
$prevtoclevel = $toclevel;
|
2011-04-03 11:44:11 +00:00
|
|
|
|
$toc .= Linker::tocIndent();
|
2009-06-25 11:05:22 +00:00
|
|
|
|
$numVisible++;
|
|
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
} elseif ( $level < $prevlevel && $toclevel > 1 ) {
|
2009-06-25 11:05:22 +00:00
|
|
|
|
# Decrease TOC level, find level to jump to
|
|
|
|
|
|
|
2010-03-30 21:20:05 +00:00
|
|
|
|
for ( $i = $toclevel; $i > 0; $i-- ) {
|
2019-12-07 18:32:45 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypeInvalidDimOffset
|
2009-06-25 11:05:22 +00:00
|
|
|
|
if ( $levelCount[$i] == $level ) {
|
|
|
|
|
|
# Found last matching level
|
|
|
|
|
|
$toclevel = $i;
|
|
|
|
|
|
break;
|
2010-03-30 21:20:05 +00:00
|
|
|
|
} elseif ( $levelCount[$i] < $level ) {
|
2019-12-07 18:32:45 +00:00
|
|
|
|
// @phan-suppress-previous-line PhanTypeInvalidDimOffset
|
2009-06-25 11:05:22 +00:00
|
|
|
|
# Found first matching level below current level
|
|
|
|
|
|
$toclevel = $i + 1;
|
|
|
|
|
|
break;
|
2006-04-25 19:43:46 +00:00
|
|
|
|
}
|
2005-01-15 23:21:52 +00:00
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $i == 0 ) {
|
|
|
|
|
|
$toclevel = 1;
|
|
|
|
|
|
}
|
2018-08-08 14:49:46 +00:00
|
|
|
|
if ( $toclevel < $maxTocLevel ) {
|
|
|
|
|
|
if ( $prevtoclevel < $maxTocLevel ) {
|
2009-06-25 11:05:22 +00:00
|
|
|
|
# Unindent only if the previous toc level was shown :p
|
2011-04-03 11:44:11 +00:00
|
|
|
|
$toc .= Linker::tocUnindent( $prevtoclevel - $toclevel );
|
2009-06-25 11:05:22 +00:00
|
|
|
|
$prevtoclevel = $toclevel;
|
|
|
|
|
|
} else {
|
2011-04-03 11:44:11 +00:00
|
|
|
|
$toc .= Linker::tocLineEnd();
|
2006-04-25 19:43:46 +00:00
|
|
|
|
}
|
2005-01-15 23:21:52 +00:00
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
} else {
|
2009-06-25 11:05:22 +00:00
|
|
|
|
# No change in level, end TOC line
|
2018-08-08 14:49:46 +00:00
|
|
|
|
if ( $toclevel < $maxTocLevel ) {
|
2011-04-03 11:44:11 +00:00
|
|
|
|
$toc .= Linker::tocLineEnd();
|
2009-06-25 11:05:22 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
2009-06-25 11:05:22 +00:00
|
|
|
|
$levelCount[$toclevel] = $level;
|
2005-01-15 23:21:52 +00:00
|
|
|
|
|
2009-06-25 11:05:22 +00:00
|
|
|
|
# count number of headlines for each level
|
2013-03-17 14:53:37 +00:00
|
|
|
|
$sublevelCount[$toclevel]++;
|
2009-06-25 11:05:22 +00:00
|
|
|
|
$dot = 0;
|
2013-04-20 15:38:24 +00:00
|
|
|
|
for ( $i = 1; $i <= $toclevel; $i++ ) {
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( !empty( $sublevelCount[$i] ) ) {
|
|
|
|
|
|
if ( $dot ) {
|
2009-06-25 11:05:22 +00:00
|
|
|
|
$numbering .= '.';
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2012-03-05 05:53:12 +00:00
|
|
|
|
$numbering .= $this->getTargetLanguage()->formatNum( $sublevelCount[$i] );
|
2009-06-25 11:05:22 +00:00
|
|
|
|
$dot = 1;
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2009-01-10 17:16:21 +00:00
|
|
|
|
# The safe header is a version of the header text safe to use for links
|
|
|
|
|
|
|
|
|
|
|
|
# Remove link placeholders by the link text.
|
|
|
|
|
|
# <!--LINK number-->
|
|
|
|
|
|
# turns into
|
|
|
|
|
|
# link text with suffix
|
2012-03-20 04:39:09 +00:00
|
|
|
|
# Do this before unstrip since link text can contain strip markers
|
2019-11-04 19:23:34 +00:00
|
|
|
|
$safeHeadline = $this->replaceLinkHoldersText( $headline );
|
2012-03-20 04:39:09 +00:00
|
|
|
|
|
|
|
|
|
|
# Avoid insertion of weird stuff like <math> by expanding the relevant sections
|
|
|
|
|
|
$safeHeadline = $this->mStripState->unstripBoth( $safeHeadline );
|
2009-01-10 17:16:21 +00:00
|
|
|
|
|
2018-07-02 15:17:06 +00:00
|
|
|
|
# Remove any <style> or <script> tags (T198618)
|
|
|
|
|
|
$safeHeadline = preg_replace(
|
|
|
|
|
|
'#<(style|script)(?: [^>]*[^>/])?>.*?</\1>#is',
|
|
|
|
|
|
'',
|
|
|
|
|
|
$safeHeadline
|
|
|
|
|
|
);
|
|
|
|
|
|
|
2012-01-21 16:27:27 +00:00
|
|
|
|
# Strip out HTML (first regex removes any tag not allowed)
|
2012-09-03 08:28:08 +00:00
|
|
|
|
# Allowed tags are:
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# * <sup> and <sub> (T10393)
|
|
|
|
|
|
# * <i> (T28375)
|
2012-09-03 08:28:08 +00:00
|
|
|
|
# * <b> (r105284)
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# * <bdi> (T74884)
|
|
|
|
|
|
# * <span dir="rtl"> and <span dir="ltr"> (T37167)
|
2016-09-21 19:59:56 +00:00
|
|
|
|
# * <s> and <strike> (T35715)
|
2022-03-23 20:42:07 +00:00
|
|
|
|
# * <q> (T251672)
|
2012-09-03 08:28:08 +00:00
|
|
|
|
# We strip any parameter from accepted tags (second regex), except dir="rtl|ltr" from <span>,
|
|
|
|
|
|
# to allow setting directionality in toc items.
|
2009-01-10 17:16:21 +00:00
|
|
|
|
$tocline = preg_replace(
|
2016-02-17 19:57:37 +00:00
|
|
|
|
[
|
2022-03-23 20:42:07 +00:00
|
|
|
|
'#<(?!/?(span|sup|sub|bdi|i|b|s|strike|q)(?: [^>]*)?>).*?>#',
|
2016-09-21 19:59:56 +00:00
|
|
|
|
'#<(/?(?:span(?: dir="(?:rtl|ltr)")?|sup|sub|bdi|i|b|s|strike))(?: .*?)?>#'
|
2016-02-17 19:57:37 +00:00
|
|
|
|
],
|
|
|
|
|
|
[ '', '<$1>' ],
|
2009-01-10 17:16:21 +00:00
|
|
|
|
$safeHeadline
|
|
|
|
|
|
);
|
2015-04-15 17:44:28 +00:00
|
|
|
|
|
|
|
|
|
|
# Strip '<span></span>', which is the result from the above if
|
|
|
|
|
|
# <span id="foo"></span> is used to produce an additional anchor
|
|
|
|
|
|
# for a section.
|
|
|
|
|
|
$tocline = str_replace( '<span></span>', '', $tocline );
|
|
|
|
|
|
|
2009-01-10 17:16:21 +00:00
|
|
|
|
$tocline = trim( $tocline );
|
|
|
|
|
|
|
|
|
|
|
|
# For the anchor, strip out HTML-y stuff period
|
2015-01-27 06:02:56 +00:00
|
|
|
|
$safeHeadline = preg_replace( '/<.*?>/', '', $safeHeadline );
|
2010-06-21 01:17:36 +00:00
|
|
|
|
$safeHeadline = Sanitizer::normalizeSectionNameWhitespace( $safeHeadline );
|
2009-01-10 17:16:21 +00:00
|
|
|
|
|
|
|
|
|
|
# Save headline for section edit hint before it's escaped
|
|
|
|
|
|
$headlineHint = $safeHeadline;
|
|
|
|
|
|
|
2016-05-02 05:14:45 +00:00
|
|
|
|
# Decode HTML entities
|
|
|
|
|
|
$safeHeadline = Sanitizer::decodeCharReferences( $safeHeadline );
|
2017-11-03 02:35:11 +00:00
|
|
|
|
|
2017-11-22 23:06:21 +00:00
|
|
|
|
$safeHeadline = self::normalizeSectionName( $safeHeadline );
|
2017-11-03 02:35:11 +00:00
|
|
|
|
|
2017-06-30 00:13:12 +00:00
|
|
|
|
$fallbackHeadline = Sanitizer::escapeIdForAttribute( $safeHeadline, Sanitizer::ID_FALLBACK );
|
|
|
|
|
|
$linkAnchor = Sanitizer::escapeIdForLink( $safeHeadline );
|
|
|
|
|
|
$safeHeadline = Sanitizer::escapeIdForAttribute( $safeHeadline, Sanitizer::ID_PRIMARY );
|
|
|
|
|
|
if ( $fallbackHeadline === $safeHeadline ) {
|
|
|
|
|
|
# No reason to have both (in fact, we can't)
|
|
|
|
|
|
$fallbackHeadline = false;
|
2009-01-10 17:16:21 +00:00
|
|
|
|
}
|
2009-01-05 15:59:46 +00:00
|
|
|
|
|
2017-06-30 00:13:12 +00:00
|
|
|
|
# HTML IDs must be case-insensitively unique for IE compatibility (T12721).
|
2009-01-10 17:16:21 +00:00
|
|
|
|
$arrayKey = strtolower( $safeHeadline );
|
2017-06-30 00:13:12 +00:00
|
|
|
|
if ( $fallbackHeadline === false ) {
|
|
|
|
|
|
$fallbackArrayKey = false;
|
2009-01-05 15:59:46 +00:00
|
|
|
|
} else {
|
2017-06-30 00:13:12 +00:00
|
|
|
|
$fallbackArrayKey = strtolower( $fallbackHeadline );
|
2009-01-05 15:59:46 +00:00
|
|
|
|
}
|
2008-03-13 18:30:50 +00:00
|
|
|
|
|
2014-12-28 20:07:49 +00:00
|
|
|
|
# Create the anchor for linking from the TOC to the section
|
|
|
|
|
|
$anchor = $safeHeadline;
|
2017-06-30 00:13:12 +00:00
|
|
|
|
$fallbackAnchor = $fallbackHeadline;
|
2009-01-05 15:59:46 +00:00
|
|
|
|
if ( isset( $refers[$arrayKey] ) ) {
|
2022-07-29 00:45:09 +00:00
|
|
|
|
for ( $i = 2; isset( $refers["{$arrayKey}_$i"] ); ++$i );
|
2014-12-28 20:07:49 +00:00
|
|
|
|
$anchor .= "_$i";
|
2017-06-30 00:13:12 +00:00
|
|
|
|
$linkAnchor .= "_$i";
|
2022-07-29 00:45:09 +00:00
|
|
|
|
$refers["{$arrayKey}_$i"] = true;
|
2009-01-05 15:59:46 +00:00
|
|
|
|
} else {
|
2014-12-28 20:07:49 +00:00
|
|
|
|
$refers[$arrayKey] = true;
|
2009-01-05 15:59:46 +00:00
|
|
|
|
}
|
2017-06-30 00:13:12 +00:00
|
|
|
|
if ( $fallbackHeadline !== false && isset( $refers[$fallbackArrayKey] ) ) {
|
2022-07-29 00:45:09 +00:00
|
|
|
|
for ( $i = 2; isset( $refers["{$fallbackArrayKey}_$i"] ); ++$i );
|
2017-06-30 00:13:12 +00:00
|
|
|
|
$fallbackAnchor .= "_$i";
|
2022-07-29 00:45:09 +00:00
|
|
|
|
$refers["{$fallbackArrayKey}_$i"] = true;
|
2009-01-05 15:59:46 +00:00
|
|
|
|
} else {
|
2017-06-30 00:13:12 +00:00
|
|
|
|
$refers[$fallbackArrayKey] = true;
|
2009-01-05 15:59:46 +00:00
|
|
|
|
}
|
2004-02-26 13:37:26 +00:00
|
|
|
|
|
2018-08-08 14:49:46 +00:00
|
|
|
|
if ( $enoughToc && ( !isset( $maxTocLevel ) || $toclevel < $maxTocLevel ) ) {
|
2021-02-25 20:19:54 +00:00
|
|
|
|
$toc .= Linker::tocLine(
|
|
|
|
|
|
$linkAnchor,
|
|
|
|
|
|
$tocline,
|
|
|
|
|
|
$numbering,
|
|
|
|
|
|
$toclevel,
|
|
|
|
|
|
( $isTemplate ? false : $sectionIndex )
|
|
|
|
|
|
);
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2009-06-25 11:05:22 +00:00
|
|
|
|
# Add the section to the section tree
|
|
|
|
|
|
# Find the DOM node for this header
|
2013-10-09 15:03:40 +00:00
|
|
|
|
$noOffset = ( $isTemplate || $sectionIndex === false );
|
|
|
|
|
|
while ( $node && !$noOffset ) {
|
2009-06-25 11:05:22 +00:00
|
|
|
|
if ( $node->getName() === 'h' ) {
|
|
|
|
|
|
$bits = $node->splitHeading();
|
2010-10-18 23:50:33 +00:00
|
|
|
|
if ( $bits['i'] == $sectionIndex ) {
|
2009-06-25 11:05:22 +00:00
|
|
|
|
break;
|
2010-10-18 23:50:33 +00:00
|
|
|
|
}
|
2009-06-25 11:05:22 +00:00
|
|
|
|
}
|
2021-02-25 20:19:54 +00:00
|
|
|
|
$byteOffset += mb_strlen(
|
|
|
|
|
|
$this->mStripState->unstripBoth(
|
|
|
|
|
|
$frame->expand( $node, PPFrame::RECOVER_ORIG )
|
|
|
|
|
|
)
|
|
|
|
|
|
);
|
2009-06-25 11:05:22 +00:00
|
|
|
|
$node = $node->getNextSibling();
|
|
|
|
|
|
}
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$tocraw[] = [
|
2009-06-25 11:05:22 +00:00
|
|
|
|
'toclevel' => $toclevel,
|
2022-03-18 16:54:51 +00:00
|
|
|
|
// cast $level to string in order to keep b/c for the parse api
|
|
|
|
|
|
'level' => (string)$level,
|
2009-06-25 11:05:22 +00:00
|
|
|
|
'line' => $tocline,
|
|
|
|
|
|
'number' => $numbering,
|
2010-03-30 21:20:05 +00:00
|
|
|
|
'index' => ( $isTemplate ? 'T-' : '' ) . $sectionIndex,
|
2009-06-25 11:05:22 +00:00
|
|
|
|
'fromtitle' => $titleText,
|
2013-10-09 15:03:40 +00:00
|
|
|
|
'byteoffset' => ( $noOffset ? null : $byteOffset ),
|
2009-06-25 11:05:22 +00:00
|
|
|
|
'anchor' => $anchor,
|
2022-09-09 20:18:22 +00:00
|
|
|
|
'linkAnchor' => $linkAnchor,
|
2016-02-17 19:57:37 +00:00
|
|
|
|
];
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2006-10-18 06:53:19 +00:00
|
|
|
|
# give headline the correct <h#> tag
|
2011-10-25 22:18:33 +00:00
|
|
|
|
if ( $maybeShowEditLink && $sectionIndex !== false ) {
|
2011-01-04 11:31:06 +00:00
|
|
|
|
// Output edit section links as markers with styles that can be customized by skins
|
|
|
|
|
|
if ( $isTemplate ) {
|
|
|
|
|
|
# Put a T flag in the section identifier, to indicate to extractSections()
|
|
|
|
|
|
# that sections inside <includeonly> should be counted.
|
2014-07-04 20:19:48 +00:00
|
|
|
|
$editsectionPage = $titleText;
|
|
|
|
|
|
$editsectionSection = "T-$sectionIndex";
|
2008-01-05 12:39:12 +00:00
|
|
|
|
} else {
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$editsectionPage = $this->getTitle()->getPrefixedText();
|
2014-07-04 20:19:48 +00:00
|
|
|
|
$editsectionSection = $sectionIndex;
|
2011-01-04 11:31:06 +00:00
|
|
|
|
}
|
2022-04-19 19:25:45 +00:00
|
|
|
|
$editsectionContent = $headlineHint;
|
2014-05-10 23:03:45 +00:00
|
|
|
|
// We use a bit of pesudo-xml for editsection markers. The
|
|
|
|
|
|
// language converter is run later on. Using a UNIQ style marker
|
|
|
|
|
|
// leads to the converter screwing up the tokens when it
|
|
|
|
|
|
// converts stuff. And trying to insert strip tags fails too. At
|
|
|
|
|
|
// this point all real inputted tags have already been escaped,
|
|
|
|
|
|
// so we don't have to worry about a user trying to input one of
|
|
|
|
|
|
// these markers directly. We use a page and section attribute
|
|
|
|
|
|
// to stop the language converter from converting these
|
|
|
|
|
|
// important bits of data, but put the headline hint inside a
|
|
|
|
|
|
// content block because the language converter is supposed to
|
2011-01-04 11:31:06 +00:00
|
|
|
|
// be able to convert that piece of data.
|
2014-07-04 20:19:48 +00:00
|
|
|
|
// Gets replaced with html in ParserOutput::getText
|
2022-01-25 05:23:23 +00:00
|
|
|
|
$editlink = '<mw:editsection page="' . htmlspecialchars( $editsectionPage, ENT_COMPAT );
|
|
|
|
|
|
$editlink .= '" section="' . htmlspecialchars( $editsectionSection, ENT_COMPAT ) . '"';
|
2022-04-19 19:25:45 +00:00
|
|
|
|
$editlink .= '>' . $editsectionContent . '</mw:editsection>';
|
2007-01-08 02:11:45 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
$editlink = '';
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2021-02-25 20:19:54 +00:00
|
|
|
|
$head[$headlineCount] = Linker::makeHeadline(
|
|
|
|
|
|
$level,
|
|
|
|
|
|
$matches['attrib'][$headlineCount],
|
|
|
|
|
|
$anchor,
|
|
|
|
|
|
$headline,
|
|
|
|
|
|
$editlink,
|
|
|
|
|
|
$fallbackAnchor
|
|
|
|
|
|
);
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2004-03-21 11:28:44 +00:00
|
|
|
|
$headlineCount++;
|
2004-04-12 23:59:37 +00:00
|
|
|
|
}
|
2004-02-26 13:37:26 +00:00
|
|
|
|
|
2009-06-20 18:25:30 +00:00
|
|
|
|
$this->setOutputType( $oldType );
|
2007-12-27 20:14:07 +00:00
|
|
|
|
|
2022-05-27 16:20:35 +00:00
|
|
|
|
# Never ever show TOC if no headers (or suppressed)
|
|
|
|
|
|
$suppressToc = $this->mOptions->getSuppressTOC();
|
|
|
|
|
|
if ( $numVisible < 1 || $suppressToc ) {
|
2007-04-30 19:51:56 +00:00
|
|
|
|
$enoughToc = false;
|
|
|
|
|
|
}
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $enoughToc ) {
|
2018-08-08 14:49:46 +00:00
|
|
|
|
if ( $prevtoclevel > 0 && $prevtoclevel < $maxTocLevel ) {
|
2011-04-03 11:44:11 +00:00
|
|
|
|
$toc .= Linker::tocUnindent( $prevtoclevel - 1 );
|
2006-04-25 19:43:46 +00:00
|
|
|
|
}
|
2011-10-19 15:30:02 +00:00
|
|
|
|
$toc = Linker::tocList( $toc, $this->mOptions->getUserLangObj() );
|
2009-07-14 13:35:07 +00:00
|
|
|
|
$this->mOutput->setTOCHTML( $toc );
|
2022-07-27 22:07:09 +00:00
|
|
|
|
// Record the fact that the TOC should be shown. T294950
|
|
|
|
|
|
// (We shouldn't be looking at ::getTOCHTML() for this because
|
|
|
|
|
|
// eventually that will be replaced (T293513) and
|
|
|
|
|
|
// ::getSections() will contain sections even if there aren't
|
|
|
|
|
|
// $enoughToc to show.)
|
|
|
|
|
|
$this->mOutput->setOutputFlag( ParserOutputFlags::SHOW_TOC );
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2022-05-27 16:20:35 +00:00
|
|
|
|
if ( $isMain && !$suppressToc ) {
|
|
|
|
|
|
// We generally output the section information via the API
|
|
|
|
|
|
// even if there isn't "enough" of a ToC to merit showing
|
|
|
|
|
|
// it -- but the "suppress TOC" parser option is set when
|
|
|
|
|
|
// any sections that might be found aren't "really there"
|
|
|
|
|
|
// (ie, JavaScript content that might have spurious === or
|
|
|
|
|
|
// <h2>: T307691) so we will *not* set section information
|
|
|
|
|
|
// in that case.
|
2009-06-21 12:52:24 +00:00
|
|
|
|
$this->mOutput->setSections( $tocraw );
|
|
|
|
|
|
}
|
2004-02-26 13:37:26 +00:00
|
|
|
|
|
2004-03-21 11:28:44 +00:00
|
|
|
|
# split up and insert constructed headlines
|
2015-01-27 06:02:56 +00:00
|
|
|
|
$blocks = preg_split( '/<H[1-6].*?>[\s\S]*?<\/H[1-6]>/i', $text );
|
2004-03-21 11:28:44 +00:00
|
|
|
|
$i = 0;
|
2011-07-24 21:36:04 +00:00
|
|
|
|
|
2011-07-18 23:23:14 +00:00
|
|
|
|
// build an array of document sections
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$sections = [];
|
2011-07-24 21:36:04 +00:00
|
|
|
|
foreach ( $blocks as $block ) {
|
2011-07-18 23:23:14 +00:00
|
|
|
|
// $head is zero-based, sections aren't.
|
|
|
|
|
|
if ( empty( $head[$i - 1] ) ) {
|
|
|
|
|
|
$sections[$i] = $block;
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$sections[$i] = $head[$i - 1] . $block;
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
$i++;
|
|
|
|
|
|
}
|
2011-07-18 23:23:14 +00:00
|
|
|
|
|
|
|
|
|
|
if ( $enoughToc && $isMain && !$this->mForceTocPosition ) {
|
|
|
|
|
|
// append the TOC at the beginning
|
|
|
|
|
|
// Top anchor now in skin
|
2022-03-28 20:10:05 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypePossiblyInvalidDimOffset At least one element when enoughToc is true
|
2021-09-15 01:00:06 +00:00
|
|
|
|
$sections[0] .= self::TOC_PLACEHOLDER . "\n";
|
2011-07-18 23:23:14 +00:00
|
|
|
|
}
|
2011-07-24 21:36:04 +00:00
|
|
|
|
|
2016-03-08 08:13:12 +00:00
|
|
|
|
$full .= implode( '', $sections );
|
2011-07-24 21:36:04 +00:00
|
|
|
|
|
2021-09-15 01:00:06 +00:00
|
|
|
|
return $full;
|
2004-02-26 13:37:26 +00:00
|
|
|
|
}
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
2012-07-10 12:48:06 +00:00
|
|
|
|
* Transform wiki markup when saving a page by doing "\r\n" -> "\n"
|
2014-11-08 19:07:19 +00:00
|
|
|
|
* conversion, substituting signatures, {{subst:}} templates, etc.
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param string $text The text to transform
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param PageReference $page the current article
|
2021-09-23 01:08:02 +00:00
|
|
|
|
* @param UserIdentity $user the current user
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param ParserOptions $options Parsing options
|
|
|
|
|
|
* @param bool $clearState Whether to clear the parser state first
|
|
|
|
|
|
* @return string The altered wiki markup
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.3
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2021-07-11 19:11:37 +00:00
|
|
|
|
public function preSaveTransform(
|
|
|
|
|
|
$text,
|
|
|
|
|
|
PageReference $page,
|
|
|
|
|
|
UserIdentity $user,
|
|
|
|
|
|
ParserOptions $options,
|
|
|
|
|
|
$clearState = true
|
2014-05-10 23:03:45 +00:00
|
|
|
|
) {
|
2013-10-27 20:18:06 +00:00
|
|
|
|
if ( $clearState ) {
|
|
|
|
|
|
$magicScopeVariable = $this->lock();
|
|
|
|
|
|
}
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->startParse( $page, $options, self::OT_WIKI, $clearState );
|
2010-12-10 18:17:20 +00:00
|
|
|
|
$this->setUser( $user );
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2017-02-27 21:27:15 +00:00
|
|
|
|
// Strip U+0000 NULL (T159174)
|
|
|
|
|
|
$text = str_replace( "\000", '', $text );
|
|
|
|
|
|
|
2021-04-14 18:49:56 +00:00
|
|
|
|
// We still normalize line endings (including trimming trailing whitespace) for
|
|
|
|
|
|
// backwards-compatibility with other code that just calls PST, but this should already
|
2016-08-16 21:58:15 +00:00
|
|
|
|
// be handled in TextContent subclasses
|
|
|
|
|
|
$text = TextContent::normalizeLineEndings( $text );
|
|
|
|
|
|
|
2013-04-20 15:38:24 +00:00
|
|
|
|
if ( $options->getPreSaveTransform() ) {
|
2011-01-16 21:12:26 +00:00
|
|
|
|
$text = $this->pstPass2( $text, $user );
|
|
|
|
|
|
}
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$text = $this->mStripState->unstripBoth( $text );
|
2010-12-10 18:17:20 +00:00
|
|
|
|
|
2021-04-14 18:49:56 +00:00
|
|
|
|
// Trim trailing whitespace again, because the previous steps can introduce it.
|
|
|
|
|
|
$text = rtrim( $text );
|
|
|
|
|
|
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onParserPreSaveTransformComplete( $this, $text );
|
2020-02-10 02:20:34 +00:00
|
|
|
|
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$this->setUser( null ); # Reset
|
2010-12-10 18:17:20 +00:00
|
|
|
|
|
2004-03-06 01:49:16 +00:00
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Pre-save transform helper function
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text
|
2021-03-16 18:31:27 +00:00
|
|
|
|
* @param UserIdentity $user
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return string
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2021-03-16 18:31:27 +00:00
|
|
|
|
private function pstPass2( $text, UserIdentity $user ) {
|
2018-07-26 12:37:13 +00:00
|
|
|
|
# Note: This is the timestamp saved as hardcoded wikitext to the database, we use
|
2018-08-03 08:25:15 +00:00
|
|
|
|
# $this->contLang here in order to give everyone the same signature and use the default one
|
|
|
|
|
|
# rather than the one selected in each user's preferences. (see also T14815)
|
2008-02-09 20:39:32 +00:00
|
|
|
|
$ts = $this->mOptions->getTimestamp();
|
2013-07-06 19:59:35 +00:00
|
|
|
|
$timestamp = MWTimestamp::getLocalInstance( $ts );
|
|
|
|
|
|
$ts = $timestamp->format( 'YmdHis' );
|
2014-12-26 01:57:14 +00:00
|
|
|
|
$tzMsg = $timestamp->getTimezoneMessage()->inContentLanguage()->text();
|
2010-01-08 01:48:53 +00:00
|
|
|
|
|
2018-08-03 08:25:15 +00:00
|
|
|
|
$d = $this->contLang->timeanddate( $ts, false, false ) . " ($tzMsg)";
|
2004-03-06 01:49:16 +00:00
|
|
|
|
|
2006-01-07 23:37:40 +00:00
|
|
|
|
# Variable replacement
|
|
|
|
|
|
# Because mOutputType is OT_WIKI, this will only process {{subst:xxx}} type tags
|
|
|
|
|
|
$text = $this->replaceVariables( $text );
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2011-02-19 19:18:02 +00:00
|
|
|
|
# This works almost by chance, as the replaceVariables are done before the getUserSig(),
|
2012-08-29 08:07:10 +00:00
|
|
|
|
# which may corrupt this parser instance via its wfMessage()->text() call-
|
2011-01-23 18:45:21 +00:00
|
|
|
|
|
2006-01-12 15:42:38 +00:00
|
|
|
|
# Signatures
|
2017-07-06 23:23:32 +00:00
|
|
|
|
if ( strpos( $text, '~~~' ) !== false ) {
|
|
|
|
|
|
$sigText = $this->getUserSig( $user );
|
|
|
|
|
|
$text = strtr( $text, [
|
|
|
|
|
|
'~~~~~' => $d,
|
|
|
|
|
|
'~~~~' => "$sigText $d",
|
|
|
|
|
|
'~~~' => $sigText
|
|
|
|
|
|
] );
|
|
|
|
|
|
# The main two signature forms used above are time-sensitive
|
Add new ParserOutput::{get,set}OutputFlag() interface
This is a uniform mechanism to access a number of bespoke boolean
flags in ParserOutput. It allows extensibility in core (by adding new
field names to ParserOutputFlags) without exposing new getter/setter
methods to Parsoid. It replaces the ParserOutput::{get,set}Flag()
interface which (a) doesn't allow access to certain flags, and (b) is
typically called with a string rather than a constant, and (c) has a
very generic name. (Note that Parser::setOutputFlag() already called
these "output flags".)
In the future we might unify the representation so that we store
everything in $mFlags and don't have explicit properties in
ParserOutput, but those representation details should be invisible to
the clients of this API. (We might also use a proper enumeration
for ParserOutputFlags, when PHP supports this.)
There is some overlap with ParserOutput::{get,set}ExtensionData(), but
I've left those methods as-is because (a) they allow for non-boolean
data, unlike the *Flag() methods, and (b) it seems worthwhile to
distingush properties set by extensions from properties used by core.
Code search:
https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos=
Bug: T292868
Change-Id: I39bc58d207836df6f328c54be9e3330719cebbeb
2021-10-08 20:04:37 +00:00
|
|
|
|
$this->setOutputFlag( ParserOutputFlags::USER_SIGNATURE, 'User signature detected' );
|
2017-07-06 23:23:32 +00:00
|
|
|
|
}
|
2006-01-07 23:37:40 +00:00
|
|
|
|
|
2012-12-06 06:14:58 +00:00
|
|
|
|
# Context links ("pipe tricks"): [[|name]] and [[name (context)|]]
|
2012-05-04 19:47:00 +00:00
|
|
|
|
$tc = '[' . Title::legalChars() . ']';
|
Moving Conrad's recent parser work out to a branch. Reverted r62434, r62416, r62150, r62111, r62085, r62081, r62080, r62077, r62076, r62069, r62049, r62035.
2010-02-19 05:19:32 +00:00
|
|
|
|
$nc = '[ _0-9A-Za-z\x80-\xff-]'; # Namespaces can use non-ascii!
|
|
|
|
|
|
|
2014-05-10 23:03:45 +00:00
|
|
|
|
// [[ns:page (context)|]]
|
|
|
|
|
|
$p1 = "/\[\[(:?$nc+:|:|)($tc+?)( ?\\($tc+\\))\\|]]/";
|
|
|
|
|
|
// [[ns:page(context)|]] (double-width brackets, added in r40257)
|
|
|
|
|
|
$p4 = "/\[\[(:?$nc+:|:|)($tc+?)( ?($tc+))\\|]]/";
|
2021-01-04 09:37:08 +00:00
|
|
|
|
// [[ns:page (context), context|]] (using single, double-width or Arabic comma)
|
|
|
|
|
|
$p3 = "/\[\[(:?$nc+:|:|)($tc+?)( ?\\($tc+\\)|)((?:, |,|، )$tc+|)\\|]]/";
|
2014-05-10 23:03:45 +00:00
|
|
|
|
// [[|page]] (reverse pipe trick: add context from page title)
|
|
|
|
|
|
$p2 = "/\[\[\\|($tc+)]]/";
|
Moving Conrad's recent parser work out to a branch. Reverted r62434, r62416, r62150, r62111, r62085, r62081, r62080, r62077, r62076, r62069, r62049, r62035.
2010-02-19 05:19:32 +00:00
|
|
|
|
|
|
|
|
|
|
# try $p1 first, to turn "[[A, B (C)|]]" into "[[A, B (C)|A, B]]"
|
|
|
|
|
|
$text = preg_replace( $p1, '[[\\1\\2\\3|\\2]]', $text );
|
|
|
|
|
|
$text = preg_replace( $p4, '[[\\1\\2\\3|\\2]]', $text );
|
|
|
|
|
|
$text = preg_replace( $p3, '[[\\1\\2\\3\\4|\\2]]', $text );
|
|
|
|
|
|
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$t = $this->getTitle()->getText();
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$m = [];
|
Moving Conrad's recent parser work out to a branch. Reverted r62434, r62416, r62150, r62111, r62085, r62081, r62080, r62077, r62076, r62069, r62049, r62035.
2010-02-19 05:19:32 +00:00
|
|
|
|
if ( preg_match( "/^($nc+:|)$tc+?( \\($tc+\\))$/", $t, $m ) ) {
|
|
|
|
|
|
$text = preg_replace( $p2, "[[$m[1]\\1$m[2]|\\1]]", $text );
|
|
|
|
|
|
} elseif ( preg_match( "/^($nc+:|)$tc+?(, $tc+|)$/", $t, $m ) && "$m[1]$m[2]" != '' ) {
|
|
|
|
|
|
$text = preg_replace( $p2, "[[$m[1]\\1$m[2]|\\1]]", $text );
|
|
|
|
|
|
} else {
|
|
|
|
|
|
# if there's no context, don't bother duplicating the title
|
|
|
|
|
|
$text = preg_replace( $p2, '[[\\1]]', $text );
|
|
|
|
|
|
}
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2004-03-06 01:49:16 +00:00
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
2006-01-07 13:31:29 +00:00
|
|
|
|
|
2005-11-15 00:38:39 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Fetch the user's signature text, if any, and normalize to
|
|
|
|
|
|
* validated, ready-to-insert wikitext.
|
2009-09-30 10:35:34 +00:00
|
|
|
|
* If you have pre-fetched the nickname or the fancySig option, you can
|
|
|
|
|
|
* specify them here to save a database query.
|
2011-01-23 18:45:21 +00:00
|
|
|
|
* Do not reuse this parser instance after calling getUserSig(),
|
2019-04-11 13:36:15 +00:00
|
|
|
|
* as it may have changed.
|
2005-11-15 00:38:39 +00:00
|
|
|
|
*
|
2021-03-16 18:31:27 +00:00
|
|
|
|
* @param UserIdentity $user
|
2019-11-08 16:17:35 +00:00
|
|
|
|
* @param string|false $nickname Nickname to use or false to use user's default nickname
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param bool|null $fancySig whether the nicknname is the complete signature
|
|
|
|
|
|
* or null to use default value
|
2005-11-15 00:38:39 +00:00
|
|
|
|
* @return string
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.6
|
2005-11-15 00:38:39 +00:00
|
|
|
|
*/
|
2021-03-16 18:31:27 +00:00
|
|
|
|
public function getUserSig( UserIdentity $user, $nickname = false, $fancySig = null ) {
|
2006-01-07 23:09:21 +00:00
|
|
|
|
$username = $user->getName();
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# If not given, retrieve from the user object.
|
2013-04-20 15:38:24 +00:00
|
|
|
|
if ( $nickname === false ) {
|
2021-03-16 18:31:27 +00:00
|
|
|
|
$nickname = $this->userOptionsLookup->getOption( $user, 'nickname' );
|
2013-04-20 15:38:24 +00:00
|
|
|
|
}
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2020-01-09 23:48:34 +00:00
|
|
|
|
if ( $fancySig === null ) {
|
2021-03-16 18:31:27 +00:00
|
|
|
|
$fancySig = $this->userOptionsLookup->getBoolOption( $user, 'fancysig' );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
}
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2019-11-08 16:17:35 +00:00
|
|
|
|
if ( $nickname === null || $nickname === '' ) {
|
2021-08-27 15:49:20 +00:00
|
|
|
|
// Empty value results in the default signature (even when fancysig is enabled)
|
2019-11-08 16:17:35 +00:00
|
|
|
|
$nickname = $username;
|
2022-04-26 15:48:03 +00:00
|
|
|
|
} elseif ( mb_strlen( $nickname ) > $this->svcOptions->get( MainConfigNames::MaxSigChars ) ) {
|
2007-06-13 16:28:19 +00:00
|
|
|
|
$nickname = $username;
|
2019-06-27 03:35:50 +00:00
|
|
|
|
$this->logger->debug( __METHOD__ . ": $username has overlong signature." );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
} elseif ( $fancySig !== false ) {
|
2006-01-07 23:09:21 +00:00
|
|
|
|
# Sig. might contain markup; validate this
|
preferences: Signature validation (lint errors, user links, nested subst)
Three new checks are now applied to user signatures in preferences:
* Disallow invalid HTML and lint errors (T140606)
Since 15e0e9bb4b we can rely on Parsoid to check the signature for
lint errors. (The old PHP Parser doesn't have this capability.)
Most importantly, this will disallow unclosed HTML tags. Unclosed
formatting tags like `<i>` (and also wikitext markup like `''`)
could affect the entire page with the bad markup.
New configuration variable $wgSignatureAllowedLintErrors is added
to allow ignoring some errors. The default value ignores the
'obsolete-tag' error (caused by HTML tags like `<font>` and `<tt>`.)
* Require a link to user page, talk page or contributions (T237700)
Various tools don't work correctly when such a link is missing. For
example, Echo notifications are not sent, DiscussionTools will not
allow replying to these comments, English Wikipedia's SineBot treats
these comments as unsigned.
Such requirement has been present for a long time in many Wikimedia
wikis' policies, but it was not enforced by software.
* Disallow "nested" substitution in signature (T230652)
Clever abuse of "subst" markup and tildes allows users to save edits
containing wikitext in which substitution occurs again when the page
is next saved. Disallow this in signatures, at least.
New configuration variable $wgSignatureValidation is added to control
what we do about the result of the validation described above. The
options are:
* 'warning':
Only displays a warning near the field on Special:Preferences if
the current signature is invalid. Signatures can still be changed
regardless of validity and will be used when signing comments.
* 'new':
In addition to the above, if a user tries to change their signature,
the new one must be valid. Existing invalid signatures are still
used when signing comments.
* 'disallow':
In addition to the above, existing invalid signatures are no longer
used when signing comments.
Bug: T140606
Bug: T237700
Bug: T230652
Change-Id: I07c575c2d9d2afe7a89c4847d16ac044417297bf
2019-11-09 00:15:51 +00:00
|
|
|
|
$isValid = $this->validateSig( $nickname ) !== false;
|
|
|
|
|
|
|
|
|
|
|
|
# New validator
|
2022-04-26 15:48:03 +00:00
|
|
|
|
$sigValidation = $this->svcOptions->get( MainConfigNames::SignatureValidation );
|
preferences: Signature validation (lint errors, user links, nested subst)
Three new checks are now applied to user signatures in preferences:
* Disallow invalid HTML and lint errors (T140606)
Since 15e0e9bb4b we can rely on Parsoid to check the signature for
lint errors. (The old PHP Parser doesn't have this capability.)
Most importantly, this will disallow unclosed HTML tags. Unclosed
formatting tags like `<i>` (and also wikitext markup like `''`)
could affect the entire page with the bad markup.
New configuration variable $wgSignatureAllowedLintErrors is added
to allow ignoring some errors. The default value ignores the
'obsolete-tag' error (caused by HTML tags like `<font>` and `<tt>`.)
* Require a link to user page, talk page or contributions (T237700)
Various tools don't work correctly when such a link is missing. For
example, Echo notifications are not sent, DiscussionTools will not
allow replying to these comments, English Wikipedia's SineBot treats
these comments as unsigned.
Such requirement has been present for a long time in many Wikimedia
wikis' policies, but it was not enforced by software.
* Disallow "nested" substitution in signature (T230652)
Clever abuse of "subst" markup and tildes allows users to save edits
containing wikitext in which substitution occurs again when the page
is next saved. Disallow this in signatures, at least.
New configuration variable $wgSignatureValidation is added to control
what we do about the result of the validation described above. The
options are:
* 'warning':
Only displays a warning near the field on Special:Preferences if
the current signature is invalid. Signatures can still be changed
regardless of validity and will be used when signing comments.
* 'new':
In addition to the above, if a user tries to change their signature,
the new one must be valid. Existing invalid signatures are still
used when signing comments.
* 'disallow':
In addition to the above, existing invalid signatures are no longer
used when signing comments.
Bug: T140606
Bug: T237700
Bug: T230652
Change-Id: I07c575c2d9d2afe7a89c4847d16ac044417297bf
2019-11-09 00:15:51 +00:00
|
|
|
|
if ( $isValid && $sigValidation === 'disallow' ) {
|
2022-04-07 23:52:05 +00:00
|
|
|
|
$parserOpts = new ParserOptions(
|
|
|
|
|
|
$this->mOptions->getUserIdentity(),
|
|
|
|
|
|
$this->contLang
|
|
|
|
|
|
);
|
|
|
|
|
|
$validator = $this->signatureValidatorFactory
|
2021-12-12 15:18:36 +00:00
|
|
|
|
->newSignatureValidator( $user, null, $parserOpts );
|
preferences: Signature validation (lint errors, user links, nested subst)
Three new checks are now applied to user signatures in preferences:
* Disallow invalid HTML and lint errors (T140606)
Since 15e0e9bb4b we can rely on Parsoid to check the signature for
lint errors. (The old PHP Parser doesn't have this capability.)
Most importantly, this will disallow unclosed HTML tags. Unclosed
formatting tags like `<i>` (and also wikitext markup like `''`)
could affect the entire page with the bad markup.
New configuration variable $wgSignatureAllowedLintErrors is added
to allow ignoring some errors. The default value ignores the
'obsolete-tag' error (caused by HTML tags like `<font>` and `<tt>`.)
* Require a link to user page, talk page or contributions (T237700)
Various tools don't work correctly when such a link is missing. For
example, Echo notifications are not sent, DiscussionTools will not
allow replying to these comments, English Wikipedia's SineBot treats
these comments as unsigned.
Such requirement has been present for a long time in many Wikimedia
wikis' policies, but it was not enforced by software.
* Disallow "nested" substitution in signature (T230652)
Clever abuse of "subst" markup and tildes allows users to save edits
containing wikitext in which substitution occurs again when the page
is next saved. Disallow this in signatures, at least.
New configuration variable $wgSignatureValidation is added to control
what we do about the result of the validation described above. The
options are:
* 'warning':
Only displays a warning near the field on Special:Preferences if
the current signature is invalid. Signatures can still be changed
regardless of validity and will be used when signing comments.
* 'new':
In addition to the above, if a user tries to change their signature,
the new one must be valid. Existing invalid signatures are still
used when signing comments.
* 'disallow':
In addition to the above, existing invalid signatures are no longer
used when signing comments.
Bug: T140606
Bug: T237700
Bug: T230652
Change-Id: I07c575c2d9d2afe7a89c4847d16ac044417297bf
2019-11-09 00:15:51 +00:00
|
|
|
|
$isValid = !$validator->validateSignature( $nickname );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
if ( $isValid ) {
|
2006-01-07 23:09:21 +00:00
|
|
|
|
# Validated; clean up (if needed) and return it
|
2006-04-30 20:09:44 +00:00
|
|
|
|
return $this->cleanSig( $nickname, true );
|
2005-11-15 00:38:39 +00:00
|
|
|
|
} else {
|
2006-01-07 23:09:21 +00:00
|
|
|
|
# Failed to validate; fall back to the default
|
|
|
|
|
|
$nickname = $username;
|
preferences: Signature validation (lint errors, user links, nested subst)
Three new checks are now applied to user signatures in preferences:
* Disallow invalid HTML and lint errors (T140606)
Since 15e0e9bb4b we can rely on Parsoid to check the signature for
lint errors. (The old PHP Parser doesn't have this capability.)
Most importantly, this will disallow unclosed HTML tags. Unclosed
formatting tags like `<i>` (and also wikitext markup like `''`)
could affect the entire page with the bad markup.
New configuration variable $wgSignatureAllowedLintErrors is added
to allow ignoring some errors. The default value ignores the
'obsolete-tag' error (caused by HTML tags like `<font>` and `<tt>`.)
* Require a link to user page, talk page or contributions (T237700)
Various tools don't work correctly when such a link is missing. For
example, Echo notifications are not sent, DiscussionTools will not
allow replying to these comments, English Wikipedia's SineBot treats
these comments as unsigned.
Such requirement has been present for a long time in many Wikimedia
wikis' policies, but it was not enforced by software.
* Disallow "nested" substitution in signature (T230652)
Clever abuse of "subst" markup and tildes allows users to save edits
containing wikitext in which substitution occurs again when the page
is next saved. Disallow this in signatures, at least.
New configuration variable $wgSignatureValidation is added to control
what we do about the result of the validation described above. The
options are:
* 'warning':
Only displays a warning near the field on Special:Preferences if
the current signature is invalid. Signatures can still be changed
regardless of validity and will be used when signing comments.
* 'new':
In addition to the above, if a user tries to change their signature,
the new one must be valid. Existing invalid signatures are still
used when signing comments.
* 'disallow':
In addition to the above, existing invalid signatures are no longer
used when signing comments.
Bug: T140606
Bug: T237700
Bug: T230652
Change-Id: I07c575c2d9d2afe7a89c4847d16ac044417297bf
2019-11-09 00:15:51 +00:00
|
|
|
|
$this->logger->debug( __METHOD__ . ": $username has invalid signature." );
|
2005-11-15 00:38:39 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2006-01-07 13:31:29 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Make sure nickname doesnt get a sig in a sig
|
2011-12-06 23:07:13 +00:00
|
|
|
|
$nickname = self::cleanSigInSig( $nickname );
|
2006-06-23 19:50:55 +00:00
|
|
|
|
|
2006-01-07 23:09:21 +00:00
|
|
|
|
# If we're still here, make it a link to the user page
|
2007-11-15 03:30:03 +00:00
|
|
|
|
$userText = wfEscapeWikiText( $username );
|
|
|
|
|
|
$nickText = wfEscapeWikiText( $nickname );
|
2022-04-11 01:26:51 +00:00
|
|
|
|
if ( $this->userNameUtils->isTemp( $username ) ) {
|
|
|
|
|
|
$msgName = 'signature-temp';
|
|
|
|
|
|
} elseif ( $user->isRegistered() ) {
|
|
|
|
|
|
$msgName = 'signature';
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$msgName = 'signature-anon';
|
|
|
|
|
|
}
|
2011-06-22 17:45:31 +00:00
|
|
|
|
|
2014-05-10 23:03:45 +00:00
|
|
|
|
return wfMessage( $msgName, $userText, $nickText )->inContentLanguage()
|
2021-05-13 20:15:25 +00:00
|
|
|
|
->page( $this->getPage() )->text();
|
2005-11-15 00:38:39 +00:00
|
|
|
|
}
|
2006-01-07 13:31:29 +00:00
|
|
|
|
|
2005-11-15 00:38:39 +00:00
|
|
|
|
/**
|
2006-01-07 23:09:21 +00:00
|
|
|
|
* Check that the user's signature contains no bad XML
|
2005-11-15 00:38:39 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @return string|false An expanded string, or false if invalid.
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.6
|
2005-11-15 00:38:39 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function validateSig( $text ) {
|
2013-04-26 14:42:31 +00:00
|
|
|
|
return Xml::isWellFormedXmlFragment( $text ) ? $text : false;
|
2005-11-15 00:38:39 +00:00
|
|
|
|
}
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2006-01-07 23:09:21 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Clean up signature text
|
|
|
|
|
|
*
|
2015-04-01 00:37:28 +00:00
|
|
|
|
* 1) Strip 3, 4 or 5 tildes out of signatures @see cleanSigInSig
|
2006-01-13 09:47:09 +00:00
|
|
|
|
* 2) Substitute all transclusions
|
2006-01-07 23:09:21 +00:00
|
|
|
|
*
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param string $text
|
2013-03-11 17:15:01 +00:00
|
|
|
|
* @param bool $parsing Whether we're cleaning (preferences save) or parsing
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @return string Signature text
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.6
|
2006-01-07 23:09:21 +00:00
|
|
|
|
*/
|
2011-12-06 23:07:13 +00:00
|
|
|
|
public function cleanSig( $text, $parsing = false ) {
|
2007-12-01 06:52:25 +00:00
|
|
|
|
if ( !$parsing ) {
|
|
|
|
|
|
global $wgTitle;
|
2013-10-27 20:18:06 +00:00
|
|
|
|
$magicScopeVariable = $this->lock();
|
2020-09-18 15:07:18 +00:00
|
|
|
|
$this->startParse(
|
|
|
|
|
|
$wgTitle,
|
|
|
|
|
|
ParserOptions::newFromUser( RequestContext::getMain()->getUser() ),
|
|
|
|
|
|
self::OT_PREPROCESS,
|
|
|
|
|
|
true
|
|
|
|
|
|
);
|
2007-12-01 06:52:25 +00:00
|
|
|
|
}
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2008-07-31 09:41:28 +00:00
|
|
|
|
# Option to disable this feature
|
|
|
|
|
|
if ( !$this->mOptions->getCleanSignatures() ) {
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2011-05-17 22:03:20 +00:00
|
|
|
|
# @todo FIXME: Regex doesn't respect extension tags or nowiki
|
2007-12-01 06:52:25 +00:00
|
|
|
|
# => Move this logic to braceSubstitution()
|
2018-07-25 11:55:18 +00:00
|
|
|
|
$substWord = $this->magicWordFactory->get( 'subst' );
|
2006-01-13 09:47:09 +00:00
|
|
|
|
$substRegex = '/\{\{(?!(?:' . $substWord->getBaseRegex() . '))/x' . $substWord->getRegexCase();
|
|
|
|
|
|
$substText = '{{' . $substWord->getSynonym( 0 );
|
|
|
|
|
|
|
|
|
|
|
|
$text = preg_replace( $substRegex, $substText, $text );
|
2011-12-06 23:07:13 +00:00
|
|
|
|
$text = self::cleanSigInSig( $text );
|
2007-12-01 06:52:25 +00:00
|
|
|
|
$dom = $this->preprocessToDom( $text );
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$frame = $this->getPreprocessor()->newFrame();
|
|
|
|
|
|
$text = $frame->expand( $dom );
|
2007-12-01 06:52:25 +00:00
|
|
|
|
|
|
|
|
|
|
if ( !$parsing ) {
|
|
|
|
|
|
$text = $this->mStripState->unstripBoth( $text );
|
|
|
|
|
|
}
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2006-01-08 05:29:58 +00:00
|
|
|
|
return $text;
|
2006-01-07 23:09:21 +00:00
|
|
|
|
}
|
2006-06-23 19:50:55 +00:00
|
|
|
|
|
|
|
|
|
|
/**
|
2015-04-01 00:37:28 +00:00
|
|
|
|
* Strip 3, 4 or 5 tildes out of signatures.
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @return string Signature text with /~{3,5}/ removed
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.7
|
2006-06-23 19:50:55 +00:00
|
|
|
|
*/
|
2011-12-06 23:07:13 +00:00
|
|
|
|
public static function cleanSigInSig( $text ) {
|
2006-06-23 19:50:55 +00:00
|
|
|
|
$text = preg_replace( '/~{3,5}/', '', $text );
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2021-09-15 01:00:06 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Replace table of contents marker in parsed HTML.
|
|
|
|
|
|
*
|
|
|
|
|
|
* Used to remove or replace the marker. This method should be
|
|
|
|
|
|
* used instead of direct access to Parser::TOC_PLACEHOLDER, since
|
|
|
|
|
|
* in the future the placeholder might have additional attributes
|
|
|
|
|
|
* attached which should be ignored when the replacement is made.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.38
|
|
|
|
|
|
* @stable
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text Parsed HTML
|
|
|
|
|
|
* @param string $toc HTML table of contents string, or else an empty
|
|
|
|
|
|
* string to remove the marker.
|
|
|
|
|
|
* @return string Result HTML
|
|
|
|
|
|
*/
|
|
|
|
|
|
public static function replaceTableOfContentsMarker( $text, $toc ) {
|
2022-09-15 16:03:38 +00:00
|
|
|
|
return preg_replace_callback(
|
|
|
|
|
|
self::TOC_PLACEHOLDER_REGEX,
|
|
|
|
|
|
static function ( array $matches ) use( $toc ) {
|
|
|
|
|
|
return $toc; // Ensure $1 \1 etc are safe to use in $toc
|
|
|
|
|
|
},
|
2021-12-21 03:26:38 +00:00
|
|
|
|
// For backwards compatibility during transition period,
|
|
|
|
|
|
// also replace "old" TOC_PLACEHOLDER value
|
|
|
|
|
|
str_replace( '<mw:tocplace></mw:tocplace>', $toc, $text )
|
2021-09-15 01:00:06 +00:00
|
|
|
|
);
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Set up some variables which are usually set up in parse()
|
|
|
|
|
|
* so that an external function can call some class members with confidence
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param ?PageReference $page
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param ParserOptions $options
|
|
|
|
|
|
* @param int $outputType
|
|
|
|
|
|
* @param bool $clearState
|
2019-08-12 06:10:22 +00:00
|
|
|
|
* @param int|null $revId
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.3
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public function startExternalParse( ?PageReference $page, ParserOptions $options,
|
2019-08-12 06:10:22 +00:00
|
|
|
|
$outputType, $clearState = true, $revId = null
|
2014-05-10 23:03:45 +00:00
|
|
|
|
) {
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->startParse( $page, $options, $outputType, $clearState );
|
2019-08-12 06:10:22 +00:00
|
|
|
|
if ( $revId !== null ) {
|
|
|
|
|
|
$this->mRevisionId = $revId;
|
|
|
|
|
|
}
|
2011-02-22 15:05:08 +00:00
|
|
|
|
}
|
2011-02-24 20:23:49 +00:00
|
|
|
|
|
2011-08-05 00:33:03 +00:00
|
|
|
|
/**
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param ?PageReference $page
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param ParserOptions $options
|
|
|
|
|
|
* @param int $outputType
|
|
|
|
|
|
* @param bool $clearState
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
private function startParse( ?PageReference $page, ParserOptions $options,
|
2014-05-10 23:03:45 +00:00
|
|
|
|
$outputType, $clearState = true
|
|
|
|
|
|
) {
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->setPage( $page );
|
2004-03-27 22:47:25 +00:00
|
|
|
|
$this->mOptions = $options;
|
2006-08-14 07:10:31 +00:00
|
|
|
|
$this->setOutputType( $outputType );
|
2004-03-27 22:47:25 +00:00
|
|
|
|
if ( $clearState ) {
|
|
|
|
|
|
$this->clearState();
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
2004-04-05 10:38:40 +00:00
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
2008-01-19 09:03:45 +00:00
|
|
|
|
* Wrapper for preprocess()
|
2007-11-20 10:55:08 +00:00
|
|
|
|
*
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param string $text The text to preprocess
|
2017-12-28 15:06:10 +00:00
|
|
|
|
* @param ParserOptions $options
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param ?PageReference $page The context page or null to use $wgTitle
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @return string
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.3
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public function transformMsg( $text, ParserOptions $options, ?PageReference $page = null ) {
|
2004-04-05 10:38:40 +00:00
|
|
|
|
static $executing = false;
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2004-04-05 10:38:40 +00:00
|
|
|
|
# Guard against infinite recursion
|
|
|
|
|
|
if ( $executing ) {
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
$executing = true;
|
|
|
|
|
|
|
2021-04-25 17:29:33 +00:00
|
|
|
|
if ( !$page ) {
|
2011-02-09 15:19:45 +00:00
|
|
|
|
global $wgTitle;
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$page = $wgTitle;
|
2011-02-09 15:19:45 +00:00
|
|
|
|
}
|
2012-11-24 09:40:06 +00:00
|
|
|
|
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$text = $this->preprocess( $text, $page, $options );
|
2004-04-12 23:59:37 +00:00
|
|
|
|
|
2004-04-05 10:38:40 +00:00
|
|
|
|
$executing = false;
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
2004-06-09 12:15:42 +00:00
|
|
|
|
|
2004-09-21 05:49:12 +00:00
|
|
|
|
/**
|
2012-07-10 12:48:06 +00:00
|
|
|
|
* Create an HTML-style tag, e.g. "<yourtag>special text</yourtag>"
|
2006-02-28 05:18:36 +00:00
|
|
|
|
* The callback should have the following form:
|
2022-10-08 14:50:45 +00:00
|
|
|
|
* function myParserHook( $text, array $params, Parser $parser, PPFrame $frame ) { ... }
|
2006-02-28 05:18:36 +00:00
|
|
|
|
*
|
|
|
|
|
|
* Transform and return $text. Use $parser for any required context, e.g. use
|
|
|
|
|
|
* $parser->getTitle() and $parser->getOptions() not $wgTitle or $wgOut->mParserOptions
|
2005-11-26 23:04:05 +00:00
|
|
|
|
*
|
2011-04-22 19:06:52 +00:00
|
|
|
|
* Hooks may return extended information by returning an array, of which the
|
2022-10-08 14:50:45 +00:00
|
|
|
|
* first numbered element (index 0) must be the return string. The following other
|
|
|
|
|
|
* keys are used:
|
|
|
|
|
|
* - 'markerType': used by some core tag hooks to override which strip
|
|
|
|
|
|
* array their results are placed in, 'general' or 'nowiki'.
|
2011-04-22 19:06:52 +00:00
|
|
|
|
*
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param string $tag The tag to use, e.g. 'hook' for "<hook>"
|
2022-10-08 14:50:45 +00:00
|
|
|
|
* @param callable $callback The callback to use for the tag
|
2012-10-07 23:35:26 +00:00
|
|
|
|
* @throws MWException
|
2014-07-03 19:20:35 +00:00
|
|
|
|
* @return callable|null The old value of the mTagHooks array associated with the hook
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.3
|
2004-09-21 05:49:12 +00:00
|
|
|
|
*/
|
2017-06-07 10:49:24 +00:00
|
|
|
|
public function setHook( $tag, callable $callback ) {
|
2006-06-01 06:41:32 +00:00
|
|
|
|
$tag = strtolower( $tag );
|
2012-02-09 19:29:36 +00:00
|
|
|
|
if ( preg_match( '/[<>\r\n]/', $tag, $m ) ) {
|
|
|
|
|
|
throw new MWException( "Invalid character {$m[0]} in setHook('$tag', ...) call" );
|
|
|
|
|
|
}
|
2017-10-06 22:17:58 +00:00
|
|
|
|
$oldVal = $this->mTagHooks[$tag] ?? null;
|
2004-06-12 06:15:09 +00:00
|
|
|
|
$this->mTagHooks[$tag] = $callback;
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( !in_array( $tag, $this->mStripList ) ) {
|
2008-03-18 21:45:18 +00:00
|
|
|
|
$this->mStripList[] = $tag;
|
|
|
|
|
|
}
|
2006-01-07 13:31:29 +00:00
|
|
|
|
|
2004-06-09 12:15:42 +00:00
|
|
|
|
return $oldVal;
|
|
|
|
|
|
}
|
2004-10-15 17:39:10 +00:00
|
|
|
|
|
2008-01-22 10:10:21 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Remove all tag hooks
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.12
|
2008-01-22 10:10:21 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function clearTagHooks() {
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$this->mTagHooks = [];
|
2020-03-30 17:45:35 +00:00
|
|
|
|
$this->mStripList = [];
|
2008-01-22 10:10:21 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2006-04-05 09:40:25 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Create a function, e.g. {{sum:1|2|3}}
|
|
|
|
|
|
* The callback function should have the form:
|
|
|
|
|
|
* function myParserFunction( &$parser, $arg1, $arg2, $arg3 ) { ... }
|
|
|
|
|
|
*
|
2014-11-11 19:28:28 +00:00
|
|
|
|
* Or with Parser::SFH_OBJECT_ARGS:
|
2008-07-23 14:51:39 +00:00
|
|
|
|
* function myParserFunction( $parser, $frame, $args ) { ... }
|
|
|
|
|
|
*
|
2006-10-17 08:49:27 +00:00
|
|
|
|
* The callback may either return the text result of the function, or an array with the text
|
|
|
|
|
|
* in element 0, and a number of flags in the other elements. The names of the flags are
|
2006-04-05 09:40:25 +00:00
|
|
|
|
* specified in the keys. Valid flags are:
|
2006-10-17 08:49:27 +00:00
|
|
|
|
* found The text returned is valid, stop processing the template. This
|
2006-04-05 09:40:25 +00:00
|
|
|
|
* is on by default.
|
|
|
|
|
|
* nowiki Wiki markup in the return value should be escaped
|
|
|
|
|
|
* isHTML The returned text is HTML, armour it against wikitext transformation
|
|
|
|
|
|
*
|
2013-03-11 17:15:01 +00:00
|
|
|
|
* @param string $id The magic word ID
|
2014-07-03 19:20:35 +00:00
|
|
|
|
* @param callable $callback The callback function (and object) to use
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param int $flags A combination of the following flags:
|
2014-11-11 19:28:28 +00:00
|
|
|
|
* Parser::SFH_NO_HASH No leading hash, i.e. {{plural:...}} instead of {{#if:...}}
|
2008-07-23 14:51:39 +00:00
|
|
|
|
*
|
2014-11-11 19:28:28 +00:00
|
|
|
|
* Parser::SFH_OBJECT_ARGS Pass the template arguments as PPNode objects instead of text.
|
|
|
|
|
|
* This allows for conditional expansion of the parse tree, allowing you to eliminate dead
|
2010-01-07 04:13:14 +00:00
|
|
|
|
* branches and thus speed up parsing. It is also possible to analyse the parse tree of
|
2008-07-23 14:51:39 +00:00
|
|
|
|
* the arguments, and to control the way they are expanded.
|
|
|
|
|
|
*
|
|
|
|
|
|
* The $frame parameter is a PPFrame. This can be used to produce expanded text from the
|
|
|
|
|
|
* arguments, for instance:
|
|
|
|
|
|
* $text = isset( $args[0] ) ? $frame->expand( $args[0] ) : '';
|
|
|
|
|
|
*
|
2010-01-07 04:13:14 +00:00
|
|
|
|
* For technical reasons, $args[0] is pre-expanded and will be a string. This may change in
|
2008-07-23 14:51:39 +00:00
|
|
|
|
* future versions. Please call $frame->expand() on it anyway so that your code keeps
|
|
|
|
|
|
* working if/when this is changed.
|
|
|
|
|
|
*
|
|
|
|
|
|
* If you want whitespace to be trimmed from $args, you need to do it yourself, post-
|
|
|
|
|
|
* expansion.
|
|
|
|
|
|
*
|
2010-01-07 04:13:14 +00:00
|
|
|
|
* Please read the documentation in includes/parser/Preprocessor.php for more information
|
2008-07-23 14:51:39 +00:00
|
|
|
|
* about the methods available in PPFrame and PPNode.
|
2006-04-05 09:40:25 +00:00
|
|
|
|
*
|
2012-10-07 23:35:26 +00:00
|
|
|
|
* @throws MWException
|
2022-03-01 21:42:03 +00:00
|
|
|
|
* @return string|callable|null The old callback function for this name, if any
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.6
|
2006-04-05 09:40:25 +00:00
|
|
|
|
*/
|
2017-06-07 10:49:24 +00:00
|
|
|
|
public function setFunctionHook( $id, callable $callback, $flags = 0 ) {
|
2020-02-28 15:55:22 +00:00
|
|
|
|
$oldVal = $this->mFunctionHooks[$id][0] ?? null;
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$this->mFunctionHooks[$id] = [ $callback, $flags ];
|
2006-07-03 11:07:00 +00:00
|
|
|
|
|
2006-07-03 03:29:57 +00:00
|
|
|
|
# Add to function cache
|
2018-07-25 11:55:18 +00:00
|
|
|
|
$mw = $this->magicWordFactory->get( $id );
|
2013-04-20 15:38:24 +00:00
|
|
|
|
if ( !$mw ) {
|
2013-01-26 21:11:09 +00:00
|
|
|
|
throw new MWException( __METHOD__ . '() expecting a magic word identifier.' );
|
2013-04-20 15:38:24 +00:00
|
|
|
|
}
|
2006-07-03 11:07:00 +00:00
|
|
|
|
|
2006-07-14 16:08:16 +00:00
|
|
|
|
$synonyms = $mw->getSynonyms();
|
|
|
|
|
|
$sensitive = intval( $mw->isCaseSensitive() );
|
|
|
|
|
|
|
2006-07-03 11:07:00 +00:00
|
|
|
|
foreach ( $synonyms as $syn ) {
|
|
|
|
|
|
# Case
|
|
|
|
|
|
if ( !$sensitive ) {
|
2018-08-03 08:25:15 +00:00
|
|
|
|
$syn = $this->contLang->lc( $syn );
|
2006-07-03 11:07:00 +00:00
|
|
|
|
}
|
|
|
|
|
|
# Add leading hash
|
2014-11-11 19:28:28 +00:00
|
|
|
|
if ( !( $flags & self::SFH_NO_HASH ) ) {
|
2006-07-03 11:07:00 +00:00
|
|
|
|
$syn = '#' . $syn;
|
|
|
|
|
|
}
|
|
|
|
|
|
# Remove trailing colon
|
2008-08-26 14:37:15 +00:00
|
|
|
|
if ( substr( $syn, -1, 1 ) === ':' ) {
|
2006-07-03 11:07:00 +00:00
|
|
|
|
$syn = substr( $syn, 0, -1 );
|
|
|
|
|
|
}
|
|
|
|
|
|
$this->mFunctionSynonyms[$sensitive][$syn] = $id;
|
2006-07-02 17:43:32 +00:00
|
|
|
|
}
|
2006-07-03 03:29:57 +00:00
|
|
|
|
return $oldVal;
|
2006-07-02 17:43:32 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2006-08-30 07:45:07 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get all registered function hook identifiers
|
|
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return array
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.8
|
2006-08-30 07:45:07 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function getFunctionHooks() {
|
2006-08-30 07:45:07 +00:00
|
|
|
|
return array_keys( $this->mFunctionHooks );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2004-10-15 17:39:10 +00:00
|
|
|
|
/**
|
2012-07-10 12:48:06 +00:00
|
|
|
|
* Replace "<!--LINK-->" link placeholders with actual links, in the buffer
|
2016-01-02 22:47:08 +00:00
|
|
|
|
* Placeholders created in Linker::link()
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2017-08-11 00:23:16 +00:00
|
|
|
|
* @param string &$text
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param int $options
|
2019-10-29 16:52:47 +00:00
|
|
|
|
* @deprecated since 1.34; should not be used outside parser class.
|
2004-10-15 17:39:10 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function replaceLinkHolders( &$text, $options = 0 ) {
|
2019-10-29 16:52:47 +00:00
|
|
|
|
$this->replaceLinkHoldersPrivate( $text, $options );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Replace "<!--LINK-->" link placeholders with actual links, in the buffer
|
|
|
|
|
|
* Placeholders created in Linker::link()
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string &$text
|
|
|
|
|
|
* @param int $options
|
|
|
|
|
|
*/
|
|
|
|
|
|
private function replaceLinkHoldersPrivate( &$text, $options = 0 ) {
|
2014-11-02 15:47:51 +00:00
|
|
|
|
$this->mLinkHolders->replace( $text );
|
2004-10-15 17:39:10 +00:00
|
|
|
|
}
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
2019-10-29 16:52:47 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Replace "<!--LINK-->" link placeholders with plain text of links
|
|
|
|
|
|
* (not HTML-formatted).
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
2019-11-04 19:23:34 +00:00
|
|
|
|
private function replaceLinkHoldersText( $text ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
return $this->mLinkHolders->replaceText( $text );
|
2005-05-31 08:49:03 +00:00
|
|
|
|
}
|
2004-11-20 11:28:37 +00:00
|
|
|
|
|
2004-11-13 12:04:31 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Renders an image gallery from a text with one line per image.
|
|
|
|
|
|
* text labels may be given by using |-style alternative text. E.g.
|
|
|
|
|
|
* Image:one.jpg|The number "1"
|
|
|
|
|
|
* Image:tree.jpg|A tree
|
|
|
|
|
|
* given as text will return the HTML of a gallery with two images,
|
|
|
|
|
|
* labeled 'The number "1"' and
|
|
|
|
|
|
* 'A tree'.
|
2011-04-22 19:06:52 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
2011-08-05 00:33:03 +00:00
|
|
|
|
* @param array $params
|
2011-04-22 19:06:52 +00:00
|
|
|
|
* @return string HTML
|
2020-01-25 15:45:59 +00:00
|
|
|
|
* @internal
|
2004-11-13 12:04:31 +00:00
|
|
|
|
*/
|
2019-08-27 09:23:52 +00:00
|
|
|
|
public function renderImageGallery( $text, array $params ) {
|
New more slick gallery display
This extension adds a "mode" parameter to the gallery
tag, allowing different formats for the gallery tag
(galleries in the ui can be controlled by a global)
The added modes are:
*traditional - The original gallery
*nolines - Like the original, no borders, less padding
*packed - All images aligned by having same height.
JS also justifies the images.
(I think this one is the one that will go over best
with users.)
*packed-overlay - like packed, but caption goes over
top the image in a transloucent box.
*packed-hover - like packed-overlay, but caption only
visible on hover. Degrades gracefully on screen
readers, and falls back to packed-overlay if
you are using a touch screen. I kind of like
this mode when the caption is not that important
(ex a category where its just the file name).
This also adds a hook to allow people to make their
own gallery version. I believe there would be interest
in this, as different people have done different
experiments. For example:
* Wikia: http://community.wikia.com/wiki/Help:Galleries,_Slideshows,_and_Sliders/wikitext
* Wikinews: https://en.wikinews.org/wiki/Template:Picture_select
What I would like to see for this patch, is first it gets
enabled, with the default still "traditional". After
about a month or two we consult with users. If feedback
is positive, we change the default mode to one of the
others (probably "packed").
Adds a "mode" parameter to gallery for different
mode, including one 'height-constrained-overlay'
which looks much more like other modern websites.
Note: This makes one change to the old gallery format.
It makes Nonexistent files be rendered like thumbnails
(i.e. they are rendered with a little grey border).
One thing I'm slightly worried about with this patch,
is that I added an option to MediaTransformOutput::toHtml
to override the width attribute. I'm not sure if that
is the best approach, and would appreciate thoughts
on that.
This should be merged at the same time as Ie82c1548
Change-Id: I33462a8b52502ed76aeb163b66e3704c8618ba23
2013-06-08 04:47:07 +00:00
|
|
|
|
$mode = false;
|
|
|
|
|
|
if ( isset( $params['mode'] ) ) {
|
|
|
|
|
|
$mode = $params['mode'];
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
try {
|
|
|
|
|
|
$ig = ImageGalleryBase::factory( $mode );
|
2022-02-01 01:11:09 +00:00
|
|
|
|
} catch ( ImageGalleryClassNotFoundException $e ) {
|
New more slick gallery display
This extension adds a "mode" parameter to the gallery
tag, allowing different formats for the gallery tag
(galleries in the ui can be controlled by a global)
The added modes are:
*traditional - The original gallery
*nolines - Like the original, no borders, less padding
*packed - All images aligned by having same height.
JS also justifies the images.
(I think this one is the one that will go over best
with users.)
*packed-overlay - like packed, but caption goes over
top the image in a transloucent box.
*packed-hover - like packed-overlay, but caption only
visible on hover. Degrades gracefully on screen
readers, and falls back to packed-overlay if
you are using a touch screen. I kind of like
this mode when the caption is not that important
(ex a category where its just the file name).
This also adds a hook to allow people to make their
own gallery version. I believe there would be interest
in this, as different people have done different
experiments. For example:
* Wikia: http://community.wikia.com/wiki/Help:Galleries,_Slideshows,_and_Sliders/wikitext
* Wikinews: https://en.wikinews.org/wiki/Template:Picture_select
What I would like to see for this patch, is first it gets
enabled, with the default still "traditional". After
about a month or two we consult with users. If feedback
is positive, we change the default mode to one of the
others (probably "packed").
Adds a "mode" parameter to gallery for different
mode, including one 'height-constrained-overlay'
which looks much more like other modern websites.
Note: This makes one change to the old gallery format.
It makes Nonexistent files be rendered like thumbnails
(i.e. they are rendered with a little grey border).
One thing I'm slightly worried about with this patch,
is that I added an option to MediaTransformOutput::toHtml
to override the width attribute. I'm not sure if that
is the best approach, and would appreciate thoughts
on that.
This should be merged at the same time as Ie82c1548
Change-Id: I33462a8b52502ed76aeb163b66e3704c8618ba23
2013-06-08 04:47:07 +00:00
|
|
|
|
// If invalid type set, fallback to default.
|
|
|
|
|
|
$ig = ImageGalleryBase::factory( false );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$ig->setContextTitle( $this->getTitle() );
|
2004-11-13 12:04:31 +00:00
|
|
|
|
$ig->setShowBytes( false );
|
2017-03-06 12:32:38 +00:00
|
|
|
|
$ig->setShowDimensions( false );
|
2004-11-13 12:04:31 +00:00
|
|
|
|
$ig->setShowFilename( false );
|
2007-08-22 13:40:22 +00:00
|
|
|
|
$ig->setParser( $this );
|
|
|
|
|
|
$ig->setHideBadImages();
|
2017-04-01 13:59:21 +00:00
|
|
|
|
$ig->setAttributes( Sanitizer::validateTagAttributes( $params, 'ul' ) );
|
2004-11-20 11:28:37 +00:00
|
|
|
|
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( isset( $params['showfilename'] ) ) {
|
2010-03-13 09:43:37 +00:00
|
|
|
|
$ig->setShowFilename( true );
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$ig->setShowFilename( false );
|
|
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( isset( $params['caption'] ) ) {
|
2018-03-07 20:04:33 +00:00
|
|
|
|
// NOTE: We aren't passing a frame here or below. Frame info
|
|
|
|
|
|
// is currently opaque to Parsoid, which acts on OT_PREPROCESS.
|
|
|
|
|
|
// See T107332#4030581
|
|
|
|
|
|
$caption = $this->recursiveTagParse( $params['caption'] );
|
2007-01-05 01:07:04 +00:00
|
|
|
|
$ig->setCaptionHtml( $caption );
|
2007-01-04 19:47:11 +00:00
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( isset( $params['perrow'] ) ) {
|
2007-02-02 03:32:03 +00:00
|
|
|
|
$ig->setPerRow( $params['perrow'] );
|
|
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( isset( $params['widths'] ) ) {
|
2007-02-02 03:32:03 +00:00
|
|
|
|
$ig->setWidths( $params['widths'] );
|
|
|
|
|
|
}
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( isset( $params['heights'] ) ) {
|
2007-02-02 03:32:03 +00:00
|
|
|
|
$ig->setHeights( $params['heights'] );
|
|
|
|
|
|
}
|
New more slick gallery display
This extension adds a "mode" parameter to the gallery
tag, allowing different formats for the gallery tag
(galleries in the ui can be controlled by a global)
The added modes are:
*traditional - The original gallery
*nolines - Like the original, no borders, less padding
*packed - All images aligned by having same height.
JS also justifies the images.
(I think this one is the one that will go over best
with users.)
*packed-overlay - like packed, but caption goes over
top the image in a transloucent box.
*packed-hover - like packed-overlay, but caption only
visible on hover. Degrades gracefully on screen
readers, and falls back to packed-overlay if
you are using a touch screen. I kind of like
this mode when the caption is not that important
(ex a category where its just the file name).
This also adds a hook to allow people to make their
own gallery version. I believe there would be interest
in this, as different people have done different
experiments. For example:
* Wikia: http://community.wikia.com/wiki/Help:Galleries,_Slideshows,_and_Sliders/wikitext
* Wikinews: https://en.wikinews.org/wiki/Template:Picture_select
What I would like to see for this patch, is first it gets
enabled, with the default still "traditional". After
about a month or two we consult with users. If feedback
is positive, we change the default mode to one of the
others (probably "packed").
Adds a "mode" parameter to gallery for different
mode, including one 'height-constrained-overlay'
which looks much more like other modern websites.
Note: This makes one change to the old gallery format.
It makes Nonexistent files be rendered like thumbnails
(i.e. they are rendered with a little grey border).
One thing I'm slightly worried about with this patch,
is that I added an option to MediaTransformOutput::toHtml
to override the width attribute. I'm not sure if that
is the best approach, and would appreciate thoughts
on that.
This should be merged at the same time as Ie82c1548
Change-Id: I33462a8b52502ed76aeb163b66e3704c8618ba23
2013-06-08 04:47:07 +00:00
|
|
|
|
$ig->setAdditionalOptions( $params );
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$lines = StringUtils::explode( "\n", $text );
|
2004-11-13 12:04:31 +00:00
|
|
|
|
foreach ( $lines as $line ) {
|
2004-11-20 11:28:37 +00:00
|
|
|
|
# match lines like these:
|
|
|
|
|
|
# Image:someimage.jpg|This is some image
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$matches = [];
|
2004-11-13 12:04:31 +00:00
|
|
|
|
preg_match( "/^([^|]+)(\\|(.*))?$/", $line, $matches );
|
|
|
|
|
|
# Skip empty lines
|
|
|
|
|
|
if ( count( $matches ) == 0 ) {
|
|
|
|
|
|
continue;
|
|
|
|
|
|
}
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( strpos( $matches[0], '%' ) !== false ) {
|
2010-12-24 09:53:08 +00:00
|
|
|
|
$matches[1] = rawurldecode( $matches[1] );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
}
|
2011-04-25 13:51:54 +00:00
|
|
|
|
$title = Title::newFromText( $matches[1], NS_FILE );
|
2020-01-09 23:48:34 +00:00
|
|
|
|
if ( $title === null ) {
|
2004-12-15 09:07:21 +00:00
|
|
|
|
# Bogus title. Ignore these so we don't bomb out later.
|
|
|
|
|
|
continue;
|
|
|
|
|
|
}
|
2011-05-25 15:39:47 +00:00
|
|
|
|
|
2013-04-25 00:28:03 +00:00
|
|
|
|
# We need to get what handler the file uses, to figure out parameters.
|
2022-05-09 09:09:00 +00:00
|
|
|
|
# Note, a hook can override the file name, and chose an entirely different
|
2013-04-25 00:28:03 +00:00
|
|
|
|
# file (which potentially could be of a different type and have different handler).
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$options = [];
|
2013-04-25 00:28:03 +00:00
|
|
|
|
$descQuery = false;
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onBeforeParserFetchFileAndTitle(
|
2022-03-16 23:34:23 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypeMismatchArgument Type mismatch on pass-by-ref args
|
2022-03-01 22:39:53 +00:00
|
|
|
|
$this, $title, $options, $descQuery
|
|
|
|
|
|
);
|
2017-01-04 02:15:40 +00:00
|
|
|
|
# Don't register it now, as TraditionalImageGallery does that later.
|
2013-04-25 00:28:03 +00:00
|
|
|
|
$file = $this->fetchFileNoRegister( $title, $options );
|
|
|
|
|
|
$handler = $file ? $file->getHandler() : false;
|
|
|
|
|
|
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$paramMap = [
|
2013-04-25 00:28:03 +00:00
|
|
|
|
'img_alt' => 'gallery-internal-alt',
|
|
|
|
|
|
'img_link' => 'gallery-internal-link',
|
2016-02-17 19:57:37 +00:00
|
|
|
|
];
|
2013-04-25 00:28:03 +00:00
|
|
|
|
if ( $handler ) {
|
2017-05-01 17:18:38 +00:00
|
|
|
|
$paramMap += $handler->getParamMap();
|
2013-04-25 00:28:03 +00:00
|
|
|
|
// We don't want people to specify per-image widths.
|
|
|
|
|
|
// Additionally the width parameter would need special casing anyhow.
|
|
|
|
|
|
unset( $paramMap['img_width'] );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2018-07-25 12:22:00 +00:00
|
|
|
|
$mwArray = $this->magicWordFactory->newArray( array_keys( $paramMap ) );
|
2013-04-25 00:28:03 +00:00
|
|
|
|
|
2011-04-25 13:51:54 +00:00
|
|
|
|
$label = '';
|
|
|
|
|
|
$alt = '';
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$handlerOptions = [];
|
2022-05-10 19:40:16 +00:00
|
|
|
|
$imageOptions = [];
|
2022-06-07 22:08:23 +00:00
|
|
|
|
$hasAlt = false;
|
2022-05-10 19:40:16 +00:00
|
|
|
|
|
2004-11-13 12:04:31 +00:00
|
|
|
|
if ( isset( $matches[3] ) ) {
|
2011-04-25 13:51:54 +00:00
|
|
|
|
// look for an |alt= definition while trying not to break existing
|
|
|
|
|
|
// captions with multiple pipes (|) in it, until a more sensible grammar
|
|
|
|
|
|
// is defined for images in galleries
|
2011-05-25 15:39:47 +00:00
|
|
|
|
|
2013-04-25 00:28:03 +00:00
|
|
|
|
// FIXME: Doing recursiveTagParse at this stage, and the trim before
|
|
|
|
|
|
// splitting on '|' is a bit odd, and different from makeImage.
|
2011-04-25 13:51:54 +00:00
|
|
|
|
$matches[3] = $this->recursiveTagParse( trim( $matches[3] ) );
|
2016-09-21 18:25:26 +00:00
|
|
|
|
// Protect LanguageConverter markup
|
|
|
|
|
|
$parameterMatches = StringUtils::delimiterExplode(
|
2021-02-25 20:19:54 +00:00
|
|
|
|
'-{', '}-',
|
|
|
|
|
|
'|',
|
|
|
|
|
|
$matches[3],
|
|
|
|
|
|
true /* nested */
|
2016-09-21 18:25:26 +00:00
|
|
|
|
);
|
2011-04-25 13:51:54 +00:00
|
|
|
|
|
2012-04-10 10:30:17 +00:00
|
|
|
|
foreach ( $parameterMatches as $parameterMatch ) {
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $magicName, $match ] = $mwArray->matchVariableStartToEnd( $parameterMatch );
|
2021-02-25 20:19:54 +00:00
|
|
|
|
if ( !$magicName ) {
|
2016-01-22 03:24:03 +00:00
|
|
|
|
// Last pipe wins.
|
2018-05-16 15:29:10 +00:00
|
|
|
|
$label = $parameterMatch;
|
2021-02-25 20:19:54 +00:00
|
|
|
|
continue;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
$paramName = $paramMap[$magicName];
|
|
|
|
|
|
switch ( $paramName ) {
|
|
|
|
|
|
case 'gallery-internal-alt':
|
2022-06-07 22:08:23 +00:00
|
|
|
|
$hasAlt = true;
|
2021-02-25 20:19:54 +00:00
|
|
|
|
$alt = $this->stripAltText( $match, false );
|
|
|
|
|
|
break;
|
|
|
|
|
|
case 'gallery-internal-link':
|
|
|
|
|
|
$linkValue = $this->stripAltText( $match, false );
|
2021-10-13 06:39:36 +00:00
|
|
|
|
if ( preg_match( '/^-{R\|(.*)}-$/', $linkValue ) ) {
|
2021-02-25 20:19:54 +00:00
|
|
|
|
// Result of LanguageConverter::markNoConversion
|
|
|
|
|
|
// invoked on an external link.
|
|
|
|
|
|
$linkValue = substr( $linkValue, 4, -2 );
|
|
|
|
|
|
}
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $type, $target ] = $this->parseLinkParameter( $linkValue );
|
2022-05-10 23:15:44 +00:00
|
|
|
|
if ( $type ) {
|
|
|
|
|
|
if ( $type === 'no-link' ) {
|
|
|
|
|
|
$target = true;
|
|
|
|
|
|
}
|
2022-05-10 19:40:16 +00:00
|
|
|
|
$imageOptions[$type] = $target;
|
2021-02-25 20:19:54 +00:00
|
|
|
|
}
|
|
|
|
|
|
break;
|
|
|
|
|
|
default:
|
|
|
|
|
|
// Must be a handler specific parameter.
|
|
|
|
|
|
if ( $handler->validateParam( $paramName, $match ) ) {
|
|
|
|
|
|
$handlerOptions[$paramName] = $match;
|
|
|
|
|
|
} else {
|
|
|
|
|
|
// Guess not, consider it as caption.
|
|
|
|
|
|
$this->logger->debug(
|
|
|
|
|
|
"$parameterMatch failed parameter validation" );
|
|
|
|
|
|
$label = $parameterMatch;
|
|
|
|
|
|
}
|
2011-04-25 13:51:54 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2004-11-13 12:04:31 +00:00
|
|
|
|
}
|
2005-07-03 07:15:53 +00:00
|
|
|
|
|
2022-06-07 22:08:23 +00:00
|
|
|
|
// Match makeImage when !$hasVisibleCaption
|
|
|
|
|
|
if ( !$hasAlt ) {
|
|
|
|
|
|
if ( $label !== '' ) {
|
|
|
|
|
|
$alt = $this->stripAltText( $label, false );
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$alt = $title->getText();
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
2022-05-13 20:58:41 +00:00
|
|
|
|
$imageOptions['title'] = $this->stripAltText( $label, false );
|
|
|
|
|
|
|
2022-05-10 19:40:16 +00:00
|
|
|
|
$ig->add(
|
|
|
|
|
|
$title, $label, $alt, '', $handlerOptions,
|
|
|
|
|
|
ImageGalleryBase::LOADING_DEFAULT, $imageOptions
|
|
|
|
|
|
);
|
2004-11-13 12:04:31 +00:00
|
|
|
|
}
|
2013-04-25 00:28:03 +00:00
|
|
|
|
$html = $ig->toHTML();
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onAfterParserFetchFileAndTitle( $this, $ig, $html );
|
2013-04-25 00:28:03 +00:00
|
|
|
|
return $html;
|
2004-11-13 12:04:31 +00:00
|
|
|
|
}
|
2005-04-27 07:48:14 +00:00
|
|
|
|
|
2011-08-05 00:33:03 +00:00
|
|
|
|
/**
|
2019-08-27 09:23:52 +00:00
|
|
|
|
* @param MediaHandler|false $handler
|
2011-08-05 00:33:03 +00:00
|
|
|
|
* @return array
|
|
|
|
|
|
*/
|
2019-11-04 19:23:34 +00:00
|
|
|
|
private function getImageParams( $handler ) {
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
if ( $handler ) {
|
|
|
|
|
|
$handlerClass = get_class( $handler );
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$handlerClass = '';
|
|
|
|
|
|
}
|
2013-02-03 19:42:08 +00:00
|
|
|
|
if ( !isset( $this->mImageParams[$handlerClass] ) ) {
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Initialise static lists
|
2016-02-17 19:57:37 +00:00
|
|
|
|
static $internalParamNames = [
|
|
|
|
|
|
'horizAlign' => [ 'left', 'right', 'center', 'none' ],
|
|
|
|
|
|
'vertAlign' => [ 'baseline', 'sub', 'super', 'top', 'text-top', 'middle',
|
|
|
|
|
|
'bottom', 'text-bottom' ],
|
|
|
|
|
|
'frame' => [ 'thumbnail', 'manualthumb', 'framed', 'frameless',
|
|
|
|
|
|
'upright', 'border', 'link', 'alt', 'class' ],
|
|
|
|
|
|
];
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
static $internalParamMap;
|
|
|
|
|
|
if ( !$internalParamMap ) {
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$internalParamMap = [];
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
foreach ( $internalParamNames as $type => $names ) {
|
|
|
|
|
|
foreach ( $names as $name ) {
|
2017-04-14 21:42:15 +00:00
|
|
|
|
// For grep: img_left, img_right, img_center, img_none,
|
|
|
|
|
|
// img_baseline, img_sub, img_super, img_top, img_text_top, img_middle,
|
|
|
|
|
|
// img_bottom, img_text_bottom,
|
|
|
|
|
|
// img_thumbnail, img_manualthumb, img_framed, img_frameless, img_upright,
|
|
|
|
|
|
// img_border, img_link, img_alt, img_class
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
$magicName = str_replace( '-', '_', "img_$name" );
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$internalParamMap[$magicName] = [ $type, $name ];
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Add handler params
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
$paramMap = $internalParamMap;
|
|
|
|
|
|
if ( $handler ) {
|
|
|
|
|
|
$handlerParamMap = $handler->getParamMap();
|
|
|
|
|
|
foreach ( $handlerParamMap as $magic => $paramName ) {
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$paramMap[$magic] = [ 'handler', $paramName ];
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
}
|
2018-02-13 22:51:22 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
// Parse the size for non-existent files. See T273013
|
|
|
|
|
|
$paramMap[ 'img_width' ] = [ 'handler', 'width' ];
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
}
|
|
|
|
|
|
$this->mImageParams[$handlerClass] = $paramMap;
|
2018-07-25 12:22:00 +00:00
|
|
|
|
$this->mImageParamsMagicArray[$handlerClass] =
|
|
|
|
|
|
$this->magicWordFactory->newArray( array_keys( $paramMap ) );
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
}
|
2016-02-17 19:57:37 +00:00
|
|
|
|
return [ $this->mImageParams[$handlerClass], $this->mImageParamsMagicArray[$handlerClass] ];
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2005-04-27 07:48:14 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Parse image options text and use it to make an image
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param LinkTarget $link
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $options
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @param LinkHolderArray|false $holders
|
2011-03-23 03:13:37 +00:00
|
|
|
|
* @return string HTML
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.5
|
2005-04-27 07:48:14 +00:00
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
public function makeImage( LinkTarget $link, $options, $holders = false ) {
|
2005-04-27 07:48:14 +00:00
|
|
|
|
# Check if the options text is of the form "options|alt text"
|
|
|
|
|
|
# Options are:
|
2008-10-08 16:33:36 +00:00
|
|
|
|
# * thumbnail make a thumbnail with enlarge-icon and caption, alignment depends on lang
|
|
|
|
|
|
# * left no resizing, just left align. label is used for alt= only
|
|
|
|
|
|
# * right same, but right aligned
|
|
|
|
|
|
# * none same, but not aligned
|
|
|
|
|
|
# * ___px scale to ___ pixels width, no aligning. e.g. use in taxobox
|
|
|
|
|
|
# * center center the image
|
2022-04-06 23:29:59 +00:00
|
|
|
|
# * framed Keep original image size, no magnify-button.
|
2008-10-08 16:33:36 +00:00
|
|
|
|
# * frameless like 'thumb' but without a frame. Keeps user preferences for width
|
|
|
|
|
|
# * upright reduce width for upright images, rounded to full __0 px
|
|
|
|
|
|
# * border draw a 1px border around the image
|
|
|
|
|
|
# * alt Text for HTML alt attribute (defaults to empty)
|
2012-08-29 08:07:10 +00:00
|
|
|
|
# * class Set a class for img node
|
2010-01-07 04:13:14 +00:00
|
|
|
|
# * link Set the target of the image link. Can be external, interwiki, or local
|
2007-02-02 00:44:42 +00:00
|
|
|
|
# vertical-align values (no % or length right now):
|
|
|
|
|
|
# * baseline
|
|
|
|
|
|
# * sub
|
|
|
|
|
|
# * super
|
|
|
|
|
|
# * top
|
|
|
|
|
|
# * text-top
|
|
|
|
|
|
# * middle
|
|
|
|
|
|
# * bottom
|
|
|
|
|
|
# * text-bottom
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
2016-12-13 19:49:36 +00:00
|
|
|
|
# Protect LanguageConverter markup when splitting into parts
|
|
|
|
|
|
$parts = StringUtils::delimiterExplode(
|
|
|
|
|
|
'-{', '}-', '|', $options, true /* allow nesting */
|
|
|
|
|
|
);
|
2005-04-27 07:48:14 +00:00
|
|
|
|
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
# Give extensions a chance to select the file revision for us
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$options = [];
|
2011-09-06 18:11:53 +00:00
|
|
|
|
$descQuery = false;
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$title = Title::castFromLinkTarget( $link ); // hook signature compat
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onBeforeParserFetchFileAndTitle(
|
2022-03-16 23:34:23 +00:00
|
|
|
|
// @phan-suppress-next-line PhanTypeMismatchArgument Type mismatch on pass-by-ref args
|
2022-03-01 22:39:53 +00:00
|
|
|
|
$this, $title, $options, $descQuery
|
|
|
|
|
|
);
|
2011-03-23 17:35:40 +00:00
|
|
|
|
# Fetch and register the file (file title may be different via hooks)
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $file, $link ] = $this->fetchFileAndTitle( $link, $options );
|
2011-03-24 01:44:48 +00:00
|
|
|
|
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
# Get parameter map
|
|
|
|
|
|
$handler = $file ? $file->getHandler() : false;
|
2005-04-27 07:48:14 +00:00
|
|
|
|
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $paramMap, $mwArray ] = $this->getImageParams( $handler );
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
|
2011-04-20 19:43:47 +00:00
|
|
|
|
if ( !$file ) {
|
|
|
|
|
|
$this->addTrackingCategory( 'broken-file-category' );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
# Process the input parameters
|
2008-08-08 21:50:37 +00:00
|
|
|
|
$caption = '';
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$params = [ 'frame' => [], 'handler' => [],
|
|
|
|
|
|
'horizAlign' => [], 'vertAlign' => [] ];
|
2014-03-18 16:59:21 +00:00
|
|
|
|
$seenformat = false;
|
2010-03-30 21:20:05 +00:00
|
|
|
|
foreach ( $parts as $part ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
$part = trim( $part );
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $magicName, $value ] = $mwArray->matchVariableStartToEnd( $part );
|
2008-03-25 05:17:42 +00:00
|
|
|
|
$validated = false;
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( isset( $paramMap[$magicName] ) ) {
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $type, $paramName ] = $paramMap[$magicName];
|
2008-03-25 05:17:42 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Special case; width and height come in one variable together
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $type === 'handler' && $paramName === 'width' ) {
|
2016-06-19 06:41:43 +00:00
|
|
|
|
$parsedWidthParam = self::parseWidthParam( $value );
|
2018-02-13 22:51:22 +00:00
|
|
|
|
// Parsoid applies data-(width|height) attributes to broken
|
|
|
|
|
|
// media spans, for client use. See T273013
|
|
|
|
|
|
$validateFunc = static function ( $name, $value ) use ( $handler ) {
|
|
|
|
|
|
return $handler
|
|
|
|
|
|
? $handler->validateParam( $name, $value )
|
|
|
|
|
|
: $value > 0;
|
|
|
|
|
|
};
|
2013-04-20 15:38:24 +00:00
|
|
|
|
if ( isset( $parsedWidthParam['width'] ) ) {
|
2012-07-25 15:31:47 +00:00
|
|
|
|
$width = $parsedWidthParam['width'];
|
2018-02-13 22:51:22 +00:00
|
|
|
|
if ( $validateFunc( 'width', $width ) ) {
|
2008-03-25 08:23:10 +00:00
|
|
|
|
$params[$type]['width'] = $width;
|
|
|
|
|
|
$validated = true;
|
|
|
|
|
|
}
|
2012-07-25 15:31:47 +00:00
|
|
|
|
}
|
2013-04-20 15:38:24 +00:00
|
|
|
|
if ( isset( $parsedWidthParam['height'] ) ) {
|
2012-07-25 15:31:47 +00:00
|
|
|
|
$height = $parsedWidthParam['height'];
|
2018-02-13 22:51:22 +00:00
|
|
|
|
if ( $validateFunc( 'height', $height ) ) {
|
2008-03-25 08:23:10 +00:00
|
|
|
|
$params[$type]['height'] = $height;
|
|
|
|
|
|
$validated = true;
|
|
|
|
|
|
}
|
2012-07-25 15:31:47 +00:00
|
|
|
|
}
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# else no validation -- T15436
|
2008-03-25 05:17:42 +00:00
|
|
|
|
} else {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
if ( $type === 'handler' ) {
|
2008-03-25 05:17:42 +00:00
|
|
|
|
# Validate handler parameter
|
|
|
|
|
|
$validated = $handler->validateParam( $paramName, $value );
|
|
|
|
|
|
} else {
|
|
|
|
|
|
# Validate internal parameters
|
2013-04-26 14:42:31 +00:00
|
|
|
|
switch ( $paramName ) {
|
2017-12-11 03:07:50 +00:00
|
|
|
|
case 'alt':
|
|
|
|
|
|
case 'class':
|
2008-10-06 05:55:27 +00:00
|
|
|
|
$validated = true;
|
2019-11-04 19:23:34 +00:00
|
|
|
|
$value = $this->stripAltText( $value, $holders );
|
2017-12-11 03:07:50 +00:00
|
|
|
|
break;
|
|
|
|
|
|
case 'link':
|
2022-10-21 04:32:38 +00:00
|
|
|
|
[ $paramName, $value ] =
|
2019-11-04 19:23:34 +00:00
|
|
|
|
$this->parseLinkParameter(
|
|
|
|
|
|
$this->stripAltText( $value, $holders )
|
2018-10-15 20:39:19 +00:00
|
|
|
|
);
|
2018-05-19 13:29:52 +00:00
|
|
|
|
if ( $paramName ) {
|
2008-10-06 05:55:27 +00:00
|
|
|
|
$validated = true;
|
2018-05-19 13:29:52 +00:00
|
|
|
|
if ( $paramName === 'no-link' ) {
|
|
|
|
|
|
$value = true;
|
|
|
|
|
|
}
|
2008-10-06 05:55:27 +00:00
|
|
|
|
}
|
2017-12-11 03:07:50 +00:00
|
|
|
|
break;
|
2022-04-07 14:30:54 +00:00
|
|
|
|
case 'manualthumb':
|
|
|
|
|
|
# @todo FIXME: Possibly check validity here for
|
|
|
|
|
|
# manualthumb? downstream behavior seems odd with
|
|
|
|
|
|
# missing manual thumbs.
|
|
|
|
|
|
$value = $this->stripAltText( $value, $holders );
|
|
|
|
|
|
// fall through
|
2017-12-11 03:07:50 +00:00
|
|
|
|
case 'frameless':
|
|
|
|
|
|
case 'framed':
|
|
|
|
|
|
case 'thumbnail':
|
|
|
|
|
|
// use first appearing option, discard others.
|
|
|
|
|
|
$validated = !$seenformat;
|
|
|
|
|
|
$seenformat = true;
|
|
|
|
|
|
break;
|
|
|
|
|
|
default:
|
|
|
|
|
|
# Most other things appear to be empty or numeric...
|
|
|
|
|
|
$validated = ( $value === false || is_numeric( trim( $value ) ) );
|
2008-04-07 23:30:45 +00:00
|
|
|
|
}
|
2008-03-25 05:17:42 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
if ( $validated ) {
|
|
|
|
|
|
$params[$type][$paramName] = $value;
|
2007-08-25 15:49:36 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2008-03-25 05:17:42 +00:00
|
|
|
|
}
|
|
|
|
|
|
if ( !$validated ) {
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
$caption = $part;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
# Process alignment parameters
|
2022-02-18 00:10:12 +00:00
|
|
|
|
if ( $params['horizAlign'] !== [] ) {
|
2022-10-21 03:55:44 +00:00
|
|
|
|
$params['frame']['align'] = array_key_first( $params['horizAlign'] );
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
}
|
2022-02-18 00:10:12 +00:00
|
|
|
|
if ( $params['vertAlign'] !== [] ) {
|
2022-10-21 03:55:44 +00:00
|
|
|
|
$params['frame']['valign'] = array_key_first( $params['vertAlign'] );
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
}
|
2008-08-08 21:50:37 +00:00
|
|
|
|
|
2008-10-08 16:33:36 +00:00
|
|
|
|
$params['frame']['caption'] = $caption;
|
|
|
|
|
|
|
2009-07-03 05:13:58 +00:00
|
|
|
|
# Will the image be presented in a frame, with the caption below?
|
2022-02-18 00:10:12 +00:00
|
|
|
|
// @phan-suppress-next-line PhanImpossibleCondition
|
2022-05-09 20:35:04 +00:00
|
|
|
|
$hasVisibleCaption = isset( $params['frame']['framed'] )
|
2022-02-18 00:10:12 +00:00
|
|
|
|
// @phan-suppress-next-line PhanImpossibleCondition
|
2013-12-01 19:58:51 +00:00
|
|
|
|
|| isset( $params['frame']['thumbnail'] )
|
2022-02-18 00:10:12 +00:00
|
|
|
|
// @phan-suppress-next-line PhanImpossibleCondition
|
2013-12-01 19:58:51 +00:00
|
|
|
|
|| isset( $params['frame']['manualthumb'] );
|
2008-10-08 16:33:36 +00:00
|
|
|
|
|
|
|
|
|
|
# In the old days, [[Image:Foo|text...]] would set alt text. Later it
|
|
|
|
|
|
# came to also set the caption, ordinary text after the image -- which
|
|
|
|
|
|
# makes no sense, because that just repeats the text multiple times in
|
|
|
|
|
|
# screen readers. It *also* came to set the title attribute.
|
|
|
|
|
|
# Now that we have an alt attribute, we should not set the alt text to
|
|
|
|
|
|
# equal the caption: that's worse than useless, it just repeats the
|
|
|
|
|
|
# text. This is the framed/thumbnail case. If there's no caption, we
|
|
|
|
|
|
# use the unnamed parameter for alt text as well, just for the time be-
|
|
|
|
|
|
# ing, if the unnamed param is set and the alt param is not.
|
|
|
|
|
|
# For the future, we need to figure out if we want to tweak this more,
|
|
|
|
|
|
# e.g., introducing a title= parameter for the title; ignoring the un-
|
|
|
|
|
|
# named parameter entirely for images without a caption; adding an ex-
|
|
|
|
|
|
# plicit caption= parameter and preserving the old magic unnamed para-
|
|
|
|
|
|
# meter for BC; ...
|
2022-05-09 20:35:04 +00:00
|
|
|
|
if ( $hasVisibleCaption ) {
|
2022-02-18 00:10:12 +00:00
|
|
|
|
// @phan-suppress-next-line PhanImpossibleCondition
|
2009-07-03 05:13:58 +00:00
|
|
|
|
if ( $caption === '' && !isset( $params['frame']['alt'] ) ) {
|
|
|
|
|
|
# No caption or alt text, add the filename as the alt text so
|
|
|
|
|
|
# that screen readers at least get some description of the image
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$params['frame']['alt'] = $link->getText();
|
2009-07-03 05:13:58 +00:00
|
|
|
|
}
|
2022-05-09 20:35:04 +00:00
|
|
|
|
# Do not set $params['frame']['title'] because tooltips are unnecessary
|
|
|
|
|
|
# for framed images, the caption is visible
|
|
|
|
|
|
} else {
|
2022-02-18 00:10:12 +00:00
|
|
|
|
// @phan-suppress-next-line PhanImpossibleCondition
|
2009-07-03 05:13:58 +00:00
|
|
|
|
if ( !isset( $params['frame']['alt'] ) ) {
|
|
|
|
|
|
# No alt text, use the "caption" for the alt text
|
2013-02-09 22:03:53 +00:00
|
|
|
|
if ( $caption !== '' ) {
|
2019-11-04 19:23:34 +00:00
|
|
|
|
$params['frame']['alt'] = $this->stripAltText( $caption, $holders );
|
2009-07-03 05:13:58 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
# No caption, fall back to using the filename for the
|
|
|
|
|
|
# alt text
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$params['frame']['alt'] = $link->getText();
|
2009-07-03 05:13:58 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
# Use the "caption" for the tooltip text
|
2019-11-04 19:23:34 +00:00
|
|
|
|
$params['frame']['title'] = $this->stripAltText( $caption, $holders );
|
2008-10-08 16:33:36 +00:00
|
|
|
|
}
|
2019-03-08 23:23:10 +00:00
|
|
|
|
$params['handler']['targetlang'] = $this->getTargetLanguage()->getCode();
|
2007-05-31 16:01:26 +00:00
|
|
|
|
|
2021-04-25 17:29:33 +00:00
|
|
|
|
// hook signature compat again, $link may have changed
|
|
|
|
|
|
$title = Title::castFromLinkTarget( $link );
|
Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.
General principles:
* Use DI if it is already used. We're not changing the way state is
managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
is a service, it's a more generic interface, it is the only
thing that provides isRegistered() which is needed in some cases,
and a HookRunner can be efficiently constructed from it
(confirmed by benchmark). Because HookContainer is needed
for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
SpecialPage and ApiBase have getHookContainer() and getHookRunner()
methods in the base class, and classes that extend that base class
are not expected to know or care where the base class gets its
HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
getHookRunner() methods, getting them from the global service
container. The point of this is to ease migration to DI by ensuring
that call sites ask their local friendly base class rather than
getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
methods did not seem warranted, there is a private HookRunner property
which is accessed directly. Very rarely (two cases), there is a
protected property, for consistency with code that conventionally
assumes protected=private, but in cases where the class might actually
be overridden, a protected accessor is preferred over a protected
property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
global code. In a few cases it was used for objects with broken
construction schemes, out of horror or laziness.
Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore
Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router
setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine
Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-03-19 02:42:09 +00:00
|
|
|
|
$this->hookRunner->onParserMakeImageParams( $title, $file, $params, $this );
|
2008-02-21 10:25:44 +00:00
|
|
|
|
|
2005-04-27 07:48:14 +00:00
|
|
|
|
# Linker does the rest
|
2017-10-06 22:17:58 +00:00
|
|
|
|
$time = $options['time'] ?? false;
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$ret = Linker::makeImageLink( $this, $link, $file, $params['frame'], $params['handler'],
|
2011-03-23 03:13:37 +00:00
|
|
|
|
$time, $descQuery, $this->mOptions->getThumbSize() );
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
|
|
|
|
|
|
# Give the handler a chance to modify the parser object
|
|
|
|
|
|
if ( $handler ) {
|
|
|
|
|
|
$handler->parserTransformHook( $this, $file );
|
2007-05-31 16:01:26 +00:00
|
|
|
|
}
|
2021-12-02 05:42:04 +00:00
|
|
|
|
if ( $file ) {
|
|
|
|
|
|
$this->modifyImageHtml( $file, $params, $ret );
|
|
|
|
|
|
}
|
Basic integrated audio/video support, with Ogg implementation.
* JavaScript video player based loosely on Greg Maxwell's player
* Image page text snippet customisation
* Abstraction of transform parameters in the parser. Introduced Linker::makeImageLink2().
* Made canRender(), mustRender() depend on file, not just on handler. Moved width=0, height=0 checking to ImageHandler::canRender(), since audio streams have width=height=0 but should be rendered.
Also:
* Automatic upgrade for oldimage rows on image page view, allows media handler selection based on oi_*_mime
* oi_*_mime unconditionally referenced, REQUIRES SCHEMA UPGRADE
* Don't destroy file info for missing files on upgrade
* Simple, centralised extension message file handling
* Made MessageCache::loadAllMessages non-static, optimised for repeated-call case due to abuse in User.php
* Support for lightweight parser output hooks, with callback whitelist for security
* Moved Linker::formatSize() to Language, to join the new formatTimePeriod() and formatBitrate()
* Introduced MagicWordArray, regex capture trick requires that magic word IDs DO NOT CONTAIN HYPHENS.
2007-08-15 10:50:09 +00:00
|
|
|
|
|
|
|
|
|
|
return $ret;
|
2005-04-27 07:48:14 +00:00
|
|
|
|
}
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2019-10-29 07:34:25 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Parse the value of 'link' parameter in image syntax (`[[File:Foo.jpg|link=<value>]]`).
|
|
|
|
|
|
*
|
|
|
|
|
|
* Adds an entry to appropriate link tables.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.32
|
|
|
|
|
|
* @param string $value
|
|
|
|
|
|
* @return array of `[ type, target ]`, where:
|
|
|
|
|
|
* - `type` is one of:
|
|
|
|
|
|
* - `null`: Given value is not a valid link target, use default
|
|
|
|
|
|
* - `'no-link'`: Given value is empty, do not generate a link
|
|
|
|
|
|
* - `'link-url'`: Given value is a valid external link
|
|
|
|
|
|
* - `'link-title'`: Given value is a valid internal link
|
|
|
|
|
|
* - `target` is:
|
|
|
|
|
|
* - When `type` is `null` or `'no-link'`: `false`
|
|
|
|
|
|
* - When `type` is `'link-url'`: URL string corresponding to given value
|
|
|
|
|
|
* - When `type` is `'link-title'`: Title object corresponding to given value
|
|
|
|
|
|
*/
|
2019-11-04 19:23:34 +00:00
|
|
|
|
private function parseLinkParameter( $value ) {
|
2018-05-19 13:29:52 +00:00
|
|
|
|
$chars = self::EXT_LINK_URL_CLASS;
|
|
|
|
|
|
$addr = self::EXT_LINK_ADDR;
|
2022-04-28 13:33:39 +00:00
|
|
|
|
$prots = $this->urlUtils->validProtocols();
|
2018-05-19 13:29:52 +00:00
|
|
|
|
$type = null;
|
|
|
|
|
|
$target = false;
|
|
|
|
|
|
if ( $value === '' ) {
|
|
|
|
|
|
$type = 'no-link';
|
|
|
|
|
|
} elseif ( preg_match( "/^((?i)$prots)/", $value ) ) {
|
|
|
|
|
|
if ( preg_match( "/^((?i)$prots)$addr$chars*$/u", $value, $m ) ) {
|
|
|
|
|
|
$this->mOutput->addExternalLink( $value );
|
|
|
|
|
|
$type = 'link-url';
|
|
|
|
|
|
$target = $value;
|
|
|
|
|
|
}
|
|
|
|
|
|
} else {
|
2022-04-06 20:02:40 +00:00
|
|
|
|
// Percent-decode link arguments for consistency with wikilink
|
|
|
|
|
|
// handling (T216003#7836261).
|
|
|
|
|
|
//
|
|
|
|
|
|
// There's slight concern here though. The |link= option supports
|
|
|
|
|
|
// two formats, link=Test%22test vs link=[[Test%22test]], both of
|
|
|
|
|
|
// which are about to be decoded.
|
|
|
|
|
|
//
|
|
|
|
|
|
// In the former case, the decoding here is straightforward and
|
|
|
|
|
|
// desirable.
|
|
|
|
|
|
//
|
|
|
|
|
|
// In the latter case, there's a potential for double decoding,
|
|
|
|
|
|
// because the wikilink syntax has a higher precedence and has
|
|
|
|
|
|
// already been parsed as a link before we get here. $value
|
|
|
|
|
|
// has had stripAltText() called on it, which in turn calls
|
|
|
|
|
|
// replaceLinkHoldersText() on the link. So, the text we're
|
|
|
|
|
|
// getting at this point has already been percent decoded.
|
|
|
|
|
|
//
|
|
|
|
|
|
// The problematic case is if %25 is in the title, since that
|
|
|
|
|
|
// decodes to %, which could combine with trailing characters.
|
|
|
|
|
|
// However, % is not a valid link title character, so it would
|
|
|
|
|
|
// not parse as a link and the string we received here would
|
|
|
|
|
|
// still contain the encoded %25.
|
|
|
|
|
|
//
|
|
|
|
|
|
// Hence, double decoded is not an issue. See the test,
|
|
|
|
|
|
// "Should not double decode the link option"
|
|
|
|
|
|
if ( strpos( $value, '%' ) !== false ) {
|
|
|
|
|
|
$value = rawurldecode( $value );
|
|
|
|
|
|
}
|
2018-05-19 13:29:52 +00:00
|
|
|
|
$linkTitle = Title::newFromText( $value );
|
|
|
|
|
|
if ( $linkTitle ) {
|
|
|
|
|
|
$this->mOutput->addLink( $linkTitle );
|
|
|
|
|
|
$type = 'link-title';
|
|
|
|
|
|
$target = $linkTitle;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
return [ $type, $target ];
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2021-12-02 05:42:04 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Give hooks a chance to modify image thumbnail HTML
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param File $file
|
|
|
|
|
|
* @param array $params
|
|
|
|
|
|
* @param string &$html
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function modifyImageHtml( File $file, array $params, string &$html ) {
|
|
|
|
|
|
$this->hookRunner->onParserModifyImageHTML( $this, $file, $params, $html );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2019-10-29 07:34:25 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @param string $caption
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @param LinkHolderArray|false $holders
|
2019-10-29 07:34:25 +00:00
|
|
|
|
* @return mixed|string
|
|
|
|
|
|
*/
|
2019-11-04 19:23:34 +00:00
|
|
|
|
private function stripAltText( $caption, $holders ) {
|
2008-10-15 21:20:13 +00:00
|
|
|
|
# Strip bad stuff out of the title (tooltip). We can't just use
|
|
|
|
|
|
# replaceLinkHoldersText() here, because if this function is called
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
# from handleInternalLinks2(), mLinkHolders won't be up-to-date.
|
2008-10-15 21:20:13 +00:00
|
|
|
|
if ( $holders ) {
|
|
|
|
|
|
$tooltip = $holders->replaceText( $caption );
|
|
|
|
|
|
} else {
|
2019-11-04 19:23:34 +00:00
|
|
|
|
$tooltip = $this->replaceLinkHoldersText( $caption );
|
2008-10-15 21:20:13 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
# make sure there are no placeholders in thumbnail attributes
|
|
|
|
|
|
# that are later expanded to html- so expand them now and
|
|
|
|
|
|
# remove the tags
|
|
|
|
|
|
$tooltip = $this->mStripState->unstripBoth( $tooltip );
|
2018-11-26 17:53:51 +00:00
|
|
|
|
# Compatibility hack! In HTML certain entity references not terminated
|
|
|
|
|
|
# by a semicolon are decoded (but not if we're in an attribute; that's
|
|
|
|
|
|
# how link URLs get away without properly escaping & in queries).
|
|
|
|
|
|
# But wikitext has always required semicolon-termination of entities,
|
|
|
|
|
|
# so encode & where needed to avoid decode of semicolon-less entities.
|
|
|
|
|
|
# See T209236 and
|
|
|
|
|
|
# https://www.w3.org/TR/html5/syntax.html#named-character-references
|
|
|
|
|
|
# T210437 discusses moving this workaround to Sanitizer::stripAllTags.
|
|
|
|
|
|
$tooltip = preg_replace( "/
|
|
|
|
|
|
& # 1. entity prefix
|
|
|
|
|
|
(?= # 2. followed by:
|
|
|
|
|
|
(?: # a. one of the legacy semicolon-less named entities
|
|
|
|
|
|
A(?:Elig|MP|acute|circ|grave|ring|tilde|uml)|
|
|
|
|
|
|
C(?:OPY|cedil)|E(?:TH|acute|circ|grave|uml)|
|
|
|
|
|
|
GT|I(?:acute|circ|grave|uml)|LT|Ntilde|
|
|
|
|
|
|
O(?:acute|circ|grave|slash|tilde|uml)|QUOT|REG|THORN|
|
|
|
|
|
|
U(?:acute|circ|grave|uml)|Yacute|
|
|
|
|
|
|
a(?:acute|c(?:irc|ute)|elig|grave|mp|ring|tilde|uml)|brvbar|
|
|
|
|
|
|
c(?:cedil|edil|urren)|cent(?!erdot;)|copy(?!sr;)|deg|
|
|
|
|
|
|
divide(?!ontimes;)|e(?:acute|circ|grave|th|uml)|
|
|
|
|
|
|
frac(?:1(?:2|4)|34)|
|
|
|
|
|
|
gt(?!c(?:c|ir)|dot|lPar|quest|r(?:a(?:pprox|rr)|dot|eq(?:less|qless)|less|sim);)|
|
|
|
|
|
|
i(?:acute|circ|excl|grave|quest|uml)|laquo|
|
|
|
|
|
|
lt(?!c(?:c|ir)|dot|hree|imes|larr|quest|r(?:Par|i(?:e|f|));)|
|
|
|
|
|
|
m(?:acr|i(?:cro|ddot))|n(?:bsp|tilde)|
|
|
|
|
|
|
not(?!in(?:E|dot|v(?:a|b|c)|)|ni(?:v(?:a|b|c)|);)|
|
|
|
|
|
|
o(?:acute|circ|grave|rd(?:f|m)|slash|tilde|uml)|
|
|
|
|
|
|
p(?:lusmn|ound)|para(?!llel;)|quot|r(?:aquo|eg)|
|
|
|
|
|
|
s(?:ect|hy|up(?:1|2|3)|zlig)|thorn|times(?!b(?:ar|)|d;)|
|
|
|
|
|
|
u(?:acute|circ|grave|ml|uml)|y(?:acute|en|uml)
|
|
|
|
|
|
)
|
|
|
|
|
|
(?:[^;]|$)) # b. and not followed by a semicolon
|
|
|
|
|
|
# S = study, for efficiency
|
|
|
|
|
|
/Sx", '&', $tooltip );
|
2008-10-15 21:20:13 +00:00
|
|
|
|
$tooltip = Sanitizer::stripAllTags( $tooltip );
|
2010-01-07 04:13:14 +00:00
|
|
|
|
|
2008-10-15 21:20:13 +00:00
|
|
|
|
return $tooltip;
|
|
|
|
|
|
}
|
2005-08-07 12:09:46 +00:00
|
|
|
|
|
2010-06-10 21:05:58 +00:00
|
|
|
|
/**
|
2005-08-23 21:49:48 +00:00
|
|
|
|
* Callback from the Sanitizer for expanding items found in HTML attribute
|
|
|
|
|
|
* values, so they can be safely tested and escaped.
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*
|
2017-08-11 00:23:16 +00:00
|
|
|
|
* @param string &$text
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @param PPFrame|false $frame
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return string
|
2020-01-25 15:45:18 +00:00
|
|
|
|
* @deprecated since 1.35, internal callback should not have been public
|
2005-08-23 21:49:48 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function attributeStripCallback( &$text, $frame = false ) {
|
2020-01-25 15:45:18 +00:00
|
|
|
|
wfDeprecated( __METHOD__, '1.35' );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$text = $this->replaceVariables( $text, $frame );
|
2006-11-21 09:53:45 +00:00
|
|
|
|
$text = $this->mStripState->unstripBoth( $text );
|
2005-08-29 23:34:37 +00:00
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
2006-01-07 13:31:29 +00:00
|
|
|
|
|
2010-06-10 21:05:58 +00:00
|
|
|
|
/**
|
2006-01-08 15:13:37 +00:00
|
|
|
|
* Accessor
|
2011-04-29 23:34:37 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return array
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.6
|
2006-01-08 15:13:37 +00:00
|
|
|
|
*/
|
2022-09-28 20:39:23 +00:00
|
|
|
|
public function getTags(): array {
|
2020-08-26 17:43:57 +00:00
|
|
|
|
return array_keys( $this->mTagHooks );
|
2010-03-30 21:20:05 +00:00
|
|
|
|
}
|
2006-06-06 00:51:34 +00:00
|
|
|
|
|
2018-08-14 05:44:48 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* @since 1.32
|
|
|
|
|
|
* @return array
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function getFunctionSynonyms() {
|
|
|
|
|
|
return $this->mFunctionSynonyms;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* @since 1.32
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function getUrlProtocols() {
|
2022-04-28 13:33:39 +00:00
|
|
|
|
return $this->urlUtils->validProtocols();
|
2018-08-14 05:44:48 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2006-06-06 00:51:34 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Break wikitext input into sections, and either pull or replace
|
|
|
|
|
|
* some particular section's text.
|
|
|
|
|
|
*
|
|
|
|
|
|
* External callers should use the getSection and replaceSection methods.
|
|
|
|
|
|
*
|
2013-03-11 17:15:01 +00:00
|
|
|
|
* @param string $text Page wikitext
|
2016-12-14 16:01:47 +00:00
|
|
|
|
* @param string|int $sectionId A section identifier string of the form:
|
2012-07-10 12:48:06 +00:00
|
|
|
|
* "<flag1> - <flag2> - ... - <section number>"
|
2008-01-05 12:39:12 +00:00
|
|
|
|
*
|
|
|
|
|
|
* Currently the only recognised flag is "T", which means the target section number
|
2008-04-14 07:45:50 +00:00
|
|
|
|
* was derived during a template inclusion parse, in other words this is a template
|
|
|
|
|
|
* section edit link. If no flags are given, it was an ordinary section edit link.
|
|
|
|
|
|
* This flag is required to avoid a section numbering mismatch when a section is
|
2016-12-11 22:45:07 +00:00
|
|
|
|
* enclosed by "<includeonly>" (T8563).
|
2008-01-05 12:39:12 +00:00
|
|
|
|
*
|
2008-04-14 07:45:50 +00:00
|
|
|
|
* The section number 0 pulls the text before the first heading; other numbers will
|
|
|
|
|
|
* pull the given section along with its lower-level subsections. If the section is
|
2008-01-05 12:39:12 +00:00
|
|
|
|
* not found, $mode=get will return $newtext, and $mode=replace will return $text.
|
|
|
|
|
|
*
|
2011-09-05 06:56:08 +00:00
|
|
|
|
* Section 0 is always considered to exist, even if it only contains the empty
|
2011-09-14 15:07:20 +00:00
|
|
|
|
* string. If $text is the empty string and section 0 is replaced, $newText is
|
2011-09-05 06:56:08 +00:00
|
|
|
|
* returned.
|
|
|
|
|
|
*
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @param string $mode One of "get" or "replace"
|
2022-03-08 22:57:00 +00:00
|
|
|
|
* @param string|false $newText Replacement text for section data.
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @return string For "get", the extracted section text.
|
|
|
|
|
|
* for "replace", the whole page with the section replaced.
|
2006-06-06 00:51:34 +00:00
|
|
|
|
*/
|
2014-06-12 14:05:18 +00:00
|
|
|
|
private function extractSections( $text, $sectionId, $mode, $newText = '' ) {
|
2011-01-23 16:07:13 +00:00
|
|
|
|
global $wgTitle; # not generally used but removes an ugly failure mode
|
2013-10-27 20:18:06 +00:00
|
|
|
|
|
|
|
|
|
|
$magicScopeVariable = $this->lock();
|
2020-09-18 15:07:18 +00:00
|
|
|
|
$this->startParse(
|
|
|
|
|
|
$wgTitle,
|
|
|
|
|
|
ParserOptions::newFromUser( RequestContext::getMain()->getUser() ),
|
|
|
|
|
|
self::OT_PLAIN,
|
|
|
|
|
|
true
|
|
|
|
|
|
);
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$outText = '';
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$frame = $this->getPreprocessor()->newFrame();
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Process section extraction flags
|
2008-01-05 12:39:12 +00:00
|
|
|
|
$flags = 0;
|
2014-06-12 14:05:18 +00:00
|
|
|
|
$sectionParts = explode( '-', $sectionId );
|
2022-11-18 14:19:54 +00:00
|
|
|
|
// The section ID may either be a magic string such as 'new' (which should be treated as 0),
|
|
|
|
|
|
// or a numbered section ID in the format of "T-<section index>".
|
|
|
|
|
|
// Explicitly coerce the section index into a number accordingly. (T323373)
|
|
|
|
|
|
$sectionIndex = (int)array_pop( $sectionParts );
|
2008-01-05 12:39:12 +00:00
|
|
|
|
foreach ( $sectionParts as $part ) {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
if ( $part === 'T' ) {
|
2020-04-07 23:52:41 +00:00
|
|
|
|
$flags |= Preprocessor::DOM_FOR_INCLUSION;
|
2008-01-05 12:39:12 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2011-09-05 06:56:08 +00:00
|
|
|
|
|
|
|
|
|
|
# Check for empty input
|
|
|
|
|
|
if ( strval( $text ) === '' ) {
|
|
|
|
|
|
# Only sections 0 and T-0 exist in an empty document
|
2022-11-18 14:19:54 +00:00
|
|
|
|
if ( $sectionIndex === 0 ) {
|
2011-09-05 06:56:08 +00:00
|
|
|
|
if ( $mode === 'get' ) {
|
|
|
|
|
|
return '';
|
|
|
|
|
|
}
|
2019-03-29 20:12:24 +00:00
|
|
|
|
|
|
|
|
|
|
return $newText;
|
2011-09-05 06:56:08 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
if ( $mode === 'get' ) {
|
|
|
|
|
|
return $newText;
|
|
|
|
|
|
}
|
2019-03-29 20:12:24 +00:00
|
|
|
|
|
|
|
|
|
|
return $text;
|
2011-09-05 06:56:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Preprocess the text
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$root = $this->preprocessToDom( $text, $flags );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# <h> nodes indicate section breaks
|
|
|
|
|
|
# They can only occur at the top level, so we can find them by iterating the root's children
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$node = $root->getFirstChild();
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Find the target section
|
2022-11-18 14:19:54 +00:00
|
|
|
|
if ( $sectionIndex === 0 ) {
|
2011-03-21 15:18:11 +00:00
|
|
|
|
# Section zero doesn't nest, level=big
|
|
|
|
|
|
$targetLevel = 1000;
|
2007-11-20 10:55:08 +00:00
|
|
|
|
} else {
|
2010-06-10 21:05:58 +00:00
|
|
|
|
while ( $node ) {
|
|
|
|
|
|
if ( $node->getName() === 'h' ) {
|
|
|
|
|
|
$bits = $node->splitHeading();
|
2008-02-01 01:35:55 +00:00
|
|
|
|
if ( $bits['i'] == $sectionIndex ) {
|
2010-06-10 21:05:58 +00:00
|
|
|
|
$targetLevel = $bits['level'];
|
2007-11-20 10:55:08 +00:00
|
|
|
|
break;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
2008-08-26 14:37:15 +00:00
|
|
|
|
if ( $mode === 'replace' ) {
|
2007-12-01 07:13:31 +00:00
|
|
|
|
$outText .= $frame->expand( $node, PPFrame::RECOVER_ORIG );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$node = $node->getNextSibling();
|
2006-06-06 00:51:34 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
|
|
|
|
|
if ( !$node ) {
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Not found
|
2008-08-26 14:37:15 +00:00
|
|
|
|
if ( $mode === 'get' ) {
|
2007-11-20 10:55:08 +00:00
|
|
|
|
return $newText;
|
2006-06-06 00:51:34 +00:00
|
|
|
|
} else {
|
2007-11-20 10:55:08 +00:00
|
|
|
|
return $text;
|
2006-06-06 00:51:34 +00:00
|
|
|
|
}
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Find the end of the section, including nested sections
|
2007-11-20 10:55:08 +00:00
|
|
|
|
do {
|
2008-08-26 14:37:15 +00:00
|
|
|
|
if ( $node->getName() === 'h' ) {
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$bits = $node->splitHeading();
|
|
|
|
|
|
$curLevel = $bits['level'];
|
2022-03-29 18:11:06 +00:00
|
|
|
|
// @phan-suppress-next-line PhanPossiblyUndeclaredVariable False positive
|
2010-01-27 02:41:22 +00:00
|
|
|
|
if ( $bits['i'] != $sectionIndex && $curLevel <= $targetLevel ) {
|
2007-11-20 10:55:08 +00:00
|
|
|
|
break;
|
2006-06-06 00:51:34 +00:00
|
|
|
|
}
|
|
|
|
|
|
}
|
2008-08-26 14:37:15 +00:00
|
|
|
|
if ( $mode === 'get' ) {
|
2007-12-01 07:13:31 +00:00
|
|
|
|
$outText .= $frame->expand( $node, PPFrame::RECOVER_ORIG );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$node = $node->getNextSibling();
|
2007-11-20 10:55:08 +00:00
|
|
|
|
} while ( $node );
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Write out the remainder (in replace mode only)
|
2008-08-26 14:37:15 +00:00
|
|
|
|
if ( $mode === 'replace' ) {
|
2010-03-30 21:53:56 +00:00
|
|
|
|
# Output the replacement text
|
|
|
|
|
|
# Add two newlines on -- trailing whitespace in $newText is conventionally
|
|
|
|
|
|
# stripped by the editor, so we need both newlines to restore the paragraph gap
|
|
|
|
|
|
# Only add trailing whitespace if there is newText
|
2010-03-30 21:20:05 +00:00
|
|
|
|
if ( $newText != "" ) {
|
2009-02-01 18:58:18 +00:00
|
|
|
|
$outText .= $newText . "\n\n";
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
while ( $node ) {
|
2007-12-01 07:13:31 +00:00
|
|
|
|
$outText .= $frame->expand( $node, PPFrame::RECOVER_ORIG );
|
2008-01-21 16:36:08 +00:00
|
|
|
|
$node = $node->getNextSibling();
|
2007-11-20 10:55:08 +00:00
|
|
|
|
}
|
2006-06-06 00:51:34 +00:00
|
|
|
|
}
|
2007-03-14 18:20:21 +00:00
|
|
|
|
|
2020-06-24 17:29:59 +00:00
|
|
|
|
# Re-insert stripped tags
|
|
|
|
|
|
$outText = rtrim( $this->mStripState->unstripBoth( $outText ) );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
|
|
|
|
|
return $outText;
|
2006-06-06 00:51:34 +00:00
|
|
|
|
}
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2006-06-06 00:51:34 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* This function returns the text of a section, specified by a number ($section).
|
|
|
|
|
|
* A section is text under a heading like == Heading == or \<h1\>Heading\</h1\>, or
|
|
|
|
|
|
* the first section before any such heading (section 0).
|
|
|
|
|
|
*
|
|
|
|
|
|
* If a section contains subsections, these are also returned.
|
|
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text Text to look in
|
2016-12-14 16:01:47 +00:00
|
|
|
|
* @param string|int $sectionId Section identifier as a number or string
|
2014-06-12 14:05:18 +00:00
|
|
|
|
* (e.g. 0, 1 or 'T-1').
|
2022-03-08 22:57:00 +00:00
|
|
|
|
* @param string|false $defaultText Default to return if section is not found
|
2014-06-12 14:05:18 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return string Text of the requested section
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.7
|
2006-06-06 00:51:34 +00:00
|
|
|
|
*/
|
2014-06-12 14:05:18 +00:00
|
|
|
|
public function getSection( $text, $sectionId, $defaultText = '' ) {
|
|
|
|
|
|
return $this->extractSections( $text, $sectionId, 'get', $defaultText );
|
2006-06-06 00:51:34 +00:00
|
|
|
|
}
|
2006-07-11 17:40:11 +00:00
|
|
|
|
|
2010-12-26 19:30:10 +00:00
|
|
|
|
/**
|
2011-02-19 19:18:02 +00:00
|
|
|
|
* This function returns $oldtext after the content of the section
|
2011-09-14 15:07:20 +00:00
|
|
|
|
* specified by $section has been replaced with $text. If the target
|
2011-09-05 06:56:08 +00:00
|
|
|
|
* section does not exist, $oldtext is returned unchanged.
|
2011-02-19 19:18:02 +00:00
|
|
|
|
*
|
2014-06-12 14:05:18 +00:00
|
|
|
|
* @param string $oldText Former text of the article
|
2016-12-14 16:01:47 +00:00
|
|
|
|
* @param string|int $sectionId Section identifier as a number or string
|
2014-06-12 14:05:18 +00:00
|
|
|
|
* (e.g. 0, 1 or 'T-1').
|
2022-03-08 22:57:00 +00:00
|
|
|
|
* @param string|false $newText Replacing text
|
2014-06-12 14:05:18 +00:00
|
|
|
|
*
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @return string Modified text
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.7
|
2010-12-26 19:30:10 +00:00
|
|
|
|
*/
|
2014-06-12 14:05:18 +00:00
|
|
|
|
public function replaceSection( $oldText, $sectionId, $newText ) {
|
|
|
|
|
|
return $this->extractSections( $oldText, $sectionId, 'replace', $newText );
|
2006-06-06 00:51:34 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2019-11-04 04:23:23 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get an array of preprocessor section information.
|
|
|
|
|
|
*
|
|
|
|
|
|
* Preprocessor sections are those identified by wikitext-style syntax, not
|
|
|
|
|
|
* HTML-style syntax. Templates are not expanded, so these sections do not
|
|
|
|
|
|
* include sections created by templates or parser functions. This is the
|
|
|
|
|
|
* same definition of a section as used by section editing, but not the
|
|
|
|
|
|
* same as TOC generation.
|
|
|
|
|
|
*
|
|
|
|
|
|
* These sections are typically smaller than those acted on by getSection() and
|
|
|
|
|
|
* replaceSection() since they are not nested. Section nesting could be
|
|
|
|
|
|
* reconstructed from the heading levels.
|
|
|
|
|
|
*
|
|
|
|
|
|
* The return value is an array of associative array info structures. Each
|
|
|
|
|
|
* associative array contains the following keys, describing a section:
|
|
|
|
|
|
*
|
|
|
|
|
|
* - index: An integer identifying the section.
|
|
|
|
|
|
* - level: The heading level, e.g. 1 for <h1>. For the section before the
|
|
|
|
|
|
* the first heading, this will be 0.
|
|
|
|
|
|
* - offset: The byte offset within the wikitext at which the section starts
|
|
|
|
|
|
* - heading: The wikitext for the header which introduces the section,
|
|
|
|
|
|
* including equals signs. For the section before the first heading, this
|
|
|
|
|
|
* will be an empty string.
|
|
|
|
|
|
* - text: The complete text of the section.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @return array[]
|
2020-01-25 15:45:59 +00:00
|
|
|
|
* @internal
|
2019-11-04 04:23:23 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function getFlatSectionInfo( $text ) {
|
|
|
|
|
|
$magicScopeVariable = $this->lock();
|
2020-09-18 15:07:18 +00:00
|
|
|
|
$this->startParse(
|
|
|
|
|
|
null,
|
|
|
|
|
|
ParserOptions::newFromUser( RequestContext::getMain()->getUser() ),
|
|
|
|
|
|
self::OT_PLAIN,
|
|
|
|
|
|
true
|
|
|
|
|
|
);
|
2019-11-04 04:23:23 +00:00
|
|
|
|
$frame = $this->getPreprocessor()->newFrame();
|
|
|
|
|
|
$root = $this->preprocessToDom( $text, 0 );
|
|
|
|
|
|
$node = $root->getFirstChild();
|
|
|
|
|
|
$offset = 0;
|
|
|
|
|
|
$currentSection = [
|
|
|
|
|
|
'index' => 0,
|
|
|
|
|
|
'level' => 0,
|
|
|
|
|
|
'offset' => 0,
|
|
|
|
|
|
'heading' => '',
|
|
|
|
|
|
'text' => ''
|
|
|
|
|
|
];
|
|
|
|
|
|
$sections = [];
|
|
|
|
|
|
|
|
|
|
|
|
while ( $node ) {
|
|
|
|
|
|
$nodeText = $frame->expand( $node, PPFrame::RECOVER_ORIG );
|
|
|
|
|
|
if ( $node->getName() === 'h' ) {
|
|
|
|
|
|
$bits = $node->splitHeading();
|
|
|
|
|
|
$sections[] = $currentSection;
|
|
|
|
|
|
$currentSection = [
|
|
|
|
|
|
'index' => $bits['i'],
|
|
|
|
|
|
'level' => $bits['level'],
|
|
|
|
|
|
'offset' => $offset,
|
|
|
|
|
|
'heading' => $nodeText,
|
|
|
|
|
|
'text' => $nodeText
|
|
|
|
|
|
];
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$currentSection['text'] .= $nodeText;
|
|
|
|
|
|
}
|
|
|
|
|
|
$offset += strlen( $nodeText );
|
|
|
|
|
|
$node = $node->getNextSibling();
|
|
|
|
|
|
}
|
|
|
|
|
|
$sections[] = $currentSection;
|
|
|
|
|
|
return $sections;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2010-06-10 21:05:58 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get the ID of the revision we are parsing
|
|
|
|
|
|
*
|
2019-04-13 23:43:06 +00:00
|
|
|
|
* The return value will be either:
|
|
|
|
|
|
* - a) Positive, indicating a specific revision ID (current or old)
|
2020-04-09 03:36:39 +00:00
|
|
|
|
* - b) Zero, meaning the revision ID is specified by getCurrentRevisionRecordCallback()
|
2019-04-13 23:43:06 +00:00
|
|
|
|
* - c) Null, meaning the parse is for preview mode and there is no revision
|
|
|
|
|
|
*
|
2014-04-08 15:29:17 +00:00
|
|
|
|
* @return int|null
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.13
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function getRevisionId() {
|
2010-06-10 21:05:58 +00:00
|
|
|
|
return $this->mRevisionId;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2020-04-09 03:36:39 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get the revision record object for $this->mRevisionId
|
|
|
|
|
|
*
|
|
|
|
|
|
* @return RevisionRecord|null Either a RevisionRecord object or null
|
|
|
|
|
|
* @since 1.35
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function getRevisionRecordObject() {
|
|
|
|
|
|
if ( $this->mRevisionRecordObject ) {
|
|
|
|
|
|
return $this->mRevisionRecordObject;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2021-05-02 23:55:07 +00:00
|
|
|
|
// NOTE: try to get the RevisionRecord object even if mRevisionId is null.
|
2019-06-27 04:30:35 +00:00
|
|
|
|
// This is useful when parsing a revision that has not yet been saved.
|
2018-09-05 18:03:15 +00:00
|
|
|
|
// However, if we get back a saved revision even though we are in
|
|
|
|
|
|
// preview mode, we'll have to ignore it, see below.
|
|
|
|
|
|
// NOTE: This callback may be used to inject an OLD revision that was
|
|
|
|
|
|
// already loaded, so "current" is a bit of a misnomer. We can't just
|
|
|
|
|
|
// skip it if mRevisionId is set.
|
2015-03-31 04:00:13 +00:00
|
|
|
|
$rev = call_user_func(
|
2020-04-09 03:36:39 +00:00
|
|
|
|
$this->mOptions->getCurrentRevisionRecordCallback(),
|
2019-06-27 04:30:35 +00:00
|
|
|
|
$this->getTitle(),
|
|
|
|
|
|
$this
|
2015-03-31 04:00:13 +00:00
|
|
|
|
);
|
|
|
|
|
|
|
2020-05-06 16:38:07 +00:00
|
|
|
|
if ( $rev === false ) {
|
|
|
|
|
|
// The revision record callback returns `false` (not null) to
|
|
|
|
|
|
// indicate that the revision is missing. (See for example
|
|
|
|
|
|
// Parser::statelessFetchRevisionRecord(), the default callback.)
|
|
|
|
|
|
// This API expects `null` instead. (T251952)
|
|
|
|
|
|
$rev = null;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2018-09-05 18:03:15 +00:00
|
|
|
|
if ( $this->mRevisionId === null && $rev && $rev->getId() ) {
|
|
|
|
|
|
// We are in preview mode (mRevisionId is null), and the current revision callback
|
|
|
|
|
|
// returned an existing revision. Ignore it and return null, it's probably the page's
|
|
|
|
|
|
// current revision, which is not what we want here. Note that we do want to call the
|
|
|
|
|
|
// callback to allow the unsaved revision to be injected here, e.g. for
|
|
|
|
|
|
// self-transclusion previews.
|
|
|
|
|
|
return null;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
// If the parse is for a new revision, then the callback should have
|
|
|
|
|
|
// already been set to force the object and should match mRevisionId.
|
2021-11-19 23:19:42 +00:00
|
|
|
|
// If not, try to fetch by mRevisionId instead.
|
2018-08-07 16:52:40 +00:00
|
|
|
|
if ( $this->mRevisionId && $rev && $rev->getId() != $this->mRevisionId ) {
|
2020-04-09 03:36:39 +00:00
|
|
|
|
$rev = MediaWikiServices::getInstance()
|
|
|
|
|
|
->getRevisionLookup()
|
|
|
|
|
|
->getRevisionById( $this->mRevisionId );
|
2015-03-31 04:00:13 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2020-04-09 03:36:39 +00:00
|
|
|
|
$this->mRevisionRecordObject = $rev;
|
2015-03-31 04:00:13 +00:00
|
|
|
|
|
2020-04-09 03:36:39 +00:00
|
|
|
|
return $this->mRevisionRecordObject;
|
2010-12-10 18:17:20 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2006-11-21 09:53:45 +00:00
|
|
|
|
/**
|
2007-01-17 19:48:48 +00:00
|
|
|
|
* Get the timestamp associated with the current revision, adjusted for
|
2006-12-02 23:56:25 +00:00
|
|
|
|
* the default server-local timestamp
|
2019-04-19 01:36:40 +00:00
|
|
|
|
* @return string TS_MW timestamp
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.9
|
2006-11-21 09:53:45 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function getRevisionTimestamp() {
|
2019-04-19 01:36:40 +00:00
|
|
|
|
if ( $this->mRevisionTimestamp !== null ) {
|
|
|
|
|
|
return $this->mRevisionTimestamp;
|
2006-11-21 09:53:45 +00:00
|
|
|
|
}
|
2019-04-19 01:36:40 +00:00
|
|
|
|
|
|
|
|
|
|
# Use specified revision timestamp, falling back to the current timestamp
|
2020-04-17 20:29:22 +00:00
|
|
|
|
$revObject = $this->getRevisionRecordObject();
|
2022-07-26 18:02:27 +00:00
|
|
|
|
$timestamp = $revObject && $revObject->getTimestamp()
|
|
|
|
|
|
? $revObject->getTimestamp()
|
|
|
|
|
|
: $this->mOptions->getTimestamp();
|
2019-04-19 01:36:40 +00:00
|
|
|
|
$this->mOutput->setRevisionTimestampUsed( $timestamp ); // unadjusted time zone
|
|
|
|
|
|
|
|
|
|
|
|
# The cryptic '' timezone parameter tells to use the site-default
|
|
|
|
|
|
# timezone offset instead of the user settings.
|
|
|
|
|
|
# Since this value will be saved into the parser cache, served
|
|
|
|
|
|
# to other users, and potentially even used inside links and such,
|
|
|
|
|
|
# it needs to be consistent for all visitors.
|
|
|
|
|
|
$this->mRevisionTimestamp = $this->contLang->userAdjust( $timestamp, '' );
|
|
|
|
|
|
|
2006-11-21 09:53:45 +00:00
|
|
|
|
return $this->mRevisionTimestamp;
|
|
|
|
|
|
}
|
2007-01-17 19:48:48 +00:00
|
|
|
|
|
2009-03-07 23:01:59 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get the name of the user that edited the last revision
|
2010-06-10 21:05:58 +00:00
|
|
|
|
*
|
2020-03-31 16:11:00 +00:00
|
|
|
|
* @return string|null User name
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.15
|
2009-03-07 23:01:59 +00:00
|
|
|
|
*/
|
2020-03-31 16:11:00 +00:00
|
|
|
|
public function getRevisionUser(): ?string {
|
2020-01-09 23:48:34 +00:00
|
|
|
|
if ( $this->mRevisionUser === null ) {
|
2020-04-17 20:29:22 +00:00
|
|
|
|
$revObject = $this->getRevisionRecordObject();
|
2010-12-10 18:17:20 +00:00
|
|
|
|
|
|
|
|
|
|
# if this template is subst: the revision id will be blank,
|
|
|
|
|
|
# so just use the current user's name
|
2020-04-17 20:29:22 +00:00
|
|
|
|
if ( $revObject && $revObject->getUser() ) {
|
|
|
|
|
|
$this->mRevisionUser = $revObject->getUser()->getName();
|
2013-04-20 15:38:24 +00:00
|
|
|
|
} elseif ( $this->ot['wiki'] || $this->mOptions->getIsPreview() ) {
|
2021-07-28 14:08:59 +00:00
|
|
|
|
$this->mRevisionUser = $this->getUserIdentity()->getName();
|
2020-03-31 16:11:00 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
# Note that we fall through here with
|
|
|
|
|
|
# $this->mRevisionUser still null
|
2010-12-10 18:17:20 +00:00
|
|
|
|
}
|
2009-03-07 23:01:59 +00:00
|
|
|
|
}
|
2010-12-10 18:17:20 +00:00
|
|
|
|
return $this->mRevisionUser;
|
2009-03-07 23:01:59 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2013-09-04 19:09:36 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Get the size of the revision
|
|
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @return int|null Revision size
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.22
|
2013-09-04 19:09:36 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function getRevisionSize() {
|
2020-01-09 23:48:34 +00:00
|
|
|
|
if ( $this->mRevisionSize === null ) {
|
2020-04-17 20:29:22 +00:00
|
|
|
|
$revObject = $this->getRevisionRecordObject();
|
2013-09-04 19:09:36 +00:00
|
|
|
|
|
|
|
|
|
|
# if this variable is subst: the revision id will be blank,
|
2022-01-09 17:44:44 +00:00
|
|
|
|
# so just use the parser input size, because the own substitution
|
2013-09-04 19:09:36 +00:00
|
|
|
|
# will change the size.
|
|
|
|
|
|
if ( $revObject ) {
|
|
|
|
|
|
$this->mRevisionSize = $revObject->getSize();
|
2016-06-10 04:46:54 +00:00
|
|
|
|
} else {
|
2013-09-04 19:09:36 +00:00
|
|
|
|
$this->mRevisionSize = $this->mInputSize;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
return $this->mRevisionSize;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2006-12-29 10:39:35 +00:00
|
|
|
|
/**
|
2022-02-16 18:54:01 +00:00
|
|
|
|
* Mutator for the 'defaultsort' page property.
|
2006-12-29 10:39:35 +00:00
|
|
|
|
*
|
2013-03-11 17:15:01 +00:00
|
|
|
|
* @param string $sort New value
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.0
|
2022-02-16 18:54:01 +00:00
|
|
|
|
* @deprecated since 1.38, use
|
|
|
|
|
|
* $parser->getOutput()->setPageProperty('defaultsort', $sort)
|
2006-12-29 10:39:35 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function setDefaultSort( $sort ) {
|
2022-02-16 19:00:09 +00:00
|
|
|
|
wfDeprecated( __METHOD__, '1.38' );
|
2021-10-07 16:13:46 +00:00
|
|
|
|
$this->mOutput->setPageProperty( 'defaultsort', $sort );
|
2006-12-29 10:39:35 +00:00
|
|
|
|
}
|
2007-01-17 19:48:48 +00:00
|
|
|
|
|
2006-12-29 10:39:35 +00:00
|
|
|
|
/**
|
2022-02-16 18:54:01 +00:00
|
|
|
|
* Accessor for the 'defaultsort' page property.
|
2011-02-05 02:16:13 +00:00
|
|
|
|
* Will use the empty string if none is set.
|
|
|
|
|
|
*
|
|
|
|
|
|
* This value is treated as a prefix, so the
|
|
|
|
|
|
* empty string is equivalent to sorting by
|
|
|
|
|
|
* page name.
|
2006-12-29 10:39:35 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return string
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.9
|
2022-02-16 18:54:01 +00:00
|
|
|
|
* @deprecated since 1.38, use
|
|
|
|
|
|
* $parser->getOutput()->getPageProperty('defaultsort') ?? ''
|
2006-12-29 10:39:35 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function getDefaultSort() {
|
2022-02-16 19:00:09 +00:00
|
|
|
|
wfDeprecated( __METHOD__, '1.38' );
|
2022-02-16 18:54:01 +00:00
|
|
|
|
return $this->mOutput->getPageProperty( 'defaultsort' ) ?? '';
|
2007-09-08 02:08:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2008-11-02 14:21:04 +00:00
|
|
|
|
/**
|
2022-02-16 18:54:01 +00:00
|
|
|
|
* Accessor for the 'defaultsort' page property.
|
2008-11-02 14:21:04 +00:00
|
|
|
|
* Unlike getDefaultSort(), will return false if none is set
|
|
|
|
|
|
*
|
2022-07-31 00:02:18 +00:00
|
|
|
|
* @return string|false
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.14
|
2022-02-16 18:54:01 +00:00
|
|
|
|
* @deprecated since 1.38, use
|
|
|
|
|
|
* $parser->getOutput()->getPageProperty('defaultsort') ?? false
|
2008-11-02 14:21:04 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function getCustomDefaultSort() {
|
2022-02-16 19:00:09 +00:00
|
|
|
|
wfDeprecated( __METHOD__, '1.38' );
|
2022-02-16 18:54:01 +00:00
|
|
|
|
return $this->mOutput->getPageProperty( 'defaultsort' ) ?? false;
|
2008-11-02 14:21:04 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2017-11-22 23:06:21 +00:00
|
|
|
|
private static function getSectionNameFromStrippedText( $text ) {
|
|
|
|
|
|
$text = Sanitizer::normalizeSectionNameWhitespace( $text );
|
|
|
|
|
|
$text = Sanitizer::decodeCharReferences( $text );
|
|
|
|
|
|
$text = self::normalizeSectionName( $text );
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
private static function makeAnchor( $sectionName ) {
|
|
|
|
|
|
return '#' . Sanitizer::escapeIdForLink( $sectionName );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2018-08-08 14:49:46 +00:00
|
|
|
|
private function makeLegacyAnchor( $sectionName ) {
|
2022-04-26 15:48:03 +00:00
|
|
|
|
$fragmentMode = $this->svcOptions->get( MainConfigNames::FragmentMode );
|
2018-08-08 14:49:46 +00:00
|
|
|
|
if ( isset( $fragmentMode[1] ) && $fragmentMode[1] === 'legacy' ) {
|
2017-11-22 23:06:21 +00:00
|
|
|
|
// ForAttribute() and ForLink() are the same for legacy encoding
|
2018-01-04 20:27:11 +00:00
|
|
|
|
$id = Sanitizer::escapeIdForAttribute( $sectionName, Sanitizer::ID_FALLBACK );
|
2017-11-22 23:06:21 +00:00
|
|
|
|
} else {
|
2018-01-04 20:27:11 +00:00
|
|
|
|
$id = Sanitizer::escapeIdForLink( $sectionName );
|
2017-11-22 23:06:21 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
return "#$id";
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2007-09-08 02:08:08 +00:00
|
|
|
|
/**
|
2008-04-14 07:45:50 +00:00
|
|
|
|
* Try to guess the section anchor name based on a wikitext fragment
|
|
|
|
|
|
* presumably extracted from a heading, for example "Header" from
|
2007-09-08 02:08:08 +00:00
|
|
|
|
* "== Header ==".
|
2011-07-24 21:36:04 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text
|
2017-11-22 23:06:21 +00:00
|
|
|
|
* @return string Anchor (starting with '#')
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.12
|
2007-09-08 02:08:08 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function guessSectionNameFromWikiText( $text ) {
|
2009-01-10 17:16:21 +00:00
|
|
|
|
# Strip out wikitext links(they break the anchor)
|
2007-09-08 02:08:08 +00:00
|
|
|
|
$text = $this->stripSectionName( $text );
|
2017-11-22 23:06:21 +00:00
|
|
|
|
$sectionName = self::getSectionNameFromStrippedText( $text );
|
|
|
|
|
|
return self::makeAnchor( $sectionName );
|
2007-09-08 02:08:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
2010-08-05 20:16:43 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Same as guessSectionNameFromWikiText(), but produces legacy anchors
|
2017-06-30 00:13:12 +00:00
|
|
|
|
* instead, if possible. For use in redirects, since various versions
|
|
|
|
|
|
* of Microsoft browsers interpret Location: headers as something other
|
|
|
|
|
|
* than UTF-8, resulting in breakage.
|
2010-08-05 20:16:43 +00:00
|
|
|
|
*
|
2013-03-11 17:15:01 +00:00
|
|
|
|
* @param string $text The section name
|
2017-11-22 23:06:21 +00:00
|
|
|
|
* @return string Anchor (starting with '#')
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.17
|
2010-08-05 20:16:43 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function guessLegacySectionNameFromWikiText( $text ) {
|
|
|
|
|
|
# Strip out wikitext links(they break the anchor)
|
|
|
|
|
|
$text = $this->stripSectionName( $text );
|
2017-11-22 23:06:21 +00:00
|
|
|
|
$sectionName = self::getSectionNameFromStrippedText( $text );
|
2018-08-08 14:49:46 +00:00
|
|
|
|
return $this->makeLegacyAnchor( $sectionName );
|
2017-11-22 23:06:21 +00:00
|
|
|
|
}
|
2017-06-30 00:13:12 +00:00
|
|
|
|
|
2017-11-22 23:06:21 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Like guessSectionNameFromWikiText(), but takes already-stripped text as input.
|
|
|
|
|
|
* @param string $text Section name (plain text)
|
|
|
|
|
|
* @return string Anchor (starting with '#')
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.31
|
2017-11-22 23:06:21 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public static function guessSectionNameFromStrippedText( $text ) {
|
|
|
|
|
|
$sectionName = self::getSectionNameFromStrippedText( $text );
|
|
|
|
|
|
return self::makeAnchor( $sectionName );
|
2010-08-05 20:16:43 +00:00
|
|
|
|
}
|
2017-11-03 02:35:11 +00:00
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Apply the same normalization as code making links to this section would
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $text
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
2017-11-22 23:06:21 +00:00
|
|
|
|
private static function normalizeSectionName( $text ) {
|
2017-11-03 02:35:11 +00:00
|
|
|
|
# T90902: ensure the same normalization is applied for IDs as to links
|
2019-08-31 16:14:38 +00:00
|
|
|
|
/** @var MediaWikiTitleCodec $titleParser */
|
2017-11-03 02:35:11 +00:00
|
|
|
|
$titleParser = MediaWikiServices::getInstance()->getTitleParser();
|
2019-08-31 16:14:38 +00:00
|
|
|
|
'@phan-var MediaWikiTitleCodec $titleParser';
|
2017-11-03 02:35:11 +00:00
|
|
|
|
try {
|
|
|
|
|
|
|
|
|
|
|
|
$parts = $titleParser->splitTitleString( "#$text" );
|
|
|
|
|
|
} catch ( MalformedTitleException $ex ) {
|
|
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
|
|
|
|
|
return $parts['fragment'];
|
|
|
|
|
|
}
|
2010-08-05 20:16:43 +00:00
|
|
|
|
|
2007-09-08 02:08:08 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Strips a text string of wikitext for use in a section anchor
|
2008-04-14 07:45:50 +00:00
|
|
|
|
*
|
2007-09-08 02:08:08 +00:00
|
|
|
|
* Accepts a text string and then removes all wikitext from the
|
|
|
|
|
|
* string and leaves only the resultant text (i.e. the result of
|
|
|
|
|
|
* [[User:WikiSysop|Sysop]] would be "Sysop" and the result of
|
|
|
|
|
|
* [[User:WikiSysop]] would be "User:WikiSysop") - this is intended
|
2022-01-09 17:44:44 +00:00
|
|
|
|
* to create valid section anchors by mimicking the output of the
|
2007-09-08 02:08:08 +00:00
|
|
|
|
* parser when headings are parsed.
|
2008-04-14 07:45:50 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $text Text string to be stripped of wikitext
|
2007-09-08 02:08:08 +00:00
|
|
|
|
* for use in a Section anchor
|
2012-02-09 19:29:36 +00:00
|
|
|
|
* @return string Filtered text string
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.12
|
2007-09-08 02:08:08 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function stripSectionName( $text ) {
|
|
|
|
|
|
# Strip internal link markup
|
2010-03-30 21:20:05 +00:00
|
|
|
|
$text = preg_replace( '/\[\[:?([^[|]+)\|([^[]+)\]\]/', '$2', $text );
|
|
|
|
|
|
$text = preg_replace( '/\[\[:?([^[]+)\|?\]\]/', '$1', $text );
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
2011-05-17 22:03:20 +00:00
|
|
|
|
# Strip external link markup
|
|
|
|
|
|
# @todo FIXME: Not tolerant to blank link text
|
2014-03-13 22:23:56 +00:00
|
|
|
|
# I.E. [https://www.mediawiki.org] will render as [1] or something depending
|
2007-09-08 02:08:08 +00:00
|
|
|
|
# on how many empty links there are on the page - need to figure that out.
|
2022-04-28 13:33:39 +00:00
|
|
|
|
$text = preg_replace(
|
|
|
|
|
|
'/\[(?i:' . $this->urlUtils->validProtocols() . ')([^ ]+?) ([^[]+)\]/', '$2', $text );
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
2007-09-08 02:08:08 +00:00
|
|
|
|
# Parse wikitext quotes (italics & bold)
|
2010-03-30 21:20:05 +00:00
|
|
|
|
$text = $this->doQuotes( $text );
|
2008-04-14 07:45:50 +00:00
|
|
|
|
|
2007-09-08 02:08:08 +00:00
|
|
|
|
# Strip HTML tags
|
|
|
|
|
|
$text = StringUtils::delimiterReplace( '<', '>', '', $text );
|
|
|
|
|
|
return $text;
|
2006-12-29 10:39:35 +00:00
|
|
|
|
}
|
2007-11-20 10:55:08 +00:00
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Strip/replaceVariables/unstrip for preprocessor regression testing
|
|
|
|
|
|
*
|
2021-08-17 19:15:15 +00:00
|
|
|
|
* Called in preprocessorFuzzTest.php maintenance script
|
|
|
|
|
|
* with the help of TestingAccessWrapper to hide it from the public interface
|
|
|
|
|
|
*
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @param string $text
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param PageReference $page
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @param ParserOptions $options
|
|
|
|
|
|
* @param int $outputType
|
|
|
|
|
|
*
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
private function fuzzTestSrvus( $text, PageReference $page, ParserOptions $options,
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
$outputType = self::OT_HTML
|
2016-02-17 19:57:37 +00:00
|
|
|
|
) {
|
2013-10-27 20:18:06 +00:00
|
|
|
|
$magicScopeVariable = $this->lock();
|
2021-04-25 17:29:33 +00:00
|
|
|
|
$this->startParse( $page, $options, $outputType, true );
|
2011-01-23 16:07:13 +00:00
|
|
|
|
|
2007-11-20 10:55:08 +00:00
|
|
|
|
$text = $this->replaceVariables( $text );
|
|
|
|
|
|
$text = $this->mStripState->unstripBoth( $text );
|
2022-03-04 19:05:41 +00:00
|
|
|
|
$text = Sanitizer::internalRemoveHtmlTags( $text );
|
2007-11-20 10:55:08 +00:00
|
|
|
|
return $text;
|
|
|
|
|
|
}
|
2008-01-21 16:36:08 +00:00
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
2021-08-17 19:15:15 +00:00
|
|
|
|
* Strip/replaceVariables/unstrip for preprocessor regression testing
|
|
|
|
|
|
*
|
|
|
|
|
|
* Called in preprocessorFuzzTest.php maintenance script
|
|
|
|
|
|
* with the help of TestingAccessWrapper to hide it from the public interface
|
|
|
|
|
|
*
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @param string $text
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param PageReference $page
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @param ParserOptions $options
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
private function fuzzTestPst( $text, PageReference $page, ParserOptions $options ) {
|
2021-07-28 14:08:59 +00:00
|
|
|
|
return $this->preSaveTransform( $text, $page, $options->getUserIdentity(), $options );
|
2008-01-21 16:36:08 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
/**
|
2021-08-17 19:15:15 +00:00
|
|
|
|
* Strip/replaceVariables/unstrip for preprocessor regression testing
|
|
|
|
|
|
*
|
|
|
|
|
|
* Called in preprocessorFuzzTest.php maintenance script
|
|
|
|
|
|
* with the help of TestingAccessWrapper to hide it from the public interface
|
|
|
|
|
|
*
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @param string $text
|
2021-04-25 17:29:33 +00:00
|
|
|
|
* @param PageReference $page
|
Deprecate Parser implementation methods (will be private in next release)
The following public methods were renamed and made private; the old name
is hard-deprecated and calls the new renamed private method:
Parser::doMagicLinks() => handleMagicLinks()
Parser::doDoubleUnderscore() => handleMagicLinks()
Parser::doHeadings() => handleHeadings()
Parser::doAllQuotes() => handleAllQuotes()
Parser::replaceExternalLinks() => handleExternalLinks()
Parser::replaceInternalLinks() => handleInternalLinks()
Parser::replaceInternalLinks2() => handleInternalLinks2()
Parser::getVariableValue() => expandMagicVariable()
Parser::initialiseVariables() => initializeVariables()
Parser::formatHeadings() => finalizeHeadings()
Parser::test{Pst,Preprocess,Srvus}() => fuzzTest{Pst,Preprocess,Srvus}()
Additionally, the following methods are not used externally, but are
used outside the Parser class by core code. They have been marked
@internal:
Parser::doQuotes() (used by {{#displaytitle}}),
Parser::getExternalLink{Rel,Attribs}() (used by Linker),
Parser::normalizeLinkUrl() (used by Special:LinkSearch and elsewhere).
Parser::{brace,arg,extension}Substitution() (used by PPFrame)
Code search query:
https://codesearch.wmflabs.org/deployed/?q=do%28MagicLinks%7CDoubleUnderscore%7CHeadings%7CAllQuotes%29%7Creplace%28ExternalLinks%7CInternalLinks%28%7C2%29%29%7CgetVariableValue%7CinitialiseVariables%7CformatHeadings%7Ctest%28Pst%7CPreprocess%7CSrvus%29%7CdoQuotes%7CgetExternalLink%28Rel%7CAttribs%29%7CnormalizeLinkUrl%7C%28brace%2Carg%2Cextension%29Substitution&i=nope&files=&repos=
Bug: T236810
Change-Id: I19a43ffc5dcfdd2981b51079c33422c964acb076
2019-10-28 19:52:50 +00:00
|
|
|
|
* @param ParserOptions $options
|
|
|
|
|
|
* @return string
|
|
|
|
|
|
*/
|
2021-04-25 17:29:33 +00:00
|
|
|
|
private function fuzzTestPreprocess( $text, PageReference $page, ParserOptions $options ) {
|
|
|
|
|
|
return $this->fuzzTestSrvus( $text, $page, $options, self::OT_PREPROCESS );
|
2008-01-21 16:36:08 +00:00
|
|
|
|
}
|
2008-01-24 09:07:47 +00:00
|
|
|
|
|
2011-02-23 06:58:15 +00:00
|
|
|
|
/**
|
2011-02-24 20:23:49 +00:00
|
|
|
|
* Call a callback function on all regions of the given text that are not
|
|
|
|
|
|
* inside strip markers, and replace those regions with the return value
|
2011-02-23 06:58:15 +00:00
|
|
|
|
* of the callback. For example, with input:
|
|
|
|
|
|
*
|
|
|
|
|
|
* aaa<MARKER>bbb
|
|
|
|
|
|
*
|
2011-02-24 20:23:49 +00:00
|
|
|
|
* This will call the callback function twice, with 'aaa' and 'bbb'. Those
|
2011-02-23 06:58:15 +00:00
|
|
|
|
* two strings will be replaced with the value returned by the callback in
|
|
|
|
|
|
* each case.
|
2011-05-01 23:54:41 +00:00
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $s
|
|
|
|
|
|
* @param callable $callback
|
2011-08-05 00:33:03 +00:00
|
|
|
|
*
|
2011-05-01 23:54:41 +00:00
|
|
|
|
* @return string
|
2020-01-25 15:45:59 +00:00
|
|
|
|
* @internal
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.12
|
2011-02-23 06:58:15 +00:00
|
|
|
|
*/
|
2019-08-27 09:23:52 +00:00
|
|
|
|
public function markerSkipCallback( $s, callable $callback ) {
|
2008-01-24 09:07:47 +00:00
|
|
|
|
$i = 0;
|
|
|
|
|
|
$out = '';
|
|
|
|
|
|
while ( $i < strlen( $s ) ) {
|
Use a fixed marker prefix string in the Parser and MWTidy
Generating one-time, unique strip markers hurts us in multiple ways:
* The strip marker regexes don't benefit from JIT compilation, so they are
slower to execute than they could be.
* Although the regexes don't benefit from JIT compilation, they are still
compiled, because HHVM bets on regexes getting reused. This extra work is
fairly costly (1-2% of CPU usage on the app servers) and doesn't pay off.
* The size of the PCRE JIT cache is finite, and the caching of one-off regexes
displaces from the cache regexes which are in fact reused.
Tim's preferred solution (per his review comment on
https://gerrit.wikimedia.org/r/167530/) is to use fixed strip markers.
So:
* Replace usage of $parser->mUniqPrefix with Parser::MARKER_PREFIX, which
complements the existing Parser::MARKER_SUFFIX.
* Deprecate Parser::mUniqPrefix and its accessor, Parser::uniqPrefix().
* Deprecate Parser::getRandomString(), since it is no longer useful.
* In Preprocessor_*:preprocessToObj() and Parser::fetchTemplateAndTitle,
replace any occurences of \x7f with '?', to prevent strip marker forgery.
\x7f is not valid input anyway.
* Deprecate the $prefix parameter for StripState::__construct, since a custom
prefix may no longer be specified.
Change-Id: I31d4556bbb07acb72c33fda335fa5a230379a03f
2015-05-26 20:48:33 +00:00
|
|
|
|
$markerStart = strpos( $s, self::MARKER_PREFIX, $i );
|
2008-01-24 09:07:47 +00:00
|
|
|
|
if ( $markerStart === false ) {
|
|
|
|
|
|
$out .= call_user_func( $callback, substr( $s, $i ) );
|
|
|
|
|
|
break;
|
|
|
|
|
|
} else {
|
|
|
|
|
|
$out .= call_user_func( $callback, substr( $s, $i, $markerStart - $i ) );
|
2008-03-27 00:00:25 +00:00
|
|
|
|
$markerEnd = strpos( $s, self::MARKER_SUFFIX, $markerStart );
|
2008-01-24 09:07:47 +00:00
|
|
|
|
if ( $markerEnd === false ) {
|
|
|
|
|
|
$out .= substr( $s, $markerStart );
|
|
|
|
|
|
break;
|
|
|
|
|
|
} else {
|
2008-03-27 00:00:25 +00:00
|
|
|
|
$markerEnd += strlen( self::MARKER_SUFFIX );
|
2008-01-24 09:07:47 +00:00
|
|
|
|
$out .= substr( $s, $markerStart, $markerEnd - $markerStart );
|
|
|
|
|
|
$i = $markerEnd;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
|
|
|
|
|
return $out;
|
|
|
|
|
|
}
|
2009-02-03 04:58:08 +00:00
|
|
|
|
|
2012-03-20 04:39:09 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Remove any strip markers found in the given text.
|
|
|
|
|
|
*
|
2017-12-28 15:06:10 +00:00
|
|
|
|
* @param string $text
|
2012-03-20 04:39:09 +00:00
|
|
|
|
* @return string
|
2021-02-19 22:49:35 +00:00
|
|
|
|
* @since 1.19
|
2012-03-20 04:39:09 +00:00
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
|
public function killMarkers( $text ) {
|
2012-03-20 04:39:09 +00:00
|
|
|
|
return $this->mStripState->killMarkers( $text );
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2012-07-25 15:31:47 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Parsed a width param of imagelink like 300px or 200x300px
|
|
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
|
* @param string $value
|
2016-06-19 06:41:43 +00:00
|
|
|
|
* @param bool $parseHeight
|
2012-07-25 15:31:47 +00:00
|
|
|
|
*
|
|
|
|
|
|
* @return array
|
|
|
|
|
|
* @since 1.20
|
2020-01-25 15:45:59 +00:00
|
|
|
|
* @internal
|
2012-07-25 15:31:47 +00:00
|
|
|
|
*/
|
2016-06-19 06:41:43 +00:00
|
|
|
|
public static function parseWidthParam( $value, $parseHeight = true ) {
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$parsedWidthParam = [];
|
2013-04-20 15:38:24 +00:00
|
|
|
|
if ( $value === '' ) {
|
2012-07-25 15:31:47 +00:00
|
|
|
|
return $parsedWidthParam;
|
|
|
|
|
|
}
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$m = [];
|
2016-12-11 22:45:07 +00:00
|
|
|
|
# (T15500) In both cases (width/height and width only),
|
2012-07-25 15:31:47 +00:00
|
|
|
|
# permit trailing "px" for backward compatibility.
|
2016-06-19 06:41:43 +00:00
|
|
|
|
if ( $parseHeight && preg_match( '/^([0-9]*)x([0-9]*)\s*(?:px)?\s*$/', $value, $m ) ) {
|
2012-07-25 15:31:47 +00:00
|
|
|
|
$width = intval( $m[1] );
|
|
|
|
|
|
$height = intval( $m[2] );
|
|
|
|
|
|
$parsedWidthParam['width'] = $width;
|
|
|
|
|
|
$parsedWidthParam['height'] = $height;
|
|
|
|
|
|
} elseif ( preg_match( '/^[0-9]*\s*(?:px)?\s*$/', $value ) ) {
|
|
|
|
|
|
$width = intval( $value );
|
|
|
|
|
|
$parsedWidthParam['width'] = $width;
|
|
|
|
|
|
}
|
|
|
|
|
|
return $parsedWidthParam;
|
|
|
|
|
|
}
|
2013-10-27 20:18:06 +00:00
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Lock the current instance of the parser.
|
|
|
|
|
|
*
|
|
|
|
|
|
* This is meant to stop someone from calling the parser
|
|
|
|
|
|
* recursively and messing up all the strip state.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @throws MWException If parser is in a parse
|
|
|
|
|
|
* @return ScopedCallback The lock will be released once the return value goes out of scope.
|
|
|
|
|
|
*/
|
|
|
|
|
|
protected function lock() {
|
|
|
|
|
|
if ( $this->mInParse ) {
|
2014-05-10 23:03:45 +00:00
|
|
|
|
throw new MWException( "Parser state cleared while parsing. "
|
2017-05-23 12:48:32 +00:00
|
|
|
|
. "Did you call Parser::parse recursively? Lock is held by: " . $this->mInParse );
|
2013-10-27 20:18:06 +00:00
|
|
|
|
}
|
2017-05-23 12:48:32 +00:00
|
|
|
|
|
|
|
|
|
|
// Save the backtrace when locking, so that if some code tries locking again,
|
|
|
|
|
|
// we can print the lock owner's backtrace for easier debugging
|
|
|
|
|
|
$e = new Exception;
|
|
|
|
|
|
$this->mInParse = $e->getTraceAsString();
|
2013-10-27 20:18:06 +00:00
|
|
|
|
|
2017-06-26 16:35:31 +00:00
|
|
|
|
$recursiveCheck = new ScopedCallback( function () {
|
2016-02-11 08:40:54 +00:00
|
|
|
|
$this->mInParse = false;
|
2013-10-27 20:18:06 +00:00
|
|
|
|
} );
|
|
|
|
|
|
|
|
|
|
|
|
return $recursiveCheck;
|
|
|
|
|
|
}
|
2014-02-22 15:21:36 +00:00
|
|
|
|
|
2022-06-20 03:48:44 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Will entry points such as parse() throw an exception due to the parser
|
|
|
|
|
|
* already being active?
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.39
|
|
|
|
|
|
* @return bool
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function isLocked() {
|
|
|
|
|
|
return (bool)$this->mInParse;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2014-02-22 15:21:36 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Strip outer <p></p> tag from the HTML source of a single paragraph.
|
|
|
|
|
|
*
|
|
|
|
|
|
* Returns original HTML if the <p/> tag has any attributes, if there's no wrapping <p/> tag,
|
|
|
|
|
|
* or if there is more than one <p/> tag in the input HTML.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @param string $html
|
|
|
|
|
|
* @return string
|
2014-05-16 17:50:09 +00:00
|
|
|
|
* @since 1.24
|
2014-02-22 15:21:36 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public static function stripOuterParagraph( $html ) {
|
2016-02-17 19:57:37 +00:00
|
|
|
|
$m = [];
|
2019-03-29 20:12:24 +00:00
|
|
|
|
if ( preg_match( '/^<p>(.*)\n?<\/p>\n?$/sU', $html, $m ) && strpos( $m[1], '</p>' ) === false ) {
|
|
|
|
|
|
$html = $m[1];
|
2014-02-22 15:21:36 +00:00
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
return $html;
|
|
|
|
|
|
}
|
2014-06-20 20:38:10 +00:00
|
|
|
|
|
2022-08-09 02:52:53 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Add HTML tags marking the parts of a page title, to be displayed in the first heading of the page.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @internal
|
|
|
|
|
|
* @since 1.39
|
|
|
|
|
|
* @param string|HtmlArmor $nsText
|
|
|
|
|
|
* @param string|HtmlArmor $nsSeparator
|
|
|
|
|
|
* @param string|HtmlArmor $mainText
|
|
|
|
|
|
* @return string HTML
|
|
|
|
|
|
*/
|
|
|
|
|
|
public static function formatPageTitle( $nsText, $nsSeparator, $mainText ): string {
|
|
|
|
|
|
$html = '';
|
|
|
|
|
|
if ( $nsText !== '' ) {
|
|
|
|
|
|
$html .= '<span class="mw-page-title-namespace">' . HtmlArmor::getHtml( $nsText ) . '</span>';
|
|
|
|
|
|
$html .= '<span class="mw-page-title-separator">' . HtmlArmor::getHtml( $nsSeparator ) . '</span>';
|
|
|
|
|
|
}
|
|
|
|
|
|
$html .= '<span class="mw-page-title-main">' . HtmlArmor::getHtml( $mainText ) . '</span>';
|
|
|
|
|
|
return $html;
|
|
|
|
|
|
}
|
|
|
|
|
|
|
2014-06-20 20:38:10 +00:00
|
|
|
|
/**
|
|
|
|
|
|
* Return this parser if it is not doing anything, otherwise
|
|
|
|
|
|
* get a fresh parser. You can use this method by doing
|
2019-04-11 13:36:15 +00:00
|
|
|
|
* $newParser = $oldParser->getFreshParser(), or more simply
|
|
|
|
|
|
* $oldParser->getFreshParser()->parse( ... );
|
|
|
|
|
|
* if you're unsure if $oldParser is safe to use.
|
2014-06-20 20:38:10 +00:00
|
|
|
|
*
|
2022-06-20 03:48:44 +00:00
|
|
|
|
* @deprecated since 1.39, use ParserFactory::getInstance()
|
2014-06-20 20:38:10 +00:00
|
|
|
|
* @since 1.24
|
|
|
|
|
|
* @return Parser A parser object that is not parsing anything
|
|
|
|
|
|
*/
|
|
|
|
|
|
public function getFreshParser() {
|
|
|
|
|
|
if ( $this->mInParse ) {
|
2018-08-03 08:43:00 +00:00
|
|
|
|
return $this->factory->create();
|
2014-06-20 20:38:10 +00:00
|
|
|
|
} else {
|
|
|
|
|
|
return $this;
|
|
|
|
|
|
}
|
|
|
|
|
|
}
|
2015-07-25 15:32:08 +00:00
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
|
* Set's up the PHP implementation of OOUI for use in this request
|
|
|
|
|
|
* and instructs OutputPage to enable OOUI for itself.
|
|
|
|
|
|
*
|
|
|
|
|
|
* @since 1.26
|
2021-01-13 14:12:50 +00:00
|
|
|
|
* @deprecated since 1.35, use $parser->getOutput()->setEnableOOUI() instead.
|
2015-07-25 15:32:08 +00:00
|
|
|
|
*/
|
|
|
|
|
|
public function enableOOUI() {
|
2020-03-26 19:59:25 +00:00
|
|
|
|
wfDeprecated( __METHOD__, '1.35' );
|
2015-07-25 15:32:08 +00:00
|
|
|
|
OutputPage::setupOOUI();
|
|
|
|
|
|
$this->mOutput->setEnableOOUI( true );
|
|
|
|
|
|
}
|
2019-06-27 03:35:50 +00:00
|
|
|
|
|
|
|
|
|
|
/**
|
2020-03-26 19:49:58 +00:00
|
|
|
|
* Sets the flag on the parser output but also does some debug logging.
|
2020-06-05 02:54:51 +00:00
|
|
|
|
* Note that there is a copy of this method in CoreMagicVariables as well.
|
2019-06-27 03:35:50 +00:00
|
|
|
|
* @param string $flag
|
|
|
|
|
|
* @param string $reason
|
|
|
|
|
|
*/
|
2020-03-26 19:49:58 +00:00
|
|
|
|
private function setOutputFlag( string $flag, string $reason ): void {
|
Add new ParserOutput::{get,set}OutputFlag() interface
This is a uniform mechanism to access a number of bespoke boolean
flags in ParserOutput. It allows extensibility in core (by adding new
field names to ParserOutputFlags) without exposing new getter/setter
methods to Parsoid. It replaces the ParserOutput::{get,set}Flag()
interface which (a) doesn't allow access to certain flags, and (b) is
typically called with a string rather than a constant, and (c) has a
very generic name. (Note that Parser::setOutputFlag() already called
these "output flags".)
In the future we might unify the representation so that we store
everything in $mFlags and don't have explicit properties in
ParserOutput, but those representation details should be invisible to
the clients of this API. (We might also use a proper enumeration
for ParserOutputFlags, when PHP supports this.)
There is some overlap with ParserOutput::{get,set}ExtensionData(), but
I've left those methods as-is because (a) they allow for non-boolean
data, unlike the *Flag() methods, and (b) it seems worthwhile to
distingush properties set by extensions from properties used by core.
Code search:
https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos=
Bug: T292868
Change-Id: I39bc58d207836df6f328c54be9e3330719cebbeb
2021-10-08 20:04:37 +00:00
|
|
|
|
$this->mOutput->setOutputFlag( $flag );
|
2019-10-18 19:50:58 +00:00
|
|
|
|
$name = $this->getTitle()->getPrefixedText();
|
2019-06-27 03:35:50 +00:00
|
|
|
|
$this->logger->debug( __METHOD__ . ": set $flag flag on '$name'; $reason" );
|
|
|
|
|
|
}
|
2008-01-17 04:02:57 +00:00
|
|
|
|
}
|