2008-02-05 08:23:58 +00:00
|
|
|
<?php
|
2010-08-22 14:31:05 +00:00
|
|
|
/**
|
|
|
|
|
* Preprocessor using PHP arrays
|
|
|
|
|
*
|
2012-04-30 09:22:16 +00:00
|
|
|
* This program is free software; you can redistribute it and/or modify
|
|
|
|
|
* it under the terms of the GNU General Public License as published by
|
|
|
|
|
* the Free Software Foundation; either version 2 of the License, or
|
|
|
|
|
* (at your option) any later version.
|
|
|
|
|
*
|
|
|
|
|
* This program is distributed in the hope that it will be useful,
|
|
|
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
|
* GNU General Public License for more details.
|
|
|
|
|
*
|
|
|
|
|
* You should have received a copy of the GNU General Public License along
|
|
|
|
|
* with this program; if not, write to the Free Software Foundation, Inc.,
|
|
|
|
|
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
|
|
|
|
|
* http://www.gnu.org/copyleft/gpl.html
|
|
|
|
|
*
|
2010-08-22 14:31:05 +00:00
|
|
|
* @file
|
|
|
|
|
* @ingroup Parser
|
|
|
|
|
*/
|
2010-12-11 03:52:35 +00:00
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
/**
|
|
|
|
|
* Differences from DOM schema:
|
|
|
|
|
* * attribute nodes are children
|
2012-07-10 12:48:06 +00:00
|
|
|
* * "<h>" nodes that aren't at the top are replaced with <possible-h>
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
*
|
|
|
|
|
* Nodes are stored in a recursive array data structure. A node store is an
|
|
|
|
|
* array where each element may be either a scalar (representing a text node)
|
|
|
|
|
* or a "descriptor", which is a two-element array where the first element is
|
|
|
|
|
* the node name and the second element is the node store for the children.
|
|
|
|
|
*
|
|
|
|
|
* Attributes are represented as children that have a node name starting with
|
|
|
|
|
* "@", and a single text node child.
|
|
|
|
|
*
|
|
|
|
|
* @todo: Consider replacing descriptor arrays with objects of a new class.
|
|
|
|
|
* Benchmark and measure resulting memory impact.
|
|
|
|
|
*
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
* @ingroup Parser
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2015-10-06 15:13:24 +00:00
|
|
|
// @codingStandardsIgnoreStart Squiz.Classes.ValidClassName.NotCamelCaps
|
2015-10-08 20:54:15 +00:00
|
|
|
class Preprocessor_Hash extends Preprocessor {
|
2014-08-11 20:44:31 +00:00
|
|
|
// @codingStandardsIgnoreEnd
|
2014-08-11 20:24:54 +00:00
|
|
|
|
2014-05-16 00:48:01 +00:00
|
|
|
/**
|
|
|
|
|
* @var Parser
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public $parser;
|
2010-12-11 03:52:35 +00:00
|
|
|
|
2015-10-08 20:54:15 +00:00
|
|
|
const CACHE_PREFIX = 'preprocess-hash';
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
const CACHE_VERSION = 2;
|
2008-02-05 08:23:58 +00:00
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __construct( $parser ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$this->parser = $parser;
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-26 19:52:56 +00:00
|
|
|
/**
|
|
|
|
|
* @return PPFrame_Hash
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function newFrame() {
|
2008-02-05 08:23:58 +00:00
|
|
|
return new PPFrame_Hash( $this );
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-26 19:52:56 +00:00
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param array $args
|
2011-05-26 19:52:56 +00:00
|
|
|
* @return PPCustomFrame_Hash
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function newCustomFrame( $args ) {
|
2008-06-26 13:05:40 +00:00
|
|
|
return new PPCustomFrame_Hash( $this, $args );
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param array $values
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return PPNode_Hash_Array
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function newPartNodeArray( $values ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$list = [];
|
2010-06-10 15:16:15 +00:00
|
|
|
|
|
|
|
|
foreach ( $values as $k => $val ) {
|
|
|
|
|
if ( is_int( $k ) ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$store = [ [ 'part', [
|
|
|
|
|
[ 'name', [ [ '@index', [ $k ] ] ] ],
|
|
|
|
|
[ 'value', [ strval( $val ) ] ],
|
|
|
|
|
] ] ];
|
2010-06-10 15:16:15 +00:00
|
|
|
} else {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$store = [ [ 'part', [
|
|
|
|
|
[ 'name', [ strval( $k ) ] ],
|
|
|
|
|
'=',
|
|
|
|
|
[ 'value', [ strval( $val ) ] ],
|
|
|
|
|
] ] ];
|
2010-06-10 15:16:15 +00:00
|
|
|
}
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$list[] = new PPNode_Hash_Tree( $store, 0 );
|
2010-06-10 15:16:15 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
$node = new PPNode_Hash_Array( $list );
|
|
|
|
|
return $node;
|
|
|
|
|
}
|
|
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
/**
|
|
|
|
|
* Preprocess some wikitext and return the document tree.
|
|
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param string $text The text to parse
|
|
|
|
|
* @param int $flags Bitwise combination of:
|
|
|
|
|
* Parser::PTD_FOR_INCLUSION Handle "<noinclude>" and "<includeonly>" as if the text is being
|
|
|
|
|
* included. Default is to assume a direct page view.
|
2008-02-05 08:23:58 +00:00
|
|
|
*
|
|
|
|
|
* The generated DOM tree must depend only on the input text and the flags.
|
2008-04-14 07:45:50 +00:00
|
|
|
* The DOM tree must be the same in OT_HTML and OT_WIKI mode, to avoid a regression of bug 4899.
|
2008-02-05 08:23:58 +00:00
|
|
|
*
|
2008-04-14 07:45:50 +00:00
|
|
|
* Any flag added to the $flags parameter here, or any other parameter liable to cause a
|
|
|
|
|
* change in the DOM tree for a given text, must be passed through the section identifier
|
|
|
|
|
* in the section edit link and thus back to extractSections().
|
2008-02-05 08:23:58 +00:00
|
|
|
*
|
2012-10-07 23:35:26 +00:00
|
|
|
* @throws MWException
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return PPNode_Hash_Tree
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function preprocessToObj( $text, $flags = 0 ) {
|
2015-10-08 20:54:15 +00:00
|
|
|
$tree = $this->cacheGetTree( $text, $flags );
|
|
|
|
|
if ( $tree !== false ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$store = json_decode( $tree );
|
|
|
|
|
if ( is_array( $store ) ) {
|
|
|
|
|
return new PPNode_Hash_Tree( $store, 0 );
|
|
|
|
|
}
|
2009-02-06 20:27:58 +00:00
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
|
|
|
|
|
$forInclusion = $flags & Parser::PTD_FOR_INCLUSION;
|
|
|
|
|
|
|
|
|
|
$xmlishElements = $this->parser->getStripList();
|
Preprocessor: Don't allow unclosed extension tags (matching until end of input)
(Previously done in f51d0d9a819f8f1c181350ced2f015ce97985fcc and
reverted in 543f46e9c08e0ff8c5e8b4e917fcc045730ef1bc.)
I think it's saner to treat this as invalid syntax, and output the
mismatched tag code verbatim. The current behavior is particularly
annoying for <ref> tags, which often swallow everything afterwards.
This does not affect HTML tags, though. Assuming Tidy is enabled, they
are still auto-closed at the end of the page content. (For tags that
"shadow" a HTML tag name, this results in the tag being treated as a
HTML tag. This currently only affects <pre> tags: if unclosed, they
are still displayed as preformatted text, but without suppressing
wikitext formatting.)
It also does not affect <includeonly>, <noinclude> and <onlyinclude>
tags. Changing this behavior now would be too disruptive to existing
content, and is the reason why previous attempt was reverted. (They
are already special-cased enough that this isn't too weird, for example
mismatched closing tags are hidden.)
Related to T17712 and T58306. I think this brings the PHP parser closer
to Parsoid's interpretation.
It reduces performance somewhat in the worst case, though. Testing with
https://phabricator.wikimedia.org/F3245989 (a 1 MB page starting with
3000 opening tags of 15 different types), parsing time rises from
~0.2 seconds to ~1.1 seconds on my setup. We go from O(N) to O(kN),
where N is bytes of input and k is the number of types of tags present
on the page. Maximum k shouldn't exceed 30 or so in reasonable setups
(depends on installed extensions, it's 20 on English Wikipedia).
Change-Id: Ide8b034e464eefb1b7c9e2a48ed06e21a7f8d434
2016-02-04 01:13:24 +00:00
|
|
|
$xmlishAllowMissingEndTag = [ 'includeonly', 'noinclude', 'onlyinclude' ];
|
2008-02-05 08:23:58 +00:00
|
|
|
$enableOnlyinclude = false;
|
|
|
|
|
if ( $forInclusion ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$ignoredTags = [ 'includeonly', '/includeonly' ];
|
|
|
|
|
$ignoredElements = [ 'noinclude' ];
|
2008-02-05 08:23:58 +00:00
|
|
|
$xmlishElements[] = 'noinclude';
|
2014-05-10 23:03:45 +00:00
|
|
|
if ( strpos( $text, '<onlyinclude>' ) !== false
|
|
|
|
|
&& strpos( $text, '</onlyinclude>' ) !== false
|
|
|
|
|
) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$enableOnlyinclude = true;
|
|
|
|
|
}
|
|
|
|
|
} else {
|
2016-02-17 09:09:32 +00:00
|
|
|
$ignoredTags = [ 'noinclude', '/noinclude', 'onlyinclude', '/onlyinclude' ];
|
|
|
|
|
$ignoredElements = [ 'includeonly' ];
|
2008-02-05 08:23:58 +00:00
|
|
|
$xmlishElements[] = 'includeonly';
|
|
|
|
|
}
|
|
|
|
|
$xmlishRegex = implode( '|', array_merge( $xmlishElements, $ignoredTags ) );
|
|
|
|
|
|
|
|
|
|
// Use "A" modifier (anchored) instead of "^", because ^ doesn't work with an offset
|
|
|
|
|
$elementsRegex = "~($xmlishRegex)(?:\s|\/>|>)|(!--)~iA";
|
2008-04-14 07:45:50 +00:00
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
$stack = new PPDStack_Hash;
|
|
|
|
|
|
|
|
|
|
$searchBase = "[{<\n";
|
2014-05-10 23:03:45 +00:00
|
|
|
// For fast reverse searches
|
|
|
|
|
$revText = strrev( $text );
|
2012-08-06 11:41:28 +00:00
|
|
|
$lengthText = strlen( $text );
|
2008-02-05 08:23:58 +00:00
|
|
|
|
2014-05-10 23:03:45 +00:00
|
|
|
// Input pointer, starts out pointing to a pseudo-newline before the start
|
|
|
|
|
$i = 0;
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
// Current accumulator. See the doc comment for Preprocessor_Hash for the format.
|
2014-05-10 23:03:45 +00:00
|
|
|
$accum =& $stack->getAccum();
|
|
|
|
|
// True to find equals signs in arguments
|
|
|
|
|
$findEquals = false;
|
|
|
|
|
// True to take notice of pipe characters
|
|
|
|
|
$findPipe = false;
|
2008-02-05 08:23:58 +00:00
|
|
|
$headingIndex = 1;
|
2014-05-10 23:03:45 +00:00
|
|
|
// True if $i is inside a possible heading
|
|
|
|
|
$inHeading = false;
|
|
|
|
|
// True if there are no more greater-than (>) signs right of $i
|
|
|
|
|
$noMoreGT = false;
|
Preprocessor: Don't allow unclosed extension tags (matching until end of input)
(Previously done in f51d0d9a819f8f1c181350ced2f015ce97985fcc and
reverted in 543f46e9c08e0ff8c5e8b4e917fcc045730ef1bc.)
I think it's saner to treat this as invalid syntax, and output the
mismatched tag code verbatim. The current behavior is particularly
annoying for <ref> tags, which often swallow everything afterwards.
This does not affect HTML tags, though. Assuming Tidy is enabled, they
are still auto-closed at the end of the page content. (For tags that
"shadow" a HTML tag name, this results in the tag being treated as a
HTML tag. This currently only affects <pre> tags: if unclosed, they
are still displayed as preformatted text, but without suppressing
wikitext formatting.)
It also does not affect <includeonly>, <noinclude> and <onlyinclude>
tags. Changing this behavior now would be too disruptive to existing
content, and is the reason why previous attempt was reverted. (They
are already special-cased enough that this isn't too weird, for example
mismatched closing tags are hidden.)
Related to T17712 and T58306. I think this brings the PHP parser closer
to Parsoid's interpretation.
It reduces performance somewhat in the worst case, though. Testing with
https://phabricator.wikimedia.org/F3245989 (a 1 MB page starting with
3000 opening tags of 15 different types), parsing time rises from
~0.2 seconds to ~1.1 seconds on my setup. We go from O(N) to O(kN),
where N is bytes of input and k is the number of types of tags present
on the page. Maximum k shouldn't exceed 30 or so in reasonable setups
(depends on installed extensions, it's 20 on English Wikipedia).
Change-Id: Ide8b034e464eefb1b7c9e2a48ed06e21a7f8d434
2016-02-04 01:13:24 +00:00
|
|
|
// Map of tag name => true if there are no more closing tags of given type right of $i
|
|
|
|
|
$noMoreClosingTag = [];
|
2014-05-10 23:03:45 +00:00
|
|
|
// True to ignore all input up to the next <onlyinclude>
|
|
|
|
|
$findOnlyinclude = $enableOnlyinclude;
|
|
|
|
|
// Do a line-start run without outputting an LF character
|
|
|
|
|
$fakeLineStart = true;
|
2008-02-05 08:23:58 +00:00
|
|
|
|
|
|
|
|
while ( true ) {
|
2015-09-11 13:44:59 +00:00
|
|
|
// $this->memCheck();
|
2008-02-05 08:23:58 +00:00
|
|
|
|
|
|
|
|
if ( $findOnlyinclude ) {
|
|
|
|
|
// Ignore all input up to the next <onlyinclude>
|
|
|
|
|
$startPos = strpos( $text, '<onlyinclude>', $i );
|
|
|
|
|
if ( $startPos === false ) {
|
|
|
|
|
// Ignored section runs to the end
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum[] = [ 'ignore', [ substr( $text, $i ) ] ];
|
2008-02-05 08:23:58 +00:00
|
|
|
break;
|
|
|
|
|
}
|
|
|
|
|
$tagEndPos = $startPos + strlen( '<onlyinclude>' ); // past-the-end
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum[] = [ 'ignore', [ substr( $text, $i, $tagEndPos - $i ) ] ];
|
2008-02-05 08:23:58 +00:00
|
|
|
$i = $tagEndPos;
|
|
|
|
|
$findOnlyinclude = false;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if ( $fakeLineStart ) {
|
|
|
|
|
$found = 'line-start';
|
|
|
|
|
$curChar = '';
|
|
|
|
|
} else {
|
|
|
|
|
# Find next opening brace, closing brace or pipe
|
|
|
|
|
$search = $searchBase;
|
|
|
|
|
if ( $stack->top === false ) {
|
|
|
|
|
$currentClosing = '';
|
|
|
|
|
} else {
|
|
|
|
|
$currentClosing = $stack->top->close;
|
|
|
|
|
$search .= $currentClosing;
|
|
|
|
|
}
|
|
|
|
|
if ( $findPipe ) {
|
|
|
|
|
$search .= '|';
|
|
|
|
|
}
|
|
|
|
|
if ( $findEquals ) {
|
|
|
|
|
// First equals will be for the template
|
|
|
|
|
$search .= '=';
|
|
|
|
|
}
|
|
|
|
|
$rule = null;
|
|
|
|
|
# Output literal section, advance input counter
|
|
|
|
|
$literalLength = strcspn( $text, $search, $i );
|
|
|
|
|
if ( $literalLength > 0 ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
self::addLiteral( $accum, substr( $text, $i, $literalLength ) );
|
2008-02-05 08:23:58 +00:00
|
|
|
$i += $literalLength;
|
|
|
|
|
}
|
2012-08-06 11:41:28 +00:00
|
|
|
if ( $i >= $lengthText ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( $currentClosing == "\n" ) {
|
|
|
|
|
// Do a past-the-end run to finish off the heading
|
|
|
|
|
$curChar = '';
|
|
|
|
|
$found = 'line-end';
|
|
|
|
|
} else {
|
|
|
|
|
# All done
|
|
|
|
|
break;
|
|
|
|
|
}
|
|
|
|
|
} else {
|
|
|
|
|
$curChar = $text[$i];
|
|
|
|
|
if ( $curChar == '|' ) {
|
|
|
|
|
$found = 'pipe';
|
|
|
|
|
} elseif ( $curChar == '=' ) {
|
|
|
|
|
$found = 'equals';
|
|
|
|
|
} elseif ( $curChar == '<' ) {
|
|
|
|
|
$found = 'angle';
|
|
|
|
|
} elseif ( $curChar == "\n" ) {
|
|
|
|
|
if ( $inHeading ) {
|
|
|
|
|
$found = 'line-end';
|
|
|
|
|
} else {
|
|
|
|
|
$found = 'line-start';
|
|
|
|
|
}
|
|
|
|
|
} elseif ( $curChar == $currentClosing ) {
|
|
|
|
|
$found = 'close';
|
2015-10-31 23:10:54 +00:00
|
|
|
} elseif ( isset( $this->rules[$curChar] ) ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$found = 'open';
|
2015-10-31 23:10:54 +00:00
|
|
|
$rule = $this->rules[$curChar];
|
2008-02-05 08:23:58 +00:00
|
|
|
} else {
|
|
|
|
|
# Some versions of PHP have a strcspn which stops on null characters
|
|
|
|
|
# Ignore and continue
|
|
|
|
|
++$i;
|
|
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if ( $found == 'angle' ) {
|
|
|
|
|
$matches = false;
|
|
|
|
|
// Handle </onlyinclude>
|
2014-05-10 23:03:45 +00:00
|
|
|
if ( $enableOnlyinclude
|
|
|
|
|
&& substr( $text, $i, strlen( '</onlyinclude>' ) ) == '</onlyinclude>'
|
|
|
|
|
) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$findOnlyinclude = true;
|
|
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Determine element name
|
|
|
|
|
if ( !preg_match( $elementsRegex, $text, $matches, 0, $i + 1 ) ) {
|
|
|
|
|
// Element name missing or not listed
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
self::addLiteral( $accum, '<' );
|
2008-02-05 08:23:58 +00:00
|
|
|
++$i;
|
|
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
// Handle comments
|
|
|
|
|
if ( isset( $matches[2] ) && $matches[2] == '!--' ) {
|
2013-08-07 00:21:00 +00:00
|
|
|
|
|
|
|
|
// To avoid leaving blank lines, when a sequence of
|
|
|
|
|
// space-separated comments is both preceded and followed by
|
|
|
|
|
// a newline (ignoring spaces), then
|
|
|
|
|
// trim leading and trailing spaces and the trailing newline.
|
2008-04-14 07:45:50 +00:00
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
// Find the end
|
|
|
|
|
$endPos = strpos( $text, '-->', $i + 4 );
|
|
|
|
|
if ( $endPos === false ) {
|
|
|
|
|
// Unclosed comment in input, runs to end
|
|
|
|
|
$inner = substr( $text, $i );
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum[] = [ 'comment', [ $inner ] ];
|
2012-08-06 11:41:28 +00:00
|
|
|
$i = $lengthText;
|
2008-02-05 08:23:58 +00:00
|
|
|
} else {
|
|
|
|
|
// Search backwards for leading whitespace
|
2013-08-08 23:48:16 +00:00
|
|
|
$wsStart = $i ? ( $i - strspn( $revText, " \t", $lengthText - $i ) ) : 0;
|
2013-08-07 00:21:00 +00:00
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
// Search forwards for trailing whitespace
|
2011-01-13 17:30:27 +00:00
|
|
|
// $wsEnd will be the position of the last space (or the '>' if there's none)
|
2013-08-08 23:48:16 +00:00
|
|
|
$wsEnd = $endPos + 2 + strspn( $text, " \t", $endPos + 3 );
|
2013-08-07 00:21:00 +00:00
|
|
|
|
|
|
|
|
// Keep looking forward as long as we're finding more
|
|
|
|
|
// comments.
|
2016-02-17 09:09:32 +00:00
|
|
|
$comments = [ [ $wsStart, $wsEnd ] ];
|
2013-08-07 00:21:00 +00:00
|
|
|
while ( substr( $text, $wsEnd + 1, 4 ) == '<!--' ) {
|
|
|
|
|
$c = strpos( $text, '-->', $wsEnd + 4 );
|
|
|
|
|
if ( $c === false ) {
|
|
|
|
|
break;
|
|
|
|
|
}
|
2013-08-08 23:48:16 +00:00
|
|
|
$c = $c + 2 + strspn( $text, " \t", $c + 3 );
|
2016-02-17 09:09:32 +00:00
|
|
|
$comments[] = [ $wsEnd + 1, $c ];
|
2013-08-07 00:21:00 +00:00
|
|
|
$wsEnd = $c;
|
|
|
|
|
}
|
|
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
// Eat the line if possible
|
2008-04-14 07:45:50 +00:00
|
|
|
// TODO: This could theoretically be done if $wsStart == 0, i.e. for comments at
|
|
|
|
|
// the overall start. That's not how Sanitizer::removeHTMLcomments() did it, but
|
2008-02-05 08:23:58 +00:00
|
|
|
// it's a possible beneficial b/c break.
|
2008-04-14 07:45:50 +00:00
|
|
|
if ( $wsStart > 0 && substr( $text, $wsStart - 1, 1 ) == "\n"
|
2013-12-01 20:39:00 +00:00
|
|
|
&& substr( $text, $wsEnd + 1, 1 ) == "\n"
|
|
|
|
|
) {
|
2008-02-05 08:23:58 +00:00
|
|
|
// Remove leading whitespace from the end of the accumulator
|
|
|
|
|
$wsLength = $i - $wsStart;
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$endIndex = count( $accum ) - 1;
|
|
|
|
|
|
|
|
|
|
// Sanity check
|
2008-04-14 07:45:50 +00:00
|
|
|
if ( $wsLength > 0
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
&& $endIndex >= 0
|
|
|
|
|
&& is_string( $accum[$endIndex] )
|
|
|
|
|
&& strspn( $accum[$endIndex], " \t", -$wsLength ) === $wsLength
|
2013-12-01 20:39:00 +00:00
|
|
|
) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum[$endIndex] = substr( $accum[$endIndex], 0, -$wsLength );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
2013-08-07 00:21:00 +00:00
|
|
|
|
|
|
|
|
// Dump all but the last comment to the accumulator
|
|
|
|
|
foreach ( $comments as $j => $com ) {
|
|
|
|
|
$startPos = $com[0];
|
|
|
|
|
$endPos = $com[1] + 1;
|
2013-08-24 15:06:25 +00:00
|
|
|
if ( $j == ( count( $comments ) - 1 ) ) {
|
2013-08-07 00:21:00 +00:00
|
|
|
break;
|
|
|
|
|
}
|
2013-08-24 15:06:25 +00:00
|
|
|
$inner = substr( $text, $startPos, $endPos - $startPos );
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum[] = [ 'comment', [ $inner ] ];
|
2013-08-07 00:21:00 +00:00
|
|
|
}
|
|
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
// Do a line-start run next time to look for headings after the comment
|
|
|
|
|
$fakeLineStart = true;
|
|
|
|
|
} else {
|
|
|
|
|
// No line to eat, just take the comment itself
|
|
|
|
|
$startPos = $i;
|
|
|
|
|
$endPos += 2;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if ( $stack->top ) {
|
|
|
|
|
$part = $stack->top->getCurrentPart();
|
2013-03-24 10:01:51 +00:00
|
|
|
if ( !( isset( $part->commentEnd ) && $part->commentEnd == $wsStart - 1 ) ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$part->visualEnd = $wsStart;
|
|
|
|
|
}
|
2011-04-22 14:25:17 +00:00
|
|
|
// Else comments abutting, no change in visual end
|
|
|
|
|
$part->commentEnd = $endPos;
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
$i = $endPos + 1;
|
|
|
|
|
$inner = substr( $text, $startPos, $endPos - $startPos + 1 );
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum[] = [ 'comment', [ $inner ] ];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
$name = $matches[1];
|
2008-03-05 01:07:47 +00:00
|
|
|
$lowerName = strtolower( $name );
|
2008-02-05 08:23:58 +00:00
|
|
|
$attrStart = $i + strlen( $name ) + 1;
|
|
|
|
|
|
|
|
|
|
// Find end of tag
|
|
|
|
|
$tagEndPos = $noMoreGT ? false : strpos( $text, '>', $attrStart );
|
|
|
|
|
if ( $tagEndPos === false ) {
|
|
|
|
|
// Infinite backtrack
|
|
|
|
|
// Disable tag search to prevent worst-case O(N^2) performance
|
|
|
|
|
$noMoreGT = true;
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
self::addLiteral( $accum, '<' );
|
2008-02-05 08:23:58 +00:00
|
|
|
++$i;
|
|
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Handle ignored tags
|
2008-03-05 01:07:47 +00:00
|
|
|
if ( in_array( $lowerName, $ignoredTags ) ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum[] = [ 'ignore', [ substr( $text, $i, $tagEndPos - $i + 1 ) ] ];
|
2008-02-05 08:23:58 +00:00
|
|
|
$i = $tagEndPos + 1;
|
|
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
$tagStartPos = $i;
|
2013-04-13 11:36:24 +00:00
|
|
|
if ( $text[$tagEndPos - 1] == '/' ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
// Short end tag
|
|
|
|
|
$attrEnd = $tagEndPos - 1;
|
|
|
|
|
$inner = null;
|
|
|
|
|
$i = $tagEndPos + 1;
|
|
|
|
|
$close = null;
|
|
|
|
|
} else {
|
|
|
|
|
$attrEnd = $tagEndPos;
|
|
|
|
|
// Find closing tag
|
Preprocessor: Don't allow unclosed extension tags (matching until end of input)
(Previously done in f51d0d9a819f8f1c181350ced2f015ce97985fcc and
reverted in 543f46e9c08e0ff8c5e8b4e917fcc045730ef1bc.)
I think it's saner to treat this as invalid syntax, and output the
mismatched tag code verbatim. The current behavior is particularly
annoying for <ref> tags, which often swallow everything afterwards.
This does not affect HTML tags, though. Assuming Tidy is enabled, they
are still auto-closed at the end of the page content. (For tags that
"shadow" a HTML tag name, this results in the tag being treated as a
HTML tag. This currently only affects <pre> tags: if unclosed, they
are still displayed as preformatted text, but without suppressing
wikitext formatting.)
It also does not affect <includeonly>, <noinclude> and <onlyinclude>
tags. Changing this behavior now would be too disruptive to existing
content, and is the reason why previous attempt was reverted. (They
are already special-cased enough that this isn't too weird, for example
mismatched closing tags are hidden.)
Related to T17712 and T58306. I think this brings the PHP parser closer
to Parsoid's interpretation.
It reduces performance somewhat in the worst case, though. Testing with
https://phabricator.wikimedia.org/F3245989 (a 1 MB page starting with
3000 opening tags of 15 different types), parsing time rises from
~0.2 seconds to ~1.1 seconds on my setup. We go from O(N) to O(kN),
where N is bytes of input and k is the number of types of tags present
on the page. Maximum k shouldn't exceed 30 or so in reasonable setups
(depends on installed extensions, it's 20 on English Wikipedia).
Change-Id: Ide8b034e464eefb1b7c9e2a48ed06e21a7f8d434
2016-02-04 01:13:24 +00:00
|
|
|
if (
|
|
|
|
|
!isset( $noMoreClosingTag[$name] ) &&
|
|
|
|
|
preg_match( "/<\/" . preg_quote( $name, '/' ) . "\s*>/i",
|
2013-12-01 20:39:00 +00:00
|
|
|
$text, $matches, PREG_OFFSET_CAPTURE, $tagEndPos + 1 )
|
|
|
|
|
) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$inner = substr( $text, $tagEndPos + 1, $matches[0][1] - $tagEndPos - 1 );
|
|
|
|
|
$i = $matches[0][1] + strlen( $matches[0][0] );
|
|
|
|
|
$close = $matches[0][0];
|
|
|
|
|
} else {
|
Preprocessor: Don't allow unclosed extension tags (matching until end of input)
(Previously done in f51d0d9a819f8f1c181350ced2f015ce97985fcc and
reverted in 543f46e9c08e0ff8c5e8b4e917fcc045730ef1bc.)
I think it's saner to treat this as invalid syntax, and output the
mismatched tag code verbatim. The current behavior is particularly
annoying for <ref> tags, which often swallow everything afterwards.
This does not affect HTML tags, though. Assuming Tidy is enabled, they
are still auto-closed at the end of the page content. (For tags that
"shadow" a HTML tag name, this results in the tag being treated as a
HTML tag. This currently only affects <pre> tags: if unclosed, they
are still displayed as preformatted text, but without suppressing
wikitext formatting.)
It also does not affect <includeonly>, <noinclude> and <onlyinclude>
tags. Changing this behavior now would be too disruptive to existing
content, and is the reason why previous attempt was reverted. (They
are already special-cased enough that this isn't too weird, for example
mismatched closing tags are hidden.)
Related to T17712 and T58306. I think this brings the PHP parser closer
to Parsoid's interpretation.
It reduces performance somewhat in the worst case, though. Testing with
https://phabricator.wikimedia.org/F3245989 (a 1 MB page starting with
3000 opening tags of 15 different types), parsing time rises from
~0.2 seconds to ~1.1 seconds on my setup. We go from O(N) to O(kN),
where N is bytes of input and k is the number of types of tags present
on the page. Maximum k shouldn't exceed 30 or so in reasonable setups
(depends on installed extensions, it's 20 on English Wikipedia).
Change-Id: Ide8b034e464eefb1b7c9e2a48ed06e21a7f8d434
2016-02-04 01:13:24 +00:00
|
|
|
// No end tag
|
|
|
|
|
if ( in_array( $name, $xmlishAllowMissingEndTag ) ) {
|
|
|
|
|
// Let it run out to the end of the text.
|
|
|
|
|
$inner = substr( $text, $tagEndPos + 1 );
|
|
|
|
|
$i = $lengthText;
|
|
|
|
|
$close = null;
|
|
|
|
|
} else {
|
|
|
|
|
// Don't match the tag, treat opening tag as literal and resume parsing.
|
|
|
|
|
$i = $tagEndPos + 1;
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
self::addLiteral( $accum,
|
|
|
|
|
substr( $text, $tagStartPos, $tagEndPos + 1 - $tagStartPos ) );
|
Preprocessor: Don't allow unclosed extension tags (matching until end of input)
(Previously done in f51d0d9a819f8f1c181350ced2f015ce97985fcc and
reverted in 543f46e9c08e0ff8c5e8b4e917fcc045730ef1bc.)
I think it's saner to treat this as invalid syntax, and output the
mismatched tag code verbatim. The current behavior is particularly
annoying for <ref> tags, which often swallow everything afterwards.
This does not affect HTML tags, though. Assuming Tidy is enabled, they
are still auto-closed at the end of the page content. (For tags that
"shadow" a HTML tag name, this results in the tag being treated as a
HTML tag. This currently only affects <pre> tags: if unclosed, they
are still displayed as preformatted text, but without suppressing
wikitext formatting.)
It also does not affect <includeonly>, <noinclude> and <onlyinclude>
tags. Changing this behavior now would be too disruptive to existing
content, and is the reason why previous attempt was reverted. (They
are already special-cased enough that this isn't too weird, for example
mismatched closing tags are hidden.)
Related to T17712 and T58306. I think this brings the PHP parser closer
to Parsoid's interpretation.
It reduces performance somewhat in the worst case, though. Testing with
https://phabricator.wikimedia.org/F3245989 (a 1 MB page starting with
3000 opening tags of 15 different types), parsing time rises from
~0.2 seconds to ~1.1 seconds on my setup. We go from O(N) to O(kN),
where N is bytes of input and k is the number of types of tags present
on the page. Maximum k shouldn't exceed 30 or so in reasonable setups
(depends on installed extensions, it's 20 on English Wikipedia).
Change-Id: Ide8b034e464eefb1b7c9e2a48ed06e21a7f8d434
2016-02-04 01:13:24 +00:00
|
|
|
// Cache results, otherwise we have O(N^2) performance for input like <foo><foo><foo>...
|
|
|
|
|
$noMoreClosingTag[$name] = true;
|
|
|
|
|
continue;
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
// <includeonly> and <noinclude> just become <ignore> tags
|
2008-03-05 01:07:47 +00:00
|
|
|
if ( in_array( $lowerName, $ignoredElements ) ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum[] = [ 'ignore', [ substr( $text, $tagStartPos, $i - $tagStartPos ) ] ];
|
2008-02-05 08:23:58 +00:00
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if ( $attrEnd <= $attrStart ) {
|
|
|
|
|
$attr = '';
|
|
|
|
|
} else {
|
2008-04-14 07:45:50 +00:00
|
|
|
// Note that the attr element contains the whitespace between name and attribute,
|
2008-02-05 08:23:58 +00:00
|
|
|
// this is necessary for precise reconstruction during pre-save transform.
|
|
|
|
|
$attr = substr( $text, $attrStart, $attrEnd - $attrStart );
|
|
|
|
|
}
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$children = [
|
|
|
|
|
[ 'name', [ $name ] ],
|
|
|
|
|
[ 'attr', [ $attr ] ] ];
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( $inner !== null ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$children[] = [ 'inner', [ $inner ] ];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
if ( $close !== null ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$children[] = [ 'close', [ $close ] ];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum[] = [ 'ext', $children ];
|
2014-03-14 21:09:47 +00:00
|
|
|
} elseif ( $found == 'line-start' ) {
|
2008-04-14 07:45:50 +00:00
|
|
|
// Is this the start of a heading?
|
2008-02-05 08:23:58 +00:00
|
|
|
// Line break belongs before the heading element in any case
|
|
|
|
|
if ( $fakeLineStart ) {
|
|
|
|
|
$fakeLineStart = false;
|
|
|
|
|
} else {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
self::addLiteral( $accum, $curChar );
|
2008-02-05 08:23:58 +00:00
|
|
|
$i++;
|
|
|
|
|
}
|
2008-04-14 07:45:50 +00:00
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
$count = strspn( $text, '=', $i, 6 );
|
|
|
|
|
if ( $count == 1 && $findEquals ) {
|
2014-05-10 23:03:45 +00:00
|
|
|
// DWIM: This looks kind of like a name/value separator.
|
|
|
|
|
// Let's let the equals handler have it and break the potential
|
|
|
|
|
// heading. This is heuristic, but AFAICT the methods for
|
|
|
|
|
// completely correct disambiguation are very complex.
|
2008-02-05 08:23:58 +00:00
|
|
|
} elseif ( $count > 0 ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$piece = [
|
2008-02-05 08:23:58 +00:00
|
|
|
'open' => "\n",
|
|
|
|
|
'close' => "\n",
|
2016-02-17 09:09:32 +00:00
|
|
|
'parts' => [ new PPDPart_Hash( str_repeat( '=', $count ) ) ],
|
2008-02-05 08:23:58 +00:00
|
|
|
'startPos' => $i,
|
2016-02-17 09:09:32 +00:00
|
|
|
'count' => $count ];
|
2008-02-05 08:23:58 +00:00
|
|
|
$stack->push( $piece );
|
|
|
|
|
$accum =& $stack->getAccum();
|
|
|
|
|
extract( $stack->getFlags() );
|
|
|
|
|
$i += $count;
|
|
|
|
|
}
|
2011-05-29 14:01:47 +00:00
|
|
|
} elseif ( $found == 'line-end' ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$piece = $stack->top;
|
|
|
|
|
// A heading must be open, otherwise \n wouldn't have been in the search list
|
2012-03-19 11:14:43 +00:00
|
|
|
assert( '$piece->open == "\n"' );
|
2008-02-05 08:23:58 +00:00
|
|
|
$part = $piece->getCurrentPart();
|
2014-05-10 23:03:45 +00:00
|
|
|
// Search back through the input to see if it has a proper close.
|
|
|
|
|
// Do this using the reversed string since the other solutions
|
|
|
|
|
// (end anchor, etc.) are inefficient.
|
2012-08-06 11:41:28 +00:00
|
|
|
$wsLength = strspn( $revText, " \t", $lengthText - $i );
|
2008-02-05 08:23:58 +00:00
|
|
|
$searchStart = $i - $wsLength;
|
|
|
|
|
if ( isset( $part->commentEnd ) && $searchStart - 1 == $part->commentEnd ) {
|
|
|
|
|
// Comment found at line end
|
|
|
|
|
// Search for equals signs before the comment
|
|
|
|
|
$searchStart = $part->visualEnd;
|
2012-08-06 11:41:28 +00:00
|
|
|
$searchStart -= strspn( $revText, " \t", $lengthText - $searchStart );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
$count = $piece->count;
|
2012-08-06 11:41:28 +00:00
|
|
|
$equalsLength = strspn( $revText, '=', $lengthText - $searchStart );
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( $equalsLength > 0 ) {
|
2010-06-21 20:33:07 +00:00
|
|
|
if ( $searchStart - $equalsLength == $piece->startPos ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
// This is just a single string of equals signs on its own line
|
2013-03-04 08:44:38 +00:00
|
|
|
// Replicate the doHeadings behavior /={count}(.+)={count}/
|
2008-02-05 08:23:58 +00:00
|
|
|
// First find out how many equals signs there really are (don't stop at 6)
|
|
|
|
|
$count = $equalsLength;
|
|
|
|
|
if ( $count < 3 ) {
|
|
|
|
|
$count = 0;
|
|
|
|
|
} else {
|
|
|
|
|
$count = min( 6, intval( ( $count - 1 ) / 2 ) );
|
|
|
|
|
}
|
|
|
|
|
} else {
|
|
|
|
|
$count = min( $equalsLength, $count );
|
|
|
|
|
}
|
|
|
|
|
if ( $count > 0 ) {
|
|
|
|
|
// Normal match, output <h>
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$element = [ [ 'possible-h',
|
|
|
|
|
array_merge(
|
|
|
|
|
[
|
|
|
|
|
[ '@level', [ $count ] ],
|
|
|
|
|
[ '@i', [ $headingIndex++ ] ]
|
|
|
|
|
],
|
|
|
|
|
$accum
|
|
|
|
|
)
|
|
|
|
|
] ];
|
2008-02-05 08:23:58 +00:00
|
|
|
} else {
|
|
|
|
|
// Single equals sign on its own line, count=0
|
|
|
|
|
$element = $accum;
|
|
|
|
|
}
|
|
|
|
|
} else {
|
|
|
|
|
// No match, no <h>, just pass down the inner text
|
|
|
|
|
$element = $accum;
|
|
|
|
|
}
|
|
|
|
|
// Unwind the stack
|
|
|
|
|
$stack->pop();
|
|
|
|
|
$accum =& $stack->getAccum();
|
|
|
|
|
extract( $stack->getFlags() );
|
|
|
|
|
|
|
|
|
|
// Append the result to the enclosing accumulator
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
array_splice( $accum, count( $accum ), 0, $element );
|
|
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
// Note that we do NOT increment the input pointer.
|
2008-04-14 07:45:50 +00:00
|
|
|
// This is because the closing linebreak could be the opening linebreak of
|
2008-02-05 08:23:58 +00:00
|
|
|
// another heading. Infinite loops are avoided because the next iteration MUST
|
2008-04-14 07:45:50 +00:00
|
|
|
// hit the heading open case above, which unconditionally increments the
|
2008-02-05 08:23:58 +00:00
|
|
|
// input pointer.
|
2011-05-29 14:01:47 +00:00
|
|
|
} elseif ( $found == 'open' ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
# count opening brace characters
|
|
|
|
|
$count = strspn( $text, $curChar, $i );
|
|
|
|
|
|
|
|
|
|
# we need to add to stack only if opening brace count is enough for one of the rules
|
|
|
|
|
if ( $count >= $rule['min'] ) {
|
|
|
|
|
# Add it to the stack
|
2016-02-17 09:09:32 +00:00
|
|
|
$piece = [
|
2008-02-05 08:23:58 +00:00
|
|
|
'open' => $curChar,
|
|
|
|
|
'close' => $rule['end'],
|
|
|
|
|
'count' => $count,
|
2013-04-20 15:38:24 +00:00
|
|
|
'lineStart' => ( $i > 0 && $text[$i - 1] == "\n" ),
|
2016-02-17 09:09:32 +00:00
|
|
|
];
|
2008-02-05 08:23:58 +00:00
|
|
|
|
|
|
|
|
$stack->push( $piece );
|
|
|
|
|
$accum =& $stack->getAccum();
|
|
|
|
|
extract( $stack->getFlags() );
|
|
|
|
|
} else {
|
|
|
|
|
# Add literal brace(s)
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
self::addLiteral( $accum, str_repeat( $curChar, $count ) );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
$i += $count;
|
2011-05-29 14:01:47 +00:00
|
|
|
} elseif ( $found == 'close' ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$piece = $stack->top;
|
|
|
|
|
# lets check if there are enough characters for closing brace
|
|
|
|
|
$maxCount = $piece->count;
|
|
|
|
|
$count = strspn( $text, $curChar, $i, $maxCount );
|
|
|
|
|
|
|
|
|
|
# check for maximum matching characters (if there are 5 closing
|
|
|
|
|
# characters, we will probably need only 3 - depending on the rules)
|
2015-10-31 23:10:54 +00:00
|
|
|
$rule = $this->rules[$piece->open];
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( $count > $rule['max'] ) {
|
|
|
|
|
# The specified maximum exists in the callback array, unless the caller
|
|
|
|
|
# has made an error
|
|
|
|
|
$matchingCount = $rule['max'];
|
|
|
|
|
} else {
|
|
|
|
|
# Count is less than the maximum
|
|
|
|
|
# Skip any gaps in the callback array to find the true largest match
|
|
|
|
|
# Need to use array_key_exists not isset because the callback can be null
|
|
|
|
|
$matchingCount = $count;
|
|
|
|
|
while ( $matchingCount > 0 && !array_key_exists( $matchingCount, $rule['names'] ) ) {
|
|
|
|
|
--$matchingCount;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2013-02-03 19:42:08 +00:00
|
|
|
if ( $matchingCount <= 0 ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
# No matching element found in callback array
|
|
|
|
|
# Output a literal closing brace and continue
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
self::addLiteral( $accum, str_repeat( $curChar, $count ) );
|
2008-02-05 08:23:58 +00:00
|
|
|
$i += $count;
|
|
|
|
|
continue;
|
|
|
|
|
}
|
|
|
|
|
$name = $rule['names'][$matchingCount];
|
|
|
|
|
if ( $name === null ) {
|
|
|
|
|
// No element, just literal text
|
|
|
|
|
$element = $piece->breakSyntax( $matchingCount );
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
self::addLiteral( $element, str_repeat( $rule['end'], $matchingCount ) );
|
2008-02-05 08:23:58 +00:00
|
|
|
} else {
|
|
|
|
|
# Create XML element
|
|
|
|
|
$parts = $piece->parts;
|
|
|
|
|
$titleAccum = $parts[0]->out;
|
|
|
|
|
unset( $parts[0] );
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$children = [];
|
2008-02-05 08:23:58 +00:00
|
|
|
|
2008-04-14 07:45:50 +00:00
|
|
|
# The invocation is at the start of the line if lineStart is set in
|
2008-02-05 08:23:58 +00:00
|
|
|
# the stack, and all opening brackets are used up.
|
|
|
|
|
if ( $maxCount == $matchingCount && !empty( $piece->lineStart ) ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$children[] = [ '@lineStart', [ 1 ] ];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$titleNode = [ 'title', $titleAccum ];
|
|
|
|
|
$children[] = $titleNode;
|
2008-02-05 08:23:58 +00:00
|
|
|
$argIndex = 1;
|
2010-10-14 20:53:04 +00:00
|
|
|
foreach ( $parts as $part ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( isset( $part->eqpos ) ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$equalsNode = $part->out[$part->eqpos];
|
|
|
|
|
$nameNode = [ 'name', array_slice( $part->out, 0, $part->eqpos ) ];
|
|
|
|
|
$valueNode = [ 'value', array_slice( $part->out, $part->eqpos + 1 ) ];
|
|
|
|
|
$partNode = [ 'part', [ $nameNode, $equalsNode, $valueNode ] ];
|
|
|
|
|
$children[] = $partNode;
|
2008-02-05 08:23:58 +00:00
|
|
|
} else {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$nameNode = [ 'name', [ [ '@index', [ $argIndex++ ] ] ] ];
|
|
|
|
|
$valueNode = [ 'value', $part->out ];
|
|
|
|
|
$partNode = [ 'part', [ $nameNode, $valueNode ] ];
|
|
|
|
|
$children[] = $partNode;
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$element = [ [ $name, $children ] ];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# Advance input pointer
|
|
|
|
|
$i += $matchingCount;
|
|
|
|
|
|
|
|
|
|
# Unwind the stack
|
|
|
|
|
$stack->pop();
|
|
|
|
|
$accum =& $stack->getAccum();
|
|
|
|
|
|
|
|
|
|
# Re-add the old stack element if it still has unmatched opening characters remaining
|
2013-02-03 19:42:08 +00:00
|
|
|
if ( $matchingCount < $piece->count ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$piece->parts = [ new PPDPart_Hash ];
|
2008-02-05 08:23:58 +00:00
|
|
|
$piece->count -= $matchingCount;
|
|
|
|
|
# do we still qualify for any callback with remaining count?
|
2015-10-31 23:10:54 +00:00
|
|
|
$min = $this->rules[$piece->open]['min'];
|
2012-05-22 23:56:33 +00:00
|
|
|
if ( $piece->count >= $min ) {
|
|
|
|
|
$stack->push( $piece );
|
|
|
|
|
$accum =& $stack->getAccum();
|
|
|
|
|
} else {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
self::addLiteral( $accum, str_repeat( $piece->open, $piece->count ) );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
extract( $stack->getFlags() );
|
|
|
|
|
|
|
|
|
|
# Add XML element to the enclosing accumulator
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
array_splice( $accum, count( $accum ), 0, $element );
|
2011-05-29 14:01:47 +00:00
|
|
|
} elseif ( $found == 'pipe' ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$findEquals = true; // shortcut for getFlags()
|
|
|
|
|
$stack->addPart();
|
|
|
|
|
$accum =& $stack->getAccum();
|
|
|
|
|
++$i;
|
2011-05-29 14:01:47 +00:00
|
|
|
} elseif ( $found == 'equals' ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$findEquals = false; // shortcut for getFlags()
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum[] = [ 'equals', [ '=' ] ];
|
|
|
|
|
$stack->getCurrentPart()->eqpos = count( $accum ) - 1;
|
2008-02-05 08:23:58 +00:00
|
|
|
++$i;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# Output any remaining unclosed brackets
|
|
|
|
|
foreach ( $stack->stack as $piece ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
array_splice( $stack->rootAccum, count( $stack->rootAccum ), 0, $piece->breakSyntax() );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# Enable top-level headings
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
foreach ( $stack->rootAccum as &$node ) {
|
|
|
|
|
if ( is_array( $node ) && $node[PPNode_Hash_Tree::NAME] === 'possible-h' ) {
|
|
|
|
|
$node[PPNode_Hash_Tree::NAME] = 'h';
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$rootStore = [ [ 'root', $stack->rootAccum ] ];
|
|
|
|
|
$rootNode = new PPNode_Hash_Tree( $rootStore, 0 );
|
2010-12-11 03:52:35 +00:00
|
|
|
|
2009-02-09 23:18:37 +00:00
|
|
|
// Cache
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$tree = json_encode( $rootStore, JSON_UNESCAPED_SLASHES | JSON_UNESCAPED_UNICODE );
|
|
|
|
|
if ( $tree !== false ) {
|
|
|
|
|
$this->cacheSetTree( $text, $flags, $tree );
|
|
|
|
|
}
|
2010-12-11 03:52:35 +00:00
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
return $rootNode;
|
|
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
|
|
|
|
|
private static function addLiteral( array &$accum, $text ) {
|
|
|
|
|
$n = count( $accum );
|
|
|
|
|
if ( $n && is_string( $accum[$n - 1] ) ) {
|
|
|
|
|
$accum[$n - 1] .= $text;
|
|
|
|
|
} else {
|
|
|
|
|
$accum[] = $text;
|
|
|
|
|
}
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Stack class to help Preprocessor::preprocessToObj()
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
* @ingroup Parser
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2015-10-06 15:13:24 +00:00
|
|
|
// @codingStandardsIgnoreStart Squiz.Classes.ValidClassName.NotCamelCaps
|
2008-02-05 08:23:58 +00:00
|
|
|
class PPDStack_Hash extends PPDStack {
|
2014-08-11 20:44:31 +00:00
|
|
|
// @codingStandardsIgnoreEnd
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __construct() {
|
2008-02-05 08:23:58 +00:00
|
|
|
$this->elementClass = 'PPDStackElement_Hash';
|
|
|
|
|
parent::__construct();
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$this->rootAccum = [];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
/**
|
|
|
|
|
* @ingroup Parser
|
|
|
|
|
*/
|
2015-10-06 15:13:24 +00:00
|
|
|
// @codingStandardsIgnoreStart Squiz.Classes.ValidClassName.NotCamelCaps
|
2008-02-05 08:23:58 +00:00
|
|
|
class PPDStackElement_Hash extends PPDStackElement {
|
2016-02-17 19:57:37 +00:00
|
|
|
// @codingStandardsIgnoreEnd
|
2014-08-11 20:44:31 +00:00
|
|
|
|
2016-02-17 19:57:37 +00:00
|
|
|
public function __construct( $data = [] ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$this->partClass = 'PPDPart_Hash';
|
|
|
|
|
parent::__construct( $data );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Get the accumulator that would result if the close is not found.
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param int|bool $openingCount
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
* @return array
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function breakSyntax( $openingCount = false ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( $this->open == "\n" ) {
|
|
|
|
|
$accum = $this->parts[0]->out;
|
|
|
|
|
} else {
|
|
|
|
|
if ( $openingCount === false ) {
|
|
|
|
|
$openingCount = $this->count;
|
|
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum = [ str_repeat( $this->open, $openingCount ) ];
|
|
|
|
|
$lastIndex = 0;
|
2008-02-05 08:23:58 +00:00
|
|
|
$first = true;
|
|
|
|
|
foreach ( $this->parts as $part ) {
|
|
|
|
|
if ( $first ) {
|
|
|
|
|
$first = false;
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
} elseif ( is_string( $accum[$lastIndex] ) ) {
|
|
|
|
|
$accum[$lastIndex] .= '|';
|
2008-02-05 08:23:58 +00:00
|
|
|
} else {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum[++$lastIndex] = '|';
|
|
|
|
|
}
|
|
|
|
|
foreach ( $part->out as $node ) {
|
|
|
|
|
if ( is_string( $node ) && is_string( $accum[$lastIndex] ) ) {
|
|
|
|
|
$accum[$lastIndex] .= $node;
|
|
|
|
|
} else {
|
|
|
|
|
$accum[++$lastIndex] = $node;
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return $accum;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
/**
|
|
|
|
|
* @ingroup Parser
|
|
|
|
|
*/
|
2015-10-06 15:13:24 +00:00
|
|
|
// @codingStandardsIgnoreStart Squiz.Classes.ValidClassName.NotCamelCaps
|
2008-02-05 08:23:58 +00:00
|
|
|
class PPDPart_Hash extends PPDPart {
|
2014-08-11 20:44:31 +00:00
|
|
|
// @codingStandardsIgnoreEnd
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __construct( $out = '' ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( $out !== '' ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum = [ $out ];
|
2008-02-05 08:23:58 +00:00
|
|
|
} else {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$accum = [];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
parent::__construct( $accum );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* An expansion frame, used as a context to expand the result of preprocessToObj()
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
* @ingroup Parser
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2015-10-06 15:13:24 +00:00
|
|
|
// @codingStandardsIgnoreStart Squiz.Classes.ValidClassName.NotCamelCaps
|
2008-02-05 08:23:58 +00:00
|
|
|
class PPFrame_Hash implements PPFrame {
|
2014-08-11 20:44:31 +00:00
|
|
|
// @codingStandardsIgnoreEnd
|
2014-05-16 00:48:01 +00:00
|
|
|
|
2011-02-24 11:59:51 +00:00
|
|
|
/**
|
2014-05-16 00:48:01 +00:00
|
|
|
* @var Parser
|
2011-02-24 11:59:51 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public $parser;
|
2011-02-24 17:04:49 +00:00
|
|
|
|
2014-05-16 00:48:01 +00:00
|
|
|
/**
|
|
|
|
|
* @var Preprocessor
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public $preprocessor;
|
2008-02-05 08:23:58 +00:00
|
|
|
|
2014-05-16 00:48:01 +00:00
|
|
|
/**
|
|
|
|
|
* @var Title
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public $title;
|
|
|
|
|
public $titleCache;
|
2014-05-10 22:52:21 +00:00
|
|
|
|
2014-05-16 00:48:01 +00:00
|
|
|
/**
|
|
|
|
|
* Hashtable listing templates which are disallowed for expansion in this frame,
|
|
|
|
|
* having been encountered previously in parent frames.
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public $loopCheckHash;
|
2008-02-05 08:23:58 +00:00
|
|
|
|
|
|
|
|
/**
|
2014-05-16 00:48:01 +00:00
|
|
|
* Recursion depth of this frame, top = 0
|
|
|
|
|
* Note that this is NOT the same as expansion depth in expand()
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public $depth;
|
2008-02-05 08:23:58 +00:00
|
|
|
|
2013-12-06 23:28:09 +00:00
|
|
|
private $volatile = false;
|
2014-05-28 20:17:41 +00:00
|
|
|
private $ttl = null;
|
2013-12-06 23:28:09 +00:00
|
|
|
|
2014-05-29 00:54:55 +00:00
|
|
|
/**
|
|
|
|
|
* @var array
|
|
|
|
|
*/
|
|
|
|
|
protected $childExpansionCache;
|
|
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
/**
|
|
|
|
|
* Construct a new preprocessor frame.
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param Preprocessor $preprocessor The parent preprocessor
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __construct( $preprocessor ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$this->preprocessor = $preprocessor;
|
|
|
|
|
$this->parser = $preprocessor->parser;
|
|
|
|
|
$this->title = $this->parser->mTitle;
|
2016-02-17 09:09:32 +00:00
|
|
|
$this->titleCache = [ $this->title ? $this->title->getPrefixedDBkey() : false ];
|
|
|
|
|
$this->loopCheckHash = [];
|
2008-02-05 08:23:58 +00:00
|
|
|
$this->depth = 0;
|
2016-02-17 09:09:32 +00:00
|
|
|
$this->childExpansionCache = [];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Create a new child frame
|
|
|
|
|
* $args is optionally a multi-root PPNode or array containing the template arguments
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param array|bool|PPNode_Hash_Array $args
|
|
|
|
|
* @param Title|bool $title
|
2012-10-07 23:35:26 +00:00
|
|
|
* @param int $indexOffset
|
|
|
|
|
* @throws MWException
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return PPTemplateFrame_Hash
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function newChild( $args = false, $title = false, $indexOffset = 0 ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$namedArgs = [];
|
|
|
|
|
$numberedArgs = [];
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( $title === false ) {
|
|
|
|
|
$title = $this->title;
|
|
|
|
|
}
|
|
|
|
|
if ( $args !== false ) {
|
|
|
|
|
if ( $args instanceof PPNode_Hash_Array ) {
|
|
|
|
|
$args = $args->value;
|
|
|
|
|
} elseif ( !is_array( $args ) ) {
|
|
|
|
|
throw new MWException( __METHOD__ . ': $args must be array or PPNode_Hash_Array' );
|
|
|
|
|
}
|
|
|
|
|
foreach ( $args as $arg ) {
|
|
|
|
|
$bits = $arg->splitArg();
|
|
|
|
|
if ( $bits['index'] !== '' ) {
|
|
|
|
|
// Numbered parameter
|
2012-05-22 03:26:25 +00:00
|
|
|
$index = $bits['index'] - $indexOffset;
|
2014-05-29 16:55:40 +00:00
|
|
|
if ( isset( $namedArgs[$index] ) || isset( $numberedArgs[$index] ) ) {
|
2014-10-24 16:09:36 +00:00
|
|
|
$this->parser->getOutput()->addWarning( wfMessage( 'duplicate-args-warning',
|
|
|
|
|
wfEscapeWikiText( $this->title ),
|
|
|
|
|
wfEscapeWikiText( $title ),
|
|
|
|
|
wfEscapeWikiText( $index ) )->text() );
|
2014-05-29 16:55:40 +00:00
|
|
|
$this->parser->addTrackingCategory( 'duplicate-args-category' );
|
|
|
|
|
}
|
2012-05-22 03:26:25 +00:00
|
|
|
$numberedArgs[$index] = $bits['value'];
|
|
|
|
|
unset( $namedArgs[$index] );
|
2008-02-05 08:23:58 +00:00
|
|
|
} else {
|
|
|
|
|
// Named parameter
|
|
|
|
|
$name = trim( $this->expand( $bits['name'], PPFrame::STRIP_COMMENTS ) );
|
2014-05-29 16:55:40 +00:00
|
|
|
if ( isset( $namedArgs[$name] ) || isset( $numberedArgs[$name] ) ) {
|
2014-10-24 16:09:36 +00:00
|
|
|
$this->parser->getOutput()->addWarning( wfMessage( 'duplicate-args-warning',
|
|
|
|
|
wfEscapeWikiText( $this->title ),
|
|
|
|
|
wfEscapeWikiText( $title ),
|
|
|
|
|
wfEscapeWikiText( $name ) )->text() );
|
2014-05-29 16:55:40 +00:00
|
|
|
$this->parser->addTrackingCategory( 'duplicate-args-category' );
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
$namedArgs[$name] = $bits['value'];
|
|
|
|
|
unset( $numberedArgs[$name] );
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return new PPTemplateFrame_Hash( $this->preprocessor, $this, $numberedArgs, $namedArgs, $title );
|
|
|
|
|
}
|
|
|
|
|
|
2014-05-29 00:54:55 +00:00
|
|
|
/**
|
|
|
|
|
* @throws MWException
|
|
|
|
|
* @param string|int $key
|
2014-08-04 11:04:01 +00:00
|
|
|
* @param string|PPNode $root
|
2014-05-29 00:54:55 +00:00
|
|
|
* @param int $flags
|
|
|
|
|
* @return string
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function cachedExpand( $key, $root, $flags = 0 ) {
|
2014-05-29 00:54:55 +00:00
|
|
|
// we don't have a parent, so we don't have a cache
|
|
|
|
|
return $this->expand( $root, $flags );
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
|
|
|
|
* @throws MWException
|
2014-07-24 17:43:25 +00:00
|
|
|
* @param string|PPNode $root
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param int $flags
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return string
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function expand( $root, $flags = 0 ) {
|
2008-10-23 14:40:10 +00:00
|
|
|
static $expansionDepth = 0;
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( is_string( $root ) ) {
|
|
|
|
|
return $root;
|
|
|
|
|
}
|
|
|
|
|
|
2011-02-24 17:04:49 +00:00
|
|
|
if ( ++$this->parser->mPPNodeCount > $this->parser->mOptions->getMaxPPNodeCount() ) {
|
2012-04-21 11:09:03 +00:00
|
|
|
$this->parser->limitationWarn( 'node-count-exceeded',
|
|
|
|
|
$this->parser->mPPNodeCount,
|
|
|
|
|
$this->parser->mOptions->getMaxPPNodeCount()
|
|
|
|
|
);
|
2008-02-05 08:23:58 +00:00
|
|
|
return '<span class="error">Node-count limit exceeded</span>';
|
|
|
|
|
}
|
2010-08-05 19:01:47 +00:00
|
|
|
if ( $expansionDepth > $this->parser->mOptions->getMaxPPExpandDepth() ) {
|
2012-04-21 11:09:03 +00:00
|
|
|
$this->parser->limitationWarn( 'expansion-depth-exceeded',
|
|
|
|
|
$expansionDepth,
|
|
|
|
|
$this->parser->mOptions->getMaxPPExpandDepth()
|
|
|
|
|
);
|
2008-03-25 04:26:58 +00:00
|
|
|
return '<span class="error">Expansion depth limit exceeded</span>';
|
|
|
|
|
}
|
2008-10-23 14:40:10 +00:00
|
|
|
++$expansionDepth;
|
2012-05-04 20:44:14 +00:00
|
|
|
if ( $expansionDepth > $this->parser->mHighestExpansionDepth ) {
|
|
|
|
|
$this->parser->mHighestExpansionDepth = $expansionDepth;
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
|
2016-02-17 09:09:32 +00:00
|
|
|
$outStack = [ '', '' ];
|
|
|
|
|
$iteratorStack = [ false, $root ];
|
|
|
|
|
$indexStack = [ 0, 0 ];
|
2008-02-05 08:23:58 +00:00
|
|
|
|
|
|
|
|
while ( count( $iteratorStack ) > 1 ) {
|
|
|
|
|
$level = count( $outStack ) - 1;
|
2013-03-24 10:01:51 +00:00
|
|
|
$iteratorNode =& $iteratorStack[$level];
|
2008-02-05 08:23:58 +00:00
|
|
|
$out =& $outStack[$level];
|
|
|
|
|
$index =& $indexStack[$level];
|
|
|
|
|
|
|
|
|
|
if ( is_array( $iteratorNode ) ) {
|
|
|
|
|
if ( $index >= count( $iteratorNode ) ) {
|
|
|
|
|
// All done with this iterator
|
|
|
|
|
$iteratorStack[$level] = false;
|
|
|
|
|
$contextNode = false;
|
|
|
|
|
} else {
|
|
|
|
|
$contextNode = $iteratorNode[$index];
|
|
|
|
|
$index++;
|
|
|
|
|
}
|
|
|
|
|
} elseif ( $iteratorNode instanceof PPNode_Hash_Array ) {
|
|
|
|
|
if ( $index >= $iteratorNode->getLength() ) {
|
|
|
|
|
// All done with this iterator
|
|
|
|
|
$iteratorStack[$level] = false;
|
|
|
|
|
$contextNode = false;
|
|
|
|
|
} else {
|
|
|
|
|
$contextNode = $iteratorNode->item( $index );
|
|
|
|
|
$index++;
|
|
|
|
|
}
|
|
|
|
|
} else {
|
2008-04-14 07:45:50 +00:00
|
|
|
// Copy to $contextNode and then delete from iterator stack,
|
2008-02-05 08:23:58 +00:00
|
|
|
// because this is not an iterator but we do have to execute it once
|
|
|
|
|
$contextNode = $iteratorStack[$level];
|
|
|
|
|
$iteratorStack[$level] = false;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
$newIterator = false;
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$contextName = false;
|
|
|
|
|
$contextChildren = false;
|
2008-02-05 08:23:58 +00:00
|
|
|
|
|
|
|
|
if ( $contextNode === false ) {
|
|
|
|
|
// nothing to do
|
|
|
|
|
} elseif ( is_string( $contextNode ) ) {
|
|
|
|
|
$out .= $contextNode;
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
} elseif ( $contextNode instanceof PPNode_Hash_Array ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$newIterator = $contextNode;
|
|
|
|
|
} elseif ( $contextNode instanceof PPNode_Hash_Attr ) {
|
|
|
|
|
// No output
|
|
|
|
|
} elseif ( $contextNode instanceof PPNode_Hash_Text ) {
|
|
|
|
|
$out .= $contextNode->value;
|
|
|
|
|
} elseif ( $contextNode instanceof PPNode_Hash_Tree ) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$contextName = $contextNode->name;
|
|
|
|
|
$contextChildren = $contextNode->getRawChildren();
|
|
|
|
|
} elseif ( is_array( $contextNode ) ) {
|
|
|
|
|
// Node descriptor array
|
|
|
|
|
if ( count( $contextNode ) !== 2 ) {
|
|
|
|
|
throw new MWException( __METHOD__.
|
|
|
|
|
': found an array where a node descriptor should be' );
|
|
|
|
|
}
|
|
|
|
|
list( $contextName, $contextChildren ) = $contextNode;
|
|
|
|
|
} else {
|
|
|
|
|
throw new MWException( __METHOD__ . ': Invalid parameter type' );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Handle node descriptor array or tree object
|
|
|
|
|
if ( $contextName === false ) {
|
|
|
|
|
// Not a node, already handled above
|
|
|
|
|
} elseif ( $contextName[0] === '@' ) {
|
|
|
|
|
// Attribute: no output
|
|
|
|
|
} elseif ( $contextName === 'template' ) {
|
|
|
|
|
# Double-brace expansion
|
|
|
|
|
$bits = PPNode_Hash_Tree::splitRawTemplate( $contextChildren );
|
|
|
|
|
if ( $flags & PPFrame::NO_TEMPLATES ) {
|
|
|
|
|
$newIterator = $this->virtualBracketedImplode(
|
|
|
|
|
'{{', '|', '}}',
|
|
|
|
|
$bits['title'],
|
|
|
|
|
$bits['parts']
|
|
|
|
|
);
|
|
|
|
|
} else {
|
|
|
|
|
$ret = $this->parser->braceSubstitution( $bits, $this );
|
|
|
|
|
if ( isset( $ret['object'] ) ) {
|
|
|
|
|
$newIterator = $ret['object'];
|
2014-03-14 21:09:47 +00:00
|
|
|
} else {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$out .= $ret['text'];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
}
|
|
|
|
|
} elseif ( $contextName === 'tplarg' ) {
|
|
|
|
|
# Triple-brace expansion
|
|
|
|
|
$bits = PPNode_Hash_Tree::splitRawTemplate( $contextChildren );
|
|
|
|
|
if ( $flags & PPFrame::NO_ARGS ) {
|
|
|
|
|
$newIterator = $this->virtualBracketedImplode(
|
|
|
|
|
'{{{', '|', '}}}',
|
|
|
|
|
$bits['title'],
|
|
|
|
|
$bits['parts']
|
|
|
|
|
);
|
|
|
|
|
} else {
|
|
|
|
|
$ret = $this->parser->argSubstitution( $bits, $this );
|
|
|
|
|
if ( isset( $ret['object'] ) ) {
|
|
|
|
|
$newIterator = $ret['object'];
|
2008-02-05 08:23:58 +00:00
|
|
|
} else {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$out .= $ret['text'];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
}
|
|
|
|
|
} elseif ( $contextName === 'comment' ) {
|
|
|
|
|
# HTML-style comment
|
|
|
|
|
# Remove it in HTML, pre+remove and STRIP_COMMENTS modes
|
|
|
|
|
# Not in RECOVER_COMMENTS mode (msgnw) though.
|
|
|
|
|
if ( ( $this->parser->ot['html']
|
|
|
|
|
|| ( $this->parser->ot['pre'] && $this->parser->mOptions->getRemoveComments() )
|
|
|
|
|
|| ( $flags & PPFrame::STRIP_COMMENTS )
|
|
|
|
|
) && !( $flags & PPFrame::RECOVER_COMMENTS )
|
|
|
|
|
) {
|
|
|
|
|
$out .= '';
|
|
|
|
|
} elseif ( $this->parser->ot['wiki'] && !( $flags & PPFrame::RECOVER_COMMENTS ) ) {
|
|
|
|
|
# Add a strip marker in PST mode so that pstPass2() can
|
|
|
|
|
# run some old-fashioned regexes on the result.
|
|
|
|
|
# Not in RECOVER_COMMENTS mode (extractSections) though.
|
|
|
|
|
$out .= $this->parser->insertStripItem( $contextChildren[0] );
|
|
|
|
|
} else {
|
|
|
|
|
# Recover the literal comment in RECOVER_COMMENTS and pre+no-remove
|
|
|
|
|
$out .= $contextChildren[0];
|
|
|
|
|
}
|
|
|
|
|
} elseif ( $contextName === 'ignore' ) {
|
|
|
|
|
# Output suppression used by <includeonly> etc.
|
|
|
|
|
# OT_WIKI will only respect <ignore> in substed templates.
|
|
|
|
|
# The other output types respect it unless NO_IGNORE is set.
|
|
|
|
|
# extractSections() sets NO_IGNORE and so never respects it.
|
|
|
|
|
if ( ( !isset( $this->parent ) && $this->parser->ot['wiki'] )
|
|
|
|
|
|| ( $flags & PPFrame::NO_IGNORE )
|
|
|
|
|
) {
|
|
|
|
|
$out .= $contextChildren[0];
|
|
|
|
|
} else {
|
|
|
|
|
// $out .= '';
|
|
|
|
|
}
|
|
|
|
|
} elseif ( $contextName === 'ext' ) {
|
|
|
|
|
# Extension tag
|
|
|
|
|
$bits = PPNode_Hash_Tree::splitRawExt( $contextChildren ) +
|
|
|
|
|
[ 'attr' => null, 'inner' => null, 'close' => null ];
|
|
|
|
|
if ( $flags & PPFrame::NO_TAGS ) {
|
|
|
|
|
$s = '<' . $bits['name']->getFirstChild()->value;
|
|
|
|
|
if ( $bits['attr'] ) {
|
|
|
|
|
$s .= $bits['attr']->getFirstChild()->value;
|
2014-04-03 03:52:52 +00:00
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
if ( $bits['inner'] ) {
|
|
|
|
|
$s .= '>' . $bits['inner']->getFirstChild()->value;
|
|
|
|
|
if ( $bits['close'] ) {
|
|
|
|
|
$s .= $bits['close']->getFirstChild()->value;
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
} else {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$s .= '/>';
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$out .= $s;
|
|
|
|
|
} else {
|
|
|
|
|
$out .= $this->parser->extensionSubstitution( $bits, $this );
|
|
|
|
|
}
|
|
|
|
|
} elseif ( $contextName === 'h' ) {
|
|
|
|
|
# Heading
|
|
|
|
|
if ( $this->parser->ot['html'] ) {
|
|
|
|
|
# Expand immediately and insert heading index marker
|
|
|
|
|
$s = $this->expand( $contextChildren, $flags );
|
|
|
|
|
$bits = PPNode_Hash_Tree::splitRawHeading( $contextChildren );
|
|
|
|
|
$titleText = $this->title->getPrefixedDBkey();
|
|
|
|
|
$this->parser->mHeadings[] = [ $titleText, $bits['i'] ];
|
|
|
|
|
$serial = count( $this->parser->mHeadings ) - 1;
|
|
|
|
|
$marker = Parser::MARKER_PREFIX . "-h-$serial-" . Parser::MARKER_SUFFIX;
|
|
|
|
|
$s = substr( $s, 0, $bits['level'] ) . $marker . substr( $s, $bits['level'] );
|
|
|
|
|
$this->parser->mStripState->addGeneral( $marker, '' );
|
|
|
|
|
$out .= $s;
|
2008-02-05 08:23:58 +00:00
|
|
|
} else {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
# Expand in virtual stack
|
|
|
|
|
$newIterator = $contextChildren;
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
} else {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
# Generic recursive expansion
|
|
|
|
|
$newIterator = $contextChildren;
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if ( $newIterator !== false ) {
|
|
|
|
|
$outStack[] = '';
|
|
|
|
|
$iteratorStack[] = $newIterator;
|
|
|
|
|
$indexStack[] = 0;
|
|
|
|
|
} elseif ( $iteratorStack[$level] === false ) {
|
|
|
|
|
// Return accumulated value to parent
|
|
|
|
|
// With tail recursion
|
|
|
|
|
while ( $iteratorStack[$level] === false && $level > 0 ) {
|
|
|
|
|
$outStack[$level - 1] .= $out;
|
|
|
|
|
array_pop( $outStack );
|
|
|
|
|
array_pop( $iteratorStack );
|
|
|
|
|
array_pop( $indexStack );
|
|
|
|
|
$level--;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
2008-10-23 14:40:10 +00:00
|
|
|
--$expansionDepth;
|
2008-02-05 08:23:58 +00:00
|
|
|
return $outStack[0];
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param string $sep
|
|
|
|
|
* @param int $flags
|
2014-08-15 16:22:34 +00:00
|
|
|
* @param string|PPNode $args,...
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return string
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function implodeWithFlags( $sep, $flags /*, ... */ ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$args = array_slice( func_get_args(), 2 );
|
|
|
|
|
|
|
|
|
|
$first = true;
|
|
|
|
|
$s = '';
|
|
|
|
|
foreach ( $args as $root ) {
|
|
|
|
|
if ( $root instanceof PPNode_Hash_Array ) {
|
|
|
|
|
$root = $root->value;
|
|
|
|
|
}
|
|
|
|
|
if ( !is_array( $root ) ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$root = [ $root ];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
foreach ( $root as $node ) {
|
|
|
|
|
if ( $first ) {
|
|
|
|
|
$first = false;
|
|
|
|
|
} else {
|
|
|
|
|
$s .= $sep;
|
|
|
|
|
}
|
|
|
|
|
$s .= $this->expand( $node, $flags );
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return $s;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Implode with no flags specified
|
|
|
|
|
* This previously called implodeWithFlags but has now been inlined to reduce stack depth
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param string $sep
|
2014-08-15 16:22:34 +00:00
|
|
|
* @param string|PPNode $args,...
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return string
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function implode( $sep /*, ... */ ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$args = array_slice( func_get_args(), 1 );
|
|
|
|
|
|
|
|
|
|
$first = true;
|
|
|
|
|
$s = '';
|
|
|
|
|
foreach ( $args as $root ) {
|
|
|
|
|
if ( $root instanceof PPNode_Hash_Array ) {
|
|
|
|
|
$root = $root->value;
|
|
|
|
|
}
|
|
|
|
|
if ( !is_array( $root ) ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$root = [ $root ];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
foreach ( $root as $node ) {
|
|
|
|
|
if ( $first ) {
|
|
|
|
|
$first = false;
|
|
|
|
|
} else {
|
|
|
|
|
$s .= $sep;
|
|
|
|
|
}
|
|
|
|
|
$s .= $this->expand( $node );
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return $s;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2008-04-14 07:45:50 +00:00
|
|
|
* Makes an object that, when expand()ed, will be the same as one obtained
|
2008-02-05 08:23:58 +00:00
|
|
|
* with implode()
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param string $sep
|
2014-08-15 16:22:34 +00:00
|
|
|
* @param string|PPNode $args,...
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return PPNode_Hash_Array
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function virtualImplode( $sep /*, ... */ ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$args = array_slice( func_get_args(), 1 );
|
2016-02-17 09:09:32 +00:00
|
|
|
$out = [];
|
2008-02-05 08:23:58 +00:00
|
|
|
$first = true;
|
|
|
|
|
|
|
|
|
|
foreach ( $args as $root ) {
|
|
|
|
|
if ( $root instanceof PPNode_Hash_Array ) {
|
|
|
|
|
$root = $root->value;
|
|
|
|
|
}
|
|
|
|
|
if ( !is_array( $root ) ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$root = [ $root ];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
foreach ( $root as $node ) {
|
|
|
|
|
if ( $first ) {
|
|
|
|
|
$first = false;
|
|
|
|
|
} else {
|
|
|
|
|
$out[] = $sep;
|
|
|
|
|
}
|
|
|
|
|
$out[] = $node;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return new PPNode_Hash_Array( $out );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Virtual implode with brackets
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param string $start
|
|
|
|
|
* @param string $sep
|
|
|
|
|
* @param string $end
|
2014-08-15 16:22:34 +00:00
|
|
|
* @param string|PPNode $args,...
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return PPNode_Hash_Array
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function virtualBracketedImplode( $start, $sep, $end /*, ... */ ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$args = array_slice( func_get_args(), 3 );
|
2016-02-17 09:09:32 +00:00
|
|
|
$out = [ $start ];
|
2008-02-05 08:23:58 +00:00
|
|
|
$first = true;
|
|
|
|
|
|
|
|
|
|
foreach ( $args as $root ) {
|
|
|
|
|
if ( $root instanceof PPNode_Hash_Array ) {
|
|
|
|
|
$root = $root->value;
|
|
|
|
|
}
|
|
|
|
|
if ( !is_array( $root ) ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$root = [ $root ];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
foreach ( $root as $node ) {
|
|
|
|
|
if ( $first ) {
|
|
|
|
|
$first = false;
|
|
|
|
|
} else {
|
|
|
|
|
$out[] = $sep;
|
|
|
|
|
}
|
|
|
|
|
$out[] = $node;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
$out[] = $end;
|
|
|
|
|
return new PPNode_Hash_Array( $out );
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __toString() {
|
2008-02-05 08:23:58 +00:00
|
|
|
return 'frame{}';
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param bool $level
|
|
|
|
|
* @return array|bool|string
|
2011-05-29 14:01:47 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getPDBK( $level = false ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( $level === false ) {
|
|
|
|
|
return $this->title->getPrefixedDBkey();
|
|
|
|
|
} else {
|
|
|
|
|
return isset( $this->titleCache[$level] ) ? $this->titleCache[$level] : false;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
|
|
|
|
* @return array
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getArguments() {
|
2016-02-17 09:09:32 +00:00
|
|
|
return [];
|
2009-07-02 16:21:30 +00:00
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
|
|
|
|
* @return array
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getNumberedArguments() {
|
2016-02-17 09:09:32 +00:00
|
|
|
return [];
|
2009-07-02 16:21:30 +00:00
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
|
|
|
|
* @return array
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getNamedArguments() {
|
2016-02-17 09:09:32 +00:00
|
|
|
return [];
|
2009-07-02 16:21:30 +00:00
|
|
|
}
|
|
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
/**
|
|
|
|
|
* Returns true if there are no arguments in this frame
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
|
|
|
|
* @return bool
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function isEmpty() {
|
2008-02-05 08:23:58 +00:00
|
|
|
return true;
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
2016-03-07 11:02:43 +00:00
|
|
|
* @param int|string $name
|
|
|
|
|
* @return bool Always false in this implementation.
|
2011-05-29 14:01:47 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getArgument( $name ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Returns true if the infinite loop check is OK, false if a loop is detected
|
2011-02-24 17:04:49 +00:00
|
|
|
*
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param Title $title
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
|
|
|
|
* @return bool
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function loopCheck( $title ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
return !isset( $this->loopCheckHash[$title->getPrefixedDBkey()] );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Return true if the frame is a template frame
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
|
|
|
|
* @return bool
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function isTemplate() {
|
2008-02-05 08:23:58 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
2011-11-09 20:52:24 +00:00
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Get a title of frame
|
|
|
|
|
*
|
|
|
|
|
* @return Title
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getTitle() {
|
2011-11-09 20:52:24 +00:00
|
|
|
return $this->title;
|
|
|
|
|
}
|
2013-12-06 23:28:09 +00:00
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Set the volatile flag
|
|
|
|
|
*
|
|
|
|
|
* @param bool $flag
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function setVolatile( $flag = true ) {
|
2013-12-06 23:28:09 +00:00
|
|
|
$this->volatile = $flag;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Get the volatile flag
|
|
|
|
|
*
|
|
|
|
|
* @return bool
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function isVolatile() {
|
2013-12-06 23:28:09 +00:00
|
|
|
return $this->volatile;
|
|
|
|
|
}
|
2014-05-28 20:17:41 +00:00
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Set the TTL
|
|
|
|
|
*
|
|
|
|
|
* @param int $ttl
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function setTTL( $ttl ) {
|
2014-05-28 20:17:41 +00:00
|
|
|
if ( $ttl !== null && ( $this->ttl === null || $ttl < $this->ttl ) ) {
|
|
|
|
|
$this->ttl = $ttl;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Get the TTL
|
|
|
|
|
*
|
|
|
|
|
* @return int|null
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getTTL() {
|
2014-05-28 20:17:41 +00:00
|
|
|
return $this->ttl;
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Expansion frame with template arguments
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
* @ingroup Parser
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2015-10-06 15:13:24 +00:00
|
|
|
// @codingStandardsIgnoreStart Squiz.Classes.ValidClassName.NotCamelCaps
|
2008-02-05 08:23:58 +00:00
|
|
|
class PPTemplateFrame_Hash extends PPFrame_Hash {
|
2014-08-11 20:44:31 +00:00
|
|
|
// @codingStandardsIgnoreEnd
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public $numberedArgs, $namedArgs, $parent;
|
|
|
|
|
public $numberedExpansionCache, $namedExpansionCache;
|
2008-02-05 08:23:58 +00:00
|
|
|
|
2011-05-26 19:52:56 +00:00
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param Preprocessor $preprocessor
|
|
|
|
|
* @param bool|PPFrame $parent
|
|
|
|
|
* @param array $numberedArgs
|
|
|
|
|
* @param array $namedArgs
|
2014-05-10 23:05:51 +00:00
|
|
|
* @param bool|Title $title
|
2011-05-26 19:52:56 +00:00
|
|
|
*/
|
2016-02-17 09:09:32 +00:00
|
|
|
public function __construct( $preprocessor, $parent = false, $numberedArgs = [],
|
|
|
|
|
$namedArgs = [], $title = false
|
2014-05-10 23:03:45 +00:00
|
|
|
) {
|
2010-07-25 21:15:27 +00:00
|
|
|
parent::__construct( $preprocessor );
|
|
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
$this->parent = $parent;
|
|
|
|
|
$this->numberedArgs = $numberedArgs;
|
|
|
|
|
$this->namedArgs = $namedArgs;
|
|
|
|
|
$this->title = $title;
|
|
|
|
|
$pdbk = $title ? $title->getPrefixedDBkey() : false;
|
|
|
|
|
$this->titleCache = $parent->titleCache;
|
|
|
|
|
$this->titleCache[] = $pdbk;
|
|
|
|
|
$this->loopCheckHash = /*clone*/ $parent->loopCheckHash;
|
|
|
|
|
if ( $pdbk !== false ) {
|
|
|
|
|
$this->loopCheckHash[$pdbk] = true;
|
|
|
|
|
}
|
|
|
|
|
$this->depth = $parent->depth + 1;
|
2016-02-17 09:09:32 +00:00
|
|
|
$this->numberedExpansionCache = $this->namedExpansionCache = [];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __toString() {
|
2008-02-05 08:23:58 +00:00
|
|
|
$s = 'tplframe{';
|
|
|
|
|
$first = true;
|
|
|
|
|
$args = $this->numberedArgs + $this->namedArgs;
|
|
|
|
|
foreach ( $args as $name => $value ) {
|
|
|
|
|
if ( $first ) {
|
|
|
|
|
$first = false;
|
|
|
|
|
} else {
|
|
|
|
|
$s .= ', ';
|
|
|
|
|
}
|
2008-04-14 07:45:50 +00:00
|
|
|
$s .= "\"$name\":\"" .
|
2008-02-05 08:23:58 +00:00
|
|
|
str_replace( '"', '\\"', $value->__toString() ) . '"';
|
|
|
|
|
}
|
|
|
|
|
$s .= '}';
|
|
|
|
|
return $s;
|
|
|
|
|
}
|
2014-03-15 19:57:00 +00:00
|
|
|
|
2014-05-29 00:54:55 +00:00
|
|
|
/**
|
|
|
|
|
* @throws MWException
|
|
|
|
|
* @param string|int $key
|
2014-08-04 11:04:01 +00:00
|
|
|
* @param string|PPNode $root
|
2014-05-29 00:54:55 +00:00
|
|
|
* @param int $flags
|
|
|
|
|
* @return string
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function cachedExpand( $key, $root, $flags = 0 ) {
|
2013-12-06 23:28:09 +00:00
|
|
|
if ( isset( $this->parent->childExpansionCache[$key] ) ) {
|
|
|
|
|
return $this->parent->childExpansionCache[$key];
|
2014-05-29 00:54:55 +00:00
|
|
|
}
|
2013-12-06 23:28:09 +00:00
|
|
|
$retval = $this->expand( $root, $flags );
|
|
|
|
|
if ( !$this->isVolatile() ) {
|
|
|
|
|
$this->parent->childExpansionCache[$key] = $retval;
|
|
|
|
|
}
|
|
|
|
|
return $retval;
|
2014-05-29 00:54:55 +00:00
|
|
|
}
|
|
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
/**
|
|
|
|
|
* Returns true if there are no arguments in this frame
|
2011-05-26 19:52:56 +00:00
|
|
|
*
|
|
|
|
|
* @return bool
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function isEmpty() {
|
2008-02-05 08:23:58 +00:00
|
|
|
return !count( $this->numberedArgs ) && !count( $this->namedArgs );
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-26 19:52:56 +00:00
|
|
|
/**
|
|
|
|
|
* @return array
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getArguments() {
|
2016-02-17 09:09:32 +00:00
|
|
|
$arguments = [];
|
2008-08-09 06:08:54 +00:00
|
|
|
foreach ( array_merge(
|
2013-02-03 19:42:08 +00:00
|
|
|
array_keys( $this->numberedArgs ),
|
|
|
|
|
array_keys( $this->namedArgs ) ) as $key ) {
|
|
|
|
|
$arguments[$key] = $this->getArgument( $key );
|
2008-08-09 06:08:54 +00:00
|
|
|
}
|
|
|
|
|
return $arguments;
|
|
|
|
|
}
|
2010-12-11 03:52:35 +00:00
|
|
|
|
2011-05-26 19:52:56 +00:00
|
|
|
/**
|
|
|
|
|
* @return array
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getNumberedArguments() {
|
2016-02-17 09:09:32 +00:00
|
|
|
$arguments = [];
|
2013-02-03 19:42:08 +00:00
|
|
|
foreach ( array_keys( $this->numberedArgs ) as $key ) {
|
|
|
|
|
$arguments[$key] = $this->getArgument( $key );
|
2008-08-09 06:08:54 +00:00
|
|
|
}
|
|
|
|
|
return $arguments;
|
|
|
|
|
}
|
2010-12-11 03:52:35 +00:00
|
|
|
|
2011-05-26 19:52:56 +00:00
|
|
|
/**
|
|
|
|
|
* @return array
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getNamedArguments() {
|
2016-02-17 09:09:32 +00:00
|
|
|
$arguments = [];
|
2013-02-03 19:42:08 +00:00
|
|
|
foreach ( array_keys( $this->namedArgs ) as $key ) {
|
|
|
|
|
$arguments[$key] = $this->getArgument( $key );
|
2008-08-09 06:08:54 +00:00
|
|
|
}
|
|
|
|
|
return $arguments;
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param int $index
|
2016-03-07 11:02:43 +00:00
|
|
|
* @return string|bool
|
2011-05-29 14:01:47 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getNumberedArgument( $index ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( !isset( $this->numberedArgs[$index] ) ) {
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
if ( !isset( $this->numberedExpansionCache[$index] ) ) {
|
|
|
|
|
# No trimming for unnamed arguments
|
2014-05-10 23:03:45 +00:00
|
|
|
$this->numberedExpansionCache[$index] = $this->parent->expand(
|
|
|
|
|
$this->numberedArgs[$index],
|
|
|
|
|
PPFrame::STRIP_COMMENTS
|
|
|
|
|
);
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
return $this->numberedExpansionCache[$index];
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param string $name
|
2016-03-07 11:02:43 +00:00
|
|
|
* @return string|bool
|
2011-05-29 14:01:47 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getNamedArgument( $name ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( !isset( $this->namedArgs[$name] ) ) {
|
|
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
if ( !isset( $this->namedExpansionCache[$name] ) ) {
|
|
|
|
|
# Trim named arguments post-expand, for backwards compatibility
|
2008-04-14 07:45:50 +00:00
|
|
|
$this->namedExpansionCache[$name] = trim(
|
2010-07-29 07:20:02 +00:00
|
|
|
$this->parent->expand( $this->namedArgs[$name], PPFrame::STRIP_COMMENTS ) );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
return $this->namedExpansionCache[$name];
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
2016-03-07 11:02:43 +00:00
|
|
|
* @param int|string $name
|
|
|
|
|
* @return string|bool
|
2011-05-29 14:01:47 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getArgument( $name ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$text = $this->getNumberedArgument( $name );
|
|
|
|
|
if ( $text === false ) {
|
|
|
|
|
$text = $this->getNamedArgument( $name );
|
|
|
|
|
}
|
|
|
|
|
return $text;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Return true if the frame is a template frame
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
|
|
|
|
* @return bool
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function isTemplate() {
|
2008-02-05 08:23:58 +00:00
|
|
|
return true;
|
|
|
|
|
}
|
2013-12-06 23:28:09 +00:00
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function setVolatile( $flag = true ) {
|
2013-12-06 23:28:09 +00:00
|
|
|
parent::setVolatile( $flag );
|
|
|
|
|
$this->parent->setVolatile( $flag );
|
|
|
|
|
}
|
2014-05-28 20:17:41 +00:00
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function setTTL( $ttl ) {
|
2014-05-28 20:17:41 +00:00
|
|
|
parent::setTTL( $ttl );
|
|
|
|
|
$this->parent->setTTL( $ttl );
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
2008-06-26 13:05:40 +00:00
|
|
|
/**
|
|
|
|
|
* Expansion frame with custom arguments
|
|
|
|
|
* @ingroup Parser
|
|
|
|
|
*/
|
2015-10-06 15:13:24 +00:00
|
|
|
// @codingStandardsIgnoreStart Squiz.Classes.ValidClassName.NotCamelCaps
|
2008-06-26 13:05:40 +00:00
|
|
|
class PPCustomFrame_Hash extends PPFrame_Hash {
|
2014-08-11 20:44:31 +00:00
|
|
|
// @codingStandardsIgnoreEnd
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public $args;
|
2008-06-26 13:05:40 +00:00
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __construct( $preprocessor, $args ) {
|
2010-07-25 21:15:27 +00:00
|
|
|
parent::__construct( $preprocessor );
|
2008-06-26 13:05:40 +00:00
|
|
|
$this->args = $args;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __toString() {
|
2008-06-26 13:05:40 +00:00
|
|
|
$s = 'cstmframe{';
|
|
|
|
|
$first = true;
|
|
|
|
|
foreach ( $this->args as $name => $value ) {
|
|
|
|
|
if ( $first ) {
|
|
|
|
|
$first = false;
|
|
|
|
|
} else {
|
|
|
|
|
$s .= ', ';
|
|
|
|
|
}
|
|
|
|
|
$s .= "\"$name\":\"" .
|
|
|
|
|
str_replace( '"', '\\"', $value->__toString() ) . '"';
|
|
|
|
|
}
|
|
|
|
|
$s .= '}';
|
|
|
|
|
return $s;
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
|
|
|
|
* @return bool
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function isEmpty() {
|
2008-06-26 13:05:40 +00:00
|
|
|
return !count( $this->args );
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
2016-03-07 11:02:43 +00:00
|
|
|
* @param int|string $index
|
|
|
|
|
* @return string|bool
|
2011-05-29 14:01:47 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getArgument( $index ) {
|
2008-11-03 00:04:33 +00:00
|
|
|
if ( !isset( $this->args[$index] ) ) {
|
|
|
|
|
return false;
|
|
|
|
|
}
|
2008-06-26 13:05:40 +00:00
|
|
|
return $this->args[$index];
|
|
|
|
|
}
|
2012-05-22 03:26:25 +00:00
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getArguments() {
|
2012-05-22 03:26:25 +00:00
|
|
|
return $this->args;
|
|
|
|
|
}
|
2008-06-26 13:05:40 +00:00
|
|
|
}
|
|
|
|
|
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
/**
|
|
|
|
|
* @ingroup Parser
|
|
|
|
|
*/
|
2015-10-06 15:13:24 +00:00
|
|
|
// @codingStandardsIgnoreStart Squiz.Classes.ValidClassName.NotCamelCaps
|
2008-02-05 08:23:58 +00:00
|
|
|
class PPNode_Hash_Tree implements PPNode {
|
2014-08-11 20:44:31 +00:00
|
|
|
// @codingStandardsIgnoreEnd
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
public $name;
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* The store array for children of this node. It is "raw" in the sense that
|
|
|
|
|
* nodes are two-element arrays ("descriptors") rather than PPNode_Hash_*
|
|
|
|
|
* objects.
|
|
|
|
|
*/
|
|
|
|
|
private $rawChildren;
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* The store array for the siblings of this node, including this node itself.
|
|
|
|
|
*/
|
|
|
|
|
private $store;
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* The index into $this->store which contains the descriptor of this node.
|
|
|
|
|
*/
|
|
|
|
|
private $index;
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* The offset of the name within descriptors, used in some places for
|
|
|
|
|
* readability.
|
|
|
|
|
*/
|
|
|
|
|
const NAME = 0;
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* The offset of the child list within descriptors, used in some places for
|
|
|
|
|
* readability.
|
|
|
|
|
*/
|
|
|
|
|
const CHILDREN = 1;
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Construct an object using the data from $store[$index]. The rest of the
|
|
|
|
|
* store array can be accessed via getNextSibling().
|
|
|
|
|
*
|
|
|
|
|
* @param array $store
|
|
|
|
|
* @param integer $index
|
|
|
|
|
*/
|
|
|
|
|
public function __construct( array $store, $index ) {
|
|
|
|
|
$this->store = $store;
|
|
|
|
|
$this->index = $index;
|
|
|
|
|
list( $this->name, $this->rawChildren ) = $this->store[$index];
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Construct an appropriate PPNode_Hash_* object with a class that depends
|
|
|
|
|
* on what is at the relevant store index.
|
|
|
|
|
*
|
|
|
|
|
* @param array $store
|
|
|
|
|
* @param integer $index
|
|
|
|
|
* @return PPNode_Hash_Tree|PPNode_Hash_Attr|PPNode_Hash_Text
|
|
|
|
|
*/
|
|
|
|
|
public static function factory( array $store, $index ) {
|
|
|
|
|
if ( !isset( $store[$index] ) ) {
|
|
|
|
|
return false;
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$descriptor = $store[$index];
|
|
|
|
|
if ( is_string( $descriptor ) ) {
|
|
|
|
|
$class = 'PPNode_Hash_Text';
|
|
|
|
|
} elseif ( is_array( $descriptor ) ) {
|
|
|
|
|
if ( $descriptor[self::NAME][0] === '@' ) {
|
|
|
|
|
$class = 'PPNode_Hash_Attr';
|
|
|
|
|
} else {
|
|
|
|
|
$class = 'PPNode_Hash_Tree';
|
|
|
|
|
}
|
|
|
|
|
} else {
|
|
|
|
|
throw new MWException( __METHOD__.': invalid node descriptor' );
|
|
|
|
|
}
|
|
|
|
|
return new $class( $store, $index );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
/**
|
|
|
|
|
* Convert a node to XML, for debugging
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __toString() {
|
2008-02-05 08:23:58 +00:00
|
|
|
$inner = '';
|
|
|
|
|
$attribs = '';
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
for ( $node = $this->getFirstChild(); $node; $node = $node->getNextSibling() ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( $node instanceof PPNode_Hash_Attr ) {
|
|
|
|
|
$attribs .= ' ' . $node->name . '="' . htmlspecialchars( $node->value ) . '"';
|
|
|
|
|
} else {
|
|
|
|
|
$inner .= $node->__toString();
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
if ( $inner === '' ) {
|
|
|
|
|
return "<{$this->name}$attribs/>";
|
|
|
|
|
} else {
|
|
|
|
|
return "<{$this->name}$attribs>$inner</{$this->name}>";
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
|
|
|
|
* @return PPNode_Hash_Array
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getChildren() {
|
2016-02-17 09:09:32 +00:00
|
|
|
$children = [];
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
foreach ( $this->rawChildren as $i => $child ) {
|
|
|
|
|
$children[] = self::factory( $this->rawChildren, $i );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
return new PPNode_Hash_Array( $children );
|
|
|
|
|
}
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
/**
|
|
|
|
|
* Get the first child, or false if there is none. Note that this will
|
|
|
|
|
* return a temporary proxy object: different instances will be returned
|
|
|
|
|
* if this is called more than once on the same node.
|
|
|
|
|
*
|
|
|
|
|
* @return PPNode_Hash_Tree|PPNode_Hash_Attr|PPNode_Hash_Text|boolean
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getFirstChild() {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
if ( !isset( $this->rawChildren[0] ) ) {
|
|
|
|
|
return false;
|
|
|
|
|
} else {
|
|
|
|
|
return self::factory( $this->rawChildren, 0 );
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
/**
|
|
|
|
|
* Get the next sibling, or false if there is none. Note that this will
|
|
|
|
|
* return a temporary proxy object: different instances will be returned
|
|
|
|
|
* if this is called more than once on the same node.
|
|
|
|
|
*
|
|
|
|
|
* @return PPNode_Hash_Tree|PPNode_Hash_Attr|PPNode_Hash_Text|boolean
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getNextSibling() {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
return self::factory( $this->store, $this->index + 1 );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
/**
|
|
|
|
|
* Get an array of the children with a given node name
|
|
|
|
|
*
|
|
|
|
|
* @param string $name
|
|
|
|
|
* @return PPNode_Hash_Array
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getChildrenOfType( $name ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$children = [];
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
foreach ( $this->rawChildren as $i => $child ) {
|
|
|
|
|
if ( is_array( $child ) && $child[self::NAME] === $name ) {
|
|
|
|
|
$children[] = self::factory( $this->rawChildren, $i );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
2015-12-09 17:56:30 +00:00
|
|
|
return new PPNode_Hash_Array( $children );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
/**
|
|
|
|
|
* Get the raw child array. For internal use.
|
|
|
|
|
* @return array
|
|
|
|
|
*/
|
|
|
|
|
public function getRawChildren() {
|
|
|
|
|
return $this->rawChildren;
|
|
|
|
|
}
|
|
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
|
|
|
|
* @return bool
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getLength() {
|
2011-05-29 14:01:47 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2014-04-21 23:38:39 +00:00
|
|
|
* @param int $i
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return bool
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function item( $i ) {
|
2011-05-29 14:01:47 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
|
2011-05-29 14:01:47 +00:00
|
|
|
/**
|
|
|
|
|
* @return string
|
|
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getName() {
|
2008-02-05 08:23:58 +00:00
|
|
|
return $this->name;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2012-07-10 12:48:06 +00:00
|
|
|
* Split a "<part>" node into an associative array containing:
|
|
|
|
|
* - name PPNode name
|
|
|
|
|
* - index String index
|
|
|
|
|
* - value PPNode value
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
2012-10-07 23:35:26 +00:00
|
|
|
* @throws MWException
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return array
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitArg() {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
return self::splitRawArg( $this->rawChildren );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Like splitArg() but for a raw child array. For internal use only.
|
|
|
|
|
*/
|
|
|
|
|
public static function splitRawArg( array $children ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$bits = [];
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
foreach ( $children as $i => $child ) {
|
|
|
|
|
if ( !is_array( $child ) ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
continue;
|
|
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
if ( $child[self::NAME] === 'name' ) {
|
|
|
|
|
$bits['name'] = new self( $children, $i );
|
|
|
|
|
if ( isset( $child[self::CHILDREN][0][self::NAME] )
|
|
|
|
|
&& $child[self::CHILDREN][0][self::NAME] === '@index'
|
2013-12-01 20:39:00 +00:00
|
|
|
) {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$bits['index'] = $child[self::CHILDREN][0][self::CHILDREN][0];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
} elseif ( $child[self::NAME] === 'value' ) {
|
|
|
|
|
$bits['value'] = new self( $children, $i );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if ( !isset( $bits['name'] ) ) {
|
|
|
|
|
throw new MWException( 'Invalid brace node passed to ' . __METHOD__ );
|
|
|
|
|
}
|
|
|
|
|
if ( !isset( $bits['index'] ) ) {
|
|
|
|
|
$bits['index'] = '';
|
|
|
|
|
}
|
|
|
|
|
return $bits;
|
|
|
|
|
}
|
2008-04-14 07:45:50 +00:00
|
|
|
|
2008-02-05 08:23:58 +00:00
|
|
|
/**
|
2012-07-10 12:48:06 +00:00
|
|
|
* Split an "<ext>" node into an associative array containing name, attr, inner and close
|
2008-02-05 08:23:58 +00:00
|
|
|
* All values in the resulting array are PPNodes. Inner and close are optional.
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
2012-10-07 23:35:26 +00:00
|
|
|
* @throws MWException
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return array
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitExt() {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
return self::splitRawExt( $this->rawChildren );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Like splitExt() but for a raw child array. For internal use only.
|
|
|
|
|
*/
|
|
|
|
|
public static function splitRawExt( array $children ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$bits = [];
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
foreach ( $children as $i => $child ) {
|
|
|
|
|
if ( !is_array( $child ) ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
continue;
|
|
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
switch ( $child[self::NAME] ) {
|
|
|
|
|
case 'name':
|
|
|
|
|
$bits['name'] = new self( $children, $i );
|
|
|
|
|
break;
|
|
|
|
|
case 'attr':
|
|
|
|
|
$bits['attr'] = new self( $children, $i );
|
|
|
|
|
break;
|
|
|
|
|
case 'inner':
|
|
|
|
|
$bits['inner'] = new self( $children, $i );
|
|
|
|
|
break;
|
|
|
|
|
case 'close':
|
|
|
|
|
$bits['close'] = new self( $children, $i );
|
|
|
|
|
break;
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
if ( !isset( $bits['name'] ) ) {
|
|
|
|
|
throw new MWException( 'Invalid ext node passed to ' . __METHOD__ );
|
|
|
|
|
}
|
|
|
|
|
return $bits;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2012-07-10 12:48:06 +00:00
|
|
|
* Split an "<h>" node
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
2012-10-07 23:35:26 +00:00
|
|
|
* @throws MWException
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return array
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitHeading() {
|
2008-02-05 08:23:58 +00:00
|
|
|
if ( $this->name !== 'h' ) {
|
|
|
|
|
throw new MWException( 'Invalid h node passed to ' . __METHOD__ );
|
|
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
return self::splitRawHeading( $this->rawChildren );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Like splitHeading() but for a raw child array. For internal use only.
|
|
|
|
|
*/
|
|
|
|
|
public static function splitRawHeading( array $children ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$bits = [];
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
foreach ( $children as $i => $child ) {
|
|
|
|
|
if ( !is_array( $child ) ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
continue;
|
|
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
if ( $child[self::NAME] === '@i' ) {
|
|
|
|
|
$bits['i'] = $child[self::CHILDREN][0];
|
|
|
|
|
} elseif ( $child[self::NAME] === '@level' ) {
|
|
|
|
|
$bits['level'] = $child[self::CHILDREN][0];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
if ( !isset( $bits['i'] ) ) {
|
|
|
|
|
throw new MWException( 'Invalid h node passed to ' . __METHOD__ );
|
|
|
|
|
}
|
|
|
|
|
return $bits;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
2012-07-10 12:48:06 +00:00
|
|
|
* Split a "<template>" or "<tplarg>" node
|
2011-05-29 14:01:47 +00:00
|
|
|
*
|
2012-10-07 23:35:26 +00:00
|
|
|
* @throws MWException
|
2011-05-29 14:01:47 +00:00
|
|
|
* @return array
|
2008-02-05 08:23:58 +00:00
|
|
|
*/
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitTemplate() {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
return self::splitRawTemplate( $this->rawChildren );
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* Like splitTemplate() but for a raw child array. For internal use only.
|
|
|
|
|
*/
|
|
|
|
|
public static function splitRawTemplate( array $children ) {
|
2016-02-17 09:09:32 +00:00
|
|
|
$parts = [];
|
|
|
|
|
$bits = [ 'lineStart' => '' ];
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
foreach ( $children as $i => $child ) {
|
|
|
|
|
if ( !is_array( $child ) ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
continue;
|
|
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
switch ( $child[self::NAME] ) {
|
|
|
|
|
case 'title':
|
|
|
|
|
$bits['title'] = new self( $children, $i );
|
|
|
|
|
break;
|
|
|
|
|
case 'part':
|
|
|
|
|
$parts[] = new self( $children, $i );
|
|
|
|
|
break;
|
|
|
|
|
case '@lineStart':
|
2008-02-05 08:23:58 +00:00
|
|
|
$bits['lineStart'] = '1';
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
break;
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
if ( !isset( $bits['title'] ) ) {
|
|
|
|
|
throw new MWException( 'Invalid node passed to ' . __METHOD__ );
|
|
|
|
|
}
|
|
|
|
|
$bits['parts'] = new PPNode_Hash_Array( $parts );
|
|
|
|
|
return $bits;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
/**
|
|
|
|
|
* @ingroup Parser
|
|
|
|
|
*/
|
2015-10-06 15:13:24 +00:00
|
|
|
// @codingStandardsIgnoreStart Squiz.Classes.ValidClassName.NotCamelCaps
|
2008-02-05 08:23:58 +00:00
|
|
|
class PPNode_Hash_Text implements PPNode {
|
2014-08-11 20:44:31 +00:00
|
|
|
// @codingStandardsIgnoreEnd
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
public $value;
|
|
|
|
|
private $store, $index;
|
2008-02-05 08:23:58 +00:00
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
/**
|
|
|
|
|
* Construct an object using the data from $store[$index]. The rest of the
|
|
|
|
|
* store array can be accessed via getNextSibling().
|
|
|
|
|
*
|
|
|
|
|
* @param array $store
|
|
|
|
|
* @param integer $index
|
|
|
|
|
*/
|
|
|
|
|
public function __construct( array $store, $index ) {
|
|
|
|
|
$this->value = $store[$index];
|
|
|
|
|
if ( !is_scalar( $this->value ) ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
throw new MWException( __CLASS__ . ' given object instead of string' );
|
|
|
|
|
}
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
$this->store = $store;
|
|
|
|
|
$this->index = $index;
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __toString() {
|
2008-02-05 08:23:58 +00:00
|
|
|
return htmlspecialchars( $this->value );
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getNextSibling() {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
return PPNode_Hash_Tree::factory( $this->store, $this->index + 1 );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getChildren() {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getFirstChild() {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getChildrenOfType( $name ) {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getLength() {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function item( $i ) {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getName() {
|
2014-05-10 23:03:45 +00:00
|
|
|
return '#text';
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitArg() {
|
2014-05-10 23:03:45 +00:00
|
|
|
throw new MWException( __METHOD__ . ': not supported' );
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitExt() {
|
2014-05-10 23:03:45 +00:00
|
|
|
throw new MWException( __METHOD__ . ': not supported' );
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitHeading() {
|
2014-05-10 23:03:45 +00:00
|
|
|
throw new MWException( __METHOD__ . ': not supported' );
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
/**
|
|
|
|
|
* @ingroup Parser
|
|
|
|
|
*/
|
2015-10-06 15:13:24 +00:00
|
|
|
// @codingStandardsIgnoreStart Squiz.Classes.ValidClassName.NotCamelCaps
|
2008-02-05 08:23:58 +00:00
|
|
|
class PPNode_Hash_Array implements PPNode {
|
2014-08-11 20:44:31 +00:00
|
|
|
// @codingStandardsIgnoreEnd
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
public $value;
|
2008-02-05 08:23:58 +00:00
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __construct( $value ) {
|
2008-02-05 08:23:58 +00:00
|
|
|
$this->value = $value;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __toString() {
|
2008-02-05 08:23:58 +00:00
|
|
|
return var_export( $this, true );
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getLength() {
|
2008-04-14 07:45:50 +00:00
|
|
|
return count( $this->value );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function item( $i ) {
|
2008-04-14 07:45:50 +00:00
|
|
|
return $this->value[$i];
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getName() {
|
2014-05-10 23:03:45 +00:00
|
|
|
return '#nodelist';
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getNextSibling() {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
return false;
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getChildren() {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getFirstChild() {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getChildrenOfType( $name ) {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitArg() {
|
2014-05-10 23:03:45 +00:00
|
|
|
throw new MWException( __METHOD__ . ': not supported' );
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitExt() {
|
2014-05-10 23:03:45 +00:00
|
|
|
throw new MWException( __METHOD__ . ': not supported' );
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitHeading() {
|
2014-05-10 23:03:45 +00:00
|
|
|
throw new MWException( __METHOD__ . ': not supported' );
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
WARNING: HUGE COMMIT
Doxygen documentation update:
* Changed alls @addtogroup to @ingroup. @addtogroup adds the comment to the group description, but doesn't add the file, class, function, ... to the group like @ingroup does. See for example http://svn.wikimedia.org/doc/group__SpecialPage.html where it's impossible to see related files, classes, ... that should belong to that group.
* Added @file to file description, it seems that it should be explicitely decalred for file descriptions, otherwise doxygen will think that the comment document the first class, variabled, function, ... that is in that file.
* Removed some empty comments
* Removed some ?>
Added following groups:
* ExternalStorage
* JobQueue
* MaintenanceLanguage
One more thing: there are still a lot of warnings when generating the doc.
2008-05-20 17:13:28 +00:00
|
|
|
/**
|
|
|
|
|
* @ingroup Parser
|
|
|
|
|
*/
|
2015-10-06 15:13:24 +00:00
|
|
|
// @codingStandardsIgnoreStart Squiz.Classes.ValidClassName.NotCamelCaps
|
2008-02-05 08:23:58 +00:00
|
|
|
class PPNode_Hash_Attr implements PPNode {
|
2014-08-11 20:44:31 +00:00
|
|
|
// @codingStandardsIgnoreEnd
|
|
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
public $name, $value;
|
|
|
|
|
private $store, $index;
|
2008-02-05 08:23:58 +00:00
|
|
|
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
/**
|
|
|
|
|
* Construct an object using the data from $store[$index]. The rest of the
|
|
|
|
|
* store array can be accessed via getNextSibling().
|
|
|
|
|
*
|
|
|
|
|
* @param array $store
|
|
|
|
|
* @param integer $index
|
|
|
|
|
*/
|
|
|
|
|
public function __construct( array $store, $index ) {
|
|
|
|
|
$descriptor = $store[$index];
|
|
|
|
|
if ( $descriptor[PPNode_Hash_Tree::NAME][0] !== '@' ) {
|
|
|
|
|
throw new MWException( __METHOD__.': invalid name in attribute descriptor' );
|
|
|
|
|
}
|
|
|
|
|
$this->name = substr( $descriptor[PPNode_Hash_Tree::NAME], 1 );
|
|
|
|
|
$this->value = $descriptor[PPNode_Hash_Tree::CHILDREN][0];
|
|
|
|
|
$this->store = $store;
|
|
|
|
|
$this->index = $index;
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function __toString() {
|
2008-02-05 08:23:58 +00:00
|
|
|
return "<@{$this->name}>" . htmlspecialchars( $this->value ) . "</@{$this->name}>";
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getName() {
|
2008-02-05 08:23:58 +00:00
|
|
|
return $this->name;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getNextSibling() {
|
Preprocessor_Hash: use child arrays instead of linked lists
The singly-linked list data structure of Preprocessor_Hash was causing
stack exhaustion due to the need for a recursion depth proportional to
the number of children of a given PPNode, in serialize() and on
object destruction. So, switch to array-based storage. PPNode_* becomes
a temporary proxy around the underlying storage, which avoids circular
references and keeps the storage very compact. Preprocessor_DOM uses
similar temporary PPNode objects, so the fact that
$node->getFirstChild() !== $node->getFirstChild()
should not cause any new problems.
* Increment cache version
* Use JSON serialization of the store array instead of serialize(),
since JSON is more compact, even after gzipping.
* For efficiency, make $accum a plain array, and use it as an array
where possible, instead of using helper functions.
Performance and memory usage for typical input are slightly improved:
something like 4% faster for the whole parse, and 20% less memory for
the tree.
Bug: T73486
Change-Id: I0d6c162b790d6dc1ddb0352aba6e4753854f4c56
2016-07-18 02:05:13 +00:00
|
|
|
return PPNode_Hash_Tree::factory( $this->store, $this->index + 1 );
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getChildren() {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getFirstChild() {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getChildrenOfType( $name ) {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function getLength() {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function item( $i ) {
|
2014-05-10 23:03:45 +00:00
|
|
|
return false;
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitArg() {
|
2014-05-10 23:03:45 +00:00
|
|
|
throw new MWException( __METHOD__ . ': not supported' );
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitExt() {
|
2014-05-10 23:03:45 +00:00
|
|
|
throw new MWException( __METHOD__ . ': not supported' );
|
|
|
|
|
}
|
|
|
|
|
|
2014-08-11 20:24:54 +00:00
|
|
|
public function splitHeading() {
|
2014-05-10 23:03:45 +00:00
|
|
|
throw new MWException( __METHOD__ . ': not supported' );
|
|
|
|
|
}
|
2008-02-05 08:23:58 +00:00
|
|
|
}
|