Parsoid uses <figure-inline> for inline figures. The intention is to
transition core to use <figure> and <figure-inline> as well in the
future (T118517). As a first step (and to keep Parsoid and the legacy
parser in sync) allow <figure-inline> attributes in the Sanitizer.
Note that this does not allow <figure-inline> in wikitext,
since neither <figure> nor <figure-inline> is on the
getRecognizedTagData() list.
Bug: T51097
Bug: T118517
Bug: T118520
Change-Id: I5248717739bef0f7106c2bcf0b4a15acbc3c9a68
We synchronized the allowed attributes for <video> in
4e7483ffd3 but then decided to use the
<audio> tag for audio media in Parsoid commit
5f3dbdc8794f2605101609f28e679df29a0387bc and updated its Sanitizer,
but never updated core to match.
Bug: T163583
Bug: T133673
Change-Id: Iefcbead2f335949eb45e2880861fd9473b810367
If {{REVISIONID}} results in a re-parse, that re-parse will be post-send
unless the user has canonical parser options and will need the output for
page views anyway (e.g. the refresh after editing).
Also make getPreparedEdit() allow lazy-loading of the parser output via
a callback. A magic __get() method handles objects created the new way
but accessed by other code the old way.
Bug: T216306
Change-Id: I2012437c45dd605a6c0868dea47cf43dc67061d8
This avoids a double parse when the edit stash is not used,
which can be confirmed via the SaveParse log for a page
using {{REVISIONID}} when edit stashing is disabled. This
now matches the reuse for the edit stash hit case.
Change-Id: I405c39d4d7ac04e39fbdfe400f73238b734c7833
HHVM does not support variadic arguments with type hints. This is
mostly not a big problem, because we can just drop the type hint, but
for some reason PHPUnit adds a type hint of "array" when it creates
mocks, so a class with a variadic method can't be mocked (at least in
some cases). As such, I left alone all the classes that seem like
someone might like to mock them, like Title and User. If anyone wants
to mock them in the future, they'll have to switch back to
func_get_args(). Some of the changes are definitely safe, like
functions and test classes.
In most cases, func_get_args() (and/or func_get_arg(), func_num_args() )
were only present because the code was written before we required PHP
5.6, and writing them as variadic functions is strictly superior. In
some cases I left them alone, aside from HHVM compatibility:
* Forwarding all arguments to another function. It's useful to keep
func_get_args() here where we want to keep the list of expected
arguments and their meanings in the function signature line for
documentation purposes, but don't want to copy-paste a long line of
argument names.
* Handling deprecated calling conventions.
* One or two miscellaneous cases where we're basically using the
arguments individually but want to use them as an array as well for
some reason.
Change-Id: I066ec95a7beb7c0665146195a08e7cce1222c788
This code is surprisingly little changed since I added the class in
November 2003, and needs some modernisation.
* Remove the "linked" option, unused since 1.21. Similarly, make the
"match-whole" option implied. This allows the regexes to be
simplified. Nothing will be broken, according to CodeSearch.
* Instead of ucfirst(), use the canonical month name from the language.
This will work with e.g. French which does not capitalise month names.
* Stop caching DateFormatter instances in APC. Caching was added
in 2005 when initialisation was being done on every request, but now
it is only needed when parsing a page with {{#formatdate}}, which is
rarely, and the constructor overhead is only 200µs after Language
object data initialisation. Instead, use an in-process cache via a
factory service.
* Add docs and extra tests.
* Remove todo note obsolete since 38 minutes after the original commit.
* Rename many variables.
* Use double-slash comments
* Don't store the Language object, just get arrays.
* Use mb_strtolower() instead of Language::lc() -- any customisation of
Language::lc() would break PCRE case-insensitive matching.
* Use named subpatterns instead of "keys"
* Remove the ISO1/ISO2 distinction, the only difference was linking.
* Use closure variables instead of temporary object members
Change-Id: I25fb1203dba2930724d7bc28ad0d51f59f88e1ea
Some languages have date abbreviations that contain ".", which allows
the non-ISO regexes to match an input string containing an invalid month
name. Use preg_quote() to avoid this.
Also fix the error handling case of makeIsoMonth(). If the input date is
invalid, don't try to wrap it in a date span, since that's semantically
incorrect and may also access unset members of $bits, causing a notice.
Bug: T220563
Change-Id: Ib2b3fb315dc93b60de595d3c445637f6bcc78a1a
PhanTypeExpectedObjectPropAccess was flagged by phan in the
DatabaseMysqli and Preprocessor_Hash classes.
In Database, the $conn property might be a standard object, such as
`\mysqli`, which is not a resource.
In Preprocessor, phan was getting confused and thinking
PPDStack::getCurrentPart() was returning bool, and not a PPDPart object.
Adding explicit documentation about the return value fixed that.
Change-Id: I0a3aa219693da5cb46ff9c0936841ed740c6968a
This code is functionally identical, but less error prone (not so easy
to forget or mix these numerical indexes).
This patch happens to touch the Parser, which might be a bit scary. We
can remove this file from this patch if you prefer.
Change-Id: I8cbe3a9a6725d1c42b86e67678c1af15fbc5961a
In T208070 / I120ca25a77b7b933de4afddd1d458e36a95e26da we added a
check whether we were processing the last line of input, in order
to avoid emitting extra trailing newlines. But if the number of
input lines is large, StringUtils::explode() will return an
iterator which doesn't implement Countable for efficiency.
I22eebb70af1b19d7c25241fc78bfcced4470e78a fixed this, but at the
cost of scanning the string twice: once just to count the number
of newlines before we begin to iterate over the lines.
This patch uses Iterator::valid() to determine if we're on the
last iteration without having to scan the string twice.
Bug: T208070
Bug: T218817
Change-Id: I41a45266d266195aa6002d3854e018cacf052ca6
The previous fix for T218817 (I22eebb70af1b19d7c25241fc78bfcced4470e78a)
was a bit premature: we didn't notice that ExplodeIterator *also*
used a different Iterator::key() than ArrayIterator -- it used
the string position as a key, not the line number. Combined with
an inequality test for "not the last line" meant that almost every
line was now the "last line" and we were missing a lot of needed
newlines.
Count the lines ourselves to fix the problem.
Bug: T208070
Bug: T218817
Change-Id: I55a2c4c0ec304292162c51aa88b206fea0142392
StringUtils::explode() returns an ExplodeIterator if the number of
separators is too high, which doesn't implement count.
So count the way that explode does.
Bug: T218817
Change-Id: I22eebb70af1b19d7c25241fc78bfcced4470e78a
This now at least matches the function names even though what's actually
meant is more like 'block-level tag'. Section is a poor choice of name
since there're wikitext sections unrelated to this.
Change-Id: Ic83aff4d862800b778441c28884194480b7e7d96
HTML, generated by some infoboxes and perhaps other places, gets
stripped in a way that merges words together that should not be
merged. Add tr, th, and td to the list of tags that should force
word separation.
Bug: T218001
Change-Id: Ib374339628b1f543ea4e07f24aa3e3b76f3117b5
This only applies to content namespaces for now since
the cost of vary-revision-id is much less of a concern.
The potential to harm page save time is far worse than what
use they have, which is almost entirely just hacks to check
for preview mode. These have nothing to do with the actual
revision ID nor timestamp itself. They simply check whether
the value is the empty string. Since this magic word still
only returns an empty string in preview mode, such checks
will keep working.
Bug: T137900
Depends-on: I1809354055513a5b9d9589e2d6acda7579af76e2
Change-Id: Ieff8423ae3804b42d264f630e1a029199abf5976
It's a temporary feature flag not included in any release, just
removing it outright. The functonality will now be always enabled.
Bug: T205040
Change-Id: Ia9da82e6f6b2d270f1790a99fc8c35ad5e6aee5e
The addModuleScripts() methods were deprecated in 1.31 and 1.32,
these are now removed.
The getModuleScripts() are now deprecated as well, always returning
an empty array. To be removed in 1.34.
Depends on commits for bundled/wmf-deployed extensions that
remove the last few remaining callers to the deprecated functions
in: 3D, Collection, Flow, GlobalUserPage, and Wikibase.
Bug: T188689
Depends-On: If9f0bc6aef85117587fa1929f34f8861c8d80314
Depends-On: Ia8d41b97fbf6822f5f8f7ac889408acce1ac9a3a
Depends-On: I503b919739ea474ff33726815b0da55e2f7e2724
Depends-On: I236ef637fd03b810a46eb361e25067a037e9d183
Depends-On: I62e17779753b977a452cc0c9694947941e999cc3
Change-Id: I5a19b8f164ccf666485d2971202194b747f882df
I would argue that these comments do not add any information that
would not be there already. Having them adds mental overhead, because
one needs to read both the comment and the next line of code first to
understand they say the exact same. I don't find this helpful, but
more distracting.
Change-Id: I39c98f25225947ebffdcc2fd8f0243e7a6c070d7