Encourage localization and factor out common code by taking a message
key as the first argument to ::addWarningMsg() instead of a wikitext
string. This also plays nicer with Parsoid by separating out the
localization code from the parse.
Bug: T293515
Change-Id: I6a7c04c67ac586ab00d4edcbb3d09485a7794e23
This is a uniform mechanism to access a number of bespoke boolean
flags in ParserOutput. It allows extensibility in core (by adding new
field names to ParserOutputFlags) without exposing new getter/setter
methods to Parsoid. It replaces the ParserOutput::{get,set}Flag()
interface which (a) doesn't allow access to certain flags, and (b) is
typically called with a string rather than a constant, and (c) has a
very generic name. (Note that Parser::setOutputFlag() already called
these "output flags".)
In the future we might unify the representation so that we store
everything in $mFlags and don't have explicit properties in
ParserOutput, but those representation details should be invisible to
the clients of this API. (We might also use a proper enumeration
for ParserOutputFlags, when PHP supports this.)
There is some overlap with ParserOutput::{get,set}ExtensionData(), but
I've left those methods as-is because (a) they allow for non-boolean
data, unlike the *Flag() methods, and (b) it seems worthwhile to
distingush properties set by extensions from properties used by core.
Code search:
https://codesearch.wmcloud.org/search/?q=%5BOo%5Dut%28put%29%3F%28%5C%28%5C%29%29%3F-%3E%28g%7Cs%29etFlag%5C%28&i=nope&files=&excludeFiles=&repos=
Bug: T292868
Change-Id: I39bc58d207836df6f328c54be9e3330719cebbeb
The ::getProperty() naming is too generic and doesn't clearly indicate
that these are "page properties" (which have their own table in the DB).
As part of refactoring a clean API out of ParserOutput which can be used
by Parsoid, clean up the naming here.
Soft-deprecation in this patch, there are a handful of external users
which need to be cleaned up before we hard-deprecate.
Bug: T287216
Change-Id: Ie963eea5aa0f0e984ced7c4dfa0fd65d57313cfa
Updates for the removal of the Revision class itself
and the various methods/hooks/variables removed in the
process, including:
- Update some documentation removing most references
to the Revision class and updating the MCR migration
notes to reflect the past tense for Revision methods.
- Change some capitalization from "Revision" to "revision"
to make it clear comments are about revisions in general,
not the Revision class in particular.
- Minor code tweaks including removing unused variables that
were around for the old hooks that were removed, and
removing the use of DeprecatablePropertyArray where no
longer needed for anything.
- Fix incorrect documentation for PageUpdater::getStatus(),
the status value changed a while ago to have revision-record
in addition to revision, and recently to only have the
revision-record, but ironically PageUpdater was never updated.
- Removed Parser::$mRevisionObject, used to be a Revision object
and was deprecated in 1.35, missed earlier because it was no
longer being set to Revision objects, always null.
- Add RevisionRecord typehints in DummyLinker to match those
in the corresponding Linker methods
This should be a no-op in terms of functionality.
Bug: T247143
Change-Id: I03bbb94fc29085855448780b1a5ad9063911ecc4
This has effectively been the case since 1.35; this just cleans up the
remaining code which assumed it still needed to explicitly call
Parser::firstCallInit() on a newly-constructed Parser.
Bug: T250444
Change-Id: I340947c721172f12ff413322b4283627c0b0b3a4
This is micro-optimization of closure code to avoid binding the closure
to $this where it is not needed.
Created by I25a17fb22b6b669e817317a0f45051ae9c608208
Change-Id: I0ffc6200f6c6693d78a3151cb8cea7dce7c21653
The PHP function is_numeric() returns true for numbers like '123.456'
and even '1.23e45'. However, it returns false for (string)NAN,
(string)INF, and (string)-INF (which are "NAN", "INF" and "-INF"
respectively). We can return the appropriate unicode characters for
the infinities to localize these/make them universal, and allow a
localization of the "Not a Number" message.
Make the corresponding change to Language::parseFormattedNumber() so
that its remains the inverse operation to ::formatNum().
Accept "NAN"/"INF"/"-INF" only when they stand alone in the string;
in the legacy case where text and numbers are intermingled, split
only on "traditional" numbers; I think we're more likely to find
INF/NAN "innocently" in the middle of text than we are to find it
as a "real" number.
Change-Id: I3ff227a4aac66fc938182dc9fb8a7b743e94faca
NumberFormatter handles exponential notation fine, and is_numeric
recognizes it, but some of our checks on the {{formatnum}} parser
function were a bit too strict.
Bug: T237467
Change-Id: I20c51da1e58bffeefba18237815541c1b6ccb415
* Update digitGroupingPattern to match CLDR 31: New versions of CLDR has
digit grouping pattern with decimal part. Update digitGroupingPattern
values in Message classes with this improved pattern.
Refer: http://unicode.org/reports/tr35/tr35-numbers.html
* Refer the following chart for the decimal patterns.
http://www.unicode.org/cldr/charts/31/by_type/numbers.number_formatting_patterns.html
* Uses PHP NumberFormatter class for the commafy implementation, which
is available in PHP 7.
* Some tests need to update to match the TR 35 spec
* The formatNum public method in Language.php is the preferred way to
use this feature. It does separator transformation and digit transformation
wherever applicable.
* Renamed the second param name for formatNum from noCommafy to noSeparators
* commafy method is deprecated and formatNum is preferred. Practically,
we are not just adding comma, but seperators according to the language.
Replaced some tests based on commafy methods with tests based on formatNum.
Note: The corresponding js implementation is not changed in this commit.
It would probably be a good idea to use globalize.js, which is also based
on the CLDR patterns.
Note: This patch preserves the existing off-by-one error in
$minimumGroupingDigits; T262500 will eventually fix this.
Bug: T167088
Co-Authored-By: C. Scott Ananian <cscott@cscott.net>
Change-Id: Ic721b9a91e78e4ef07040339d1006b7a90a910c0
The {{formatnum}} parser function can take anything, not just numeric
strings. We'd like to restrict Language::commafy() to operate only on
numeric strings, however (see T237467). Split the argument to the
{{formatnum}} parser function so that we only invoke
Language::commafy() on numeric strings. Add a tracking category so we
can (gradually) lint our content appropriately.
Bug: T237467
Change-Id: Ib6c832df1f69aa4579402701fad1f77e548291ee
This reverts commit c45ccd7ca8.
Reason for revert: Assuming that I6af7aeabbba fixes the real issue.
Change-Id: Ie1fc595a18e54f0c29b43740039cd7114d8e071e
This newly-added method returns `false` on error; the caller expects
it to return `null`.
Bug: T253725
Followup-To: If36b35391f7833a1aded8b5a0de706d44187d423
Change-Id: I6af7aeabbba9f95338497026fd08d9ae23f75c22
Reason for revert: issue arose again when deployed with wmf.34
Partial revert: keep the intended fix in Parser.php, revert
removal of fail-safe logic in CoreParserFunctions.hp
This reverts commit 2712cb8330.
Bug: T253725
Change-Id: I06266ca8bd29520b2c8f86c430d0f1e2d5dd20c0
Parser::getRevisionRecordObject() returns `null` if the revision is
missing, but it invokes ParserOptions::getCurrentRevisionRecordCallback()
(ie, Parser::statelessFetchRevisionRecord() by default) which returns
`false` as its error condition.
This reverts commit ae74a29af3, and instead
fixes the bug at its root.
Bug: T251952
Change-Id: If36b35391f7833a1aded8b5a0de706d44187d423
Private method, no need to worry about deprecation
Most of its uses in the class called Revision methods that were
identical to the RevisionRecord methods, and didn't need to change
to reflect the new type being returned.
Remove a use of Revision::getUserText
Bug: T249393
Bug: T250579
Change-Id: Ide0dcd01caee3d3388038e6f40edda25528f55d8
Done:
* Replace LanguageConverter::newConverter by LanguageConverterFactory::getLanguageConverter
* Remove LanguageConverter::newConverter from all subclasses
* Add LanguageConverterFactory integration tests which covers all languages by their code.
* Caching of LanguageConverters in factory
* Make all tests running (hope that's would be enough)
* Uncomment the deprecated functions.
* Rename FakeConverter to TrivialLanguageConverter
* Create ILanguageConverter to have shared ancestor
* Make the LanguageConverter class abstract.
* Create table with mapping between lang code and converter instead of using name convention
* ILanguageConverter @internal
* Clean up code
Change-Id: I0e4d77de0f44e18c19956a1ffd69d30e63cf51bf
Bug: T226833, T243332
These were discovered by setting `null_casts_as_any_type` to true in
phan, and filtering by `PhanTypeMismatchReturnNullable`. Of course there
are others, some of which are false positives, but we cannot suppress
them now (or the UnusedSuppressionPlugin will complain).
Change-Id: Ia8443e575c22f47a6d8c63038f4e7ac36815fc27
This works similarly to speculative rev IDs with {{REVISIONID}}.
Re-parses can be avoided if the page ID is correctly guessed.
Also make the {{PAGEID:X}} parser function set vary-page-id.
Bug: T226785
Change-Id: I0b19be45e6ddd6cde330bfcd09d243e4e5beda01
This can be used to avoid double parsed on save if the prior output
can be reused in-spite of involving a self content reference.
Change-Id: Idcd30a3fa3f7012dac76ce8bbf46625453ae331f
These global functions were deprecated in 1.34 and services made
available to replace them. See services below;
* wfFindFile() - MediaWikiServices::getInstance()->getRepoGroup()->findFile()
* wfLocalFind() - MediaWikiServices::getInstance()->getRepoGroup()->getLocalRepo()->newFile()
NOTES:
* wfFindFile() and wfLocalFind() usages in tests have been ignored
in this change per @Timo's comments about state of objects.
* includes/upload/UploadBase.php also maintained for now as it causes
some failures I don't fully understand, will investigate and handle
it in a follow up patch.
* Also, includes/MovePage.php
Change-Id: I9437494de003f40fbe591321da7b42d16bb732d6
HHVM does not support variadic arguments with type hints. This is
mostly not a big problem, because we can just drop the type hint, but
for some reason PHPUnit adds a type hint of "array" when it creates
mocks, so a class with a variadic method can't be mocked (at least in
some cases). As such, I left alone all the classes that seem like
someone might like to mock them, like Title and User. If anyone wants
to mock them in the future, they'll have to switch back to
func_get_args(). Some of the changes are definitely safe, like
functions and test classes.
In most cases, func_get_args() (and/or func_get_arg(), func_num_args() )
were only present because the code was written before we required PHP
5.6, and writing them as variadic functions is strictly superior. In
some cases I left them alone, aside from HHVM compatibility:
* Forwarding all arguments to another function. It's useful to keep
func_get_args() here where we want to keep the list of expected
arguments and their meanings in the function signature line for
documentation purposes, but don't want to copy-paste a long line of
argument names.
* Handling deprecated calling conventions.
* One or two miscellaneous cases where we're basically using the
arguments individually but want to use them as an array as well for
some reason.
Change-Id: I066ec95a7beb7c0665146195a08e7cce1222c788
This code is surprisingly little changed since I added the class in
November 2003, and needs some modernisation.
* Remove the "linked" option, unused since 1.21. Similarly, make the
"match-whole" option implied. This allows the regexes to be
simplified. Nothing will be broken, according to CodeSearch.
* Instead of ucfirst(), use the canonical month name from the language.
This will work with e.g. French which does not capitalise month names.
* Stop caching DateFormatter instances in APC. Caching was added
in 2005 when initialisation was being done on every request, but now
it is only needed when parsing a page with {{#formatdate}}, which is
rarely, and the constructor overhead is only 200µs after Language
object data initialisation. Instead, use an in-process cache via a
factory service.
* Add docs and extra tests.
* Remove todo note obsolete since 38 minutes after the original commit.
* Rename many variables.
* Use double-slash comments
* Don't store the Language object, just get arrays.
* Use mb_strtolower() instead of Language::lc() -- any customisation of
Language::lc() would break PCRE case-insensitive matching.
* Use named subpatterns instead of "keys"
* Remove the ISO1/ISO2 distinction, the only difference was linking.
* Use closure variables instead of temporary object members
Change-Id: I25fb1203dba2930724d7bc28ad0d51f59f88e1ea
I would argue that these comments do not add any information that
would not be there already. Having them adds mental overhead, because
one needs to read both the comment and the next line of code first to
understand they say the exact same. I don't find this helpful, but
more distracting.
Change-Id: I39c98f25225947ebffdcc2fd8f0243e7a6c070d7
bacd87e49 moved the displaytitle error message from the content to
outside of the content. Only the content is converted by the language
conversion. The error message outside of the content is not converted.
Therefor markNoConversion is not needed here anymore.
This change removes the -{R|...}- around the displaytitle in the error
message when the language converter is active.
Bug: T208249
Change-Id: Ieec43e9af045d19b0b7a82afb889e076b347eed1
Ever since 184658eb32, the output of a non-existing message will be
HTML safe, regardless of output format, so we can treat non-existing messages
exactly the same as messages that do exist.
The pre-existing "int keyword - non-existing message" parser test verifies
that no change in output has ocurred in this patch.
Change-Id: I0e32be14f1b420d7f222ac3c76e1cc266f912b69
The getConverterLanguage() method was added in March 2012 in commit
561424c266 as a workaround for a regression
in mediawiki 1.19. It was an indirection which checked the global variable
$wgBug34832TransitionalRollback to return a different converter language
for Chinese wikis.
When this temporary bugfix was reverted in January 2013 in commit
a3fbdaaa2c, the temporary global variable
was removed, but not the getConverterLanguage() indirection. Since then,
new code in the parser seems to have faithfully used getConverterLanguage()
instead of getTargetLanguage(), even though they are identical and the
need for getConverterLanguage() has long since passed.
Strike a small blow for elegant minimalism by removing the completely
unnecessary Parser::getConverterLanguage() indirection. Well, sort
of: since this blight has been slowly growing inside Parser.php for
so long, we need to deprecate getConverterLanguage() first just in
case any external dependency has been infected. Next release we
can finally excise the unnecessary method.
Change-Id: I567c29c9c7699020955699b76cbe8578d02e2fe6