Commit graph

1050 commits

Author SHA1 Message Date
Arlo Breault
b12f5d8e20 Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit 67180924cc1d78eed9b300b6f867498da51c35bc

Change-Id: Icb1c8c3cc4e19db9fa5c93b62a6afadb9f6676dc
2020-12-18 12:30:25 -05:00
Arlo Breault
c2cef6cb58 Consistent label escaping in makeBrokenImageLinkObj
Html::element is more lenient about which characters it escapes.

But really this is just factored out of the next patch for ease of
review.

Change-Id: I9abb4d866a624df7bf4628ab9cc581967e715160
2020-12-18 11:41:09 -05:00
Arlo Breault
c203c574bd Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit c2952b434c1dc52d7c73154ca47bda19f2c2602f

Change-Id: Ic878dd183592c0ace77e3e078c40df40e54b7eab
2020-12-17 13:18:44 -05:00
Reedy
c75d2e41cc ParserTestRunner: Fix skipping typo
Change-Id: I72f2ed394141acf1fba9a450d10302607545225f
2020-12-16 23:50:51 +00:00
jenkins-bot
fe8779ebf9 Merge "Parser tests: Update TestFileReader to the latest reader from Parsoid" 2020-12-16 17:45:01 +00:00
David Kamholz
a7ad0547bc Implement <langconvert> tag
The <langconvert> tag takes two attributes: from (language variant from) and to (language variant to). It returns the content of the tag converted using LanguageConverter. It returns an error if the attributes are not present, if the variants do not exist, or if the variants belong to different languages. Currently it does not work for IuConverter, because the variants use the code ike rather than iu, and ike isn't in the list of languages with converters available.

This patchset reimplements from a parser function to a tag, and renames from transliterate to langconvert.

Bug: T263082
Change-Id: Idc3a32c66d5a0466c63e7ce8753d2619354c30b0
2020-12-14 19:40:31 -08:00
C. Scott Ananian
02ec9de651 Parser tests: Update TestFileReader to the latest reader from Parsoid
This replaces the 'requirements' from parser tests (hooks and
functionhooks) with a more flexible 'options' clause to allow
additional file-level requirements/options to support running parser
tests in multiple modes.  (For example, with the legacy parser or in
one of two parsoid modes.)

Bug: T254181
Depends-On: I636bd1f2c8aee327acbbd1636e2ac76355f1d80e
Change-Id: I58373d135c3a804f4ce9967112c338435f5cd4b6
2020-12-14 18:49:11 -05:00
Ammar Abdulhamid
71571191d4 Chain MutableRevisionRecord method calls 2
Change-Id: I86578cfbc892f171a4e433283b86d1b78fe4167d
2020-11-27 05:26:54 +01:00
C. Scott Ananian
c64e71615e Replace $wgDisable{Lang,Title}Conversion with LanguageConverterFactory methods
Replace direct access to $wgDisableLangConversion with
LanguageConverterFactory::isConversionDisabled(), and replace direct
access to $wgDisableTitleConversion with
LanguageConverterFactory::isTitleConversionDisabled().  However, most
places that check ::isTitleConversionDisabled() actually want
::isLinkConversionDisabled(), so add that too (and deprecate
isTitleConversionDisabled()).

Code search:
https://codesearch.wmcloud.org/search/?q=Disable%28Lang|Title%29Conversion&i=nope&files=&repos=

This change removes a number of spurious dependencies on the global
configuration and reduces code duplication (for example, if the logic
for disabling language conversion were ever to change).

Depends-On: I6fa8230ae97b0e34c381003548e61f9b7387d363
Change-Id: Icc4687638ff1815003dd903854efdbd904854f1e
2020-11-25 12:47:26 -05:00
Aaron Schulz
90865e0e89 tests: Use FileBackend::quick*() methods in ParserTestRunner
This avoids needs I/O from lock files

Change-Id: I0b8661712154c61f16c37c0c4303909c7b678cad
2020-11-13 06:06:59 +00:00
jenkins-bot
2752717667 Merge "Minor updates to some unrelated PHPDoc tags" 2020-10-30 17:17:57 +00:00
Umherirrender
c85a43561e Improve class property documentation
Reformat existing documentation to match the format

Change-Id: I190b54b5e962f17bab6502dd1b3c02f11dc926d2
2020-10-30 10:38:58 +01:00
Thiemo Kreuz
3306f15ce3 Minor updates to some unrelated PHPDoc tags
The past weeks I collected a few minor updates in my local dev
environment, and would like to submit them now.

Change-Id: Ibe00d72763f1b66c50cf73e00c8fa52d265043fc
2020-10-28 19:00:48 +00:00
Arlo Breault
acb40ea0d5 Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit 010856ed7d4aeb9617ac264782809cc58d94fc47

Change-Id: Iae87302abe2c11deb36088100e50a638d58cffe6
2020-10-05 16:09:18 -04:00
Ed Sanders
95a64c8740 Use the same character set for link prefix as for suffix in Arabic 'ar'
The default is a-z plus every non-ASCII character, but this
is too broad.

Instead use the same character set as is used for link trails,
specifically Latin & Arabic letters.

Bonus:
Add combining diacritics to both sets as when these are appended
to letters, the resulting glyph is still considered a letter.

Bug: T263266
Change-Id: I358673f79989491799d3d68da17e73b806b167e0
2020-10-05 18:31:33 +00:00
C. Scott Ananian
33c506843b phpunit tests: Reset the SpecialPageFactory when Content Language changes
This is causing problems for Parsoid CI, as parser tests fail when
phpunit runs the tests at a different point than they are run in
core's CI due to the side-effects of content-language changes made in
other phpunit tests. (For example, phpunit runs all extension tests
after core tests, so the same parsertest can pass if included in core
and then fail when included in an extension.)

SpecialPageFactory::$aliases has a dependency on the current content
language, with no way to reset it other than to recreate the
SpecialPageFactory.

Change-Id: I278580ed5cf2c85403cbaf601f8af4753e14a9d0
2020-09-23 14:17:31 -04:00
C. Scott Ananian
c443177614 Allow parserTests to declare a dependency on a particular extension
The parsertests file allows certain tests to declare a dependency on
a particular tag hook, but this doesn't work for extensions like
TimedMediaHandler which affect the output but don't register a
unique extension tag name.  Allow using 'extension:Foo' in the
`hooks` clause to register a dependency on the specific extension name,
instead of indirectly on the registered extension tag name.

Change-Id: I2d3f7e1313b4456733f820e6d8c504bb8d7427a7
2020-09-22 12:51:34 -04:00
Arlo Breault
8329352f08 Don't preface test description with comment
This seems to primarily be used in ParserTestPrinter::showTesting()
to print the string,

"Running test $desc... "

Follow up to 585cbcd

Change-Id: I53fc98ae56e3e9faad6ab1ca5a5a778f1c146fd1
2020-09-17 15:51:44 -04:00
jenkins-bot
13dc0e893d Merge "Tracking category and parser warning for deprecated uses of {{=}}" 2020-09-15 21:48:22 +00:00
jenkins-bot
448da261a2 Merge "Allow independent parser test files to (re)define articles w/ the same names" 2020-09-15 21:47:43 +00:00
C. Scott Ananian
aeb3f45c20 Tracking category and parser warning for deprecated uses of {{=}}
We plan to add {{=}} as a built-in parser function, expanding to `=`,
in the same way that `{{!}}` is a built-in.  It will be used to
automatically escape uses of `=` in template arguments (again, in the
same way that `{{!}}` can be used to protect uses of `|` in template
arguments).

Some wikis have non-standard definitions of `Template:=`; add a
tracking category to warn these wikis to transition before we turn on
the built-in parser function in a future release.

New parser test file added, so we can re-define Template:= and test
both cases of this new warning.

Bug: T91154
Change-Id: I50ff8a7b6be95901ebb14ffbe64940a0f499cfac
2020-09-15 20:16:37 +00:00
C. Scott Ananian
ce663741bc Allow independent parser test files to (re)define articles w/ the same names
It leads to surprising results when the definitions in one parser test
file leak into all the others.  This can cause spurious test failures
when you happen to have two extensions which define conflicting
article fixtures, and prevents you from using parser tests to test
patches like I50ff8a7b6be95901ebb14ffbe64940a0f499cfac, where you
deliberately want to set up and test two different definitions for the
same template name.

Change-Id: I958c6305a95ca32418d83b7f33f7c180a3b370cd
2020-09-15 16:15:44 -04:00
DannyS712
e834b31b2b Use recordUpload3, hard deprecate recordUpload2
Also reduces references to $wgUser in importImages

Bug: T248813
Change-Id: I17c850000044f65f2fcfdfcfb82f852583a99000
2020-09-10 00:00:42 +00:00
C. Scott Ananian
e66f8e2393 Sanitizer: use RemexHtml entity table, instead of its own
Reduce code duplication by using the authoritative HTML entity list
from Remex, instead of duplicating the table inside MediaWiki.

This also extends the set of entities accepted in wikitext to nearly
match HTML5.  (HTML5 allows some entities which are not
semicolon-terminated; wiktext insists on the semicolon.)

This patch brings the core parser closer to Parsoid output, as in most
cases Parsoid already accepted the full HTML5 entity list.
(I873a6120e4bd1c69fee9da76d266e24e97a22add is a corresponding patch to
Parsoid to unify its copy of Sanitizer.)

Also deprecate Sanitizer::hackDocType() while we're updating it, since
this method should not be public.

Bug: T94603
Change-Id: Ia08bc261c3644f83109f13df04b692101b4e8ef2
2020-08-21 00:02:44 +00:00
C. Scott Ananian
9ccb93db25 Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit d9a3e14dfcb422e95de7a79f0eb662fd43f9354f

Change-Id: Iecb80aefb82ddedd8121ddbd633f7481c3436f34
2020-08-18 16:52:34 -04:00
C. Scott Ananian
585cbcd77f Use parser test file parser from Parsoid
One (test file) parser to rule them all.  Reduce a little bit of
redundant code between core and Parsoid by using Parsoid's parser test
file parser to run core's parser tests.

This should have no effect on users of TestFileReader::read() *except*
that Parsoid's test file reader is more strict about bogus lines in
the test file, including duplicate test names, and we've removed support
for the old v1 format (hard deprecated in 1.35).

Next step will be to be able to execute parser tests on extensions
using Parsoid's parser as well.

Bug: T254181
Depends-On: I8ab4a8c59ed1b6837dba428f96a8ba0084b7fb68
Change-Id: I5acaf82819ae964895a831be4f28c31c77a09e84
2020-08-17 14:52:08 -04:00
C. Scott Ananian
fb7ae07c8a Hard-deprecate parser tests targetting Preprocessor_DOM
Support for Preprocessor_DOM was removed in 1.35; it's time to clean up
any old parser tests which required it.

Change-Id: I36c7906b8ce31ef6885aef54175749e67e51d07c
2020-08-04 14:21:32 -04:00
C. Scott Ananian
7bbb14f87d Fix parserTests.php by ensuring change_tag tables are cloned
This fixes a regression in the parser test CLI runner caused by
435d5f4d55, which added a change tag for
"manual reverts".  The parser test framework was triggering this
change tag addition as it set up its mock article database, and then
subsequent attempts to write the change tag to the database failed
because the tables were missing.

Bug: T259186
Change-Id: I232e918dfdc83244a010681b6adffd6c1171cf24
2020-07-29 16:06:20 -04:00
C. Scott Ananian
b05d862405 Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit 77db605163990ad851e3da0fb4fa7eca2081f379

Change-Id: Ib4f9f5b84ff337a6e22566387336ef163665df4f
2020-07-28 15:54:56 -04:00
C. Scott Ananian
7101c981b2 Fix parser test class naming
This isn't really user visible, but the algorithm for ensuring there
are no conflicts in automatically-generated parser test class names
had a number of issues which led to inconsistent naming.

Change-Id: I50ff5b72381332c77f0d99af08e689796019a7af
2020-07-24 00:29:12 -04:00
C. Scott Ananian
a12976911f Correct some bogus comment lines in parserTests.txt
The current parser ignores lines it doesn't understand, but these
aren't valid comments.

Change-Id: I3fd621cdfe3d2cb4f7b58559290856d1f51d9c0f
2020-07-15 23:10:50 +00:00
C. Scott Ananian
39fb017285 Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit 7321ab547b7663ba86c1cfe0bc021ff1918c0970

Change-Id: I6301f1cefb423373aa6ac5cc8a1917a3331cfcd1
2020-07-15 11:38:14 -04:00
jenkins-bot
5de63c7396 Merge "Revert "Adding = as a parser function"" 2020-07-03 15:21:54 +00:00
C. Scott Ananian
9bd4fc0ae9 Revert "Adding = as a parser function"
This reverts commit 4bc0dc348a.

Reason for revert: Dutch Wikitionary uses {{=}} for something else;
see https://phabricator.wikimedia.org/T91154#6276915 for details.

Revert for now so it doesn't disrupt next week's train, we'll add it back with a config var or some other mitigation.

Bug: T91154
Change-Id: I9f81c7b73a04d6c1d77b67ce311cc7e6d279eb8b
2020-07-03 14:52:27 +00:00
DannyS712
2f4b71fc6c Replace uses of Revision constants
Bug: T257010
Change-Id: Id63123e8b8becd31756d5b68ca11edb238ec8a59
2020-07-03 01:23:44 +00:00
jenkins-bot
3b3027a6e1 Merge "Adding = as a parser function" 2020-07-01 19:30:47 +00:00
Base
4bc0dc348a Adding = as a parser function
Bug: T91154
Change-Id: I8c9df0a8ce07e1febef137946615efd74d4800e3
2020-07-01 14:20:37 -04:00
addshore
959bc315f2 MediaWikiTestCase to MediaWikiIntegrationTestCase
The name change happened some time ago, and I think its
about time to start using the name name!
(Done with a find and replace)

My personal motivation for doing this is that I have started
trying out vscode as an IDE for mediawiki development, and
right now it doesn't appear to handle php aliases very well
or at all.

Change-Id: I412235d91ae26e4c1c6a62e0dbb7e7cf3c5ed4a6
2020-06-30 17:02:22 +01:00
Tim Starling
d459add63d Introduce wfDeprecatedMsg()
Deprecating something means to say something nasty about it, or to draw
its character into question. For example, "this function is lazy and good
for nothing". Deprecatory remarks by a developer are generally taken as a
warning that violence will soon be done against the function in question.
Other developers are thus warned to avoid associating with the deprecated
function.

However, since wfDeprecated() was introduced, it has become obvious that
the targets of deprecation are not limited to functions. Developers can
deprecate literally anything: a parameter, a return value, a file
format, Mondays, the concept of being, etc. wfDeprecated() requires
every deprecatory statement to begin with "use of", leading to some
awkward sentences. For example, one might say: "Use of your mouth to
cough without it being covered by your arm is deprecated since 2020."

So, introduce wfDeprecatedMsg(), which allows deprecation messages to be
specified in plain text, with the caller description being optionally
appended. Migrate incorrect or gramatically awkward uses of wfDeprecated()
to wfDeprecatedMsg().

Change-Id: Ib3dd2fe37677d98425d0f3692db5c9e988943ae8
2020-06-22 14:34:39 +10:00
DannyS712
a6d16bd03d Remove unneeded creation of revision objects
Clean up some technical debt; use MutableRevisionRecord instead of
manually constructing a Revision from an array, remove last uses of
RevisionStoreDbTestBase::revisionToRow and remove the method.

Each file can be reviewed separately (except that the removal of
revisionToRow depends on replacing its usage)

Bug: T246284
Change-Id: I0bdc069b21a5c41ef8f9e972c5b17ff189d4a741
2020-06-10 09:09:55 +00:00
Thiemo Kreuz
6aa6d10e86 Replace all call_user_func(_array) in all tests
There is native support for all of this now in PHP, thanks to changes
and additions that have been made in later versions. There should be no
need any more to ever use call_user_func() or call_user_func_array().

Reviewing this should be fairly easy: Because this patch touches
exclusivly tests, but no production code, there is no such thing as
"insufficent test coverage". As long as CI goes green, this should be
fine.

Change-Id: Ib9690103687734bb5a85d3dab0e5642a07087bbc
2020-06-06 18:41:20 +02:00
DannyS712
381d873a8b Replace core uses and hard deprecate Parser(Options) Revision methods
Bug: T249384
Change-Id: Iff10e76120eb8b6b4fbb939182dede83c86d3da2
2020-06-03 05:55:35 +00:00
Tim Starling
47a1619027 Remove terminating line breaks from debug messages
A terminating line break has not been required in wfDebug() since 2014,
however no migration was done. Some of these line breaks found their way
into LoggerInterface::debug() calls, where they mess up the formatting
of the debug log.

So, remove terminating line breaks from wfDebug() and
LoggerInterface::debug() calls.

Also:
* Fix the stripping of leading line breaks from the log header emitted
  by Setup.php. This feature, accidentally broken in 2014, allows
  requests to be distinguished in the log file.
* Avoid using the global variable $self.
* Move the logging of the client IP back to Setup.php. It was moved to
  WebRequest in the hopes that it would not always be needed, however
  $wgRequest->getIP() is now called unconditionally a few lines up in
  Setup.php. This means that it is put in its proper place after the
  "start request" message.
* Wrap the log header code in a closure so that variables like $name do
  not leak into global scope.
* In Linker.php, remove a few instances of an unnecessary second
  parameter to wfDebug().

Change-Id: I96651d3044a95b9d210b51cb8368edc76bebbb9e
2020-06-03 12:01:16 +10:00
Tim Starling
68c433bd23 Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.

General principles:
* Use DI if it is already used. We're not changing the way state is
  managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
  is a service, it's a more generic interface, it is the only
  thing that provides isRegistered() which is needed in some cases,
  and a HookRunner can be efficiently constructed from it
  (confirmed by benchmark). Because HookContainer is needed
  for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
  SpecialPage and ApiBase have getHookContainer() and getHookRunner()
  methods in the base class, and classes that extend that base class
  are not expected to know or care where the base class gets its
  HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
  getHookRunner() methods, getting them from the global service
  container. The point of this is to ease migration to DI by ensuring
  that call sites ask their local friendly base class rather than
  getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
  methods did not seem warranted, there is a private HookRunner property
  which is accessed directly. Very rarely (two cases), there is a
  protected property, for consistency with code that conventionally
  assumes protected=private, but in cases where the class might actually
  be overridden, a protected accessor is preferred over a protected
  property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
  global code. In a few cases it was used for objects with broken
  construction schemes, out of horror or laziness.

Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore

Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router

setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine

Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-05-30 14:23:28 +00:00
Arlo Breault
f45fa7be70 Sync up with Parsoid parserTests.txt
This now aligns with Parsoid commit fab551559cc3b779e2ace29ecabc03559adf0f93

Change-Id: I1699422880e79c096055853cfda9e85f7df6bb94
2020-05-29 11:10:37 -04:00
jenkins-bot
5a412a6046 Merge "Use HTML5 semantics for self-closed HTML tags in wikitext" 2020-05-28 23:02:54 +00:00
Arlo Breault
938b7234a4 Add caption to always suppressing
In brief, the BlockLevelPass looks at opening and closing tags on a line
to determine whether it should do paragraph wrapping.  The blockElems
want to stop wrapping when opened and start again when closed.  The
antiBlockElems want the opposite, to start when they're opened and stop
when closed.  "table" is a blockElems and "td"|"th" are anitBlockElems
so that content found in the interstitial spaces of tables are never
paragraph wrapped.

That means that, to date, "caption" elements are always found in a place
where paragraph wrapping is always suppressed and so adding them to that
set won't change any test results.  However, a new test is added to spec
out this behaviour.

In the legacy parser, "captions" are always found in the right place
because handleTables runs at an earlier stage.  In Parsoid, however, the
treebuilder is relied on to close table cells [0] so when we get to the
token stream paragraph wrappping pass, "caption"s are found in table
cells and therefore get wrapped, even though the treebuilder is about to
be induced to close the cell before opening the caption.

Therefore, in Parsoid, the fix would require us to make captions always-
suppressing to match the legacy parser behaviour.  Thus, this change
here is just to keep these lists [1] consistent between the two
parsers.

[0] 5e11a3f390/src/Wt2Html/TokenizerUtils.php (L138-L151)
[1] 5e11a3f390/src/Wt2Html/TT/ParagraphWrapper.php (L71-L78)

Bug: T210647
Change-Id: I8ccefd69d47dca740f50924b235dffa3873d1f99
2020-05-27 12:29:59 -04:00
C. Scott Ananian
05bc687111 Use HTML5 semantics for self-closed HTML tags in wikitext
This behavior has been deprecated and with a tracking category since
1.28.  Time to remove the temporary parameter added to
Sanitizer::removeHTMLtags() and (finally) tweak the behavior to match
HTML5.

Bug: T134423
Change-Id: I5c725175d05854139c95a2b3d8d35ff63cb6707b
2020-05-27 11:59:18 -04:00
jenkins-bot
c0cb506ad8 Merge "Move french space armoring after doBlockLevels" 2020-05-19 22:09:52 +00:00
DannyS712
b31cec3cec Remove more IE6 and IE7 compatibility and notes
Neither is supported

Bug: T232563
Change-Id: Ia7902f0b1df6148d819621dd5e57d2fe91a50973
2020-05-19 00:31:46 +00:00