Commit graph

1335 commits

Author SHA1 Message Date
jenkins-bot
f05e371bbd Merge "Copy over Parsoid's Config and ServiceWiring classes" 2022-03-29 18:29:16 +00:00
C. Scott Ananian
2d66ee70a2 Copy over Parsoid's Config and ServiceWiring classes
* This is the first step of migrating Parsoid integration code into
  core and transitioning Parsoid from an extension to a pure library.

* Parsoid already has conditional code to skip loading Parsoid's
  copy of its classes, but it relies on the existence of ParsoidServices.
  Technically ParsoidServices isn't needed once Parsoid is migrated to
  core -- users can just use MediaWikiServices instead -- but we need
  to temporarily add ParsoidServices as a marker class during the
  transition.

This version of Parsoid's ServiceWiring comes from Parsoid commit
898c813fd832b3f2d7b5a37f60bd65e8368ce18f.

Bug: T302118
Change-Id: I0b388d93143a782c2c3b72e46407572e5c586e4a
2022-03-28 12:36:38 -04:00
Alexander Vorwerk
38d71c95ad Remove DBMasterPos as an alias for DBPrimaryPos
Bug: T282894
Change-Id: I1f214653d431a2393f40bf963fd422de06fa4363
2022-03-27 21:15:19 +02:00
jenkins-bot
2df0aa9200 Merge "Remove deprecated EventRelayerKafka and KafkaHandler" 2022-03-25 15:16:33 +00:00
Amir Sarabadani
603569ab22 maintenance: Add migrateLinksTable.php
To populate tl_target_id and other tables in the future.

Bug: T299423
Change-Id: I067e861f95555bf7630f32a26c8f6a2284367897
2022-03-25 02:21:34 +01:00
Timo Tijhof
2de79774e1 Remove deprecated EventRelayerKafka and KafkaHandler
Also remove the unmaintained kafka-php package from the from "suggested"
and "dev" composer dependencies, as this is now no longer used.

Change-Id: If5668974f417b627df95bce47db18d46fa03327c
2022-03-25 00:07:22 +00:00
jenkins-bot
7894e49ea8 Merge "Generate config name constants." 2022-03-18 11:09:44 +00:00
jenkins-bot
18875c2c0f Merge "Generate DefaultSettings.php from schema" 2022-03-18 11:02:51 +00:00
daniel
6880561a4d Generate config name constants.
We can generate a class that contains constants for all config variable
names. This removes the need to rely on string literals when calling
Config::get, and it provides a place for documentation that integrates
better with IDEs than the markdown file.

Change-Id: I817dc14c4ce8fc0a29d9c07e8fd393c4f359cade
2022-03-18 10:32:58 +01:00
daniel
1ee8fde16b Generate DefaultSettings.php from schema
Avoid having to maintain defaults twice.

Change-Id: I7a883fe3c952cc653d43b7e399631ec3beab0bc3
2022-03-18 10:25:52 +01:00
jenkins-bot
29e0cf9c9d Merge "Use class constants to define config schema, rather than config-schema.yaml" 2022-03-18 07:44:45 +00:00
daniel
2fe23d6860 Use class constants to define config schema, rather than config-schema.yaml
Instead of maintaining the config schema as a yaml file, we
maintain it as a set of constants in a class. From the information in
these constants, we can generate a JSON schema (yaml) file, and an
php file containing optimized arrays for fast loading.

Advantages:
- PHP doc available to IDEs. The generated markdown file is no longer
  needed.
- Can use PHP constants when defining default values.

NOTE: needs backport to 1.38

Change-Id: I663c08b8a200644cbe7e5f65c20f1592a4f3974d
2022-03-17 21:20:03 +01:00
Ladsgroup
31c1ca8658 Revert "rdbms: make automatic connection recovery apply to more cases"
This reverts commit 4cac31de4e.

Reason for revert: Blocking the train, reverting the chain.

Change-Id: I7f275b3a25379c6f3256e90947c8eed4b232c0f4
2022-03-17 20:11:10 +01:00
Ladsgroup
226da4c3a0 Revert "rdbms: factor out session state helper class from Database"
This reverts commit d189665ea6.

Reason for revert: reverting the chain

Change-Id: I7d9a1237b71bdb2195cbdbe629c9b8634c96893e
2022-03-17 20:10:58 +01:00
Aaron Schulz
d189665ea6 rdbms: factor out session state helper class from Database
Also, rename some private session state methods in Database

Change-Id: Ie784c45f2eed9cd6dd88dc35f765d6384097e24e
2022-03-15 22:32:52 +00:00
jenkins-bot
b9b75d9613 Merge "rdbms: make automatic connection recovery apply to more cases" 2022-03-10 00:08:35 +00:00
Aaron Schulz
4cac31de4e rdbms: make automatic connection recovery apply to more cases
Rename canRecoverFromDisconnect() in order to better describe
its function. Make it use the transaction ID and query walltime
as arguments and return an ERR_* class constant instead of a bool.
Avoid retries of slow queries that yield lost connection errors.

Add methods and class constants to track session state errors
caused by the loss of named locks or temp tables. Such errors can
be resolved by a "session flush" method.

Make assertQueryIsCurrentlyAllowed() better distinguish ROLLBACK
queries from ROLLBACK TO SAVEPOINT queries. For some scenarios,
only full tranasction ROLLBACK queries should be allowed.

Add flushSession() method to Database and flushPrimarySessions()
methods to LBFactory/LoadBalancer.

Also:
* Rename wasKnownStatementRollbackError() and make it take the
  error number as an argument, similar to wasConnectionError().
  Add mysql error codes for query timeouts since they only cause
  statement rollbacks.
* Rename wasConnectionError() and mark it as protected. This is an
  internal method with no outside callers.
* Rename wasQueryTimeout(), remove some HHVM-specific code, and
  simplify the arguments.
* Make executeQuery() use a for loop for the query retry logic
  to reduce code duplication.
* Move the error state setting logic in executeQueryAttempt() up
  in order to reduce code duplication.
* Move the beginIfImplied() call in executeQueryAttempt() up to the
  retry loop in executeQuery(). This narrows the executeQueryAttempt()
  concerns to sending a single query and updating tracking fields.
* Make closeConnection() and doHandleSessionLossPreconnect() in
  DatabaseSqlite more consistent with the base class by releasing named locks.
* Mark trxStatus() as @internal.

Bug: T281451
Bug: T293859
Change-Id: I200f90e413b8a725828745f81925b54985c72180
2022-03-09 15:49:38 +11:00
Timo Tijhof
f8ecea1e5c rcfeed: Deprecate $wgRCEngines and RCFeedEngine
Follows-up 39a6e3dc4d (I8be497c623c5d92).

* Improve documentation all around and advertise 'class'
  everywhere instead of 'uri'.

* Add test coverage for RCFeed::factory().

* Deprecate the $wgRCEngines "uri to class" mapping in favour
  of specifying "class" directly in $wgRCFeeds.

* Deprecate RCFeedEngine in favour of FormattedRCFeed.
  Convert to class_alias so that UDPRCFeedEngine no longer has
  to extend the deprecated class name explicitly (for instanceof compat).

* Hard-deprecate RecentChange::getEngine.

Bug: T250628
Depends-On: Ie939e1d06b9ee2d841ec7256c8d24cc4e7e386dd
Change-Id: Ib6758d724c7200404c89c7ab157aa55f1cad9763
2022-03-08 19:50:19 +00:00
C. Scott Ananian
9f14fbd002 Add Sanitizer::removeSomeTags() which uses Remex to tokenize
The existing Sanitizer::removeHTMLtags() method, in addition to having
dodgy capitalization, uses regular expressions to parse the HTML.
That produces corner cases like T298401 and T67747 and is not guaranteed
to yield balanced or well-formed HTML.

Instead, introduce and use a new Sanitizer::removeSomeTags() method
which is guaranteed to always return balanced and well-formed HTML.

Note that Sanitizer::removeHTMLtags()/::removeSomeTags() take a callback
argument which (as far as I can tell) is never used outside core. Mark
that argument as @internal, and clean up the version used by
::removeSomeTags().

Use the new ::removeSomeTags() method in the two places where
DISPLAYTITLE is handled (following up on T67747).  The use by the
legacy parser is more difficult to replace (and would have a
performace cost), so leave the old ::removeHTMLtags() method in place
for that call site for now: when the legacy parser is replaced by
Parsoid the need for the old ::removeHTMLtags() will go away.  In a
follow-up patch we'll rename ::removeHTMLtags() and mark it @internal
so that we can deprecate ::removeHTMLtags() for external use.

Some benchmarking code added.  On my machine, with PHP 7.4, the new
method tidies short 30-character title strings at a rate of about
6764/s while the tidy-based method being replaced here managed 6384/s.
Sanitizer::removeHTMLtags blazes through short strings 20x faster
(120,915/s); some of this difference is due to the set up cost of
creating the tag whitelist and the Remex pipeline, so further
optimizations could doubtless be done if Sanitizer::removeSomeTags()
is more widely used.

Bug: T299722
Bug: T67747
Change-Id: Ic864c01471c292f11799c4fbdac4d7d30b8bc50f
2022-03-04 14:06:02 -05:00
Tim Starling
74249a8812 Add "grep.php" to search pages for a regex
Change-Id: I56bea66f122050c77e011cb4ae13ddef056ad8e6
2022-02-23 14:20:18 +11:00
Clare Ming
e70633b3b2 Introduce SkinComponentTableOfContents
- Moves Skin::getSectionsData and related methods into new TOC component.
- Update SkinMustacheTest for toc getTemplateData method.
- Add test for new TOC component.

Bug: T301523
Change-Id: I29dda96f1e91da6892840d38a80c6102d425d0f7
2022-02-17 17:43:19 -07:00
Roan Kattouw
d84a3ecbed Add Codex v0.1.0-alpha.2
Bug: T299148
Change-Id: I414698b1c8251a5e873a2b6385b61051f32f54c0
2022-02-14 19:59:53 -08:00
Tim Starling
9c40889117 Add script to benchmark TRUNCATE versus DELETE
I noticed that the CentralAuth integration tests were slow due to their
use of permanent rather than temporary tables, which cause MariaDB to do
unlink+fsync on truncate. The result was an incredible ~100ms of latency
per truncate query, reproducible with this script, although it was
reduced to ~10ms by mounting with nobarrier. Since this is likely
system-dependent, I am curious to know if others have the same issue.

Change-Id: Ifb25ff7675fff179e0b6e465aa3043b928048fc0
2022-02-14 17:03:06 +11:00
mainframe98
a3c6ebba98 Generate abstract schemas with one script call
This rewrites both generateSchemaChangeSql.php and generateSchemaSql.php
to deduplicate logic and adds the 'all' option to --type to generate
schemas for each supported DBMS platform.

Specifying a path with --sql will have the script insert a directory
named after the platform, but only when --type=all is provided.
Only if a directory named mysql/ exists will sql files for mysql be
placed there, to allow using a setup similar to MediaWiki, where only
non-mysql types have their sql files placed in dedicated directories.

This also adds preparations for T298320, so that the abstract schema
can be validated and any errors can be handled before generating a
schema.

Bug: T268587
Change-Id: Ief8282017c8d38659b79262afb8fc691b5bda256
2022-02-11 18:50:25 +01:00
jenkins-bot
7c96930134 Merge "Introduce SkinComponent and SkinComponentLogos" 2022-02-10 21:14:13 +00:00
jenkins-bot
b955dcb47e Merge "Rename ForeignRepoWithMWAPI -> IForeignRepoWithMWApi" 2022-02-10 17:34:42 +00:00
jdlrobson
f7693089a8 Introduce SkinComponent and SkinComponentLogos
In preparation for refactoring SkinTemplate so that SkinMustache
extends Skin rather than SkinTemplate, we take the opportunity
to reorganize the skin code around the concept of components.

Going forward a skin will consist of multiple components, each
of which must return template data that can be passed to an
associated template.

This will result in code that is easier to work with, compared
with the existing 3000 line skin class.

This is the beginning of that journey. Other components will follow
while maintaining backwards compatibility

Bug: T263213
Change-Id: Ib62724c24601e04aa13ab09b3242e70d7d6436ca
2022-02-10 08:10:24 -08:00
Petr Pchelko
ef73bfafd9 Add simple configuration doc generator
This is a first draft of the configuration doc renderer.
The resulting markdown certainly needs some love, but
we can work on improvements incrementally. This gives
us a baseline to reference on doc.wikimedia.org

Bug: T296647
Change-Id: I3c426b9fc37b1cf7ce8423969b2d7589767ee6cc
2022-02-09 07:09:32 -08:00
Brian Wolff
b64e9d17af Rename ForeignRepoWithMWAPI -> IForeignRepoWithMWApi
Per Krinkle. This was just introduced and not included in any
released version of MW, so we don't have to worry about back-compat.
However, this patch should be merged at the same time as the
corresponding patch for TimedMediaHandler.

Change-Id: Ife41cabb0f5a6b89b160aec9a123220275edf914
2022-02-08 08:11:51 -08:00
Petr Pchelko
d54e3d216c Add pre-generator for config-schema.php
Bug: T300129
Change-Id: Ib2620993114af5a659bda60dc45b6fb3bed657b0
2022-02-03 13:27:47 +00:00
jenkins-bot
2becc606d3 Merge "Try not to discard Excimer timeout exceptions" 2022-02-03 00:28:16 +00:00
Tim Starling
ca71e69fc6 Try not to discard Excimer timeout exceptions
Don't catch and discard exceptions from the RequestTimeout library,
except when the exception is properly handled and the code seems to be
trying to wrap things up.

In most cases the exception is rethrown. Ideally it should instead be
done by narrowing the catch, and this was feasible in a few cases. But
sometimes the exception being caught is an instance of the base class
(notably DateTime::__construct()). Often Exception is the root of the
hierarchy of exceptions being thrown and so is the obvious catch-all.

Notes on specific callers:

* In the case of ResourceLoader::respond(), exceptions were caught for API
  correctness, but processing continued. I added an outer try block for
  timeout handling so that termination would be more prompt.
* In LCStoreCDB the Exception being caught was Cdb\Exception not
  \Exception. I added an alias to avoid confusion.
* In ImageGallery I added a special exception class.
* In Message::__toString() the rationale for catching disappears
  in PHP 7.4.0+, so I added a PHP version check.
* In PoolCounterRedis, let the shutdown function do its thing, but
  rethrow the exception for logging.

Change-Id: I4c3770b9efc76a1ce42ed9f59329c36de04d657c
2022-02-02 16:27:44 +11:00
jenkins-bot
5a2b4acdae Merge "Add hook UserEditCountUpdate" 2022-02-01 20:13:15 +00:00
jenkins-bot
491945c9d7 Merge "Add a new interface ForeignRepoWithMWApi" 2022-01-31 22:31:37 +00:00
Tim Starling
960a7925ae Add hook UserEditCountUpdate
Add a hook which runs at the end of UserEditCountUpdate. The idea is to
allow CentralAuth to keep its own similar edit count. There's no
other hook appropriate for this purpose since
UserEditTracker::incrementUserEditCount() has multiple callers.

Convert the private associative array in UserEditCountUpdate to a class.
Instances of the class are passed to the new hook. A class is better
suited to a public interface.

Bug: T300075
Change-Id: I16a92e6a6e9faf1be4c7fbe25354a08559df163d
2022-01-31 17:00:49 +11:00
jenkins-bot
62b141bf1d Merge "rdbms: Introduce TransactionManager class to move out the logic" 2022-01-28 19:17:56 +00:00
Amir Sarabadani
83adf1eb88 rdbms: Introduce TransactionManager class to move out the logic
This would make Database class smaller and encapsulates the transaction
logic making it easier to understand and change.

Bug: T299698
Change-Id: I409a474209ab4b714c5f62e5e7c0b7a62b9e82c1
2022-01-28 19:02:17 +00:00
Brian Wolff
64ede1431a Add a new interface ForeignRepoWithMWApi
Right now, TimedMediaHandler hardcodes looking for ForeignAPIRepo
class name to do some api queries. This means if you make an extension
file repo that supports making api queries that you want to work
with TimedMediaHandler, you need to subclass ForeignAPIRepo.

This isn't very friendly to extension developers. Instead introduce
an interface that extension file repos can implement, so that they
can mark their repos as working for this purpose.

Change-Id: Iadcb831442529d0b9443a422069ada3ce9eac0dd
2022-01-28 08:42:36 +00:00
Aaron Schulz
1cc67dccc7 rdbms: remove deprecated DBAccessBase class
Change-Id: I6dfa4c3b5e1a456565daab58a3252feaa0e05cf1
2022-01-26 17:08:14 -08:00
jenkins-bot
c60409405b Merge "Move LinksUpdate and LinksDeletionUpdate into the new namespace" 2022-01-10 19:17:10 +00:00
jenkins-bot
47705a46e6 Merge "Remove old orphans.php script" 2022-01-06 12:11:15 +00:00
Tim Starling
682aad7557 Move LinksUpdate and LinksDeletionUpdate into the new namespace
Change-Id: I5cf7a08324d08aa89c23540222ba8eddc1ae2647
2022-01-04 15:35:57 +11:00
Aaron Schulz
2ab5e58857 Remove old orphans.php script
This script from ~2005 is not runnable on large wikis, partly redundant
given deleteOrphanedRevisions.php, and is pretty scary. It also is one
of the few callers of lockTables().

Bug: T294969
Change-Id: I8d4608c51ce83ac2e221a91727959b8c1df13db7
2021-12-17 15:55:20 -08:00
Tim Starling
d7cb90dc5e Add a ParserModifyImageHTML hook for PageImages
PageImages is expensively loading and reparsing the lead section during
LinksUpdate because the parser's image link hooks do not give enough
context to tell whether the image is in the lead section.

So, add a hook which allows PageImages to add a marker to image links.
The marker can be associated with the section number in ParserAfterTidy.

Bug: T296895
Bug: T176520
Change-Id: I24528381e8d24ca8d138bceadb9397c83fd31356
2021-12-15 15:33:16 +11:00
Tim Starling
e85d532aa2 RemoteIcuCollation
Add a collation that gets its data from a remote Shellbox instance. This
is meant as a migration helper to use during an ICU upgrade.

Add a batch method to Collation so that this can be somewhat efficient
when adding multiple categories.

Bug: T263437
Change-Id: I76610d251fb55df90c78acb9f59fd81421f876dd
2021-12-13 22:13:10 +00:00
Amir Sarabadani
9bcd3fdfa5 Remove ActionAjax
Bug: T42786
Change-Id: I8bda0c281e1f4abbffbddb80ac74a6d61a034d28
2021-12-01 22:31:30 +01:00
jenkins-bot
7205783166 Merge "Rename Special:Delete/Protect to Special:DeletePage/ProtectPage" 2021-11-18 18:52:30 +00:00
Timo Tijhof
6f061f4db2 resourceloader: Bundle user.defaults as part of mediawiki.base
== Background ==

The `user.options` module is private, and thus has to be embedded in
the page HTML. This data is quite large. For example, on enwiki the
finalized mw.user.options object is about 3KB serialized/compressed
(7KB uncompressed).

The `user.defaults` module is an implementation detail of
`user.options`, and was created to accomplish mainly two things:

* Save significant data transfers by allowing it to be cached
  client-side without being part of the article.
* Ensure consistency between articles and allow faster deployment of
  changes, by not being part of the cacheable article HTML.

All our pageviews already load `user.defaults`, as a dependency of
the popular `mediawiki.api` and `mediawiki.user` modules. These are
used by `mediawiki.page.ready` (queued on all pages), and on Wikipedia
these are also loaded on all pages by ULS, VisualEditor, EventLogging,
and more.

As such, in practice, bundling "user.defaults" with "mediawiki.base"
will not cause the data to be loaded more often than before.

== What ==

* Add virtual "user.json" package file with the same data that
  was previously exported by ResourceLoaderUserDefaultsModule,
  and pass it to mw.user.options.set() from base module's entry point.

  An alternative way would be to use a "user.js" file, which would
  return a generated "mw.user.options.set()" expression. I went
  for exporting it as JSON for improved maintainability (reducing
  the amount of JS code written in PHP), and because it performs
  slightly better. The JS file would implicitly come with a file
  closure (tiny bit more bytes), and would then be lazy executed
  (tiny bit more time).

  The chosen approach allows the browser to compile the JSON
  off-the-main-thread ahead of time while the module response downloads.
  Then when the module executes, we can reference the JSON object
  and use it directly.

* Update internal dependency from `user.options`.

* Remove `user.defaults` module without deprecation. It is an internal
  module with no direct use anywhere in Git (Codeseach), and no use
  anywhere on-wiki (Global Search).

Change-Id: Id3916f94f75078808951863dea2b3a9c71b0e30c
2021-11-18 04:41:09 +00:00
Alexander Vorwerk
439838c33f Rename Special:Delete/Protect to Special:DeletePage/ProtectPage
Otherwise they are using the 'delete' and 'protect' message key,
which are already defined.

Follow-Up: Idf076573c0f429171221660145b616ec83516a2a
Bug: T295611
Change-Id: I797f0e23ca639dac6889f3f363b7a8c72e30f681
2021-11-18 02:12:20 +01:00
Alexander Vorwerk
f172fd717c Create redirect Special Pages for delete and protect action
Adding Special:Delete and Special:Protect as redirects to the delete and
protect action.

Bug: T295611
Change-Id: Idf076573c0f429171221660145b616ec83516a2a
2021-11-13 13:21:49 +00:00