Commit graph

162 commits

Author SHA1 Message Date
Reedy
499dbfb4cb maintenance: Use more of namespaced Maintenance class
Change-Id: I53f2e32c73c92cc3a0deee48ebe6d13329a7a0cf
2024-10-16 01:09:19 +00:00
Bartosz Dziewoński
ecb8b0d689 Specify caller in DB queries
Found warnings about this in WMF production logs.

Change-Id: I3ba973c320d672604c0c0ffa1c229a32231261b9
2024-09-11 15:20:47 +02:00
Dreamy Jazz
e7393b3cc7 Exclude boilerplate maintenance code from code coverage reports
Why:
* Maintenance scripts in core have bolierplate code that is
  added before and after the class to allow directly running
  the maintenance script.
* Running the maintenance script directly has been deprecated
  since 1.40, so this boilerplate code is only to support a now
  deprecated method of running maintenance scripts.
* This code cannot also be marked as covered, due to PHPUnit
  not recognising code coverage for files.
* Therefore, it is best to ignore this boilerplate code in code
  coverage reports as it cannot be marked as covered and also
  is for deprecated code.

What:
* Wrap the boilerplate code (requiring Maintenance.php and then
  later defining the maintenance script class and running if the
  maintenance script was called directly) with @codeCoverageIgnore
  comments.
* Some files use a different boilerplate code, however, these
  should also be marked as ignored for coverage for the same
  reason that coverage is not properly reported for files.

Bug: T371167
Change-Id: I32f5c6362dfb354149a48ce9c28da9a7fc494f7c
2024-08-27 13:22:29 +01:00
Umherirrender
81c6df6a46 maintenance: Use expression builder instead of raw sql
Bug: T361023
Change-Id: Ieb229d8088cb1ff3f03e44f7ac99eb612f48bc7b
2024-07-22 22:29:20 +02:00
Umherirrender
fea5c2f687 Use expression builder to avoid raw sql via BETWEEN operator
Replace BETWEEN with >= and <= operator

Change-Id: Ic21b6f4cc11c773c967d9d4c5f20e762c2ff9629
2024-04-21 14:24:21 +02:00
Umherirrender
8018e157e8 maintenance: Migrate to IDatabase::newUpdateQueryBuilder
Bug: T353219
Change-Id: Ic278c8534dad40a3f34674db2d5fbfbca5984da8
2024-04-14 18:47:55 +00:00
Amir Sarabadani
d9370003fb maintenance: Introduce getReplicaDB() and getPrimaryDB()
And start using them instead of wfGetDB(), LB/LBF connection methods or
worse, $this->getDB().

$this->getDB() reuses the database object regardless of whether you're
calling a replica or primary, leading to returning a replica on a
primary and other way around.

Bug: T330641
Change-Id: I9e2cf85ca277022284fc26b9f37db57bd12aaa81
2024-01-18 15:12:04 +01:00
Amir Sarabadani
5a3e6564e4 maintenance: Migrate to DeleteQueryBuilder
Bug: T353219
Change-Id: Iecb55ab3f905ee9ed4e32e9cbb58c36f8cacf669
2024-01-02 13:13:49 +01:00
Tim Starling
9c02258a04 Use thousands separators in selected integer literals
For readability. Allowed since PHP 7.4.

I searched for integer literals of 6 or more digits, and also changed
some nearby smaller numbers for consistency.

Bug: T353205
Change-Id: I8518e04889ba8fd52e0f9476a74f8e3e1454b678
2023-12-12 09:22:45 +11:00
James D. Forrester
67217d08df Namespace remaining files under includes/deferred
Bug: T166010
Change-Id: Ibd40734b96fd2900e3ce12239d09becfb4150059
2023-11-22 10:08:53 -05:00
Bartosz Dziewoński
c9683b7092 Replace more single-value $db->buildComparison() with $db->expr()
A few more fairly simple cases that don't quite match the regexp in
I2cfc3070c2a08fc3888ad48a995f7d79198cc336 or required other tweaks.

Change-Id: I5438c777344e9ba07f3b62a452fce9ec63baa48a
2023-10-22 01:06:04 +02:00
Bartosz Dziewoński
695a489d5b Re-apply "Remove allowances for missing redirect rows"
This reverts commit c5f4ffd4e6,
re-applies commit b0fe2c4111.

WikiPage::getRedirectTarget() needs to still allow missing rows,
but for a different reason.

Bug: T348881
Change-Id: I6e1fd823fbe140819c28096d5adc41cd15bcc8c0
2023-10-18 23:50:01 +00:00
Bartosz Dziewoński
c5f4ffd4e6 Revert "Remove allowances for missing redirect rows"
This reverts commit b0fe2c4111.

Reason for revert: Causing test failures in the UserMerge extension.

Bug: T348881
Change-Id: I35e82df7a7f95150927dc6e4ad68588c3400b63f
2023-10-13 16:10:34 +00:00
Bartosz Dziewoński
b0fe2c4111 Remove allowances for missing redirect rows
After the other changes in T346290 there must always be a `redirect`
table row for each page with `page_is_redirect=1`. The only place
that needs to handle missing rows is the migration script
fixInconsistentRedirects.php.

Bug: T346290
Change-Id: I7e991aa5a33be37e0d6c9ef0900306706c171466
2023-10-03 19:31:13 +02:00
Bartosz Dziewoński
35596d980a installer: Add database updater for 2008/2011 redirect schema changes
In 2008, the `redirect` table was added, and in 2011, it gained the
fields `rd_interwiki` and `rd_fragment`. We have never performed
proper maintenance for those changes, instead relying on code in
WikiPage to update it when the page was visited, or on an optional
run of refreshLinks.php.

I would like to remove the code in WikiPage, so we probably need to
perform this maintenance in the database updater. You know, for the
millions of people who have been dutifully upgrading their MediaWiki
installations since 2008, but never visited the pages there.

The script is a trimmed-down version of refreshLinks.php, without all
the weird stuff, and using a better index for the queries.

Bug: T346290
Change-Id: Iea251d2737b2fb472c4efb060ad2b97735b4ac53
2023-09-21 20:28:13 -07:00
Derick Alangi
74033c50cd maintenance: Begin using Maintenance::getServiceContainer()
Maintenance class provides a method for getting a fresh reference
of the MW services container instance. Let's make use of these in
maintenance scripts now that we have it.

NOTE: There are still some static methods like in refreshLinks.php
that makes use of services that we can't use this method for now.

Change-Id: Idba744057577896fc97c9ecf4724db27542bf01c
2023-09-04 10:39:58 +00:00
Matěj Suchánek
1c8896a0dd Fix various typos and documentation issues
Change-Id: I2cd4b647c01d84cfe0e1b4d55e155ced8c918b17
2023-08-27 12:05:11 +02:00
Func
b0451af27c refreshLinks: Use join instead of subquery for dfnCheckInterval()
From now this script is fully migrated to the select query builder method.

Change-Id: I61623632d9f61fadf58bbda62fcd3be38690b641
2023-08-25 22:22:35 +08:00
Func
5618428f63 refreshLinks: Fix refreshing pages in category
Follow up to commit 49e56ae11, the `page_id` field should always be
selected since we use it later.

Sorry, I only noticed this issue when I ran it with the `--verbose` option.

Bug: T344402
Change-Id: Ia8a3affea3324955a94ba5b2cd7a9fb39596cc44
2023-08-24 14:24:04 +08:00
Func
9c55a2613f refreshLinks: Introduce --touched-only option
So that we can only fix pages that has been touched after last update.

Bug: T344402
Change-Id: I141e8c9c36801373f89141155ed5124ca2234388
2023-08-22 00:28:31 +00:00
Func
aec6191b5e refreshLinks: Skip DFN if the namespace option is given
This feature can not support query by namespace: only few link tables
have the `xxx_from_namespace` field, and we are looking for non-existing
pages.

Bug: T344402
Change-Id: I21485e2ce843489072a0d6dbeec621ceec9fe6ae
2023-08-22 00:28:09 +00:00
Func
49e56ae11d refreshLinks: Improve efficiency of page filtering
Use select queries for page IDs, so we can avoid loading and checking
each page via the WikiPage object.
Also, reused the doRefreshLinks() method for refreshCategory(), so the
fixRedirect() check also works for this case.

Other behavioral changes:
 - Pass the start/end options to deleteLinksFromNonexistent() even when
   not in the `dfn-only` mode, since we may want to run different
   intervals in parallel to save time, and we don't need DFN without
   intervals.
 - fixRedirect() now won't delete entries for nonexistent pages,
   since the page filtering method changed to use select queries,
   and the deleting is covered by deleteLinksFromNonexistent().
 - Removed the clearing of link cache, which was added to control the
   memory usage in 2006, but now LinkCache uses a MapCacheLRU.

Bug: T344402
Change-Id: Iaefeeb0391393a2273edfa0f32d4f75ff4b7b22b
2023-08-22 00:15:46 +00:00
Func
a1a6cf3a19 refreshLinks: Remove unrelated check on the tracking-category option
Tracking category keys are something like `broken-file-category`, not
category page name for Title::makeTitleSafe().

This partially reverted commit 06e2d0e874

Bug: T331473
Change-Id: Ic744a58ef56981c3aecc4e7cf5322b77894a9249
2023-08-17 10:47:20 +08:00
Umherirrender
6e0065ad20 Simplify WHERE conditions with field IS NULL
Reduce raw sql fragments on simple compares

Change-Id: I3f2340dfdbf5197cc22546911e6c5653dc5a6269
2023-07-24 19:22:36 +02:00
Amir Sarabadani
310333906a maintenance: Switch simple calls of Database::select to SQB
Done semi-automatically via a php parser written on top of ANTLR4.

Bug: T311866
Change-Id: I33f5b6703c0aa9c80c907a21c2a770e30642edd3
2023-07-19 17:42:23 +02:00
David Causse
7cbb253de3 refreshLinks: set a causeAction for SecondaryDataUpdates
Knowning the reason why a secondary update was triggered is an useful
information for debugging. Some LinksUpdate hook consumers might also
want to fine-tune their behaviors based on this value.

Change-Id: I19c0620e409b31995080ee0111b0b78782276563
2023-06-12 15:55:34 +02:00
samtar
06e2d0e874
refreshLinks: Add verbose option
Add verbose output to refreshCategory() with total pages to refresh,
and per-page refresh status.
Add `Title::makeTitleSafe` to the passed `tracking-category` before
passing to refreshTrackingCategory().

Bug: T331473
Change-Id: I3234b1560156813b95355754de2212508f7ee6af
2023-03-07 19:47:25 +00:00
Umherirrender
17fdcb4c3d TrackingCatgories: Change doc from Title to LinkTarget
Repair maintenance script (from cba68e4) while testing

Follow-Up: I697ce188a912e445a6a748121575548e79aabac6
Change-Id: Id0cc2cafbe5780f11855d0cf608296f2b331e1ee
2023-03-02 20:51:40 +01:00
James D. Forrester
ad06527fb4 Reorg: Namespace the Title class
This is moderately messy.

Process was principally:

* xargs rg --files-with-matches '^use Title;' | grep 'php$' | \
  xargs -P 1 -n 1 sed -i -z 's/use Title;/use MediaWiki\\Title\\Title;/1'
* rg --files-without-match 'MediaWiki\\Title\\Title;' . | grep 'php$' | \
  xargs rg --files-with-matches 'Title\b' | \
  xargs -P 1 -n 1 sed -i -z 's/\nuse /\nuse MediaWiki\\Title\\Title;\nuse /1'
* composer fix

Then manual fix-ups for a few files that don't have any use statements.

Bug: T166010
Follows-Up: Ia5d8cb759dc3bc9e9bbe217d0fb109e2f8c4101a
Change-Id: If8fc9d0d95fc1a114021e282a706fc3e7da3524b
2023-03-02 08:46:53 -05:00
thiemowmde
d923d7aa8d Use more narrow database interfaces in maintenance scripts
This makes this code easier to read and to maintain because it's more
obvious why a DB connection is passed. For now this patrch focusses
exclusively on private methods.

Change-Id: Id60dc90b124f4cae1dfbede990f45e3c69491a25
2023-02-27 15:58:37 +00:00
jenkins-bot
ae5b8b774b Merge "Add --before-timestamp option to refreshLinks.php" 2023-02-21 02:35:23 +00:00
Kunal Mehta
058ec520c5 Add --before-timestamp option to refreshLinks.php
We want to be able to refresh pages that haven't been updated in quite a
while to ensure any MediaWiki parsing changes, etc. get reflected in
links tables. This information is already tracked in the
page.page_links_updated field, which we'll filter with. We can't
actually do it in SQL because there's no index on the column, but it
gets loaded by WikiPageFactory::newFromID(), so we simply check it
in the fixRedirect() and fixLinksFromArticle() functions.

== Test plan ==
* Run plain `refreshLinks.php` and see that all page_links_updated
  fields have been updated to now (this patch doesn't break existing
  functionality).
* Backdate some page_links_updated fields.
* Run refreshLinks.php --before-timestamp X, where X is between your
  backdated values and now.
* Observe that only the backdated pages have had their
  page_links_updated modified to now.

Bug: T159512
Change-Id: I695d971ef7cbabddda3125361975be0f94dabf4c
2023-02-19 17:42:03 -05:00
Kunal Mehta
2e82f45606 refreshLinks: Use namespaceCond()
It does the same thing.

Change-Id: I9c3554487e0207ef6df95cbde309d36cc610aa05
2023-02-18 12:56:10 +00:00
Timo Tijhof
24509f97c0 ParserCache: Improve docs for cache type and purgeParserCache.php
Change-Id: I3301ff90a135bd0103c1dfc7e86f1dd1ba245a5a
2023-01-30 20:49:52 +00:00
jenkins-bot
d6c13414b8 Merge "Use buildComparison() instead of raw SQL in more maintenance scripts" 2022-12-01 14:51:58 +00:00
Bartosz Dziewoński
865002b57c Use buildComparison() instead of raw SQL in more maintenance scripts
Bug: T321422
Change-Id: Ibe46e5df64a3a6a6e8042a56e10aa286dd3797dd
2022-11-15 09:54:05 +01:00
Umherirrender
ea5ea60b31 Various doc fixes about false on method arguments/return types
Doc-only changes

Change-Id: I5177f582ae7ee70c357e9389fed14819faf79463
2022-11-10 19:23:46 +00:00
Amir Sarabadani
b525884e11 maintenance: Use $this->waitForReplication()
This adds reconfiguring db pools in case a replica gets depooled

Bug: T298485
Change-Id: Id052ce8ed45c51e51b071778858d27b48605bf93
2022-10-24 21:11:53 +02:00
Thiemo Kreuz
67c56155c7 Replace trivial usages of code in strings with concatenation
This is really hard to read. What is code, what is string? These
places are so simple, they really don't need the "{$var}" syntax.

Change-Id: I589dedb8c0193eec4eef500bbb896b5b790b727b
2022-08-26 12:26:44 +00:00
Derick Alangi
8fe9e0317f Introduce Redirect(Lookup&Store) services to handle redirects
The concept of a redirect chain didn't really work for a value of
max redirect > 1. In the ideal world, we just want to have a source
which points to target (source -> target) discarding the concept of
a redirect chain completely.

Having something like: source -> target -> target1 -> target2 doesn't
really work well with the current database design.

NOTE: Support for $wgMaxRedirect will be removed soon hence
deprecation without interfaces for replacement.

Bug: T290639
Change-Id: I469de6f85e405e8ddbe7abaa5b99b77cb9cf415d
2021-12-01 19:14:22 +01:00
DannyS712
b9663bed45 Convert TrackingCategories to a service with DI
Bug: T247194
Change-Id: I50012e2a5e65aeee7671023d2fd5367e21e8ae67
2021-10-08 16:36:20 -04:00
James D. Forrester
df5eb22f83 Replace uses of DB_MASTER with DB_PRIMARY
Just an auto-replace from codesniffer for now.

Change-Id: I5240dc9ac5929d291b0ef1c743ea2bfd3f428266
2021-04-29 09:24:31 -07:00
Umherirrender
62002cdcf1 build: Update mediawiki/mediawiki-codesniffer to 35.0.0
Change-Id: Idb413be4b8cba8611afdc022af59810ce1a4531e
2021-01-31 13:34:38 +00:00
Reedy
cba68e4f02 refreshLinks.php: use hasOption() rather than getOption and assignment in conditional
Change-Id: I0f5bc2117b5d26e10f116b879d60e7c996690463
2021-01-28 23:52:28 +00:00
Umherirrender
a3194f2194 Replace deprecated WikiPage::factory/newFromID in maintenance scripts
Change-Id: I5b2d4313f986484368da9b63c9a19892c2328dae
2020-11-12 21:48:21 +00:00
Umherirrender
5a318bd5bf Pass function name to database functions (maintenance scripts)
Useful for logging

Change-Id: I79fe037abcd74f56c935abc118d706bef0198124
2020-06-07 17:24:10 +00:00
Tim Starling
68c433bd23 Hooks::run() call site migration
Migrate all callers of Hooks::run() to use the new
HookContainer/HookRunner system.

General principles:
* Use DI if it is already used. We're not changing the way state is
  managed in this patch.
* HookContainer is always injected, not HookRunner. HookContainer
  is a service, it's a more generic interface, it is the only
  thing that provides isRegistered() which is needed in some cases,
  and a HookRunner can be efficiently constructed from it
  (confirmed by benchmark). Because HookContainer is needed
  for object construction, it is also needed by all factories.
* "Ask your friendly local base class". Big hierarchies like
  SpecialPage and ApiBase have getHookContainer() and getHookRunner()
  methods in the base class, and classes that extend that base class
  are not expected to know or care where the base class gets its
  HookContainer from.
* ProtectedHookAccessorTrait provides protected getHookContainer() and
  getHookRunner() methods, getting them from the global service
  container. The point of this is to ease migration to DI by ensuring
  that call sites ask their local friendly base class rather than
  getting a HookRunner from the service container directly.
* Private $this->hookRunner. In some smaller classes where accessor
  methods did not seem warranted, there is a private HookRunner property
  which is accessed directly. Very rarely (two cases), there is a
  protected property, for consistency with code that conventionally
  assumes protected=private, but in cases where the class might actually
  be overridden, a protected accessor is preferred over a protected
  property.
* The last resort: Hooks::runner(). Mostly for static, file-scope and
  global code. In a few cases it was used for objects with broken
  construction schemes, out of horror or laziness.

Constructors with new required arguments:
* AuthManager
* BadFileLookup
* BlockManager
* ClassicInterwikiLookup
* ContentHandlerFactory
* ContentSecurityPolicy
* DefaultOptionsManager
* DerivedPageDataUpdater
* FullSearchResultWidget
* HtmlCacheUpdater
* LanguageFactory
* LanguageNameUtils
* LinkRenderer
* LinkRendererFactory
* LocalisationCache
* MagicWordFactory
* MessageCache
* NamespaceInfo
* PageEditStash
* PageHandlerFactory
* PageUpdater
* ParserFactory
* PermissionManager
* RevisionStore
* RevisionStoreFactory
* SearchEngineConfig
* SearchEngineFactory
* SearchFormWidget
* SearchNearMatcher
* SessionBackend
* SpecialPageFactory
* UserNameUtils
* UserOptionsManager
* WatchedItemQueryService
* WatchedItemStore

Constructors with new optional arguments:
* DefaultPreferencesFactory
* Language
* LinkHolderArray
* MovePage
* Parser
* ParserCache
* PasswordReset
* Router

setHookContainer() now required after construction:
* AuthenticationProvider
* ResourceLoaderModule
* SearchEngine

Change-Id: Id442b0dbe43aba84bd5cf801d86dedc768b082c7
2020-05-30 14:23:28 +00:00
Reedy
c7eb28aac9 Fix various MediaWiki.WhiteSpace.SpaceBeforeSingleLineComment.NewLineComment
Change-Id: I50c7c93f1534e966224f98a835ca01f93eb9416d
2020-05-21 01:06:05 +00:00
Reedy
f648dd236e Fix PSR12.Properties.ConstantVisibility.NotFound in maintenance/
Change-Id: Ib0f081f7b278fdd3f4083fc5020bcac97f6015b4
2020-05-09 23:54:58 +00:00
Reedy
8ba1c75559 Replace wfWaitForSlaves() with LBFactory::waitForReplication()
Change-Id: I337147d61e2ec686a8672d0340dff4b6783f78cd
2020-05-02 02:00:01 +00:00