Commit graph

24 commits

Author SHA1 Message Date
Brad Jorsch
42b14d98b5 MigrateActors: Improve query for log_search rows
The query in question is, generically,

 SELECT ls_value, ls_log_id, actor_id
  FROM log_search LEFT JOIN actor ON (ls_value = actor_user)
  WHERE ls_field = 'target_author_id'
  ORDER BY ls_value, ls_log_id
  LIMIT 100;

The intention is that it'll pull out 100 rows from log_search using its
primary key, and for each of those 100 rows find the actor_id for the
referenced user_id (using the actor_user index) if an actor ID exists.

The twist comes in the fact that ls_value is a string-type column while
actor_user is an integer-type. MySQL doesn't usually care, but other DBs
do so we have to cast one into the other.

Currently the code is casting actor_user to a string. But that means the
DB can't use the index on actor_user to find the one matching row,
instead it needs to scan the whole table.

The fix is simple enough: instead of casting actor_user to a string,
cast ls_value to an integer. That allows the actor_user index to be used
as expected.

Bug: T215525
Change-Id: I2f7a6ba9fd537336594088a0281a62ea5601cd59
2019-04-02 10:37:24 -04:00
Brad Jorsch
a3c101ee07 Fix typo in MigrateActors.php
Change-Id: Ic08210a8cd394f6ad49673a2d2e4800e6bcf2989
2019-04-02 09:49:00 -04:00
jenkins-bot
d94c86dbac Merge "make xml abstracts, stubs and page log dumps work again" 2019-03-24 20:36:56 +00:00
Thiemo Kreuz
9314453c93 Make use of the list() feature where it makes sense
This code is functionally identical, but less error prone (not so easy
to forget or mix these numerical indexes).

This patch happens to touch the Parser, which might be a bit scary. We
can remove this file from this patch if you prefer.

Change-Id: I8cbe3a9a6725d1c42b86e67678c1af15fbc5961a
2019-03-24 20:12:23 +00:00
Ariel T. Glenn
a397704f82 make xml abstracts, stubs and page log dumps work again
Broken in I979b6c8f0a72bc1f5ecce1d499d3fdfa0f671588

Bug: T174031
Change-Id: I494fe7578f936a2316c27f9c419e981055c38ed4
2019-03-24 13:43:37 +02:00
jenkins-bot
b9789f9c56 Merge "add lbzip2 output processor for exports" 2019-03-23 23:35:29 +00:00
Ariel T. Glenn
b01ff36537 add lbzip2 output processor for exports
Bug: T214293
Change-Id: I98e26b833df473bbeb3dc1b881f428174d776b64
2019-03-24 01:20:04 +02:00
Brad Jorsch
23b5c0891a RevDel: Avoid log_search rows with empty values for target_author_actor
During migration, RevDel may wind up being used on items where an actor
has not been assigned yet. The code creating log_search rows for
target_author_actor needs to take this into account.

Also, to clean this up on Wikimedia wikis, I've added code to
MigrateActors to delete these rows before (re-)migrating log_search
and a --tables option so a re-run can skip trying to process all the
already-processed tables (cf. T188327#4892827).

Bug: T215525
Change-Id: Ica15e2e30445e23761e6d3d6405b3eb39a086161
2019-03-21 16:42:48 -04:00
daniel
45f3912bf1 Make the XML dump schema version configurable.
Bug: T174031
Change-Id: I979b6c8f0a72bc1f5ecce1d499d3fdfa0f671588
2019-03-21 12:43:32 +01:00
Brian Wolff
1af807c10f Various fixes for phan-taint-check
Change-Id: I56f42ef2d2e9b4f3c23e1e93d1a4d3db64f16de7
2019-03-16 21:12:40 +00:00
Thiemo Kreuz
b7cd670cb7 maintenance: Remove unused code from several maintenance scripts
The most notable removal is done in the orphans script. This code was
really never used. Brion introduced it in 2005, already disabled.

I have all the respect for what Brion did. I just think it does not make
much sense to keep code around for so long if it does not work anyway,
and must be rewritten from scratch anyway now that we have multi-content
revisions and such.

Change-Id: I4e8050929f90e44a6e6051bf938993a8b0cdf649
2019-03-03 16:57:19 +00:00
Thiemo Kreuz
007bfbf835 maintenance: Add missing limit parameters to some explode()
This is, in theory, a loophole that can not only cause such code to
consume suprising amounts of memory and runtime. It can also create
suprising results. For example, an input like

 -param="might contain a = char"

might result in a cut-off value.

Not so much of a problem in a maintenance script. But still good
practice, I find.

Change-Id: I14fb278e6fdb61d0c486ca7e23229851ea479408
2019-03-01 17:17:40 +00:00
Umherirrender
c242c67803 Add missing use for IMaintainableDatabase
Change-Id: I00b30466fa6044988768493586993c3db253c975
2019-02-20 20:57:18 +01:00
Brad Jorsch
aa83f3a6bd MigrateActors: Don't delete log_search rows when migrating
When I4764c1c78 switched from being run during read-both/write-new to
write-both/read-old, we should have also removed the code that
blanked/deleted the old rows. That was done for the main migration, but
was overlooked for log_search.

Bug: T215464
Change-Id: Icbba54dbd57fe0fa07ea0f6dcdde30089f067ace
2019-02-15 14:54:29 -05:00
Matěj Suchánek
93aeb14e08 Move migrateActors.php to includes
This way it can be subclassed in extensions, like AbuseFilter
(Ic755526d5f989c4a66b1d37527cda235f61cb437).

Bug: T188180
Change-Id: Idf320232011c72e39267b1f3c39848aea35d37fe
2019-02-09 11:42:38 +01:00
Derick Alangi
027fb1c8cd Fix condition if...else in getDB() & PHPDoc comment for getUserDB()
So the conditional check should by default return $this->mDb if it's not
null, so, the else seems not to be needed(?). If we have a database handle
to process the current batch, $this->getDB() will return IMaintainableDatabase
but if it's not available (null), a call to $this->getDB() will return an
instance of \Wikimedia\Rdbms\Database is returned instead.

In accordance with the documentation (phpdoc), update the method getUserDB()
to be compliant with callers return type.

Change-Id: I95f3407dd2ffe8e4a1ad7a70be86b6cf3b65ff50
2019-02-06 09:50:59 +00:00
Thiemo Kreuz
d62f4688e8 Use the ?: shortcut from PHP 5.3 where it makes sense
Change-Id: Ieff70f23b19f0be3670c4ed3e2a5c30ef3792d7f
2019-01-12 21:56:41 +00:00
jenkins-bot
e53f26bde1 Merge "Replace WikiExporter streaming (unbuffered) mode with batched queries" 2018-10-02 05:16:07 +00:00
Amir Sarabadani
609d1fa001 Add waitForReplication in DeleteLocalPasswords
It almost brought commonswiki down when it was run

Bug: T201009
Change-Id: Ia825f9572b8c71c5627eb627c58f51a689c2f8aa
2018-10-01 13:42:12 +00:00
Bill Pirkle
085b6e4787 Replace WikiExporter streaming (unbuffered) mode with batched queries
WikiExporter allows streaming mode, using unbuffered mode on
the database connection. We are moving away from this technique.
Instead, do multiple normal queries and retrieve the
information in batches.

Bug: T203424
Change-Id: I582240b67c91a8be993a68c23831c9d86617350e
2018-09-28 10:55:05 -05:00
Umherirrender
a4caa4d0c6 build: Updating mediawiki/mediawiki-codesniffer to 22.0.0
Added spaces around .
Removed empty return statement which are not required
Removed return after phpunit markTestIncomplete,
which is throwing to exit the test, no need for a return

Change-Id: I2c80b965ee52ba09949e70ea9e7adfc58a1d89ce
2018-09-16 15:51:11 +00:00
Amir Sarabadani
3fc713d455 Fix --user option in DeleteLocalPasswords
Bug: T201009
Change-Id: I69c14741f578b59cd73e5c8c5576f8c250825a30
2018-09-12 13:31:59 +02:00
Timo Tijhof
55875bcd2d maintenance: Move backup.inc to a regular php class file
Move the class to maintenance/includes/, following the precedent
set by d788692076.

Bug: T184782
Change-Id: I0ba86e4401e2c97db4cf2ad9f0e78c04b5565ee8
2018-08-02 17:20:30 +01:00
Gergő Tisza
d788692076 Add maintenance script for deleting local passwords
This is mainly for the benefit of authentication extensions which
all need similar functionality for removing local passwords on a
wiki where local authentication was used for a while but has been
disabled, but can be used directly to just indiscriminately remove
the passwords of all users.

To test the change without irreversibly locking out users, an
option is provided to make the password invalid in an
easy-to-reverse way.

The immediate use case is I974184899c33.

This patch also introduces the maintenance/includes directory
to hold PHP files which are not executable scripts themselves.
(Previously such files had a .inc extension, but that is so PHP4.)

Bug: T57420
Change-Id: If7207b80a2c8374e90182e0b09d8f76ee94264b0
2018-08-02 13:47:33 +00:00