Common utility method for maintenance scripts that's a little more
clever than LBFactory::waitForReplication(). Previously it was
included in Maintenance::commitTransaction but that logs an error
when there is no uncommitted change, and one might want to commit
more often and wait for replicas to catch up only after some amount
of commits.
Change-Id: I3394536eea01eb982a4a2033fd2062bc67f6bdc1
- in the instructions for how to extract IDs of bad revisions using
grep, the expression was looking for the wrong string.
- output for bad revisions didn't include the timestamp, making it
harder to determine the duration of the problem that cause the
bad revisions.
- documentation advertized YYYY-MM-DD_HH:MM:SS as an allowed data
format, but it wasn't actually supported.
Bug: T272540
Change-Id: Iac0c184c5a7008aec3b0899df30c6fb6644b23d9
THe findBadBlobs.php maintenance script unnecessarily required
cleanupTable.inc instead of the typical Maintenance.php. While this
worked (because cleanupTable.inc requires Maintenance.php), it was
slightly confusing and slightly less efficient. Change to just
require Maintenance.php instead.
Bug: T263604
Change-Id: I42dfb5220b701ec90f39e9ad905c1e32c9c28904
This allows bad actor IDs to be overwritten with some default. This
solves the problem of rows in tables like ipblocks, logging, or
revision not being found due to a failing join against the actor table.
Bug: T261325
Change-Id: Ibc554d0b6f52e7b30cdde5138ac165774831ec36
This makes the following changes to the findBadBlobs utility:
- rename --from-date to --scan-from, to match the intended use.
- require the usage of --revisions with --mark, so revisions
cannot be marked directly when found by a scan.
- catch any exception when testing for bad blobs, casting
a wider net.
- change the output format, so the IDs of bad revisions can easily be
extracted by command line tools for further processing.
- warn when trying to mark blobs that can successfully we read.
The idea is to allow detection of blobs that are "bad" in a
large variety of ways, including due to misconfiguration, while at the
same time making sure that blobs do not get marked as bad due to
temporary outages.
The intended usage of findBadBlobs is to first scan a potentially
problematic set of revisions using --scan-from, review to errors found,
and then determine which of the revisions should be marked as bad.
Once the bad revisions have been identified, a list with their IDs
can be extracted from the output, and supplied back to findBadBlobs
via the --revisions option.
Bug: T251778
Change-Id: I47c11190b665c1dac88db32ee2bf683728cb3dc6
Force the database to use the rev_timestamp index.
MySql/MariaDB was coming up with very slow query plans.
Bug: T205936
Change-Id: Iab68253c62a51463ba4afd072cd7bff2d1fafdde
When for some reason we can't determine the title for a revision
in the batch, this should not trigger a fatal TypeError, but handled
gracefully, with helpful information included in the error message.
Bug: T205936
Change-Id: I0c7d2c1fee03d1c9208669a9b5ad66612494a47c
This adds a --revisions paramter to markBadBlobs.php that can be used to
specific individual revisions, instead of scanning by date.
Bug: T205936
Change-Id: Ie1a907f2c15f1d4a85affff2701ff2289bfa77ea
This script scans for content blobs that can't be loaded due to
database corruption, and can change their entry in the content table
to an address starting with "bad:". Such addresses cause the content
to be read as empty, with no log entry. This is useful to avoid
errors and log spam due to known bad revisions.
The script is designed to scan a limited number of revisions from a
given start date. The assumption is that database corruption is
generally caused by an intermedia bug or system failure which will
affect many revisions over a short period of time.
Bug: T205936
Change-Id: I6f513133e90701bee89d63efa618afc3f91c2d2b