This patch restores the old behavior of omitting the xml:space attribute
on empty <text> tags in stub dumps and suppressed revisions.
Bug: T228763
Change-Id: I12e72a3f4f3583e4e41daa11a9a28a96cadf7725
WikiExporter used to require SCHEMA_COMPAT_WRITE_OLD to be enabled,
until that requirement was fixed in I5ea972bb07ca1cfb3a2ad8ef120aef7.
However, I failed to remove the explicit check for the flag at the
time, causing all exports to fail in SCHEMA_COMPAT_NEW mode. This
change removes the obsolete check.
Bug: T236735
Change-Id: I809ed4e2f1f30fdc4bd817f815d733d8a62f3d4f
These were all checked with codesearch to ensure nothing is overriding
these methods.
For the most part, I've updated the signature to use nullable types; for
two Pager's, I've just made all parameters non-optional, because you're
already forced to pass them with a required parameter at the end.
Bug: T231636
Change-Id: Ie047891f55fcd322039194cfa9a8549e4f1f6f14
The base implementation says it can accept an array with a single
element, but the subclasses only had `string` in the docblock (although
they could handle the array case). Hence, replace docblocks in
subclasses with @inheritDoc to copy the parent description and avoid
such discrepancies in the future.
Plus, change `array` to `string[]` for better type inference.
Change-Id: Ica9929fd50f31d8d5f0e29f7c60364086ea39ae5
Using * in select is not the prefered way.
List all needed columns to make the use visible and to avoid issues when
new fields gets added with big data.
As each column name is unique there is no need to get the table name for
prefixing the columns
The following columns no longer selected:
- log_user_text -> not used due to use of ActorMigration class
- log_actor -> Add by ActorMigration class
- log_comment_id -> Added by CommentStore
- log_page -> Unused in the writer, the ns/title pair is used instead
Move the arrays out of the loop, because there are not depending on
values changing in the loop
Change-Id: I140641b7ed75bc2b8db2e7612020d668f1be663b
This allows us to remove many suppressions for phan false positives.
Bug: T231636
Depends-On: I82a279e1f7b0fdefd3bb712e46c7d0665429d065
Change-Id: I5c251e9584a1ae9fb1577afcafb5001e0dcd41c7
Loading content can also throw InvalidArgumentException when
the cluster address is an unknown cluster.
Bug: T228720
Change-Id: I313f9a5a27b21a33e90639abae3f505640c30e23
In the WMF databases, we have several revisions for which we cannot
load the content. They typically (but not necessarily) have
content_address = "tt:0" and content_sha1 = "" and rev_sha1 = ""
and content_size = 0 and rev_len = 0.
This patch makes sure we can still generate dumps in the presence of
such revisions.
Bug: T228720
Change-Id: Iaadad44eb5b5fe5a4f2e60da406ffc11f39c735b
This introduces a way to construct a RevisionRecord based on a
known set of SlotRecords. To allow this to be used consistently
with the legacy revision schema, some tweaks had to be made
to getSlotsQueryInfo().
Bug: T220493
Change-Id: I5ea972bb07ca1cfb3a2ad8ef120aef77e460745c
These global functions were deprecated in 1.34 and services made
available to replace them. See services below;
* wfFindFile() - MediaWikiServices::getInstance()->getRepoGroup()->findFile()
* wfLocalFind() - MediaWikiServices::getInstance()->getRepoGroup()->getLocalRepo()->newFile()
NOTES:
* wfFindFile() and wfLocalFind() usages in tests have been ignored
in this change per @Timo's comments about state of objects.
* includes/upload/UploadBase.php also maintained for now as it causes
some failures I don't fully understand, will investigate and handle
it in a follow up patch.
* Also, includes/MovePage.php
Change-Id: I9437494de003f40fbe591321da7b42d16bb732d6
Also drop ordering of revs within pages, since there is only one
revision being dumped
Bug: T207628
Change-Id: I5e4f0bea7b54506ca389818407c43152a290da6e
We don't alter the db query for this, but throw away the extraneous
rows before doing any processing on them whatsoever.
Use of the DumpNamespaceFilter comes too late to avoid processing
for each revision done in XmlDumpWriter::writeRevision.
Bug: T220940
Change-Id: I9cb30ce612d862d97d96720ac68ff2327409f485
...and not as numbers!! Also added strict compare for the namespaces
field while we're in here.
Bug: T220257
Change-Id: If68b79334188c2f3be5d254bea3c1e27d52c4a9f
This makes BackupDumper compatible with the new mechanism for accessing
revision content.
This requires some changes to the way database connections are re-used,
since RevisionStore/SqlBlobStore needs to be able to run queries against
the database while the overall result set is being streamed.
This change does not yet add handing for extra slots to BackupDumper.
That first needs a spec for how extra slots will be represented in the
XML schma (T174031).
NOTE: this changes the output of fetchText from using integer text_id
values to using content_address values (e.g. "tt:4567" for text row
with old_id 4567). It also changes fetchText to accept such addresses
as input, for forward-compatibility. XML stub dumps still use the
numeric format in the id attribute, pending T199121.
Bug: T198706
Change-Id: If4c31b7975b4d901afa8c194c10446c99e27eadf
In abstracts for the specific case, we don't care at all, since the
problem is that it's a self redirect. Redirects are filtered out of
the stream at the end so it won't even show up.
In anything else, we do what dumpTextPass does already, which is to
leave the text alone and emit it as is.
Bug: T217329
Change-Id: I39cdf89531c67962b1a9bba4e0a91f7c655ad6f3
htmlentities() can output entity references that are invalid in XML.
Use htmlspecialchars() instead.
Additionally, cast user-id to int for phan-taint-check
Bug: T216348
Change-Id: Idf781f5a3ffc3c6463969b3f5af63f0f08ae837c