The check makes sure that the page id is known for the title
passed to the constructor. LinksUpdate needs to know this id in
order to update the various links tables. If the page id is not
known to the title (e.g. because the page doesn't actually exist)
something is wrong, and LinksUpdate can't operate.
Amend: use MWException instead of assert()
Change-Id: I4873211a71099fe3563b52a53532c95b6a2ff30f
This supercedes I6d03bf2a, using better names for the new classes and
incorporating the changes requested by Aaron.
This change introduces the base class SecondaryDataUpdate to be used for any
updates that need to be applied when a page is changed or deleted. Until now,
this was done by the LinksUpdate class for updates and WikiPage::doDeletionUpdates
upon deletion. This patch uses a list of SecondaryDataUpdates in both cases.
This allows extensions (e.g. via the ContentHandler facility, once that is in) to
easily specify what needs to be done when a page is updated or deleted in order to
keep any secondary data stores (such as link tables) in sync.
Note that limited transactional logic is also introduced, so SecondaryDataUpdate
can be implemented to only commit their changes if all updates were performed
sucessfully.
Patch Set 2: fixing some coding style issues mentioned by Nikerabbit.
Patch Set 4: some stuff I kept from the old LinksUpdate class needs cleanup,
but might break extensions when changed. Marking as todo for now.
Patch Set 5: fixed misnamed member in LinksDeletionUpdate (thanks Aaron).
Change-Id: Ibe3e88fadd8c1d4063cf13bb6972f2a23569a73f
* Introduce new hooks which allow BacklinkCache to handle non-core tables
* Make table name a parameter to RefreshLinks2 job (instead of hardcoded templatelinks)
Breaks unit tests as below, not going to be able to fix them before I disappear for the evening, so might aswell leave trunk clean
ArticleTablesTest testbug14404
Error:
ArticleTablesTest::testbug14404
Undefined offset: 0
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/includes/ArticleTablesTest.php:31
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiTestCase.php:60
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiPHPUnitCommand.php:20
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/phpunit.php:60
ParserTests testParserTest #552 - testParserTest with data set #551
Failure:
ParserTests::testParserTest with data set #551 ('RAW magic word', '{{RAW:QUERTY}}', '<p><a href="/index.php?title=Template:QUERTY&action=edit&redlink=1" class="new" title="Template:QUERTY (page does not exist)">Template:QUERTY</a>
</p>', '', '')
RAW magic word
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-<p><a href="/index.php?title=Template:QUERTY&action=edit&redlink=1" class="new" title="Template:QUERTY (page does not exist)">Template:QUERTY</a>
+<p><a href="/index.php?title=Template:RAW:QUERTY&action=edit&redlink=1" class="new" title="Template:RAW:QUERTY (page does not exist)">Template:RAW:QUERTY</a>
</p>
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/includes/parser/NewParserTest.php:545
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiTestCase.php:60
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiPHPUnitCommand.php:20
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/phpunit.php:60
Make it so that by default, the parser consdiers categories to have a sortkey of "" unless
specified otherwise. Before, the parser said the sortkey was the page name, and then in LinksUpdate
we reset the sortkey (prefix) to "" if it was the same as the page name.
This way, the parser uniformly outputs the sortkey prefix, instead of a mix between prefix or pagename.
It should be noted, this changes the output of api.php?action=parse for categories that do not have
a sortkey.
* Added a maintenance script which generates a list of first letters. Unified Han are omitted for performance, and because they shouldn't be used as headings anyway. A future collation specific to Chinese would provide the KangXi radicals as "first letters".
* Provided a precomputed list of first letters. Used Unicode 6.0.0 data and ICU 4.2.
* Moved collation functionality from Language to a Collation class hierarchy with factory function. Removed the recently-added methods from Language and updated all callers.
* Changed Title::getCategorySortkey() to separate its parts with a line break instead of a null character. All collations supported by the intl extension ignore the null character, i.e. "ab" == "a\0b". It would have required a lot of hacking to make it work.
* Fixed the uppercase collation to handle non-ASCII characters, redundantly with r80436. I don't think it's necessary to change the collation name as was done there, so I reverted that in the course of my conflict merge. A --force option to updateCollation.php might be nice though.
Caused by two things:
*after the new category collation stuff, the if this category has changed
code was comparing the user supplied sortkey, with the collation sortable
sortkey, which didn't work. (See also CR for r70415) This is new issue in 1.17
*If the sortkey was too long (long enough to get truncated by DB), then the
did this sortkey change code also failed since it was comparing the
non-truncated sortkey to the truncated sortkey. This issue is present
in older (all previous?) versions of mediawiki. This issue especially affects
non-latin wikis that use 3-byte characters.
This bug primarily is an issue for people using DPL type extension.
I wasn't sure if i should add RELEASE-NOTES, since i'm tagging this 1.17
Per review by Tim, I made two changes:
1) Fix cl_sortkey to be varbinary(255).
2) Expand cl_collation to varbinary(32), and change $wgCollationVersion
to $wgCategoryCollation, to account for the variety of collations we
might have. tinyint is too small. I could have gone with int, but
that's annoyingly inscrutable in practice, as we all know from namespace
fields.
To make the upgrade easier for non-trunk users, I updated the old patch
file to incorporate the new changes, using the updatelog table so that
people upgrading from 1.16 won't have to do two alters on categorylinks.
I didn't test the upgrade-from-1.16 code path yet, so if anyone tests
that and it seems not to break, commenting to that effect would be
appreciated.
Also removed wfDeprecated() from archive(). Do *not* add this to
functions that are still actively used in core. If you think this
function is so terrible that it really mustn't be used, remove callers
yourself, don't pester every single developer with messages in the hope
that someone else will do it for you.
Patch best viewed with whitespace changes ignored. This will doubtless
introduce a bunch of bugs. Please report any so I can fix them. If
they're big enough and the fix isn't obvious, please revert.
As Bawolff pointed out at [[mw:User talk:Simetrical/Collation]], the
prefixing scheme I was using meant that the page "Z" with sort key of
"F" would sort after a page named "A" with a sort key of "FF", since the
first one's raw sort key would compute to "FZ", and the second's would
compute to "FFA". I've fixed this by separating the prefix from the
unprefixed part by a null byte (cl_sortkey is eventually going to be
totally binary anyway, may as well start now).
In response to feedback by Phillipe Verdy on bug 164. Now if a bunch of
pages have [[Category:Foo| ]], they'll sort amongst themselves according
to page name, instead of in basically random order as it is currently.
This also makes storage more elegant and intuitive: instead of giving
NULL a magic meaning when there's no custom sortkey specified, we just
store an empty string, since there's no prefix.
This means {{defaultsort:}} really now means {{defaultsortprefix:}},
which is slightly confusing, and a lot of code is now slightly
misleading or poorly named. But it should all work fine.
Also, while I was at it, I made updateCollation.php work as a transition
script, so you can apply the SQL patch and then run updateCollation.php
and things will work. However, with the new schema it's not trivial to
reverse this -- you'd have to recover the raw sort keys with some PHP.
Conversion goes at about a thousand rows a second for me, and seems to
be CPU-bound. Could probably be optimized.
I also adjusted the transition script so it will fix rows with collation
versions *greater* than the current one, as well as less. Thus if some
site wants to use their own collation, they can call it 137 or
something, and if they later want to switch back to MediaWiki stock
collation 7, it will work.
Also fixed a silly bug in updateCollation.php where it would say "1000
done" if it did nothing, and changed $res->numRows() >= self::BATCH_SIZE
to == so people don't wonder how it could be bigger (since it can't, I
hope).
Hidden behind $wgExperimentalCategorySort until it's reasonably
complete. If that's false, no behavior should change (but I didn't test
carefully, so poke me if there's a bug). See DefaultSettings.php for
documentation on setting it to true. Currently you should not do this
except if you're working on the feature, since functionality is not
close to reasonable yet and will change rapidly.
Bug 1211 is already fixed with this commit for me. However, many other
things still need to be done, so this is all very much a
proof-of-concept.
Like langlinks, this stores the interwiki prefix (as iwl_prefix) and full page title (as iwl_title), attached to the page doing the liking (as iwl_from -> page_id).
Unlike langlinks, there can be multiple entries stored per interwiki prefix.
Updater to add the table confirmed on MySQL, untested on SQLite but should work.
Someone may still need to add and test a PostgreSQL updater.
Refactored makeWhereFrom2d() out of LinkBatch to Database so it could be re-used for the similar mapping for the interwiki links, which need a string prefix rather than an int namespace key.
Also cleaned it up internally to reuse existing code for building where clauses from arrays. (Tim & Domas -- if the previous more verbose code was there to reduce function call and array processing overhead on very large link lists, feel free to unroll it again if the difference is measurable. Just swap the var names around from the old LinkBatch code and escape the base key value if it's not an integer, it'll be functionally equivalent.)