* Introduce new hooks which allow BacklinkCache to handle non-core tables
* Make table name a parameter to RefreshLinks2 job (instead of hardcoded templatelinks)
Breaks unit tests as below, not going to be able to fix them before I disappear for the evening, so might aswell leave trunk clean
ArticleTablesTest testbug14404
Error:
ArticleTablesTest::testbug14404
Undefined offset: 0
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/includes/ArticleTablesTest.php:31
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiTestCase.php:60
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiPHPUnitCommand.php:20
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/phpunit.php:60
ParserTests testParserTest #552 - testParserTest with data set #551
Failure:
ParserTests::testParserTest with data set #551 ('RAW magic word', '{{RAW:QUERTY}}', '<p><a href="/index.php?title=Template:QUERTY&action=edit&redlink=1" class="new" title="Template:QUERTY (page does not exist)">Template:QUERTY</a>
</p>', '', '')
RAW magic word
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-<p><a href="/index.php?title=Template:QUERTY&action=edit&redlink=1" class="new" title="Template:QUERTY (page does not exist)">Template:QUERTY</a>
+<p><a href="/index.php?title=Template:RAW:QUERTY&action=edit&redlink=1" class="new" title="Template:RAW:QUERTY (page does not exist)">Template:RAW:QUERTY</a>
</p>
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/includes/parser/NewParserTest.php:545
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiTestCase.php:60
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/MediaWikiPHPUnitCommand.php:20
/home/ci/cruisecontrol-bin-2.8.3/projects/mw/source/tests/phpunit/phpunit.php:60
Make it so that by default, the parser consdiers categories to have a sortkey of "" unless
specified otherwise. Before, the parser said the sortkey was the page name, and then in LinksUpdate
we reset the sortkey (prefix) to "" if it was the same as the page name.
This way, the parser uniformly outputs the sortkey prefix, instead of a mix between prefix or pagename.
It should be noted, this changes the output of api.php?action=parse for categories that do not have
a sortkey.
* Added a maintenance script which generates a list of first letters. Unified Han are omitted for performance, and because they shouldn't be used as headings anyway. A future collation specific to Chinese would provide the KangXi radicals as "first letters".
* Provided a precomputed list of first letters. Used Unicode 6.0.0 data and ICU 4.2.
* Moved collation functionality from Language to a Collation class hierarchy with factory function. Removed the recently-added methods from Language and updated all callers.
* Changed Title::getCategorySortkey() to separate its parts with a line break instead of a null character. All collations supported by the intl extension ignore the null character, i.e. "ab" == "a\0b". It would have required a lot of hacking to make it work.
* Fixed the uppercase collation to handle non-ASCII characters, redundantly with r80436. I don't think it's necessary to change the collation name as was done there, so I reverted that in the course of my conflict merge. A --force option to updateCollation.php might be nice though.
Caused by two things:
*after the new category collation stuff, the if this category has changed
code was comparing the user supplied sortkey, with the collation sortable
sortkey, which didn't work. (See also CR for r70415) This is new issue in 1.17
*If the sortkey was too long (long enough to get truncated by DB), then the
did this sortkey change code also failed since it was comparing the
non-truncated sortkey to the truncated sortkey. This issue is present
in older (all previous?) versions of mediawiki. This issue especially affects
non-latin wikis that use 3-byte characters.
This bug primarily is an issue for people using DPL type extension.
I wasn't sure if i should add RELEASE-NOTES, since i'm tagging this 1.17
Per review by Tim, I made two changes:
1) Fix cl_sortkey to be varbinary(255).
2) Expand cl_collation to varbinary(32), and change $wgCollationVersion
to $wgCategoryCollation, to account for the variety of collations we
might have. tinyint is too small. I could have gone with int, but
that's annoyingly inscrutable in practice, as we all know from namespace
fields.
To make the upgrade easier for non-trunk users, I updated the old patch
file to incorporate the new changes, using the updatelog table so that
people upgrading from 1.16 won't have to do two alters on categorylinks.
I didn't test the upgrade-from-1.16 code path yet, so if anyone tests
that and it seems not to break, commenting to that effect would be
appreciated.
Also removed wfDeprecated() from archive(). Do *not* add this to
functions that are still actively used in core. If you think this
function is so terrible that it really mustn't be used, remove callers
yourself, don't pester every single developer with messages in the hope
that someone else will do it for you.
Patch best viewed with whitespace changes ignored. This will doubtless
introduce a bunch of bugs. Please report any so I can fix them. If
they're big enough and the fix isn't obvious, please revert.
As Bawolff pointed out at [[mw:User talk:Simetrical/Collation]], the
prefixing scheme I was using meant that the page "Z" with sort key of
"F" would sort after a page named "A" with a sort key of "FF", since the
first one's raw sort key would compute to "FZ", and the second's would
compute to "FFA". I've fixed this by separating the prefix from the
unprefixed part by a null byte (cl_sortkey is eventually going to be
totally binary anyway, may as well start now).
In response to feedback by Phillipe Verdy on bug 164. Now if a bunch of
pages have [[Category:Foo| ]], they'll sort amongst themselves according
to page name, instead of in basically random order as it is currently.
This also makes storage more elegant and intuitive: instead of giving
NULL a magic meaning when there's no custom sortkey specified, we just
store an empty string, since there's no prefix.
This means {{defaultsort:}} really now means {{defaultsortprefix:}},
which is slightly confusing, and a lot of code is now slightly
misleading or poorly named. But it should all work fine.
Also, while I was at it, I made updateCollation.php work as a transition
script, so you can apply the SQL patch and then run updateCollation.php
and things will work. However, with the new schema it's not trivial to
reverse this -- you'd have to recover the raw sort keys with some PHP.
Conversion goes at about a thousand rows a second for me, and seems to
be CPU-bound. Could probably be optimized.
I also adjusted the transition script so it will fix rows with collation
versions *greater* than the current one, as well as less. Thus if some
site wants to use their own collation, they can call it 137 or
something, and if they later want to switch back to MediaWiki stock
collation 7, it will work.
Also fixed a silly bug in updateCollation.php where it would say "1000
done" if it did nothing, and changed $res->numRows() >= self::BATCH_SIZE
to == so people don't wonder how it could be bigger (since it can't, I
hope).
Hidden behind $wgExperimentalCategorySort until it's reasonably
complete. If that's false, no behavior should change (but I didn't test
carefully, so poke me if there's a bug). See DefaultSettings.php for
documentation on setting it to true. Currently you should not do this
except if you're working on the feature, since functionality is not
close to reasonable yet and will change rapidly.
Bug 1211 is already fixed with this commit for me. However, many other
things still need to be done, so this is all very much a
proof-of-concept.
Like langlinks, this stores the interwiki prefix (as iwl_prefix) and full page title (as iwl_title), attached to the page doing the liking (as iwl_from -> page_id).
Unlike langlinks, there can be multiple entries stored per interwiki prefix.
Updater to add the table confirmed on MySQL, untested on SQLite but should work.
Someone may still need to add and test a PostgreSQL updater.
Refactored makeWhereFrom2d() out of LinkBatch to Database so it could be re-used for the similar mapping for the interwiki links, which need a string prefix rather than an int namespace key.
Also cleaned it up internally to reuse existing code for building where clauses from arrays. (Tim & Domas -- if the previous more verbose code was there to reduce function call and array processing overhead on very large link lists, feel free to unroll it again if the difference is measurable. Just swap the var names around from the old LinkBatch code and escape the base key value if it's not an integer, it'll be functionally equivalent.)
* r41081 was causing the job queue to be flooded with tiny htmlCacheUpdate jobs which were less than the batch size and so, according to the original logic, should have been done immediately. This was causing template updates to be delayed even when the template has few backlinks. This is fixed.
* Introduced a shared cache called BacklinkCache with the main purpose of sharing data from these backlink queries, thus recovering the performance of r41081.
* Refactored backlink partitioning code, which in r40741 was copied from HTMLCacheUpdate to LinksUpdate with a bug intact. The bug caused every htmlCacheUpdate or refreshLinks2 job to be split into at least two pieces even when number of rows is less than the batch size.
* Fixed a bug from r40741 causing refreshLinks2 jobs with end=false to be ignored.
* Made SquidUpdate::newFromTitles() accept a TitleArray
This is a global search and replace of NS_IMAGE and NS_IMAGE_TALK with NS_FILE and NS_FILE_TALK respectively in all core files, excluding those already updated in step 1 (r44004).