Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
Kunal Mehta	6e9b4f0e9c	Convert all array() syntax to [] Per wikitech-l consensus: https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html Notes: * Disabled CallTimePassByReference due to false positives (T127163) Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b	2016-02-17 01:33:00 -08:00
Max Semenik	59db24e90b	Use addDescription() instead of accessing mDescription directly Change-Id: I0e2aa83024b8abf5298cfea4b21bf45722ad3103	2016-01-30 01:28:32 -08:00
Kevin Israel	1dd4c867e5	updateCollation.php: Switch back to using cl_from index for now Using the cl_sortkey index instead (to reduce disruption to a live site), as currently implemented, seems to have two serious problems: * MySQL / MariaDB filesorts all rows that "sort above the given row [the last row of the previous batch]", not just a single category at a time until the row limit is reached. * The current approach to pagination is broken in that it does not work with ENUM columns such as cl_type, causing 'file' rows to be skipped, or rows of any type to be repeated. See T119173. This reverts part of commit `a43f751cf6`. Bug: T58041 Change-Id: I619564e85b2122f249bdacc45d547b9ce1b3beb5	2016-01-21 05:57:48 +00:00
Aaron Schulz	fa8e1a9b00	Clean up transactions in maintenance scripts Add transaction methods to complement getDB(). This makes it easy to grep for direct begin()/commit() calls to IDatabase by having script use their own wrapper. Maintenance scripts are one of the few places that can (and need to) use begin/commit instead of the start/end atomic methods. Eventually, there should be almost no direct callers and those methods can be made stricter about throwing errors on nested calls. Change-Id: Ibbfc7a77c0d2a55f7fc2261087f6c3a19061e0aa	2015-12-30 23:40:35 +00:00
Kevin Israel	924a34c298	Remove --max-slave-lag options and remnants from maintenance scripts Change-Id: Id01fb9a82bcfe1af8cbce23a9aec7eccaa0f6b21	2015-03-26 19:33:35 -04:00
umherirrender	b0cfcd0fcb	Add missing @return and @param to doc blocks Change-Id: I9d99ba1968ed8f97624d957754c8847dfe1b41da	2014-08-27 21:57:45 +02:00
umherirrender	6b4c44c2db	Add missing @param to function docs Change-Id: Ib26407bc55dff7969d8a3b1e2ae51751b202d8fb	2014-08-18 16:24:59 +00:00
Siebrand Mazeland	606c680b21	Update formatting in maintenance/ (4/4) Change-Id: I6b58d014a4bfd6600e4e6f80188fdcfce18482ca	2014-04-23 20:09:26 +02:00
Mark A. Hershberger	0b5acd0623	Move reference to $row where it is in-scope and doesn't produce E_STRICT notices. Bug: 57575 Change-Id: Ic508ebbb0816acd32be355b5f19b46637d58c36a	2013-11-25 22:18:55 -05:00
MatmaRex	c9e8cffc81	updateCollation.php: sanity check the collation before proceeding In some cases the constructor will work, but trying to access first letter data will raise an exception, breaking all category pages. Bug: 46615 Change-Id: I77de040f97080653fe0d1734d38490eaa2d322db	2013-07-04 05:21:04 +00:00
Timo Tijhof	beb1c4a0ec	phpcs: More require/include is not a function Follows-up I1343872de7, Ia533aedf63 and I2df2f80b81. Also updated usage in text in documentation and the installer LocalSettingsGenerator. Most of them were handled by this regex: - find: (require\|include\|require_once\|include_once)\s$\s(.+?)\s$\s;$ - replace: $1 $2; Change-Id: I6b38aad9a5149c9c43ce18bd8edbab14b8ce43fa	2013-05-21 23:26:28 +02:00
Brian Wolff	af6d3572fa	Revert "(bug 46615) updateCollation.php: sanity check the collation before proceeding" Sorry, forgot that method was not in the base class, and I had only tested with uca based collations. This breaks on uppercase type collations. This reverts commit `6eb84144df` Change-Id: Ib7b9597ff842a76185ba5c153922834ffb741237	2013-05-15 22:40:29 +00:00
Timo Tijhof	50e7985d4d	phpcs: Fix WhiteSpace.LanguageConstructSpacing warnings Squiz.WhiteSpace.LanguageConstructSpacing: Language constructs must be followed by a single space; expected "require_once expression" but found "require_once(expression)" It is a keyword (e.g. like `new`, `return` and `print`). As such the parentheses don't make sense. Per our code conventions, we use a space after keywords like these. We appeared to have an unwritten exception for `require` that doesn't make sense. About 60% of require/include usage was missing the space and/or had superfluous parentheses. It is as silly as print("foo") or return("foo"), it works because keywords have no significance for whitespace between it and the expression that follows, and since experessions can be wrapped in parentheses for clarity (e.g. when doing string concatenation or mathematical operations) the parenthesis before and after basiclaly just ignored. Change-Id: I2df2f80b8123714bea7e0771bf94b51ad5bb4b87	2013-05-09 05:56:26 +02:00
MatmaRex	6eb84144df	(bug 46615) updateCollation.php: sanity check the collation before proceeding Change-Id: I5be1b1ec1823fdb7438c3f501fb6194142c1e9dc	2013-03-27 21:16:57 +01:00
Platonides	c3f1a3c9ea	`a43f751` removed the usage of $wgMiserMode Change-Id: I5528dba582d218721324431015bd930b9b6ab57e	2013-03-18 04:21:55 +00:00
Tim Starling	1db83c1b76	Restore SET cl_timestamp=cl_timestamp Apparently cl_timestamp=cl_timestamp is a workaround for obscure behaviour of the timestamp type in MySQL Change-Id: I803f20bcf4e28e8e2833a07bcf00e7edc00ad84b	2013-03-13 10:18:12 +11:00
Tim Starling	a43f751cf6	Reduce disruption during updateCollation.php Have updateCollation.php order by cl_to, so that each category is updated all at once. This minimises the time during which a category will appear to be incorrectly sorted, while the maintenance script is in progress. Mark the cl_collation index as needing deletion, it was always pretty pointless. You can't do much better than a full table scan when you're changing the collation value on a wiki. Increase the batch size since the lack of a cl_to,cl_from index means that it will have to filesort each category. A larger batch size means less sorts. As noted by Liangent on bug 45970, you can't order by cl_sortkey since that will change during execution. Also fix an inappropriate use of $wgMiserMode and remove a no-op from the SET clause of the UPDATE. Very lightly tested. Change-Id: I19bc8d6701f5f78040aa9c521427ac98ef488d89	2013-03-12 23:08:29 +00:00
Marius Hoch	652c4be7c2	Clean up: Declare variables with public instead of var Variables in classes should be declared using public $foo instead of var $foo for various reasons. As we require PHP 5.3 we don't have to take care about that PHP4 left over, but can get rid of it in favour of the more clear and better readable public. See also: http://php.net/manual/en/language.oop5.visibility.php (Divided into several commits to keep reviewable) Change-Id: Ic723d0347ab2e3c78bc0097345c68bbee3dc035a	2012-09-14 21:00:00 +02:00
Alexandre Emsenhuber	2a7478b4fb	Improve documentation of maintenance scripts. Change-Id: Id7a04ff816dc47a8cc81a4da5ab0dff26b688bd5	2012-09-03 20:10:09 +02:00
jeroendedauw	38c7f444e1	Use __DIR__ instead of dirname( __FILE__ ) We can now do this since we finally switched to PHP 5.3 for MW 1.20 and get rid of the silly dirname(__FILE__) stuff :) Change-Id: Id9b2c9cd2e678197aa81c78adced5d1d31ff57b1	2012-08-27 21:45:00 +02:00
Tim Starling	8df24d5586	updateCollation.php size histogram feature Added a feature allowing updateCollation.php to show a histogram of sort key sizes, to assess the effect of index size truncation. Added --dry-run and --target-collation options to allow the index truncation to be assessed without actually changing the collation. Change-Id: I497b5d0740384f5d6fdebc6d5ccfea5d853fbd37	2012-07-18 13:23:14 +10:00
Reedy	a8cdc7df3a	Use estimateRowPage if wiki is using wgMiserMode Change-Id: I59404e9514a87f65faf3eb865fafe358d9f01079	2012-07-06 17:57:40 +01:00
Sam Reed	c47f83a4d4	More __METHOD__ in our madness	2012-02-24 18:45:24 +00:00
Sam Reed	62491fef13	Comments, braces, explicit member variables Remove a couple of unused variables	2011-11-16 13:22:03 +00:00
Roan Kattouw	a47f2dcb2d	Followup r97146: drop the $lb->waitTimeout() call per Tim. Was used so Tim could sleep while a schema change was going on, but this is the kind of live hack that doesn't belong in core.	2011-09-15 12:42:29 +00:00
Roan Kattouw	c6fb8af8ef	Merge live hacks from r83992 to trunk, after cleaning some things up. * Wait for slaves after every thousand rows rather than after processing every batch. r83992 had 1000 hard-coded, I put it in SYNC_INTERVAL * Set $lb->waitTimeout(100000). I have no idea why, but it was in the live hack. Maybe Tim or Domas could enlighten me * Use a STRAIGHT JOIN for the query on categorylinks and page because MySQL appears to want to join the tables the wrong way around * Use cl_collation='previousValue' rather than cl_collation!='newValue' if possible. This was originally a dirty live hack, but I re-implemented it nicely with a --previous-collation command line option * Print a status update both before and after the SELECT query. This allows the user to notice when the SELECT queries are getting increasingly slower, which is an indication you may want to set --previous-collation	2011-09-15 12:17:44 +00:00
Max Semenik	c79a16167a	Introduced Maintenance::getDB() and corresponding setDB() to control externally what database object should be used by maintenance script. Currently used by updater to avoid DatabaseSqliteTest from running stuff like Populate* on the live database instead of the one used for testing.	2011-05-24 17:48:22 +00:00
Sam Reed	fa7662d94a	Ensure $collationConds is defined on all paths	2011-04-14 18:46:37 +00:00
Sam Reed	b88afb0daa	Fixup/add documentation Remove some unused variables	2011-03-30 19:00:11 +00:00
Roan Kattouw	a38fd53df2	(bug 27975) Fix r83529 (slave catchup in updateCollation.php) to not try to wait for slaves if there are none. Reporter was getting a permission error for getting the master position on a single-server setup	2011-03-14 09:30:56 +00:00
Aryeh Gregor	8c69bdb0a6	Change collationUpdate batch size from 1000 to 50 It selects that many rows, then does PHP processing and an individual update query for each one. This is not a good idea when each batch is done in a single transaction: 1000 MySQL updates interspersed with PHP processing might take a second or more while locks are held.	2011-03-08 21:21:08 +00:00
Roan Kattouw	ff6fec1e6f	Make updateCollation.php a bit less murderous for WMF databases: * Don't run a COUNT() query on what's potentially the entire categorylinks table on enwiki (hundreds of millions of rows). Put it in a miser mode check Wait for DB replication to catch up before processing the next batch. Implemented LoadBalancer::waitAll() for this purpose, which should behave more nicely than wfWaitForSlaves()	2011-03-08 16:47:26 +00:00
Tim Starling	f1869f59b0	Add --force option to updateCollation.php.	2011-01-20 06:24:11 +00:00
Tim Starling	eaeea84b44	* Introduced a non-dummy collation for $wgCategoryCollation, namely UCA with default tables. * Added a maintenance script which generates a list of first letters. Unified Han are omitted for performance, and because they shouldn't be used as headings anyway. A future collation specific to Chinese would provide the KangXi radicals as "first letters". * Provided a precomputed list of first letters. Used Unicode 6.0.0 data and ICU 4.2. * Moved collation functionality from Language to a Collation class hierarchy with factory function. Removed the recently-added methods from Language and updated all callers. * Changed Title::getCategorySortkey() to separate its parts with a line break instead of a null character. All collations supported by the intl extension ignore the null character, i.e. "ab" == "a\0b". It would have required a lot of hacking to make it work. * Fixed the uppercase collation to handle non-ASCII characters, redundantly with r80436. I don't think it's necessary to change the collation name as was done there, so I reverted that in the course of my conflict merge. A --force option to updateCollation.php might be nice though.	2011-01-17 14:02:22 +00:00
Brian Wolff	c79b4bdd21	Change the default collation from strtoupper to Language::uc, so that non-ascii characters get to play too. I know the uppercase thing is just a standby until a real collation function is written. However in the mean time, i think it'd be really weird for a wiki with $wgCapitalLinks = false to suddenly have [[a]] and [[A]] sort under the same letter in a category page, but [[Ä]] and [[ä]] sort no where near each other, even though on a capitalized wiki they would be the same page. See discussion on r69816. Also fix an issue with maintenance/updateCollation.php, where php thinks that 'uppercase' == 0 (?!). I don't really know what the deal with that is, but using a ! instead of == 0 seems to fix it. (Follow-up r69961)	2011-01-17 06:27:49 +00:00
Chad Horohoe	26505b170a	Fix concern raised by Brion in r74108 (but has really existed since the maintenance rewrite). Right now, including a maintenance script causes it to execute. This is bad when you want to reuse the particular class but not have it start executing all by itself. Until now, we relied on setting MW_NO_SETUP which was a) hacky, b) irreversable, and c) likely to be forgotten if you didn't use one of the wrappers like runChild(). Instead, move the freaky magic to doMaintenance and have it check if it's in a specific call stack that indicates this is being run from the file scope and should be executed. Rename DO_MAINTENANCE to RUN_MAINTENANCE_IF_MAIN so it's nice and clear what magic happens behind the require_once().	2011-01-13 22:58:55 +00:00
Alexandre Emsenhuber	9f5d06527c	Part of bug 26280: added license headers to PHP files in maintenance	2010-12-16 19:15:12 +00:00
Mark A. Hershberger	617a5b1e15	Whitespace fixup under tha maint directory.	2010-12-04 03:20:14 +00:00
Aryeh Gregor	dcd5d260d4	Further categorylinks schema changes Per review by Tim, I made two changes: 1) Fix cl_sortkey to be varbinary(255). 2) Expand cl_collation to varbinary(32), and change $wgCollationVersion to $wgCategoryCollation, to account for the variety of collations we might have. tinyint is too small. I could have gone with int, but that's annoyingly inscrutable in practice, as we all know from namespace fields. To make the upgrade easier for non-trunk users, I updated the old patch file to incorporate the new changes, using the updatelog table so that people upgrading from 1.16 won't have to do two alters on categorylinks. I didn't test the upgrade-from-1.16 code path yet, so if anyone tests that and it seems not to break, commenting to that effect would be appreciated. Also removed wfDeprecated() from archive(). Do not add this to functions that are still actively used in core. If you think this function is so terrible that it really mustn't be used, remove callers yourself, don't pester every single developer with messages in the hope that someone else will do it for you.	2010-09-03 20:52:08 +00:00
Aryeh Gregor	5b132a4f47	Preserve cl_timestamp in updateCollation.php For those crazy Wikinews people, and other DPL users. Why do we use a crazy auto-updating column type instead of specifying the current time explicitly when we want to update it, again . . . ?	2010-08-04 00:29:20 +00:00
Aryeh Gregor	34db6f4b6f	Use exact counts in updateCollation.php There's no reason to avoid a one-time COUNT(*), is there? It will be free if collations are actually up-to-date, because the column is indexed.	2010-08-03 21:11:16 +00:00
Aryeh Gregor	a30d4319a5	Sort pages in categories without namespace prefix This removes $wgCategoryPrefixedDefaultSortkey and effectively always makes it false. The setting was added in the first place to hack around the default, clearly broken behavior, but this just fixes it instead, so the setting is no longer needed. Running maintenance/updateCollation.php for the first time will fix this, no need to run refreshLinks.php. If you've already run updateCollation.php, you can do UPDATE categorylinks SET cl_collation = 76; or such and then run the script again.	2010-08-03 20:50:31 +00:00
Aryeh Gregor	7ec501be6a	Enable new category sort by default Patch best viewed with whitespace changes ignored. This will doubtless introduce a bunch of bugs. Please report any so I can fix them. If they're big enough and the fix isn't obvious, please revert.	2010-08-03 20:50:01 +00:00
Aryeh Gregor	2ffa5e4876	Fix bug in prefixing scheme As Bawolff pointed out at [[mw:User talk:Simetrical/Collation]], the prefixing scheme I was using meant that the page "Z" with sort key of "F" would sort after a page named "A" with a sort key of "FF", since the first one's raw sort key would compute to "FZ", and the second's would compute to "FFA". I've fixed this by separating the prefix from the unprefixed part by a null byte (cl_sortkey is eventually going to be totally binary anyway, may as well start now).	2010-07-26 22:04:19 +00:00
Aryeh Gregor	022b7ba140	Reconcept cl_raw_sortkey as cl_sortkey_prefix In response to feedback by Phillipe Verdy on bug 164. Now if a bunch of pages have [[Category:Foo\| ]], they'll sort amongst themselves according to page name, instead of in basically random order as it is currently. This also makes storage more elegant and intuitive: instead of giving NULL a magic meaning when there's no custom sortkey specified, we just store an empty string, since there's no prefix. This means {{defaultsort:}} really now means {{defaultsortprefix:}}, which is slightly confusing, and a lot of code is now slightly misleading or poorly named. But it should all work fine. Also, while I was at it, I made updateCollation.php work as a transition script, so you can apply the SQL patch and then run updateCollation.php and things will work. However, with the new schema it's not trivial to reverse this -- you'd have to recover the raw sort keys with some PHP. Conversion goes at about a thousand rows a second for me, and seems to be CPU-bound. Could probably be optimized. I also adjusted the transition script so it will fix rows with collation versions greater than the current one, as well as less. Thus if some site wants to use their own collation, they can call it 137 or something, and if they later want to switch back to MediaWiki stock collation 7, it will work. Also fixed a silly bug in updateCollation.php where it would say "1000 done" if it did nothing, and changed $res->numRows() >= self::BATCH_SIZE to == so people don't wonder how it could be bigger (since it can't, I hope).	2010-07-26 19:27:13 +00:00
Aryeh Gregor	3783aa2a3c	Add non-identity collation, with migration script It seemed to work correctly, with the newly-created page "bob" sorting as "BOB", but then I nuked all my cl_sortkey by running the migration script before refreshLinks.php had finished running, so I'll have to wait a while to see if it works properly with a non-messed-up database. It's possible there's something wrong with the display of section letters in the categories, but otherwise I think this is working right.	2010-07-23 20:58:11 +00:00

46 commits