Thijs/wiki.techinc.nl

Author	SHA1	Message	Date
Petr Pchelko	0f87f5885c	Convert JobRunner into a service and use DI Bug: T246156 Change-Id: If4f67a6fa0e26ade3fc0420e62fa836c2a3e4b2e	2020-02-27 08:04:48 -08:00
Petr Pchelko	c8136454cd	Add test for JobRunner Bug: T220127 Change-Id: I35ff5f97d3e8e677c5c236723df6f74b5e21214d	2020-02-07 11:32:17 -05:00
Aaron Schulz	6aa41cbd37	jobqueue: cleanup JobRunner for reability and code reuse Consolidate more logic into JobRunner::execute() and make it public. Add "caught" field to the resulting map. The intended use case for this method is JobExecutor. Calling this method from there could cut down on code duplication. Also: * Use try/finally to restore state instead of ScopedCallback. * Use more generic Throwable instead of Exception. * Reorganize JobRunner::run() slightly for readability. * Set class constant visibility and improve code comments. Bug: T243492 Change-Id: I90566a49c603aa78f45b35c0d3fc1925d2cfe2f8	2020-01-29 21:00:39 +00:00
James D. Forrester	4f2d1efdda	Coding style: Auto-fix MediaWiki.Classes.UnsortedUseStatements.UnsortedUse Change-Id: I94a0ae83c65e8ee419bbd1ae1e86ab21ed4d8210	2020-01-10 09:32:25 -08:00
Thiemo Kreuz	78ca9eff4a	Remove duplicate variable name from class property PHPDocs Repeating the variable name doesn't do anything. Documentation generators don't need it. It's more stuff to read that doesn't add new information. And it can become outdated. Note there are two types of @var docs. When used inline (and not on a class property) the variable name is needed. Change-Id: If5a520405efacd8cefd90b878c999b842b91ac61	2019-12-02 12:58:29 +00:00
Umherirrender	c7ad21c25f	Improve param docs Change-Id: I746a69f6ed01c3ff000da125457df62b02d13b34	2019-11-28 19:08:59 +01:00
jenkins-bot	7badf41667	Merge "rdbms: add ILBFactory::setDefaultReplicationWaitTimeout() method"	2019-10-27 00:44:31 +00:00
James D. Forrester	2bc660c95a	Collapse uses of now-deprecated wfGetRusage() Change-Id: I9a2b5d1234ebb458b6cd29797de3f387d1399e6f	2019-10-22 11:32:06 +01:00
Aaron Schulz	d5f5dd2a52	rdbms: add ILBFactory::setDefaultReplicationWaitTimeout() method Use this in JobRunner to avoid overly sensitive lag timeouts and log spam. The 3 second timeout is between the regular web default and the CLI default. Follow-up to `e8df0fbab1`. Bug: T235244 Change-Id: I92f657a638031d913b0575d74bf48c3e3a63cd17	2019-10-21 20:05:30 -07:00
Derick Alangi	52a21ace03	Fix method/function names case mismatch in core files PHP doesn't care much but I think we humans do because we should call methods by the name we give them. Method fixed are; - isOk() -> isOK() - setOk() -> setOK() - teardown() -> tearDown() Change-Id: I6b3f0cf3902887058efa426968da380803869e0b	2019-08-31 23:17:51 +00:00
Aaron Schulz	0844db0b6d	Make the JobRunner flushReplicaSnapshots() call cover the first job If JobRunner is called when replica transactions exists, the first job would previously use the stale REPEATABLE-READ snapshot data. Also clear any master connection snapshots via commitMasterChanges(). This makes the code more similar to DeferredUpdates::attemptUpdate(). Change-Id: I2157a91fb01ea8c233f964b1f3164e8c3b1a07ca	2019-08-25 14:07:24 +00:00
Umherirrender	cb82a52adf	Fix type hints in jobqueue related classes JobQueueGroup is giving RunnableJob on pop(), so it should take the same type for ack() and deduplicateRootJob() JobQueue::ack alsready accept the interface Also change to RunnableJob in JobRunner to work with the type from the job queue Change-Id: I7b09586cff8affabe807ee16e80d04f5137dce45	2019-07-05 22:20:56 +02:00
Aaron Schulz	69c503148f	rdbms: add replica server counting methods to ILoadBalancer This is slightly more robust and makes the intent much clearer than random calling code checking getServerCount() all over the place. In addition, this yields better separation of concern. Also, cleanup the LoadBalancer constructer a bit and make the validation a bit stricter. Make some server index comparisons strict while at it. Change-Id: Icc1a35bd65c6862ff81faa3ab9b2aa7cafe29443	2019-06-20 12:47:23 +01:00
Aaron Schulz	6030e9cf2c	Create JobQueueEnqueueUpdate class to call JobQueueGroup::pushLazyJobs() This assures that MergeableUpdate tasks that lazy push job will actually have those jobs run instead of being added after the lone callback update to call JobQueueGroup::pushLazyJobs() already ran. This also makes it more obvious that push will happen, since a mergeable update is added each time lazyPush() is called and a job is buffered, rather than rely on some magic callback enqueued into DeferredUpdates at just the right point in multiple entry points. Bug: T207809 Change-Id: I13382ef4a17a9ba0fd3f9964b8c62f564e47e42d	2018-10-28 22:19:06 +00:00
Aaron Schulz	ebbccf1845	Migrate some wfWikiId() callers to getLocalDomainID() Change-Id: I33fe222b7ca66babd61610febaebcf52d3806a7d	2018-10-15 23:58:49 -07:00
Umherirrender	ff95c7a4ba	Fix caller name in JobRunner::commitMasterChanges Use the given fname for all places. The __METHOD__ inside the unlock closure would be shown as {closure} in logs Change-Id: I87ef26e893af858f58d1a77dcb2d8ee192456f5c	2018-10-01 18:48:36 +00:00
Tim Starling	e8df0fbab1	Don't throw an exception when waiting for replication times out For maintenance scripts it is usually harmful to throw an exception. For jobs the exception was already caught and handled appropriately, so this can continue as before. For DeferredUpdates it was extremely harmful to throw an exception. So in the web case, reduce the timeout to 1s and continue as normal if the 1s timeout is reached. This allows the DeferredUpdate to be throttled without being killed. In the updater, increase the replication wait timeout to 5 minutes. ALTER TABLE could indeed cause replication lag, but exiting the update script with an exception will probably ruin your day. Update actions are not necessarily efficiently restartable. Do not call JobQueue::waitForBackups() when jobs are popped. Maybe it makes sense to call a queue-specific replication wait function for bulk inserts, like copyJobQueue.php, but doing it when jobs are popped just makes no sense. Surely the worst that could happen is that the queue would become locally empty? Removing this waitForBackups() call avoids waiting for replication twice when JobQueueDB is used. Bug: T201482 Change-Id: Ia820196caccf9c95007aea12175faf809800f084	2018-09-03 12:29:35 +10:00
Umherirrender	130ec2523d	Fix PhanTypeMismatchDeclaredParam Auto fix MediaWiki.Commenting.FunctionComment.DefaultNullTypeParam sniff Change-Id: I865323fd0295aabd06f3e3c75e0e5043fb31069e	2018-07-07 00:34:30 +00:00
Bartosz Dziewoński	485f66f174	Use PHP 7 '??' operator instead of '?:' with 'isset()' where convenient Find: /isset$\s([^()]+?)\s$\s\?\s\1\s:\s/ Replace with: '\1 ?? ' (Everywhere except includes/PHPVersionCheck.php) (Then, manually fix some line length and indentation issues) Then manually reviewed the replacements for cases where confusing operator precedence would result in incorrect results (fixing those in I478db046a1cc162c6767003ce45c9b56270f3372). Change-Id: I33b421c8cb11cdd4ce896488c9ff5313f03a38cf	2018-05-30 18:06:13 -07:00
Aaron Schulz	c6b668c2ec	Do not start explicit transaction rounds for RecentChangesUpdateJob The replaces the hacky use of onTransactionIdle(), which no longer runs immediately in explicit transaction rounds since `d4c31cf841`. Also clarified TransactionRoundDefiningUpdate comment about rounds. Change-Id: Ie17eacdcaea4e47019cc94e1c7beed9d7fec5cf2	2018-04-17 12:39:05 +00:00
Aaron Schulz	7f571f9bca	Remove useless commit calls in JobRunner These were meant as sanity checks, but would fail in those unusual cases anyway with exceptions. Instead, have an early check to make sure no explicit transaction rounds are active when JobRunner:run is called. Change-Id: I723c77c8d3ef7ec4dcf09ce6d549b4fd57bdf1c2	2017-10-12 12:00:07 -07:00
Antoine Musso	40a9ad6ea1	jobqueue: Add job_type to PSR logging context The mediawiki.runJobs errors are collected in Logstash but the whole job description and errors are a single field message. That is challenging to split logs per job type, get the longest running jobs ... That can be worked around on the log receiving side by parsing MediaWiki messages eg https://gerrit.wikimedia.org/r/#/c/312504/ Bryan Davis suggested a better long term solution is to use the PSR3 logger with structured log messages. Culprit: 'type' is a reverved word. Hence prefix all context variables with 'job_'. Bug: T146469 Change-Id: Ib6a771c7d3f83bd75b2994bfab9bbebfd1f5aa6c	2017-08-08 05:00:52 +00:00
jenkins-bot	38a2a5661e	Merge "Add $wgMaxJobDBWriteDuration setting for avoiding replication lag"	2017-06-12 18:15:57 +00:00
Aaron Schulz	95fdff36c2	Make DeferredUpdates detect LBFactory transaction rounds Previously, tryOpportunisticExecute() tried to nest transaction rounds, which would fail. Added LBFactory::hasTransactionRound() as needed. Also cleaned up some unqualified class names in callbacks and set the PRESEND flag for the JobQueueDB AutoCommitUpdate callback. Use the proper getMasterDB() method while at it. These follow up `24842cfac`. Bug: T154425 Change-Id: Ib1d38f68bd217903d1a7d46fb15b7d7d9620daa6	2017-06-10 15:22:32 +00:00
Seb35	24842cfac0	Use AutoCommitUpdate instead of Database->onTransactionIdle This is needed for deferred updates LinksDeletionUpdate and LinksUpdate, else callbacks registered with onTransactionIdle prevent other transactions from being executed, at least in this case. Bug: T154425 Bug: T154438 Bug: T157679 Change-Id: Iecd396d584a62ac936cd963915339159467b44cd	2017-06-06 14:23:37 +02:00
Seb35	d80fca05e1	Better handling of jobs execution in post-connection shutdown In the postprocessing, some jobs can be executed but given the deferred updates were already "closed", any new DeferredUpdate were directly called (as explained by Krinkle on T165714), and the transactions opened by classical jobs are badly mixed with transactions (directly) executed by DeferredUpdates jobs, issuing a DBError, avoiding the job, which stays in a 'claimed' status even if failed. Quite similarly, some DeferredUpdates callables use JobQueueGroup::lazyPush so it is needed to really push the generated jobs. This change removes the run-immediately-deferred-updates behaviour even in the post-connection shutdown, and given there is a call to DeferredUpdates::doUpdates in JobRunner::execute it is not necessary to add another one and hence execution of Web jobs is more similar to execution of CLI jobs. In the same spirit to reconcile Web jobs and CLI jobs, the call to JobQueueGroup::pushLazyJobs is done in JobRunner::execute. Bug: T165714 Bug: T100085 Change-Id: I721e7167eca5b0b6227234fe516005243ab22388	2017-06-01 13:16:08 +02:00
Aaron Schulz	ac202927d4	Add $wgMaxJobDBWriteDuration setting for avoiding replication lag This is similar to $wgMaxUserDBWriteDuration except for jobs. Also use the Config class in JobRunner instead of globals. Bug: T95501 Change-Id: I4949bb99c26451429c7acf82ecc4444bf9fb835f	2017-05-25 19:43:27 +00:00
Aaron Schulz	dd359741cc	Move DB errors to Rdbms namespace Change-Id: I463bd86123501abc68fdb78b4cda6110f7af2549	2017-04-15 10:47:41 -07:00
Aaron Schulz	4a177b34ef	Move LBFactory to Rdbms namespace Change-Id: I5ae10783228d0252284807c9562bc8e328d4becb	2017-02-03 17:24:03 -08:00
Aaron Schulz	6477026675	Back off from job types longer for DB read-only errors Such error are likely to persist longer than other random exceptions. In that case, it is better to avoid burning through the job retry count. Change-Id: I6785bd608856f98d21e0b0b05d3899a7081c38e2	2016-12-09 23:26:34 -08:00
jenkins-bot	69ae945e8d	Merge "Update weblinks in comments from HTTP to HTTPS"	2016-11-08 21:32:00 +00:00
Fomafix	202f695f67	Update weblinks in comments from HTTP to HTTPS Use HTTPS instead of HTTP where the HTTP link is a redirect to the HTTPS link. Also update some defect links. Change-Id: Ic3a5eac910d098ed5c2a21e9f47c9b6ee06b2643	2016-11-07 15:24:46 +01:00
Kunal Mehta	61adc1e146	Use namespaced ScopedCallback The un-namespaced \ScopedCallback is deprecated. Change-Id: Ie014d5a775ead66335a24acac9d339915884d1a4	2016-10-17 15:46:05 -07:00
Gergő Tisza	d304f5e394	Pass Job success status to teardown callbacks Change-Id: Icf2e03efcfd9232fe4ead776096b61cef1c06141	2016-10-05 02:55:45 +00:00
Aaron Schulz	1cb13cff08	Remove pointless double exception logging from JobRunner Change-Id: I12a2e6db326af25a3a276a477fbff505feac87b6	2016-09-13 04:38:36 +00:00
Aaron Schulz	703b0691ca	Use ESTIMATE_DB_APPLY for total transaction time estimate Individual write queries already do this, but the COMMIT step still used the old accounting. Change-Id: I416a524d6652f933cbc49033b49745db732c8b92	2016-09-11 16:04:21 -07:00
Aaron Schulz	c14ddc5c30	Make sure the lock in JobRunner::commitMasterChanges() releases Used a ScopedCallback in case of exception to avoid queue backup Change-Id: I58a5f152a54ed9a0d5544014788792bd62afbf4a	2016-09-08 02:19:32 -07:00
Aaron Schulz	6c73b32fd5	Convert JobRunner to using beginMasterChanges() This lets the runJobs.php $wgCommandLineMode hack be removed. Some fixes based on unit tests: * Only call applyTransactionRoundFlags() for master connections for transaction rounds from beginMasterChanges(). * Also cleaned up the commitAndWaitForReplication() reset logic. * Removed deprecated DataUpdate::doUpdate() calls from jobs since they cannot nest in a transaction round. Change-Id: Ia9b91f539dc11a5c05bdac4bcd99d6615c4dc48d	2016-09-07 03:56:37 +00:00
Aaron Schulz	d1f09fb4c3	Fix IDEA errors in JobRunner Change-Id: I15939326afa80139a4d1000e43057b61cd374f18	2016-09-06 15:17:14 -07:00
Aaron Schulz	57e19b610d	Renamed some variables from "slave" to "replica" Change-Id: I455278294cd7ea344d14a76ac5957ece2e07fbf3	2016-09-05 23:03:01 -07:00
Aaron Schulz	16266edff3	Change "slave" => "replica DB" in /includes Change-Id: Icb716219c9335ff8fa447b1733d04b71d9712bf9	2016-09-05 21:01:01 +00:00
Aaron Schulz	de0b371aac	Add flushReplicaSnapshots() method for just clearing snapshots This is better than having to use the less safe commitAll(), which also checks and commits masters with writes. Change-Id: I01c95f1ebae6927ed5acf0c23dd19b5c2413f661	2016-09-02 11:06:56 -07:00
Aaron Schulz	dac1a29b43	Add more estimation modes to pendingWriteQueryDuration() * Use this to exclude some common cases of harmless queries that happen to block on row-level locks for a long time. This does not apply to UPDATE/DELETE however, due to the ambiguity of time spent scanning vs locking. * Update commitMasterChanges() and JobRunner to use the new mode to avoid pointless rollback or lag checks. Change-Id: Ifc2743f2d8cd109840c45cda5028fbb4df55d231	2016-08-29 18:36:17 -07:00
Aaron Schulz	f3cfdf0baa	Remove commit() calls from JobQueueDB These are not safe for the common case where the local DB handle is used for the queue (and other table writes). Change-Id: Ic24a05c18bf31e49bf7e9a3c058deb5d35271511	2016-08-23 17:24:58 +00:00
Aaron Schulz	8359993708	Various database class cleanups * Refactor out some code duplication in query() into a separate private method. * Remove the total master/slave query profiling, which is not necessary and redundant. * Provide a default implementation for reconnect(). * Make reconnect() catch errors so it can match the docs that say it returns true/false to indicate failure. Likewise for ping(). * Optimize ping() to no-op if there was obvious recent activity. * Move the ping() round in JobRunner to approveMasterChanges. This way, all commit rounds benefit from this logic. * Add more doc comments for DatabaseBase fields. Change-Id: Ic90ce2be4187244a0e8d44854c39d4b78be8e642	2016-08-22 20:15:41 -07:00
Aaron Schulz	4209b81ad0	Use waitForAll() for slow JobRunner commits Using waitForOne() barely goes beyond semi-sync replication already in place on serious DB clusters. Change-Id: Idb719deaa5993bc2f818cd110d49d09567e0afb3	2016-08-12 20:45:00 -07:00
Aaron Schulz	3675f1d447	Make Database disconnect and error suppression more robust * Disallow $ignoreErrors in query() on deadlocks, since that would otherwise silently rollback all changes from any other callers. * Move recoverability checks for disconnects to canRecoverFromDisconnect(). * The first write of a DBO_TRX transaction is now considered recoverable. * Run onTransactionResolution() callbacks on disconnect/deadlock rollback. Some DeferrableUpdate need this to know to abort. * Disallow $ignoreErrors on disconnects considered unrecoverable. This makes it so that query() callers cannot cause writes from other callers to be silently lost, which is hard to reason about. * Moved ping() logic to simple reconnect() method and ping() simply do a dummy SELECT, which triggeres reconnection if safe. Previously, ping() might cause subtle partial transaction loss. * Remove ping() from strencode(), which would cause partial transaction loss where it actually reached. * Remove mysqlPing() per https://bugs.php.net/bug.php?id=52561. Bug: T142079 Change-Id: Ifb7f772ae849d67c0d92240a115c3f392e252937	2016-08-11 07:26:33 +00:00
Aaron Schulz	f879dd8079	Fix increment() statsd call in JobRunner Change-Id: I17e04db59a44a491aae99c4542216316361010a0	2016-08-06 00:54:42 +00:00
Brian Wolff	fb7b637660	Call $job->teardown() even if Job throws an exception. teardown() callbacks are primarily used to reset session after job is done. It seems important to do this, even if exception is thrown by job. Change-Id: I0bd449414527321b0ed9063cea268dea5b0766c4	2016-05-16 01:44:48 -04:00
Erik Bernhardson	afc3b5a120	Track which web request created a job We currently push a request id into structured logging (monolog/ logstash) to allow seeing all logs that were triggered by the same request. This extends that to pass the id through jobs so jobs triggered by a web request also share the same id and can be tracked together. This web request id will follow jobs both directly created by a request, and jobs created by those jobs. This should give us some more visibility when debugging into what started a particular job, and if a large number of jobs blowing up the job queue are somehow related. Change-Id: Iedbd031e6e9bb18fd6f7b923c8c305102255ab4b	2016-04-13 10:41:13 -07:00

1 2 3

108 commits