Commit graph

44 commits

Author SHA1 Message Date
Aaron Schulz
7c07943eab Update more docs and type hints to use IDatabase
Change-Id: I8c8d85b32a8aba21e14d2a2dde4c25eb509186c1
2015-10-06 18:49:52 -07:00
Aaron Schulz
16999c8d50 Added jobqueue.pickup_root_age metric
* This tracks the average age of the root job for jobs being
  run that have root jobs defined.

Change-Id: Ifed709644cfa9ac60fc2b0cfd376142adebbaf68
2015-08-25 16:34:00 -07:00
Aaron Schulz
8a126ee3b3 Switched job run time profiling to context getStats()
* This replaces the scoped profiling calls

Change-Id: I73caffad0e0d31d9ffbd3c0decfe31e17ea85398
2015-08-19 20:21:29 +00:00
Aaron Schulz
4f0b16b914 Fixed BufferingStatsdDataFactory::timing() callers to use ms
* The interface actually demands this

Change-Id: I1e334c2696a8a8eca73a6ae7f71428190cad3107
2015-08-19 01:01:49 +00:00
Aaron Schulz
9d39f50904 Made JobRunner bail more smoothly on near OOM
* Use the regular limit-X style response instead of throwing an
  exception. This avoids loss of statd data and the like.

Change-Id: Ia08384a0d13c268f6e7a673b2265ab77772e5539
2015-08-07 15:09:22 -07:00
Aaron Schulz
df2dc2ef9b Improved job pickup time stats for delayed jobs
* The delay time should not count

Bug: T102743
Change-Id: I9e8b1f33b65681fd9f4f667233bce280bf6f227d
2015-07-01 21:00:09 +00:00
Ori Livneh
ca8cb1c90e Fix-up for I2ac604d3c042d
Log time until picked up.

Change-Id: I67310aa2fdbfcb8b1fd394f490ef4885cf596b0c
2015-06-25 21:56:49 -07:00
Ori Livneh
427bdb6dbd jobqueue: use more sensible metric key names
* Since JobQueue metrics are qualified with 'jobqueue.', don't add a 'job-'
  prefix to each metric.
* Separate the key from the job type with a dot rather than a dash.
* To avoid having a Graphite node that is both a "directory" and a metric, use
  '.all' as a suffix for aggregates.

Change-Id: I2ac604d3c042dbfb0b3a27759800f435ec22041e
2015-06-14 22:38:02 -07:00
Aaron Schulz
189017d244 Various code cleanup to JobRunner
* Made the pickup stats name be similar to other queue stats
* Renamed $jobsRun => $jobPopped
* Simplified some code and comments

Change-Id: I8ab1a68f04fc3ab4c0ba7f6f0b428a5a811a97fb
2015-06-05 11:15:57 -07:00
Kunal Mehta
f138447de1 jobqueue: Record stats on how long it takes before a job is run
Bug: T101054
Change-Id: I5dc13d79a5ec2e8cb6679e3ff2535b5cb031ca30
2015-06-03 12:54:27 -07:00
Nik Everett
f204da4bff Commit all connections after each job
If you don't commit the slave connections then they keep their old snapshots.
This clears the snapshots so they don't get out of date views of the world.

Bug: T100838
Change-Id: I1f6f910d88324beb589b2ad9466d8786376eda55
2015-06-01 17:43:43 -04:00
Aaron Schulz
04d11e6590 Make JobRunner flush DeferredUpdates after each job
Change-Id: Iff6625ddc04a15751d2bb07dc6558145e7ceb14a
2015-05-18 18:52:34 -07:00
jenkins-bot
2e89500994 Merge "Removed executeReadyPeriodicTasks() method" 2015-05-12 20:03:02 +00:00
Aaron Schulz
dc6a4d27de Added explicit profile sections to JobRunner
Change-Id: Iba60204e1ab7c81686f05b36661080c000b10157
2015-05-12 17:15:29 +00:00
Bryan Davis
9b8da198e0 jobrunner: Change logging level for STARTING messages
Mark debug log events describing the start of processing a job as debug
level information rather than informative.

Bug: T87521
Change-Id: I1ce3dabf4a344369fe396c5bb056ed5ed6308c87
2015-05-11 10:36:04 -06:00
Aaron Schulz
ec3da97659 Removed executeReadyPeriodicTasks() method
* Moved all these hacks to JobQueueDB, which is the only queue that
  should need this (for stock installs). Newer queues should always
  have the queue store manage stuff like this, not MediaWiki.
* This also avoids expensive object construction that does nothing
  when non-DB queues are used.

Change-Id: Id718cda25750be73044a049b39958cca55aa3172
2015-05-06 20:20:40 -07:00
Aaron Schulz
fa07f92527 Pass __METHOD__ to ping query in JobRunner::commitMasterChanges()
Change-Id: I7f79acca0f89a2e07ed2f9eb427e0788c5440ee7
2015-05-01 14:53:41 -07:00
jenkins-bot
a576e4066a Merge "Added $wgTrxProfilerLimits and slow query limits" 2015-04-28 08:26:06 +00:00
Aaron Schulz
7ea13643f5 Added $wgTrxProfilerLimits and slow query limits
* Limits are now configurable instead of being hard-coded

Change-Id: I99133586eb82e8e9e84061548c8d1a99695fde5c
2015-04-28 10:18:11 +02:00
Aaron Schulz
7c821caef5 Added $wgJobSerialCommitThreshold setting
* This is used to avoid lag by certain jobs

Bug: T95501
Change-Id: Id707c9a840fa23d56407e03aaae4e25149a1f906
2015-04-24 11:38:16 -07:00
Aaron Schulz
ef23382324 Added max lag comment to JobRunner
Change-Id: I9bb9948190d349d563f65d3e15bf1c6fa0d8adec
2015-04-22 07:45:53 -07:00
Aaron Schulz
05a5ec406d Lowered $maxAllowedLag to 3 in JobRunner
Change-Id: I7cb771c667bac21e9b67069e31c6243d9314dac5
2015-04-22 06:11:25 -07:00
Aaron Schulz
b4b932b5eb Lowered JobRunner lag check interval from 3 => 1 second
Change-Id: I4a2147316a6c43199587bb7b28aeaec2fc252c84
2015-04-21 12:47:35 -07:00
Aaron Schulz
6525642cf2 Made JobRunner avoid slave lag more aggressively
Bug: T95501
Change-Id: Ibba6d2947638a17c86edcdaadf484c7aa45cd1c6
2015-04-10 04:39:29 +00:00
Aaron Schulz
fcb0872e8b Warn when jobs do large DB writes at once
Change-Id: I57e9bb630accd5b262188ab16b17b558cd3a2bc1
2015-04-08 15:47:58 -07:00
Bryan Davis
1195e11a8a Move MWLogger classes to MediaWiki\Logger namespace
Move the MWLogger PSR-3 logging related classes into the
MediaWiki\Logger namespace. Create shim classes to ease migration of
existing MWLoggerFactory usage to the namespaced classes.

Bug: T93406
Change-Id: I359cc81fbd2dcf8937742311dcc7d3dee08747b0
2015-04-03 11:32:24 -07:00
Aaron Schulz
bd649e6566 Made JobRunner bail sooner for bogus job --type parameters
Change-Id: I1259682b8a6543e76f1c9a4d99324b457115a277
2015-03-03 12:20:12 -08:00
Bryan Davis
2eea1d5a42 Convert JobRunner to PSR-3 logger
* Implement Psr\Log\LoggerAwareInterface
* Categorize log events with levels (debug, info, error)

Bug: T87521
Change-Id: I2637c40a44e396b1020b76f54c2e8b931f764f02
2015-01-26 15:04:12 -08:00
Aaron Schulz
5b6e17e611 Made JobRunner bail if wfReadOnly() is true
Change-Id: I97ef66718bf4033768cd820b42521af31539b3f6
2015-01-16 23:05:02 -08:00
Aaron Schulz
6921770414 Updated some try-catch statements: MWException -> Exception
Change-Id: I76601a86e30f4984e3b1a8c8ec5ef5a0f652433a
2015-01-09 17:20:22 -08:00
Aaron Schulz
4ff8136807 Removed remaining profile calls
Change-Id: I31c81c78715048004fc8fca0f27d09c1fa71c118
2015-01-08 02:49:33 -08:00
Chad Horohoe
aa21e125a3 Remove obvious function-level profiling
Xhprof generates this data now. Custom profiling of various
sub-function units are kept.

Calls to profiler represented about 3% of page execution
time on Special:BlankPage (1.5% in/out); after this change
it's down to about 0.98% of page execution time.

Change-Id: Id9a1dc9d8f80bbd52e42226b724a1e1213d07af7
2015-01-07 11:14:24 -08:00
Aaron Schulz
2607ba2b89 Made JobRunner wait for all applicable slaves, not just the main cluster
Change-Id: Ib610684fd3d9b76ea13fe585a290983c071b88f4
2014-10-22 18:46:55 +00:00
Aaron Schulz
855a19ec86 Simplified getMaxLag() to use getLagTimes()
* This method now benefits from more cache sharing and de-duplicated
  lag time querying to reduce connection stampedes.

Change-Id: I2f3b9a22e4adabea703fbae1f96e65fb65125e2b
2014-09-26 00:21:27 +00:00
jenkins-bot
cb60e0511c Merge "Randomize the JobRunner slave lags checks a bit" 2014-09-24 01:20:34 +00:00
Aaron Schulz
845ae42348 Randomize the JobRunner slave lags checks a bit
Change-Id: Iee777426776c12051761d29c90da80cea27619b1
2014-09-23 18:10:04 -07:00
Aaron Schulz
a55544180b Slave lag check tweaks to JobRunner
* Do not block forever, but wait up to 10 seconds. Likewise,
  check the lag times in memcached on startup. This at least
  lets runners avoid lagged wikis but still work on others.
* Made a few small related documentation and code cleanups.

Change-Id: Ic1339bab54cba6b6cbea7d97a80ff87c7c5c87af
2014-09-24 00:59:55 +00:00
Aaron Schulz
963430d34c Fixed --maxtime handling by JobRunner
Bug: 71073
Change-Id: I4ddebd5aad27d0882dd2e4614df91ac565a71d2d
2014-09-19 22:45:59 +00:00
Aaron Schulz
797c7c9005 More tweaks to job backoff code
* Replace one time() call with microtime() in syncBackoffDeltas().
  Also moved the call down slightly to not count flock() delay.
* Moved read-only case logic into syncBackoffDeltas().
* Moved $backoffExpireFunc logic into syncBackoffDeltas().
* Tightened the syncBackoffDeltas() checks around pop() for
  better accuracy.

Change-Id: Ifed3d24ba62277c0e0f52cdc1051990a590be18a
2014-09-03 19:37:50 +00:00
Aaron Schulz
db5ad07b24 Improved job backoff handling to be more properly per-server
Change-Id: I7c8e44a474ca8d05771477665ba508a358c5747d
2014-09-03 10:25:30 -04:00
Bartosz Dziewoński
9e5443abc3 No space within the ?: operator
This style is a lot more common in our code.

Change-Id: I7f2fb3716c24c4a95a4c6c4a732b0226c315f242
2014-08-25 17:40:51 +02:00
Aaron Schulz
5fc32ab830 Added more JobRunner docs
Change-Id: I4fbb947a2fc2b5f325dff97127bf39edf18f0e13
2014-08-15 15:19:46 -07:00
Aaron Schulz
e0ad91f058 Just log exceptions instead of spamming them in JobRunner
* Even the CLI script already shows the error=X snippet.
  Logging the error should be enough, and avoids showing
  output when all we want is the JSON.

Change-Id: Iade412ea61cf427865d841ecab5498e4fcdb7e13
2014-07-29 16:01:40 -07:00
Aaron Schulz
094d901b88 Refactored duplicated code into JobRunner.php
* Also added an async flag to SpecialRunJobs so that it can be
  set to false to get a JSON blob back with a regular 200 status.

Change-Id: I2f5763e017684c3c61f3d3f27ddf7f7834bdfce2
2014-07-25 17:28:10 +00:00