Commit graph

167 commits

Author SHA1 Message Date
jenkins-bot
996b7350f3 Merge "Add tags support to patrol, protect, unblock, and undelete" 2016-03-03 16:28:45 +00:00
aude
437f60f358 Add ContentHandler::supportsCategories method
and check for this in WikiPage::doEditUpdates before
inserting a new CategoryMembershipChangeJob.

Some content models like the Wikibase ones do not
have categories and it's wasteful to add these jobs
for all Wikibase edits.

Bug: T126977
Change-Id: I2c54a4ba1546445dc41101e15cb83a2c6cc2b1c9
2016-03-02 15:20:56 +01:00
Geoffrey Mon
e70c4eb664 Add tags support to patrol, protect, unblock, and undelete
- Add 'tags' parameters to appropriate API modules
- Add tag-adding logic to appropriate functions that carry out
  relevant functions
- ManualLogEntry::{set,get}Tags to handle adding tags to log
  entries in a cleaner fashion
- Use ManualLogEntry::setTags in LocalFile::recordUpload2

Bug: T97720
Change-Id: I98c52da7985623bfdafda2dc2dae937b39b72419
2016-02-29 16:59:31 -05:00
Kunal Mehta
6e9b4f0e9c Convert all array() syntax to []
Per wikitech-l consensus:
 https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html

Notes:
* Disabled CallTimePassByReference due to false positives (T127163)

Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
2016-02-17 01:33:00 -08:00
jenkins-bot
df85890b76 Merge "Make change tagging of edits in RecentChange::notifyNew/Edit" 2016-02-12 02:46:31 +00:00
jenkins-bot
d4ecfc1a5c Merge "Don't modify $wgHooks on language object construction" 2016-02-11 04:05:56 +00:00
cenarium
b009e0af21 Make change tagging of edits in RecentChange::notifyNew/Edit
Change tags to apply to an edit can now be passed directly to the
WikiPage::doEditContent function. They are then passed to the
RecentChange::notifyNew/Edit functions where tagging is made
after the recent change is saved. This ensures that other callers
of doEditContent will not run into the same issue as T100248.
ApiRollback is fixed in this way.
In addition, we'll have to pass tags in this way for core tagging
of edits (I2e48bd458fc8d7c289f04dc276f9287516e0b987), and this makes
it possible to merge the arrays of tags and call ChangeTags::addTags
only once.

Change-Id: I829960c7a33b70464065839d7504d7529dfd0b72
2016-02-10 13:03:30 +01:00
Aaron Schulz
f4a4457403 Convert page modification to using startAtomic()/endAtomic()
A few semantic changes result from this:
* If multiple pages are edited in a request, the updates happen
  in the same order relative to each other, but all in one second
  step instead of after each page edit.
* If the same page is edited twice in a request, the WikiPage hook
  argument will reflect the last request edit, not always the edit
  that fired the hook.

Bug: T120718
Change-Id: I9429f29e5a90f24e4d7af5797a80e63a9cc34146
2016-02-09 00:03:05 +00:00
addshore
e95da15372 Page is an interface not a class
Change-Id: I299067320a2f8e541eba668464eb18c42bfb56e5
2016-02-02 19:13:39 +01:00
Tim Starling
059fd9a2ae Don't modify $wgHooks on language object construction
Previously various language objects would install a hook to update the
shared conversion table cache when the object was constructed. This is
not a good idea since language objects may be constructed even when they
are not the content language, but only the content language is
associated with variant conversion and the conversion cache.

Instead, have WikiPage call a method on $wgContLang directly. I put this
with message cache update since the logic is almost identical.

Change-Id: Ief9c0ef993e39645e74a6e158cb4e6e2139ce91d
2016-01-29 15:03:56 +11:00
Aaron Schulz
36a87a8902 Convert page creation to using startAtomic()/endAtomic()
A few semantic changes result from this:

* If multiple pages are created in a request, the updates happen
  in the same order relative to each other, but all in one second
  step instead of after each page edit.
* If an extension set some extra Status info or errors via the
  PageContentInsertComplete hook, they will not be seen by the
  caller (unless it was a CLI script possibly). Few callers use
  $status at all, and I did not see any that mutated it. Since
  the page is already committed when this hooks run (as has always
  been) they cannot veto edits and callers do not care or know what
  to do with random hook-set status errors; there was never much use
  in changing the Status anyway.

Bug: T120718
Change-Id: Ieba35056be31b2f648c57f59d19d3cbbe58f1b05
2016-01-27 11:38:05 -08:00
Florian
0acebab76c Remove WikiPage::getRawText()
Bug: T122754
Depends-On: I29ec61c482057c5b3b1048c834aedac182174929
Depends-On: I74e57d8e76149b452a9635ad8a6eca91c3df96a9
Depends-On: Ia624ffeb2d9b1862f943f7c3103df417d90001c5
Depends-On: I4a5a0d34156f9aae09a3edbe736fd924bc74773f
Depends-On: I2355b7d4a1b831cd752cbaa88bf1878e0d5554fb
Depends-On: I02051f0c74b4db93093f171f1250c03b99f6cec6
Depends-On: Ie88d05a6534ac1d02fb79494603cea17108e6bb9
Change-Id: Ie3247a7143859bf10580e67cd5383d152540a25b
2016-01-13 20:45:23 +00:00
This, that and the other
b2da5cf4e4 Update the WikiPage object with the new ID when undeleting
Issue introduced by 0aa6486cbf.

Change-Id: I2c4fde5e66f280a6bc2de1b13453f1c40385a20d
2016-01-13 00:30:33 +11:00
victorbarbu
0aa6486cbf Use ar_page_id on undeletion
Bug: T28123
Change-Id: I882b8ba09d68e7475e1d0934328730059574e292
2016-01-11 23:09:50 +00:00
Reedy
48db568102 WikiPage::testPreSaveTransform() was removed
Change-Id: Ibce32556b8213a36876dcb4c6f385afa6e6875aa
2016-01-02 20:56:43 +00:00
Reedy
07e384e490 WikiPage::updateRestrictions() was removed
Change-Id: If5bdf84b94fec928387ee12492fbec1f511ca059
2016-01-02 20:47:25 +00:00
Reedy
cfaf26e501 WikiPage::getUsedTemplates() was removed
Change-Id: I5c17a57042025b2f72083a97034a5a2dd6c8cfb5
2016-01-02 02:56:17 +00:00
Aaron Schulz
4860e1d5ac Move ArticleSaveComplete hook to doCreate()/doModify() methods
This makes it easier to migrate the methods to using atomic
sections without having the PageContentInsertComplete hook
change to ending up in separate transaction than
ArticleSaveComplete.

Bug: T120718
Change-Id: I492514413ec9c37c2f9343bb207798fc8e24a5a9
2015-12-13 04:42:46 -08:00
Aaron Schulz
4302b0419d Rename getSquidURLs() => getCdnUrls()
Change-Id: I433acc7990a5fcefd0d2ff5b14ba33dec0424706
2015-12-11 16:40:35 -08:00
Aaron Schulz
6af3c39c07 Replace "squid" with "CDN" in various comments
Change-Id: Idcc528daf28e119349155d36e30a9bcf61b2e7d5
2015-12-09 17:35:37 -08:00
Aaron Schulz
282c5fa9f3 Rename SquidUpdate => CdnCacheUpdate
Squid is not the only possible CDN

Change-Id: Ie2a2955847c5706e630322bbbab71c9d063b378f
2015-12-09 16:31:17 -08:00
Aaron Schulz
c655e38f75 Renamed confusing initial $status var in doEditContent()
Change-Id: I22cad9eb3fb4040e5506b0cccd573871d108d257
2015-12-07 19:10:17 -08:00
Aaron Schulz
ebc55440f9 Split out edit/create methods from doEditContent()
* Make the method sizes a bit more manageable.
  This will be useful for replacing the begin/commit
  calls later (with startAtomic/endAtomic).
* Cleaned up a few inconsistencies in code style.

Change-Id: I8d66503a5575ca369cd5feb56058af7d24001629
2015-12-06 19:49:40 -08:00
jenkins-bot
e0b8359250 Merge "Defer the redirect table update in WikiPage::insertRedirect()" 2015-12-04 19:30:05 +00:00
jenkins-bot
410d5e5cd2 Merge "Remove unused WikiPage::getLastNAuthors() method" 2015-12-04 19:20:33 +00:00
Aaron Schulz
afbff42aca Make CDN purge calls use DeferredUpdates
* Using addUpdate() makes sure purges are coalesced and
  de-duplicated.
* Also removed incosistent $wgUseSquid checks. If CDN caching
  is not used, then $wgSquidServers will just be empty anyway.

Bug: T119016
Change-Id: I8b448366f037f668385d252f9d68289b71d1a707
2015-12-04 19:09:03 +00:00
Aaron Schulz
6dedffc2d7 Move category membership RC updates to CategoryMembershipChangeJob
* Recursive link updates no longer mention an category changes.
  It's hard to avoid either duplicate mentioning of changes or
  confusing explicit and automatic category changes.
* LinksUpdate no longer handles this logic, but rather WikiPage
  decides to spawn this update when needed in doEditUpdates().
* Fix race conditions with calculating category deltas. Do not
  rely on the link tables for the read used to determine these
  writes, as they may be out-of-date due to slave lag. Using the
  master would still not be good enough since that would assume
  FIFO and serialized job execution, which is not garaunteed.
  Use the parser output of the relevant revisions to determine
  the RC rows. If 3 users quickly edit a page's categories, the
  old way could misattribute who actually changed what.
* Make sure RC rows are inserted in an order that matches that
  of the corresponding revisions.
* Better avoid mentioning time-based (parser functions) category
  changes so they don't get attributed to the next editor.
* Also wait for slaves between RC row insertions if there where
  many category changes (it theory it could well over 10K rows).
* Using a separate job better separates concerns as LinksUpdate
  should not have to care about recent changes updates.
* Added more docs to $wgRCWatchCategoryMembership.

Bug: T95501
Change-Id: I5863e7d7483a4fd1fa633597af66a0088ace4c68
2015-12-03 11:28:05 +00:00
Aaron Schulz
325faeea83 Defer the redirect table update in WikiPage::insertRedirect()
This avoids contention slams and synchronous master DB writes 
on HTTP GET requests.

Bug: T119742
Bug: T92357
Change-Id: I7b3ebac0d6a11542c47ddf3219911be54380c537
2015-12-02 22:53:07 +00:00
Aaron Schulz
068045a257 Remove unused WikiPage::getLastNAuthors() method
Change-Id: I06b617d7af5169046b484d22931922bf2f9a5b74
2015-12-01 01:58:57 -08:00
Aaron Schulz
fb4cb75e9f Split out WikiPage 'page' field for EditPage
* This fixes numerous IDEA warnings
* Also fixed some other warnings by fixing documentation

Change-Id: I2a76ce79c0d04a28a6cd74116dfce4e67435f44a
2015-11-25 16:14:01 -08:00
Aaron Schulz
9b386d2436 Race condition fixes for refreshLinks jobs
* Use READ_LATEST when needed to distinguish slave lag
  affecting new pages from page deletions that happened
  after the job was pushed. Run-of-the-mill mass backlink
  updates still typically use "masterPos" and READ_NORMAL.
* Search for the expected revision (via READ_LATEST)
  for jobs triggered by direct page edits. This avoids lag
  problems for edits to existing pages.
* Added a CAS-style check to avoid letting jobs clobber
  the work of other jobs that saw a newer page version.
* Rename and expose WikiPage::lock() method.
* Split out position wait logic to a separate protected
  method and made sure it only got called once instead of
  per-title (which didn't do anything). Note that there is
  normally 1 title per job in any case.
* Add FIXME about a related race-conditions.

Bug: T117332
Change-Id: Ib3fa0fc77040646b9a4e5e4b3dc9ae3c51ac29b3
2015-11-16 13:21:05 -08:00
Aaron Schulz
0686221b2d Remove 'bot' check before trying the edit stash
Some user groups, like the 'flood' one on wmf sites might
get caught up as false positives.

Change-Id: I31be62b2239477572bc063f1af0329f248bbcaf6
2015-11-05 21:36:02 -08:00
Aaron Schulz
25d821830c Remove WikiPage::doQuickEdit
Change-Id: If56f790b8a29b2262cba0feff7a96312c69cdb0c
2015-11-04 02:15:40 +00:00
Aaron Schulz
6da45bcaef Add updateRevisionOn() sanity check for existing pages too
Change-Id: I4f2fc07b0365183efb431a828d40c557b691b18c
2015-11-03 16:44:52 -08:00
jenkins-bot
b48f38a9fc Merge "Avoid use of rollback() in WikiPage::doEditContent()" 2015-11-03 23:36:53 +00:00
Aaron Schulz
c11474714c Avoid use of rollback() in WikiPage::doEditContent()
* Use the CAS style page row checks instead
* This a first step towards switching to startAtomic/endAtomic
* Only a sanity check uses rollback() now, which should never
  be hit unless there is a serious DB layer bug

Change-Id: Ideb189f918dee5d3e3c7b91cb92179df514ef35a
2015-11-03 15:22:51 -08:00
jenkins-bot
31cd8cd486 Merge "objectcache: Introduce IExpiringStore for convenient TTL constants" 2015-10-28 04:31:33 +00:00
Timo Tijhof
e8275758fe objectcache: Introduce IExpiringStore for convenient TTL constants
Also consistently use self:: instead of BagOStuff:: for constants
referenced within the BagOStuff class.

Change-Id: I20fde9fa5cddcc9e92fa6a02b05dc7effa846742
2015-10-28 04:07:25 +00:00
Kunal Mehta
c52e5a21f6 LinksUpdate: Keep track of the triggering User
So extensions like Echo are able to attribute post-edit link updates to
specific the users who triggered them.

Bug: T116485
Change-Id: I083736a174b6bc15e3ce60b2b107c697d0ac13da
2015-10-27 17:10:19 -07:00
jenkins-bot
cc167acbc6 Merge "Fixes related to WikiPage::triggerOpportunisticLinksUpdate()" 2015-10-27 10:07:48 +00:00
jenkins-bot
0a90e7ca65 Merge "Convert doDeleteArticleReal to startAtomic()/endAtomic()" 2015-10-26 19:23:52 +00:00
Aaron Schulz
c2a52446e2 Convert doDeleteArticleReal to startAtomic()/endAtomic()
* They no longer commit the update, but just make sure
  it is part of a transaction. The BEGIN/COMMIT will
  happen at request start/end given DBO_TRX or in this
  method otherwise (like when in CLI mode). This avoids
  premature committing of other transactions.
* FileDeleteForm was the only caller using $commit=false
  for WikiPage::doDeleteArticleReal() and its proxies
  WikiPage::doDeleteArticle() and Article::doDeleteArticle().
  The ugly $commit flag is now removed.
* No caller was using $id for WikiPage::doDeleteArticleReal()
  and its proxies WikiPage::doDeleteArticle() and
  Article::doDeleteArticle(). It is now removed and we can
  be sure the lock() and CAS logic always hit in the method.
  The rollback() calls are not needed given the page row lock
  and having them there could break outer transactions.
* Updated FileDeleteForm to use startAtomic()/endAtomic() so
  the article and file delete are still wrapped in a
  transaction. Note that LocalFile::delete() does reference
  counting and trxLevel() checks so it will not try to begin()
  and break FileDeleteForm's transaction via startAtomic().
* Moved less important 'page-recent-delete' key update down
  for sanity in case it blows up.

Change-Id: Idb98510506c0edd02236c30badaec97d86e07696
2015-10-26 12:08:53 -07:00
Aaron Schulz
d705ae970a Fixes related to WikiPage::triggerOpportunisticLinksUpdate()
* Focus on updating links that would *not* already be updated
  by jobs, not those that already *will* be updated.
* Place the jobs into a dedicated queue so they don't wait
  behind jobs that actually have to parse every time. This
  helps avoid queue buildup.
* Make Job::factory() set the command field to match the value
  it had when enqueued. This makes it easier to have the same
  job class used for multiple queues.
* Given the above, remove the RefreshLinksJob 'prioritize' flag.
  This worked by overriding getType() so that the job went to a
  different queue. This required both the special type *and* the
  flag to be set if using JobSpecification or either ack() would
  route to the wrong queue and fail or the job would go in the
  regular queue. This was too messy and error prone. Cirrus jobs
  using the same pattern also had ack() failures for example.

Change-Id: I5941cb62cdafde203fdee7e106894322ba87b48a
2015-10-24 00:10:12 +00:00
Timo Tijhof
c45305bcd3 poolcounter: Add 'trigger' field to the slow-parse log
To better indicate that these are only triggered by page views.

We don't currently have any slow-parse logging for the parser
invocation that happens during save (which means we're potentially
missing lots of them).

Once we add that, this will help distinguish them.

Bug: T110760
Change-Id: I22be5684ef93efd410d683637e223f770d6c768c
2015-10-23 20:53:37 +00:00
jenkins-bot
682e2b9c6b Merge "Enable users to watch category membership changes #2" 2015-10-20 21:41:16 +00:00
addshore
d40cd42b9f Enable users to watch category membership changes #2
This is part of a chain that reverts:
e412ff5ecc.

NOTE:
- The feature is disabled by default
- User settings default to hiding changes
- T109707 Touching a file on wikisource adds and
      removes it from a category... Even when page
      has no changes.... WTF? See linked issue,
      marked as stalled with a possible way forward
      for this patch.
      @see https://gerrit.wikimedia.org/r/#/c/235467/

Changes since version 1:
- T109604 - Page names in comment are no longer
      url encoded / have _'s
- T109638 & T110338 - Reserved username now used
      when we can't determine a username for the change
      (we could perhaps set the user and id to be blank
      in the RC table, but who knows what this might do)
- T109688 - History links are now disabled in RC....
      (could be fine for the introduction and worked
      on more in the future)
- Categorization changes are now always patrolled
- Touching on T109672 in this change emails will never
      be sent regarding categorization changes. (this
      can of course be changed in a followup)
- Added $wgRCWatchCategoryMembership defaulting to true
      for enabling / disabling the feature
- T109700 - for cases when no revision was retrieved
      for a category change set the bot flag to true.
      This means all changes caused by parser functions
      & Lua will be marked as bot, as will changes that
      cant find their revision due to slave lag..

Bug: T9148
Bug: T109604
Bug: T109638
Bug: T109688
Bug: T109700
Bug: T110338
Bug: T110340
Change-Id: I51c2c1254de862f24a26ef9dbbf027c6c83e9063
2015-10-20 14:23:48 -07:00
Aaron Schulz
7a4348640e Make triggerOpportunisticLinksUpdate() directly use RefreshLinks
This makes the jobs *actually* end up in the 
'refreshLinksPrioritized' queue rather than 'refreshLinks'. 
These jobs can run fast in a higher priority queue since the 
output is already cached (much of the point of 'prioritize').
They won't if they get stuck behind regular 'refreshLinks'.

This also avoids the indirection of an intermediate job. 
The use of lazyPush() is already enough to prevent the user 
from experiencing cross-DC RTT overhead.

Change-Id: I5d0440588db09c299cd70191e5624ffc7ebf04c0
2015-10-20 20:55:51 +00:00
Aaron Schulz
670612f06c Deprecate redundant SquidUpdate::newSimplePurge()
Change-Id: Id6d92fca2a2b87e23930946f054cecd1f6d433be
2015-10-19 08:24:23 -07:00
Kevin Israel
aebdfef5fe Avoid creating lots and lots of cat_id gaps
Currently, INSERT...ON DUPLICATE KEY UPDATE is used to update the page
counts in the category table. However, MySQL 5.1.22 and newer, by default,
increment the counter for cat_id before checking for duplicate key errors.
This creates many gaps in the cat_id sequence.

To avoid this, check for existing category rows, and instead UPDATE any
that were found. It is hoped that the extra queries will not significantly
harm performance.

Change-Id: Ic2ab9ff14f04a0c7ea90a5b6756cade0c78e2885
2015-10-16 23:31:08 -04:00
jenkins-bot
91d69e12e1 Merge "More specific @return doc in WikiPage::getDeletionUpdates" 2015-10-16 11:46:27 +00:00