Commit graph

134 commits

Author SHA1 Message Date
aude
9442adf96d Fix exception in Import, when import of a revision fails
A 'notice' is thrown when an import fails, for some reason,
such as the user does not have permission, and the reason
is reported to the user.

In this case, $title is false and not a Title object,
as needed by the beforeImportPage callback (which calls
WikiPage::factory).  As well, $pageInfo['_title'] is undefined,
in pageOutCallback, which also calls WikiPage::factory via
finishImportPage.

Bug: T108544
Change-Id: I55042fdf305cd1198d3a4b28a0ebb5ce31b76a1f
2015-09-22 16:22:39 +02:00
umherirrender
d8821f2b0b Fixed spacing
- Removed space after casts
- Removed spaces in array index
- Added spaces around string concat
- Added space after words: switch, foreach
- else if -> elseif
- Removed parentheses around require_once, because it is not a function
- Added newline at end of file
- Removed double spaces
- Added spaces around operations
- Removed repeated newlines

Bug: T102609
Change-Id: Ib860222b24f8ad8e9062cd4dc42ec88dc63fb49e
2015-06-17 20:22:32 +00:00
Kunal Mehta
f6e5079a69 Use mediawiki/at-ease library for suppressing warnings
wfSuppressWarnings() and wfRestoreWarnings() were split out into a
separate library. All usages in core were replaced with the new
functions, and the wf* global functions are marked as deprecated.

Additionally, some uses of @ were replaced due to composer's autoloader
being loaded even earlier.

Ie1234f8c12693408de9b94bf6f84480a90bd4f8e adds the library to
mediawiki/vendor.

Bug: T100923
Change-Id: I5c35079a0a656180852be0ae6b1262d40f6534c4
2015-06-11 18:49:29 +00:00
daniel
a43af3bc0e Reset Title cache when importing titles.
WikiImporter now uses NaiveImportTitleFactory, which in turn uses Title::makeTitleSafe,
bypassing the internal title cache. To avoid (potentially cached) Title objects obtained
via Title::newFromText getting out of sync, WikiImporter now clears the title
cache in addition to clearing the LinkCache.

NOTE: a test for this is provided by I2be12fa7d439b.

Bug: T89307
Change-Id: Ib50c48d4797fc21c62090c0be69e87f7e7d07428
2015-05-24 17:55:08 +02:00
This, that and the other
1fe98feab0 Make import destination UI more intuitive and clearer
Previously there were two fields: Destination namespace, and Destination
root page. They were both optional, and the "root page" one in particular
was a bit mysterious until you tried it out. In addition, there was a
strange interaction when you set both fields (I still don't quite
understand what used to happen in this case).

Now, there is a set of three clearly described radio buttons, allowing the
user to select whether to import pages into their automatically chosen
locations, into a single namespace, or as subpages of a given page. These
correspond to the three ImportTitleFactory classes available in MediaWiki.

See https://phabricator.wikimedia.org/M28 for a screenshot.

The logic of WikiImporter#setTargetNamespace is tweaked slightly to remove
the interaction between target namespace and target root page, since only
one of these options can now be set. Similarly, the API's import module
is modified in the same way.

Bug: T17908
Change-Id: I11521260a88a7f4a95fbdb71ac50bcf7b4fe5cd1
2015-04-22 18:46:40 +00:00
This, that and the other
0f79f04a68 Enable entity loader and handle errors nicely in WikiImporter constructor
Two issues being addressed here:
* Slightly friendlier message (instead of fatal) if libxml is not present
* Need to make sure the entity loader is enabled when opening XML documents

Also provide an error message when XMLReader::open fails, as otherwise,
the user sees cryptic errors from code that tries to use the (unopened)
XMLReader.

Bug: T45868
Bug: T86036
Change-Id: Ibcccce9f09f87b17c3093fd0c3c3ff74d7dc6cb7
2015-04-11 11:40:17 +10:00
jenkins-bot
8a00f2445d Merge "Use XML localName when importing" 2015-04-09 00:46:24 +00:00
This, that and the other
45788085af Add null check in WikiImporter
This is my code, and it caused fatals in production whenever anyone tried
to import anything :(

This should get rid of the fatals, but obviously this won't fix the
underlying issue of WikiPage::getContent() sometimes returning null. See
the task for more info on that issue.

Bug: T94325
Change-Id: I68ce2288d7d209733bceffe42e1876c7afcd73d3
2015-03-30 19:50:59 +00:00
This, that and the other
59bcb425f8 Use XML localName when importing
XMLReader#name gives the qualified name, which was not a good thing to use.

Bug: T6520
Change-Id: I8174fe64791f0e8d0c6677169595201446eab583
2015-03-27 20:37:46 +01:00
Aaron Schulz
5085a4b5cf Made wfFindFile/wfLocalFile callers use explicit "latest" flags
* Callers that should not use caches won't
* Aliased the old "bypassCache" param to "latest"

bug: T89184
Change-Id: I9f79e5942ced4ae13ba4de0b4c62908cc746e777
2015-03-06 04:18:50 +00:00
Chad Horohoe
c33f4de066 Profile all external HTTP requests from MW
Change-Id: Ie980b080da2ef21ec7d9fc32f1accc55710de140
2015-03-03 20:54:30 -08:00
physikerwelt
0a6912f20f Avoid access to array key that does not exist
Accessing an array element that is not set
causes a PHP notice. This change first, checks if the
array key is present.

Bug: T91127
Change-Id: I468a95851e6acdb8186a06b0a2ac73499cc4611f
2015-02-28 12:28:15 +00:00
Legoktm
eca35903a2 Merge "Cache countable statistics to prevent multiple counting on import" 2015-02-13 19:54:17 +00:00
daniel
766cb52048 Clean up state of libxml on failed import.
Make sure we always call XMLReader::close() to clean up libxml's internal state,
even if import fails with an exception. Otherwise, any subsequent attempt at importing
(or otherwise using an XMLReader) will fail with:

  XMLReader::open(): Unable to open source data

This is particularly annoying for unit tests which should be allowed to fail
without dragging down subsequent tests. Even more importantly, they may
explicitly be testing a failure case, which should not cause subsequent tests
to fail.

NOTE: Wikibase patch Id035ecebebb67 is blocked on this,
please re-check once this is merged.

Change-Id: I31c014df39aa11c11ded70050ef12a8e2c5fefc5
2015-02-11 11:27:21 +00:00
daniel
891cc28a97 Common interface for ImportStreamSource and ImportStringSource.
ImportStringSource is handy for testing, but was unusable due to type hints
against ImportStreamSource. Introducing a common interface implemented by both
fixes this.

Change-Id: I820ffd8312789c26f55c18b6c46be191a550870a
2015-02-10 11:35:55 +01:00
This, that and the other
341dfa2587 Cache countable statistics to prevent multiple counting on import
At the moment, when $wgArticleCountMethod = 'link' (as it is on the WMF
cluster), we are querying the Slave database before each individual
revision is imported, in order to find out whether the page is countable
at that time. This is not sensible, as (1) the slave lags behind the
master, but (2) even the master may not be up to date, since page link
updates take place through the job queue.

This change sets up a cache to hold countable values for pages where import
activity has already occurred. That way, we aren't hitting the DB on every
revision, only to get an incorrect response back.

Bug: T42009
Change-Id: I99189c82672d7790cda5036b6aa9883ce6e566b0
2015-02-04 18:00:36 +11:00
jenkins-bot
31d239f9d9 Merge "Import: Fix error reporting" 2015-01-28 13:37:59 +00:00
jenkins-bot
9581677601 Merge "Proper namespace handling for WikiImporter" 2015-01-05 22:40:15 +00:00
Evan McIntire
d17ca39f15 Documented the Classes ImportStringSource and ImportStreamSource
Added short descriptions for each class

Change-Id: I28d3dea76ab70326a1e16b7c41b1f3758f8648b8
2014-12-30 20:51:28 -05:00
Kevin Israel
74faccfa26 Change case of class names to match declarations
Found by running tests under a version of PHP patched to report
case mismatches as E_STRICT errors.

User classes:
* MIMEsearchPage
* MostlinkedTemplatesPage
* SpecialBookSources
* UnwatchedpagesPage

Internal classes:
* DOMXPath
* stdClass
* XMLReader

Did not change:
* testautoLoadedcamlCLASS
* testautoloadedserializedclass

Change-Id: Idc8caa82cd6adb7bab44b142af2b02e15f0a89ee
2014-12-19 16:01:26 +00:00
Aaron Schulz
e369f66d00 Replace wfRunHooks calls with direct Hooks::run calls
* This avoids the overhead of an extra function call

Change-Id: I8ee996f237fd111873ab51965bded3d91e61e4dd
2014-12-10 12:26:59 -08:00
This, that and the other
37b4cd5da2 Proper namespace handling for WikiImporter
Up until now, the import backend has tried to resolve titles in the XML
data using the regular Title class. This is a disastrous idea, as local
namespace names often do not match foreign namespace titles.

There is enough metadata present in XML dumps generated by modern MW
versions for the target namespace ID and name to be reliably determined.
This metadata is contained in the <siteinfo> and <ns> tags, which
(unbelievably enough) was totally ignored by WikiImporter until now.
Fallbacks are provided for older XML dump versions which may be missing
some or all of this metadata.

The ForeignTitle class is introduced. This is intended specifically for
the resolution of titles on foreign wikis. In the future, an
InterwikiTitle class could be added, which would inherit ForeignTitle
and add members for the interwiki prefix and fragment.

Factory classes to generate ForeignTitle objects from string data, and
Title objects from ForeignTitle objects, are also added.

The 'AfterImportPage' hook has been modified so the second argument is a
ForeignTitle object instead of a Title (the documentation was wrong,
it was never a string). LiquidThreads, SMW and FacetedSearch all use this
hook but none of them use the $origTitle parameter.

Bug: T32723
Bug: T42192
Change-Id: Iaa58e1b9fd7287cdf999cef6a6f3bb63cd2a4778
2014-12-10 22:24:47 +11:00
Kunal Mehta
4a2ecaa046 Import.php: Use Config instead of globals
Change-Id: I4d1a8c443cfa360c5d388364c580d48fa7124099
2014-10-22 10:35:29 -07:00
Jeff Janes
a9074bef81 Import: Fix error reporting
FileRepoStatus does not have a getXml method.  Make the import
routine invoke getHTML instead.

Change-Id: I571cfe7165b92397f205c8710d260feeec5cc2ca
2014-09-12 21:18:22 -07:00
Stephan Gambke
f09e458d39 Fix for Ia9baaf0b: Make previously public variables public again
Change Ia9baaf0b changed the visibility of member variables (many of which are not
otherwise exposed, e.g. by a method) and by that introduced a major API change
breaking extensions.

This patch explicitly marks affected variables as public again, keeping the intent
of the original patch of making phpcs-strict pass on includes/ directory.

Bug: 67522
Bug: 67984
Change-Id: I498512b2a1e615365bb477c1fd210aaa3241ca03
2014-08-29 23:01:53 +02:00
umherirrender
b409008ca5 Remove wrong @return from doc blocks
These functions actually does not return anything, so the @return is
wrong here. '@return void' is ignored.

Change-Id: I11495ee05b943c16c1c4715d617c8b50de22276c
2014-08-25 13:50:05 +00:00
Thiemo Mättig
bf4d36e29e Drop "left in" debugging var_dump from WikiImporter
I found this on accident when searching for a var_dump I forgot
somewhere in my own code. We are maintaining production code here,
right? Debugging and testing should be somewhere else.

Also note the stray print before the var_dump.

Change-Id: I98725b277039f55db9ff95399e9559a477b43c26
2014-08-22 16:11:30 +02:00
jenkins-bot
d55911358e Merge "Remove unused XMLReader2 class" 2014-07-27 00:19:08 +00:00
This, that and the other
0afc858296 Use master DB to check for page existence during import
By default, slaves are used for the existence check. However, in the case
of importing many revisions of the one page, the chances are that they
won't have caught up to the fact that that page has just been created,
causing site statistics to be incorrectly updated. We need to use the
master DB for this check.

Bug: 40009
Change-Id: I301353fb976a982f58635b87d9960d81fc541d14
2014-07-26 06:54:50 +00:00
This, that and the other
13ea23c484 Remove unused XMLReader2 class
Undocumented and unused within core. Was previously used in WikiImporter,
but that use was removed in r81437.

Change-Id: I45f4ff3fae19a7d9c1a0dacb2e02d53ee4bdaefb
2014-07-26 11:53:51 +10:00
umherirrender
1c68a1ee86 Cleanup some docs (includes/*.php)
- Swap "$variable type" to "type $variable"
- Added missing types
- Fixed spacing inside docs
- Makes beginning of @param/@return/@var/@throws in capital
- Changed some types to match the more common spelling

Change-Id: I783e4dbfe5f6f98b32b9a03ccf6439e13e132bcc
2014-07-24 19:42:24 +02:00
umherirrender
4ee680a8b3 Fixed spacing
- Removed spaces after not operator (!)
- Removed spaces inside array index
- use tab as indent instead of spaces
- Add newline at end of file
- Removed spaces after casts

Change-Id: I9ba17c4385fcb43d38998d45f89cf42952bc791b
2014-07-24 11:53:04 +02:00
umherirrender
2b021dc48a Fixed spacing
- Added/removed spaces around parenthesis
- Added space after switch/if/foreach
- changed else if to elseif

Change-Id: I99cda543e0e077320091addd75c188cb6e3a42c2
2014-07-19 23:12:10 +02:00
Adrian Lang
5d8fd152f3 Correct doc of WikiImporter::__construct parameter
Change-Id: I0c61bb4f8d1e51f3b58ff99a9c632561dfd5134d
2014-06-02 12:09:15 +02:00
jenkins-bot
487823ba89 Merge "Correctly parse 'redirect' XML tag during Special:Import." 2014-05-28 18:56:40 +00:00
Sebastian Brückner
85b695c2ee Correctly parse 'redirect' XML tag during Special:Import.
Bug: 65481
Change-Id: Id9b3b7878b2e7b6fc7a06b163e5bac60e700490e
2014-05-28 16:47:49 +02:00
jenkins-bot
36e4a83640 Merge "Introduce ContentHandler::importTransform." 2014-05-27 18:26:19 +00:00
daniel
5ca37ababd Introduce ContentHandler::importTransform.
ContentHandler::importTransform allows ContentHandler
implementations to apply transformations on page content
upon import. Such transformatiosn may by useful to update
from legacy formats, apply ID rewriting, etc.

Note that the transformation is done on the serialized content.
This allows for a "raw" import implementation that writes
improted blobs directly into a blob store without unserializing
them into an intermediary representation. Implementations may
choose to unserialize, transform, and then re-serialize.

Bug: 65256
Change-Id: I290fdf5589af43def8b3eddb68b5e1c23f6124e8
2014-05-20 19:12:35 +02:00
Alexander Lehmann
ba16bbd0a9 Inserted test whether the resource 'uploadsource' is already registered.
Bug: 65530

Change-Id: I1b82d6dc6a37792d4e7b7d01316802ea4d38a88b
2014-05-20 13:12:02 +00:00
Siebrand Mazeland
a7fbdd6503 Make phpcs-strict pass on includes/ (7/7)
Change-Id: Ia9baaf0b3cdbe1a3c6b50ef8c4fe86fead88f909
2014-05-15 20:07:09 +02:00
Alexandre Emsenhuber
5c4bf6b9bd Fix coding style from Ie40c0721ec (e9f01c9)
- Opening brace goes on the same line as the function definition
- No backslash before class name in @return

Change-Id: I4c43e047c36d0ce6e9c2344f6ee98786b2b8eac4
2014-05-15 16:44:25 +02:00
Alexander Lehmann
e9f01c9324 Inserted getter for the XMLReader and change the visibility of some
functions for use in hooks.
Bug 64657

Change-Id: Ie40c0721ec32935294756d60ea6686ebeefa61af
2014-05-15 14:17:36 +02:00
umherirrender
192a9d021c Fixed some @params documentation (includes/[Export.php|Import.php])
Swapped some "$var type" to "type $var" or added missing types
before the $var. Changed some other types to match the more common
spelling. Makes beginning of some text in captial.

Change-Id: I32ed752d23088f6203462134b2af57c6f06a4de5
2014-04-23 11:30:40 +02:00
umherirrender
b9cd789fce docs: closure -> Closure; callback -> callable
Changed closure to capital word Closure in doc and type hint,
also changed callback in docs to callable

Change-Id: I52c8e8f13d38a837052101c38b9986be780ca057
2014-04-19 08:43:31 +02:00
Ladsgroup
1ba0445c12 Changing URLs of mediawiki.org in scripts to the SSL-based website
http://www.mediawiki.org --> https://www.mediawiki.org

Part 2

Change-Id: I3be61fe3dfb502cc20180486eb1a8016eac151df
2014-03-12 23:24:03 +00:00
jenkins-bot
70ae276db1 Merge "(bug 47070) check content model namespace on import." 2014-01-28 20:56:02 +00:00
umherirrender
65a4ae9fe9 Change Title::getInterwiki() in conditions to Title::isExternal()
Change-Id: Icce26e6194ae96f262029554e05b49117d5e112e
2014-01-02 11:59:10 +01:00
daniel
4cc9407fe9 (bug 47070) check content model namespace on import.
When importing, we need to check that the kind of content we are about to
import is actually allowed in the specified location on the local wiki.


This change does two things:

* Introduce the ContentModelCanBeUsedOn hook which provides control over
which kind of content can be used where.

* Introduce a check against ContentHandler::canBeUsedOn in the importer,
along with an appropriate error message.

Change-Id: Ia2ff0b0474f4727c9ebbba3c0a2a111495055f61
2013-12-17 16:58:57 +00:00
umherirrender
cbc4fd7a5b print is not a function
Removed parenthesis after print

Change-Id: I1343872de7aa7c64952a3d86a63aaa091e46bda3
2013-05-09 20:06:03 +02:00
Max Semenik
1f8a7dc2d2 Import: Fix incorrect wfRunHooks usage
af125df519 broke importing
because Import.php was calling wfRunHooks() improperly.
Fixing the calls and making the wfRunHooks() signature
match Hooks::run() to make call errors immediately detectable.

Change-Id: If44292fedf6917cde1dae7f0391231a18d414610
2013-05-02 18:06:19 +00:00