wiki.techinc.nl/includes/title/ForeignTitle.php
This, that and the other 37b4cd5da2 Proper namespace handling for WikiImporter
Up until now, the import backend has tried to resolve titles in the XML
data using the regular Title class. This is a disastrous idea, as local
namespace names often do not match foreign namespace titles.

There is enough metadata present in XML dumps generated by modern MW
versions for the target namespace ID and name to be reliably determined.
This metadata is contained in the <siteinfo> and <ns> tags, which
(unbelievably enough) was totally ignored by WikiImporter until now.
Fallbacks are provided for older XML dump versions which may be missing
some or all of this metadata.

The ForeignTitle class is introduced. This is intended specifically for
the resolution of titles on foreign wikis. In the future, an
InterwikiTitle class could be added, which would inherit ForeignTitle
and add members for the interwiki prefix and fragment.

Factory classes to generate ForeignTitle objects from string data, and
Title objects from ForeignTitle objects, are also added.

The 'AfterImportPage' hook has been modified so the second argument is a
ForeignTitle object instead of a Title (the documentation was wrong,
it was never a string). LiquidThreads, SMW and FacetedSearch all use this
hook but none of them use the $origTitle parameter.

Bug: T32723
Bug: T42192
Change-Id: Iaa58e1b9fd7287cdf999cef6a6f3bb63cd2a4778
2014-12-10 22:24:47 +11:00

117 lines
3.1 KiB
PHP

<?php
/**
* A structure to hold the title of a page on a foreign MediaWiki installation
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
* http://www.gnu.org/copyleft/gpl.html
*
* @file
* @author This, that and the other
*/
/**
* A simple, immutable structure to hold the title of a page on a foreign
* MediaWiki installation.
*/
class ForeignTitle {
/**
* @var int|null
* Null if we don't know the namespace ID (e.g. interwiki links)
*/
protected $namespaceId;
/** @var string */
protected $namespaceName;
/** @var string */
protected $pageName;
/**
* Creates a new ForeignTitle object.
*
* @param int|null $namespaceId Null if the namespace ID is unknown (e.g.
* interwiki links)
* @param string $namespaceName
* @param string $pageName
*/
public function __construct( $namespaceId, $namespaceName, $pageName ) {
if ( is_null( $namespaceId ) ) {
$this->namespaceId = null;
} else {
$this->namespaceId = intval( $namespaceId );
}
$this->namespaceName = str_replace( ' ', '_', $namespaceName );
$this->pageName = str_replace( ' ', '_', $pageName );
}
/**
* Do we know the namespace ID of the page on the foreign wiki?
* @return bool
*/
public function isNamespaceIdKnown() {
return !is_null( $this->namespaceId );
}
/**
* @return int
* @throws MWException If isNamespaceIdKnown() is false, it does not make
* sense to call this function.
*/
public function getNamespaceId() {
if ( is_null( $this->namespaceId ) ) {
throw new MWException(
"Attempted to call getNamespaceId when the namespace ID is not known" );
}
return $this->namespaceId;
}
/** @return string */
public function getNamespaceName() {
return $this->namespaceName;
}
/** @return string */
public function getText() {
return $this->pageName;
}
/** @return string */
public function getFullText() {
$result = '';
if ( $this->namespaceName ) {
$result .= $this->namespaceName . ':';
}
$result .= $this->pageName;
return $result;
}
/**
* Returns a string representation of the title, for logging. This is purely
* informative and must not be used programmatically. Use the appropriate
* ImportTitleFactory to generate the correct string representation for a
* given use.
*
* @return string
*/
public function __toString() {
$name = '';
if ( $this->isNamespaceIdKnown() ) {
$name .= '{ns' . $this->namespaceId . '}';
} else {
$name .= '{ns??}';
}
$name .= $this->namespaceName . ':' . $this->pageName;
return $name;
}
}