Commit graph

154 commits

Author SHA1 Message Date
Erik Bernhardson
83bae78c3a Silently drop unknown titles in completion search
This mimics how full text works by silenty dropping results returned from
search that no longer exist. This could be because the search index is slightly
out of sync with reality, or the search engine could simply be broken.

Only silent from the users perspective. We maintain a count in statsd of
the number of titles dropped. This can be monitored over time to
recognize any increases.

Bug: T115756
Change-Id: I2f29d73e258cd448a14d35a2b4902a4fb6f61c68
2018-06-11 13:56:19 -07:00
Erik Bernhardson
2a43939ffb Push pagination decision for search into SearchEngine
Various code using the search engine shouldn't need to implement it's
own methods, such as over-fetching, to determine if there are more
results available. This should be knowledge internal to search that is
exposed by a boolean.

Change-Id: Ica094428700637dfdedb723b03f6aeadfe12b9f4
2018-06-11 13:35:44 -07:00
jenkins-bot
cc0fe6c4a7 Merge "Deprecate overriding SearchEngine::search*" 2018-05-16 13:31:56 +00:00
Erik Bernhardson
fdc133ef1a Deprecate overriding SearchEngine::search*
The plan is to convert these methods into final, considering
it a removal under the deprecation policy. By making entry
points into the search engine final we provide a guaranteed
point where generic handling can be applied to all search engines.

The first use case for this generic handling is pushing pagination
via overfetch into the SearchEngine class instead of re-implementing
an overfetch in individual parts of the code that perform searches.

Change-Id: I3426d6a2f32d8b368b044b154e1cb70dac007c62
2018-05-15 08:28:28 -07:00
David Causse
39cbbab021 Allow 'all:' on all wikis in addition to 'searchall' translation
This allows to have a common syntax useable everywhere.

Bug: T165110
Change-Id: If71fe5df045fb754925946088f8f793197bc8301
2018-05-11 15:20:27 +02:00
James D. Forrester
5f018dd6a4 Drop OpenSearch::getOpenSearchTemplate(), deprecated in 1.25
Change-Id: Ib76b96cf392b7f9fa38d28173dd2cd170e08a881
2018-03-08 10:43:28 +00:00
Stanislav Malyshev
030da07b18 Use getSize since SearchSuggestionSet does not implement Countable
Bug: T184934
Change-Id: I39459352399e2023149b715b049084826df22935
2018-01-22 12:13:34 -08:00
Huji Lee
e74bfe13f6 Require indentation of CASE statements in PHP code
Bug: T182546
Change-Id: I91a9555893a08e4ec58da97c6cc4d1e70000ff6b
2017-12-10 22:07:50 -05:00
Umherirrender
f739a8f368 Improve some parameter docs
Add missing @return and @param to function docs and fixed some @param

Change-Id: I810727961057cfdcc274428b239af5975c57468d
2017-09-10 20:32:31 +02:00
Umherirrender
9b8b314992 Fix spacing for @param and indent of function comments
In phpcs.xml rename renamed sniffs and add the failing sniffs,
because now the whole sniff is no longer excluded.

Change-Id: If5b0bd16028761abc2c47ace9e97d37ad14bb36f
2017-08-15 14:33:29 +00:00
jenkins-bot
cc4115e87f Merge "Add missing @param and @return documentation" 2017-08-11 21:32:58 +00:00
WMDE-Fisch
6df9ed1ad6 update mediawiki-codesniffer to 0.11.0 and fix issues
- mostly auto fixes
- some too long lines fixed
- ignore amp space in one case  passing by reference

Change-Id: I6472f83bc3cbf4bd629d83050cc3319b19ec465c
2017-08-11 22:27:51 +02:00
Umherirrender
718e63694d Add missing @param and @return documentation
Change-Id: I1d1098eec3933df6561cceef646576013ddc08c8
2017-08-11 22:17:01 +02:00
David Causse
187ada1cf4 Fix phrase search
Partially revert I61dc536 that broke phrase search support.

Fix phrase search by making explicit that there are two
kind of legalSearchChars() usecases :

- the chars allowed to be part of the search query (including special
  syntax chars such as " and *). Used by SearchDatabase::filter() to
  cleanup the whole query string (the default).

- the chars allowed to be part of a search term (excluding special
  syntax chars) Used by search engine implementaions when parsing with
  a regex.

For future reference:
Originally this distinction was made "explicit" by calling directly
SearchEngine::legalSearchChars() during the parsing stage. This was
broken by Iaabc10c by enabling inheritance.
This patch adds a new optional param to legalSearchChars to make this
more explicit.

Also remove the function I introduced in I61dc536 (I wrongly assumed
that the disctinction made between legalSearchChars usecases was due
to a difference in behavior between indexing and searching).

Added more tests to prevent this from happening in the future.

Bug: T167798
Change-Id: Ibdc796bb2881a2ed8194099d8c9f491980010f0f
2017-07-03 11:44:48 +02:00
jenkins-bot
01c3bf3431 Merge "Fix highlighting for phrase queries" 2017-06-27 23:44:27 +00:00
Umherirrender
be42e09aa8 build: Prepare for mediawiki/mediawiki-codesniffer to 0.9.0
The used phpcs has a bug, so the version 0.9.0 could not be enforced at the moment.
Will be fixed in next version, see T167168

Changed:
- Remove duplicate newline at end of file
- Add space between function and ( for closures
- and -> &&, or -> ||

Change-Id: I4172fb08861729bccd55aecbd07e029e2638d311
2017-06-26 17:14:31 +00:00
David Causse
f230f5dcc7 Fix highlighting for phrase queries
I think the bug was introduced during a cleanup in Iaabc10c.
I don't think that " should be part of the legalSearchChars at query
time, it seems to break the regex.
The strategy here is to distinguish legalSearchChars used query time vs
the ones used at index time by introducing:
SearchEngine::legalSearchCharsForUpdate()

Bug: T167798
Change-Id: I61dc53665e26d3c6c48caed78dd3bbde9a33def7
2017-06-26 09:53:13 +02:00
Paladox
54c56da85a Fix php code style
Preparation change for updating mediawiki code sniffer to 0.8.0

Change-Id: Ib0b3fe4afea9096ffa3a1347b4f7e07d3398b0b2
2017-05-05 12:03:54 +00:00
Stanislav Malyshev
1e412aabdf Add deleted archive titles search
Allows search engine to suggest deleted titles for undelete search.
Note that the titles are still verified against the archive table,
to ensure search engine is not out-of-date.

Bug: T109561
Change-Id: Id6099fe9fbf18481068a6f0a329bbde0d218135f
2017-04-05 12:02:35 -07:00
jenkins-bot
0a2297c406 Merge "Fixes for more robust dealing with content handlers." 2017-01-30 16:06:25 +00:00
David Causse
cc61a14751 Allow SearchEngine users to access features data
Useful in case the client wants to re-evaluate what was set
here, or if the SearchEngine implementation wants to expose
some of its states.
In our case it allows CirrusSearch to inform SpecialSearch
that we prefer to display search results with a new experimental
layout.

Bug: T156299
Change-Id: I7f661c852ef70ea7bc9ae2959f7d6e48776a9877
2017-01-27 15:12:10 +01:00
Stanislav Malyshev
60ffe51c79 Fixes for more robust dealing with content handlers.
Change-Id: I12a02da005f4b2bceaa850bd1f41a90ac4e1754a
2017-01-26 12:20:07 -08:00
Max Semenik
d4f3e554d7 Decrease the number of 'function says it should return something' errors
Change-Id: Ib5115fe5bbaa67d8a6e54cc3ba1ba7020e239e11
2016-12-15 16:05:52 -08:00
dcausse
bd7df68603 Do not run exact db match if offset is > 0
When scrolling results on prefix search api the exact Title
match is always at pos 0 even if we want to scroll the results
by setting offset to > 0.

Change-Id: Ib02c9d3e479d739e6fe79014d962db50b6fd9de8
2016-09-27 15:28:14 +02:00
jenkins-bot
82dcfbe20e Merge "Pass User to SearchEngine::getProfiles" 2016-09-20 20:30:36 +00:00
dcausse
16e2491a73 Pass User to SearchEngine::getProfiles
Useful for search engines that allow users to customize search profiles.

Depends-On: Icd577c8ebc6e162befe30bde4fe276e633d2e434
Change-Id: I471cd090730d2a25cb70d622ec3bebbe9583118c
2016-09-20 20:22:23 +00:00
dcausse
8b890bbe1b Extract replacePrefixes into a static method
Useful for some search engines that have a keyword that wants
to reuse this logic without building a new SearchEngine object.

Change-Id: Iee5bfd1da70b8339a98555ba062bd33b21f0b761
2016-09-19 16:54:06 +02:00
Stanislav Malyshev
7e18cfc3b5 Infrastructure for augmenting search results
Bug: T117493
Change-Id: Ia5413a7846cc961026a2dc3542b619493bc76a23
2016-09-15 15:44:03 -07:00
Stanislav Malyshev
add1ebe2ab Make content handlers assemble content for search
Bug: T89733
Change-Id: Ie45de496ecc826211d98eea3a410c7639b4be0a4
2016-07-26 13:08:45 -07:00
Stanislav Malyshev
86b55ad03c Create API to allow content handlers to handle structured data definitions
Change-Id: Ia1738803c42f6114575587c1c838fec62b6f54aa
Bug: T89733
2016-07-06 13:41:20 -07:00
dcausse
31680aaddc Expose SearchEngine specific profiles
This patch introduces a way for SearchEngine implementations to expose
specific search profiles useful to fine-tune the various behaviors related to
search.

A SearchEngine can expose a list of profiles by overriding
SearchEngine::getProfiles( $profileType ), profileType stands for the type of
profile being customized. Two types are added in this patch:
- completion: exposed by ApiQueryPrefixSearch and ApiOpenSearch to control
  the behavior of the algorithm behind "search as you type" suggestions.
- fulltext query independent profiles: exposed by ApiQuerySearch to customize
  query indpendent ranking profiles (e.g. boost by templates/incoming
  links/popularity/...)

This patch allows api consumers that might have been confused by fuzzy
suggestions to switch to stricter profiles and to officialize the behavior
behind the hidden param cirrusUseCompletionSuggester. Or to control the
fulltext ranking behaviors like cirrusBoostLinks=(yes|no).

The list of profiles can be discovered by using ApiSandbox/ApiHelp and is totally
controlled by search engine implementations.

Bug: T132477
Change-Id: I66be724d8975976c98c91badbf421f237e014f89
2016-05-30 20:43:53 +02:00
dcausse
257c023666 Fix Undefined variable: namespaces in includes/search/SearchEngineConfig.php on line 109
Bug: T134305
Change-Id: I220886e12a6d083ac34a8a75bc77871e89dbf747
2016-05-03 21:41:15 +02:00
Ricordisamoa
e64035522d Fix and standardize Doxygen tags
* Use "@param datatype $paramname description" format

* String → string, Integer → int etc.

* @return $string → @return string

Change-Id: I860d222382cb4c5699d313b0600bd22503c8c385
2016-04-30 12:10:17 +02:00
Stanislav Malyshev
82c8c00ce2 Move wgContLang from config to injectable
Change-Id: Iffdc39f2de7d38ee9ef882bb796e8969e95e75c6
2016-04-27 11:55:17 -07:00
Stanislav Malyshev
34b02d87ac Convert SearchEngine to service containers
Change-Id: Icef1ecbed3d831557e0256fdfa53743b194007cc
2016-04-25 16:25:17 -07:00
dcausse
4c2f0e3940 completionSearch: try an exact match even if the backend returns no result
Unfortunately some backends (like Cirrus) are not able to do proper crossnamespace
lookup. If the backend returns no result we should still try an exact title match.

Bug:T129575
Change-Id: Ic08ec0aac0fd8a289f21753d74ae62ba509de841
2016-03-11 09:40:40 +01:00
aude
3a5931c3e5 Fix comment in SearchEngine.php
Change-Id: Ib00b42d2210be5bc1125fa7ba74c27a5d7fbf36c
2016-03-01 10:48:55 +01:00
Kunal Mehta
6e9b4f0e9c Convert all array() syntax to []
Per wikitech-l consensus:
 https://lists.wikimedia.org/pipermail/wikitech-l/2016-February/084821.html

Notes:
* Disabled CallTimePassByReference due to false positives (T127163)

Change-Id: I2c8ce713ce6600a0bb7bf67537c87044c7a45c4b
2016-02-17 01:33:00 -08:00
Stanislav Malyshev
027972a20f Include completion search into SearchEngine
By default it still uses PrefixSearch and supports PrefixSearchBackend
but it can be deprecated and phased out and SearchEngine extensions used
instead.

New APIs:
- SearchEngine
	public function defaultPrefixSearch( $search );
	public function completionSearch( $search );
	public function completionSearchWithVariants( $search );

Search engines should override:
protected function completionSearchBackend( $search );

Bug: T121430
Change-Id: Ie78649591dff94d21b72fad8e4e5eab010a461df
2016-02-03 23:41:49 +00:00
nomoa
33e5c11e85 De-duplicate near match query terms when generating variants
zhwiki generates 9 variants: we should not run these queries if the term is unchanged.

Change-Id: If23d19761dea33bf4bdcf6495becc8e983915fde
2016-01-11 17:33:00 -08:00
Erik Bernhardson
28abb2620a Document namespaces member as nullable
This variable is set to null in the replacePrefixes method, document
it as such to prevent errors in consuming code.

Bug: T98082
Change-Id: I78880ffe590ed7193b8482c0ae41c8c69e495878
2015-05-05 11:11:32 -07:00
Nik Everett
74b1c0d256 Add a sort parameter to SearchEngine
SearchEngine grows a method to list valid sort orders one to set the
sort for the next search, and one to read it.  The default implemenation
only supports 'relevance' and the documenation strongly urges that all
other implemenations leave that as the default.  Other implementations
can support other orders. Cirrus already supports title_asc and title_desc.

Change-Id: Ie946150c6796139201221dfa6f7750c210e97166
2015-01-07 13:41:18 -08:00
Aaron Schulz
e369f66d00 Replace wfRunHooks calls with direct Hooks::run calls
* This avoids the overhead of an extra function call

Change-Id: I8ee996f237fd111873ab51965bded3d91e61e4dd
2014-12-10 12:26:59 -08:00
Brad Jorsch
28e37f55c9 Merge OpenSearchXml extension into core
There's really no reason for the extension to exist separately from
core, and merging it reduces the risks of bitrot in both the extension
(lots of deprecated functions there) and core (missing integration with
PageImages and TextExtracts, for example).

Change-Id: Ie0ab90902ede9499879402290006466efba479e9
2014-11-26 21:07:22 -08:00
umherirrender
c2de24efcd Add missing @param to function docs
Change-Id: Ib7ac94d05a04490f25dfd40b46b27973cbab582c
2014-08-14 19:38:57 +00:00
jenkins-bot
a30854859e Merge "Remove SearchEngineReplacePrefixesComplete hook" 2014-06-24 07:20:18 +00:00
Chad Horohoe
da0070e9d1 Remove SearchEngineReplacePrefixesComplete hook
This hook is poorly thought out. The only extension that uses it
can't possibly think it works how they're expecting.

Change-Id: I853a01afc8e922f22e949321a2f2343d264632a6
2014-06-23 08:48:45 -07:00
Chad Horohoe
cd9dc7ef3f Database search fixes:
- Move filter() function and make it protected, nothing uses it
  outside database-backed searching
- Use per-backend legal search characters rather than assuming the
  static implementation is right

Change-Id: Ic2b830b56137b2dfe68b9b9c3de012151e716952
2014-06-20 13:43:36 -07:00
Chad Horohoe
9643a0a92d Remove SearchResultTooMany
Only worked for Postgres, and only worked halfway at that. Result
sets could be false for many reasons. No results, bad query, database
went to Mars. We shouldn't assume they're always "too many results."

Leave it up to the normal query error logging and move on.

Change-Id: Ieddd163e440ae54b152541d727c1afdbc4ab4fbd
2014-06-20 10:36:47 -07:00
Nemo bis
5dc4dc099d Save advanced search namespace prefs on Special:Search itself
* Checkbox on own row below power search checkboxes per MatmaRex;
  avoiding a mw-search-ns* id leaves it untouched by All/None JS.
* The option searcheverything is removed: a "shortcut" which is no
  longer necessary now that options can be (un)selected at once
  with All/None buttons on search page itself.
* Require a token for saving: no accidental preferences changes.
* Keep the searchoptions/advancedsearchoptions prefs section in case
  something is using it (no known extension does though); options
  are converted to "api" type so it's empty and hidden by default.
* Add minimal documentation for saveSettings() and friends
  (@todo since 155ddf6de, 2009!).

Bug: 52817
Change-Id: I514cee835988600cc013658049e88a10b670e64a
2014-05-30 14:33:47 -07:00