Commit graph

129 commits

Author SHA1 Message Date
Robert Stojnić
abf726ea02 Re-commit r34072 with some modifications:
* turned off by default (set $wgAdvancedSearchHighlighting to turn on)
* reverted r26269, \b doesn't interact very good with unicode data,
  so it broke highlighting of words that end/begin in nonascii chars
  completely
* small bugfixes in unicode handling, tested in more languages
* $wgSearchHighlightBoundaries need to be set to "" for CJK wikis
* benchmarking: on typical simplewiki data, the code is around 4-5 slower
  (according to noc.wikimedia.org the old code profiles to about 0.8%),
  but can be up to 20 times slower on featured-size articles
* update release notes (also for r33400)
* fix profiling errors in SpecialSearch
2008-05-04 15:31:03 +00:00
Brion Vibber
723fe778bc Revert for now:
* r34072 -- new highlighter code; looks a bit expensive, not fully tested yet.
* r33489 -- broke search result highlighting all around
* Part of r32350 -- bring the color back to search highlighting so we can see our results again. Why was this removed without comment?
2008-05-01 20:55:03 +00:00
Robert Stojnić
54dfb7b2ca New class SearchHighlighter handles highlighting of search terms and
snippet extraction:
* prefer text hits over matches on images/templates/tables, making the 
  snippets more readable and relevant
* cleanup wikitext
* prefer snippets with exact query match - works only for whole phrases
* drop the old context calculation and replace it will a more flexible one
  that does a better job keeping snippets of constant width
* if the first line of the article matches whole query show only one snippet
* manually lower/uppercase non-ascii chars so that words in e.g. cyrillic 
  are also case-insensitive
* workaround for php limited utf8 support so that snippets end up being of
  constant char-size over single and multiple byte text
* if there is no text match for some reason, show beginning of the article
Warning:
* haven't done performance testing, might not be safe to go live, although 
  I don't see any immediate problems with it
2008-05-01 13:36:29 +00:00
River Tarnell
d426f7c5c0 Notice: Undefined property: PostgresSearchResult::$mRevision in /data/home/river/www/wiki/includes/SearchEngine.php on line 518 2008-04-30 09:49:58 +00:00
Aryeh Gregor
675f1701c9 The problem also applies to all the other regex special chars: try it out with ., |, etc. Use preg_quote(). 2008-04-17 15:59:49 +00:00
Greg Sabino Mullane
377f79b5b8 Escape forward slashes in search terms, otherwise PHP thinks the regex has ended and starts treating the rest of the characters in the string as PCRE modifiers. 2008-04-17 15:38:26 +00:00
Robert Stojnić
724def7bef Use content language for search prefixes. 2008-04-17 15:11:42 +00:00
Robert Stojnić
d6fd8e7c13 Ajax suggestions:
* check in a new ajax suggestion engine (mwsuggest.js) which uses 
  OpenSearch to fetch results (by default via API), this should
  deprecated the old ajaxsearch thingy
* extend PrefixSearchBackend hook to accept multiple namespaces for
  future lucene use (default implementation however can still 
  process only one)
* Added to preferences, also a feature to turn it on/off for every 
  input (disabled atm until I work out browser issues completely)
* WMF wikis probably won't be using API to fetch results, but a 
  custom php wrapper that just forwards the request to appropriate
  lucene daemon, added support for that

SpecialSearch:
* moved stuff out of SpecialSearch to SearchEngine, like snippet
  highlighting and such
* support for additional interwiki results, e.g. title matches
  from other projects shown in a separate box on the right
* todo: interwiki box doesn't have standard prev/next links to 
  avoid clutter and unintuitive interface
* support for related articles
2008-04-15 23:06:28 +00:00
Siebrand Mazeland
79d5225c0e * remove end of line whitespace
* remove empty lines at end of file
* remove "?>" where still present
2008-04-14 07:45:50 +00:00
Robert Stojnić
7532064d27 Search backend:
* add "all:" prefix that searches all namespaces (port from LuceneSearch)
* added a simplistic replacePrefixes so that now image:something will
  always search the image namespace
2008-03-23 17:29:43 +00:00
Robert Stojnić
82d7b2216b Search frontend:
* let the backend provide snippets and other info, fill only what is not 
  provided
* wrap textual results in a div, should make the snippets look more 
  compact and consistent over hits
* added a did you mean.. container
* show total number of hits if available
* added messages for "redirects to article", and "relevant section" hits
2008-03-23 13:43:11 +00:00
Brion Vibber
850cc98cf0 * (bug 11563) Deprecated SearchMySQL4 class; merged code to SearchMySQL
Some general cleanup on search backend code style :)
2008-03-18 23:50:05 +00:00
Greg Sabino Mullane
c07e337da6 Fix for bug 13004, in which the Postgres full-text search has too many results,
so it throws an error. Created a "too many" class as an alternate search result 
to return, and consider any error in SearchPostgres when running the actual search as a "too many" 
problem. Not an ideal solution, but I'm not sure how to get at the error message 
without requiring a newer version of PHP.
2008-02-17 14:11:55 +00:00
Brion Vibber
6e63f4cbad Add SearchGetNearMatch hook, have TitleKey provide a case-insensitive exact match for "go" searches. 2008-01-31 20:51:42 +00:00
Huji
2d8a62941c (bug 12608) Unifying the spelling of getDBkey() in the code. 2008-01-14 09:13:04 +00:00
Aryeh Gregor
a15c419b3d Remove ?>'s from files. They're pointless, and just asking for people to mess with the files and add trailing whitespace. (Yes, I looked over every one and reverted those that were bogus. Slash-enter a million times in less worked well enough, although it was a bit mind-numbing.) 2007-06-29 01:19:14 +00:00
Brion Vibber
b08f93c1db Add a free() function on SearchResultSet class, so the underlying result set can be freed 2007-06-06 18:36:11 +00:00
Tim Starling
ed4303922f Merged filerepo-work branch:
* Added support for configuration of an arbitrary number of commons-style file repositories.
* Split Image.php into filerepo/File.php and filerepo/LocalFile.php
* Renamed Image::getImagePath() to File::getPath()
* Added initial support for timestamp-based file fetching (OldLocalFile), to be expanded upon by aaron.
* Changed the interface for Image/File object creation: use wfFindFile() or wfLocalFile() depending on semantics
* ImageGallery::add() now accepts a title object as the first parameter
* Moved file handling operations on upload from SpecialUpload to File
* Removed path-related functions from ImageFunctions.php. Removed static path accessors from File. 
* Added a Content-Disposition header to thumb.php output
* Improved thumb.php error handling
* Updated the unit test suite to kind of partially work with modern computers. RunTests.php doesn't work just yet. Fixed an actual regression that the test suite detected -- moved some defines to Defines.php where they will be loaded consistently.
2007-05-30 21:02:32 +00:00
Brion Vibber
9b06c5355c E_STRICT fixlets: more static method markers
Also fixed a public function which was listed as private in comments for some reason
2007-05-02 16:02:23 +00:00
Antoine Musso
343420d0ad Convert whitespaces to tabulations 2007-04-21 14:44:56 +00:00
Brion Vibber
44c6db416f * (bug 5439) "Go" title search will now jump to shared/foreign Image: and
MediaWiki: pages that have not been locally edited.
2007-04-20 15:22:41 +00:00
Nick Jenkins
f9619da3f0 Yet more doc tweaks:
* Add @addtogroup tags to various classes, to try and group conceptually-related classes together.
* Add brief descriptions to various Special pages, thanks to Phil Boswell.
* Moving some docs to be right above the classes they represent, so that they are picked up.
2007-04-20 08:55:14 +00:00
River Tarnell
0b2f7f7ea4 full-search search for oracle using Oracle Text 2007-03-11 04:41:02 +00:00
Antoine Musso
c771fc9c96 Use Doxygen @addtogroup instead of phpdoc @package && @subpackage 2007-01-20 15:09:52 +00:00
Antoine Musso
eaf2cb74c1 Some static functions. Fix strict errors when running updateSearchIndex.php 2007-01-09 19:56:23 +00:00
Nick Jenkins
14c53b728f Code housekeeping stuff (and barring any stuff-ups on my behalf, there should be no changes in behaviour whatsoever after this) -
* removing some unused global declarations.
* removing or commenting out or adding comments for unused local vars.
* Adding one or two local var declarations.
* Declaring $matches array passed to preg_match() / preg_match_all() as array() before using [not required, just have a slight preference for the explicitness].
* remove one or two pass-by-reference function declarations where the value is not modified.
* Adding some braces to if-else blocks.
* In Parser.php, stripstrate is now an object rather than an array as per r17820, so we no longer need ask for a reference to it (as in "$x =& $this->mStripState;"), and in fact it's probably just simpler to get rid of $x altogether.
* Moving some preg regexes from "" quoting to '' quoting to stop static analyzer whinging about bad escape sequences.

... up to "LinksUpdate.php" in the includes/ directory.
2006-11-23 08:25:56 +00:00
Tim Starling
a3b490d2c4 * Made special page names case-insensitive and localisable. Care has been taken to maintain backwards compatibility.
* Used special page subpages in a few more places, instead of query parameters
2006-10-30 06:25:31 +00:00
Brion Vibber
85b6e95bea merge r16576 from SerbianVariants branch; should fix 'go' variant search on non-default search (Lucene) 2006-09-20 14:38:32 +00:00
Brion Vibber
61b04a3e95 * Updates to language variant code for Serbian et al 2006-09-20 10:22:12 +00:00
Brion Vibber
a38c8add84 reverting SerbianVariants check-in, not ready to go live yet on Wikimedia 2006-09-15 21:02:35 +00:00
Brion Vibber
1657774656 Merge from SerbianVariants branch, trunk 16500 vs SerbianVariants 16523 2006-09-15 20:08:21 +00:00
Brion Vibber
ba78b052b0 Revert 15733 and 15719 for the moment; I see some eval'd string code and other thinsg which make me nervous and I don't think anybody's reviewed this 2006-07-19 20:13:39 +00:00
Robert Stojnić
83da52c540 Enable UTF-8 lower/upper case operations in SearchEngine,
search in different variants (if needed).
Minor bug fixes for LanguageConverter: do no convert 
roman numbers and text between <code></code> into
variants (e.g. cyrillic).
2006-07-19 19:17:36 +00:00
Greg Sabino Mullane
4346921176 Add SearchPostgres.php 2006-07-05 03:54:01 +00:00
Greg Sabino Mullane
b9a642673a Standardize name to simply "postgres" 2006-06-27 15:08:08 +00:00
Antoine Musso
5a5cc201b1 having some fun with doxygen error log 2006-06-10 18:28:50 +00:00
Domas Mituzas
ca5f647c09 AutoLoad search classes 2006-06-06 10:33:23 +00:00
Brion Vibber
0a26267688 Revert to r14512; domas introduced massive breakage with incomplete experimental changes. They will be recommitted when they work. :) 2006-06-01 08:19:02 +00:00
Domas Mituzas
bda0b8e104 Use AutoLoader to load classes:
* remove require_once() throughout whole code, yet left in few places
* move global functions in HttpUtils, ProxyTools, Credits to class methods
* php5 only: __autoload() now used, combined with class->file map and require()
* move initialization of $wgValidSkinNames to Skin::getSkinNames()
* few more changes that will surely break stuff.
2006-06-01 07:22:49 +00:00
Ævar Arnfjörð Bjarmason
6c5d3c8c6a * Adding a trailing ?> 2006-03-07 13:32:27 +00:00
Ævar Arnfjörð Bjarmason
a26d5a49d7 * s~\t+$~~ 2006-01-07 13:31:29 +00:00
Brion Vibber
12e8fede56 * (bug 3562) for go search, try Caps-Variants-Broken-At-Non-Whitespace 2005-11-10 07:46:56 +00:00
Ævar Arnfjörð Bjarmason
5d036b41ad * 0 => NS_MAIN 2005-09-19 12:54:45 +00:00
Brion Vibber
b1a49e2d5b Drop MySQL 3.23.x support; 4.0 or greater required. 2005-08-26 23:02:54 +00:00
Brion Vibber
af2177edfd Code cleanup: normalize case for intval(), strval(), floatval() calls. 2005-08-16 23:36:16 +00:00
River Tarnell
b817c0c15f merge ORACLE_WORK. sorry, this may break some parts of MySQL, i did not test extensively. 2005-08-02 13:35:19 +00:00
Brion Vibber
e761358be7 * With $wgCapitalLinks off, accept off-by-first-letter-case in 'go' match
Requested by Wiktionary folks dealing with conversion issues.
2005-07-13 06:47:17 +00:00
Antoine Musso
157861bc31 fix some issues with phpdoc 2005-07-05 21:22:25 +00:00
River Tarnell
f688256922 check null title 2005-06-28 17:42:47 +00:00
Brion Vibber
aa99b80d7f Change the SearchEngine interface around:
* Reduce some duplicated code between MySQL 3 and 4 classes
* Generalize some things to better support Lucene search plugin
2005-05-23 08:42:20 +00:00