Numeric range queries using Zend_Search_Lucene, working around the bugs

Zend_Search_Lucene is a PHP implementation of the Lucene full-text searching engine. Whilst some of it works well, other bits don’t quite, such as performing range queries. The most common error for range queries is along the lines of an Zend_Search_Lucene_Search_QueryParserException that statesĀ Range query boundary terms…

This, as may have guessed, is that queries that work fine in the actual Java implementation of Lucene, don’t actually work in the Zend PHP implementation.

First workaround is to get the range terms set for the query. If you’ve tried something like:

$zendSearchLucene = Zend_Search_Lucene::open($directory);
$zendSearchLucene->find('date:[0000000000 TO 2147485547]');

You’ll probably get a parser error. Another way to code this (which works, well for me at least) is:

$from = new Zend_Search_Lucene_Index_Term(0000000000, 'date');
$to = new Zend_Search_Lucene_Index_Term(2147485547, 'date');
$query = new Zend_Search_Lucene_Search_Query_Range($from, $to, true);
$zendSearchLucene->find($query);

Another problem you may encounter is the fact that the default analyser for Zend_Search_Lucene doesn’t understand numbers, therefore a range query between 2 numbers won’t work.

You need to change the analyser to one that will intepret the numbers when you populate/repopulate the index, so before you add documents to the index set the default analyser to something like this:

$analyser = new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8Num_CaseInsensitive();
Zend_Search_Lucene_Analysis_Analyzer::setDefault($analyser);

To figure out which analyser is most appropriate take a look at the text analysers available for Zend_Search_Lucene.

 

 

 

 

 

1 comment