phrasequery:短語查詢,就是查詢文檔中是否包含指定的一個term或多個term,多個term之間可以指定間隔即slop參數,官方api解釋如圖:

使用示例代碼,如下:
package com.yida.framework.lucene5.query;
import java.io.ioexception;
import org.apache.lucene.analysis.analyzer;
import org.apache.lucene.analysis.standard.standardanalyzer;
import org.apache.lucene.document.document;
import org.apache.lucene.document.field;
import org.apache.lucene.document.textfield;
import org.apache.lucene.index.directoryreader;
import org.apache.lucene.index.indexreader;
import org.apache.lucene.index.indexwriter;
import org.apache.lucene.index.indexwriterconfig;
import org.apache.lucene.index.indexwriterconfig.openmode;
import org.apache.lucene.index.term;
import org.apache.lucene.search.indexsearcher;
import org.apache.lucene.search.phrasequery;
import org.apache.lucene.search.scoredoc;
import org.apache.lucene.search.topdocs;
import org.apache.lucene.store.directory;
import org.apache.lucene.store.ramdirectory;
public class phrasequerytest {
public static void main(string[] args) throws ioexception {
directory dir = new ramdirectory();
analyzer analyzer = new standardanalyzer();
indexwriterconfig iwc = new indexwriterconfig(analyzer);
iwc.setopenmode(openmode.create);
indexwriter writer = new indexwriter(dir, iwc);
document doc = new document();
doc.add(new textfield("text", "quick brown fox", field.store.yes));
writer.adddocument(doc);
doc = new document();
doc.add(new textfield("text", "jumps over lazy broun dog", field.store.yes));
doc.add(new textfield("text", "jumps over extremely very lazy broxn dog", field.store.yes));
writer.close();
indexreader reader = directoryreader.open(dir);
indexsearcher searcher = new indexsearcher(reader);
string term1 = "dog";
string term2 = "jumps";
phrasequery phrasequery = new phrasequery();
phrasequery.add(new term("text",term1));
phrasequery.add(new term("text",term2));
phrasequery.setslop(15);
topdocs results = searcher.search(phrasequery, null, 100);
scoredoc[] scoredocs = results.scoredocs;
for (int i = 0; i < scoredocs.length; ++i) {
//system.out.println(searcher.explain(query, scoredocs[i].doc));
int docid = scoredocs[i].doc;
document document = searcher.doc(docid);
string path = document.get("text");
system.out.println("text:" + path);
}
}
}
pharsequery.add(term),每次都是add到末尾,當然你也可以用add(term,position)明确指定add到哪個位置,示例代碼中add了兩個term,則我們的查詢短語是dog jumps,他們的間隔為0,然後我們設定slop值為5,
第2個索引文檔裡單詞jumps往右移動5次剛好可以得到我們的查詢短語dog jumps,是以它符合要求被傳回了,而第1個索引文檔直接不包含單詞dog不符合要求,第3個索引文檔需要移動7次才能得到dog jumps,是以最後傳回的隻有第2個索引文檔。
如果我把代碼變一下,改成這樣:
string term1 = "dog";
string term2 = "jumps";
phrasequery phrasequery = new phrasequery();
phrasequery.add(new term("text",term1),0);
phrasequery.add(new term("text",term2),2);
phrasequery.setslop(6);
topdocs results = searcher.search(phrasequery, null, 100);
這時候我們的查詢短語就是dog xxx jumps,意思就是我們要查詢包含dog和jumps字元的文檔而且dog和jumps之間要有一個字元間隔(不包含停用詞),這時候我們的slop就要加1了,即我們需要再多移動一次,是以這次slop值應該為6.
如果你還有什麼問題請加我Q-q:7-3-6-0-3-1-3-0-5,
或者加裙
一起交流學習!
轉載:http://iamyida.iteye.com/blog/2195838