Lucene.Net 建立索引和檢索

Lucene.Net 建立全文搜尋最核心的内容是Create Index 和 Search ,而建立索引是後面搜尋的基礎，因為後面的搜尋是使用索引來搜尋的。對于建立索引，Lucene.Net 專門提供了個類實作,其中 Lucene.Net.Index..IndexWrite 建立索引并将索引寫入檔案，對應的Lucene.Net.Index..IndexReader 實作從檔案夾中将索引讀出來，以便對索引進行修改等操作

Lucene.Net 建立索引和檢索

建立索引：

Lucene.Net 建立索引和檢索

IndexWriter indexwriter = new IndexWriter( " index " , new StandardAnalyzer(), true );

Lucene.Net 建立索引和檢索

首先是定義一個索引寫入器indexwrite，其中第一個參數index表示要存儲索引的檔案夾，第二個參數是一個分析對象,主要用于從文本中抽取那些需要建立索引的内容,把不需要參與建索引的文本内容去掉.比如去掉一些a the之類的常用詞,還有決定是否大小寫敏感.不同的選項通過指定不同的分析對象控制，第三個參數用于确定是否覆寫原有索引，true表示新建立的索引将覆寫掉原來的索引， false 将重新建立并保留原有索引。

Lucene.Net 建立索引和檢索

Document doc = new Document();

Lucene.Net 建立索引和檢索

建立一個文檔對象

Lucene.Net 建立索引和檢索

doc.Add(Field.UnStored( " text " , context);

Lucene.Net 建立索引和檢索

doc.Add(Field.Keyword( " path " , path));

Lucene.Net 建立索引和檢索

doc.Add(Field .Text ( " filename " ,filename));

Lucene.Net 建立索引和檢索

給文檔添加屬性，Add方法是将一個屬性添加到doc中，text是要添家的屬性的名字，context是要建立索引的内容，可以是任何可以解讀的資料源,這裡要注意的是Field中的幾個類型，總共有4個

Lucene.Net 建立索引和檢索

1 、Keyword 見名知意就是關鍵字，該字段中的内容不經過分析但會被索引并直接儲存到索引中，比如：good,filename,teacher等字元串常量，也可以是一個字元串數組，如string [] contex = {“doc”,”xls”,”ppt”,”pdf”,html”,txt”}

Lucene.Net 建立索引和檢索

Foreach(stirng strcontex in contex)

Lucene.Net 建立索引和檢索

{

doc.Add(Filed.Keyword(“text”,strcontex);

}

Lucene.Net 建立索引和檢索

也可以這樣将你的關鍵字添家到文檔中。

Lucene.Net 建立索引和檢索

2 、 UnIndexed 不被分析，不被索引，但卻儲存在索引中

Lucene.Net 建立索引和檢索

3 、Unstrored 和UnIndexed剛好相反

Lucene.Net 建立索引和檢索

4 、 Text 和UnStrored類似.如果值的類型為string還會被儲存.如果值的類型為Reader就不會被儲存和UnStored一樣.

Lucene.Net 建立索引和檢索

indexriter.AddDocument(doc);

Lucene.Net 建立索引和檢索

将doc添加到索引中

Lucene.Net 建立索引和檢索

writer.Optimize();

Lucene.Net 建立索引和檢索

對建立的索引進行優化

Lucene.Net 建立索引和檢索

writer.Close();

Lucene.Net 建立索引和檢索

關閉寫入器

Lucene.Net 建立索引和檢索

到此一個簡單的索引建立完畢。

Lucene.Net 建立索引和檢索

下面再提供一個建立索引的例子：

Lucene.Net 建立索引和檢索

private String[] keywords = {"20001895", "20001896"} ;

Lucene.Net 建立索引和檢索

private String[] unindexed = {"Red star", "good morning"} ;

Lucene.Net 建立索引和檢索

private String[] unstored = { "I am a programer", "you are programmer ,too",} ;

Lucene.Net 建立索引和檢索

private String[] text1 = { " programer ", "morning" } ;

Lucene.Net 建立索引和檢索

private String[] text2 = { "200606", "200609" } ;

Lucene.Net 建立索引和檢索

private String[] text3 = { "/Computers/red", "/Computers/star" } ;

Lucene.Net 建立索引和檢索

private Directory dir;

Lucene.Net 建立索引和檢索

protected void AddDocuments()

Lucene.Net 建立索引和檢索

{

string indexDir = "index";

dir = FSDirectory.GetDirectory(indexDir, true);

IndexWriter writer=new IndexWriter(dir, GetAnalyzer(), true);

for (int i = 0; i < keywords.Length; i++)

{

Document doc = new Document();

doc.Add(Field.Keyword("isbn", keywords[i]));

doc.Add(Field.UnIndexed("title", unindexed[i]));

doc.Add(Field.UnStored("contents", unstored[i]));

doc.Add(Field.Text("subject", text1[i]));

doc.Add(Field.Text("pubmonth", text2[i]));

doc.Add(Field.Text("category", text3[i]));

writer.AddDocument(doc);

}

writer.Optimize();

writer.Close();

}

Lucene.Net 建立索引和檢索

資料檢索：

Lucene.Net 建立索引和檢索

建立完了索引後怎麼來利用索引檢索資料，這裡就要用到Lucene.Net.Searcher.IndexSercher個類來讀取索引檔案，并将讀取的結果放在 Hits中，這裡的Hits是一個集合，和DataSet有相似之處，DataSet中放的是Tables，Hits中放的是Documents,然後就是将Hits中的資料怎麼處理的問題，這不是論述的重點，以後有時間再寫這部分。

Lucene.Net 建立索引和檢索

IndexSearcher searcher = new IndexSearcher(indexDirectory);

Lucene.Net 建立索引和檢索

建立一個搜尋器，參數是建立索引的路徑

Lucene.Net 建立索引和檢索

Query query = QueryParser.Parse(condition, " text " , new StandardAnalyzer());

Lucene.Net 建立索引和檢索

定義一個查詢對象，參數condition表示查詢的條件，text 我們建立索引時的要被分析的内容，第三個是個分析對象

Lucene.Net 建立索引和檢索

Hits hits = searcher.Search(query);

Lucene.Net 建立索引和檢索

通過搜尋器将搜尋的Document放到Hits中

Lucene.Net 建立索引和檢索

Int total = hits.Length();

Lucene.Net 建立索引和檢索

計算hits中有多少個Document

Lucene.Net 建立索引和檢索

for ( int i = 0 ; i < total; i ++ )

Lucene.Net 建立索引和檢索

循環周遊

Lucene.Net 建立索引和檢索

{

Document doc = hits.Doc(i);

string path = doc.Get("path");

string plainText =doc.Get(“text”);;

string str=doc.Get ("filename");

通過Get方法将搜尋的内容提取出來

}

Lucene.Net 建立索引和檢索

searcher.Close(); // 關閉搜尋器

Lucene.Net 建立索引和檢索

繼續閱讀

（C# 程式設計指南）

20/20: Top 20 Programming Lessons I've Learned in 20 Years

WEB前端開發規範文檔（範文）

轉詳解C#資料庫存取圖檔三大方式

C/C++頭檔案、函數使用說明

在DOS下運作不了ipconfig指令

java 日期總結

GNU科學函數庫[參考手冊][v0.1 Build 090201 Beta][GNU Scientific Library]

專家訪談：搜尋開源力量：Lucene技術前景

基于XOR的加密程式

查找算法學習之二分查找（Python版本）——BinarySearch

swift資料合集

C#多線程——前台線程和背景線程

QName是什麼

Android – ListView 中添加按鈕，動态删除添加ItemView的操作

GridView終極用法(一)