天天看點

solr 配置clustering

首先是在solrconfig.xml,添加

 <searchComponent

    name="clusteringComponent"

    enable="${solr.clustering.enabled:true} "

    class="org.apache.solr.handler.clustering.ClusteringComponent" >

    <!-- Declare an engine -->

    <lst name="engine">

      <!-- The name, only one can be named "default" -->

      <str name="name">default</str>

      <!--

           Class name of Carrot2 clustering algorithm. Currently available algorithms are:

           * org.carrot2.clustering.lingo.LingoClusteringAlgorithm

           * org.carrot2.clustering.stc.STCClusteringAlgorithm

           See http://project.carrot2.org/algorithms.html for the algorithm's characteristics.

        -->

      <str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>

      <!--

           Overriding values for Carrot2 default algorithm attributes. For a description

           of all available attributes, see: http://download.carrot2.org/stable/manual/#chapter.components.

           Use attribute key as name attribute of str elements below. These can be further

           overridden for individual requests by specifying attribute key as request

           parameter name and attribute value as parameter value.

        -->

      <str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str>

    </lst>

    <lst name="engine">

      <str name="name">stc</str>

      <str name="carrot.algorithm">org.carrot2.clustering.stc.STCClusteringAlgorithm</str>

    </lst>

  </searchComponent>

  <requestHandler name="/clustering" class="solr.SearchHandler">

     <lst name="defaults">

       <bool name="clustering">true</bool>

       <str name="clustering.engine">default</str>

       <bool name="clustering.results">true</bool>

       <!-- The title field -->

       <str name="carrot.title">name</str>

       <str name="carrot.url">id</str>

       <!-- The field to cluster on -->

       <str name="carrot.snippet">features</str>

       <!-- produce summaries -->

       <bool name="carrot.produceSummary">true</bool>

       <!-- the maximum number of labels per cluster -->

       <!--<int name="carrot.numDescriptions">5</int>-->

       <!-- produce sub clusters -->

       <bool name="carrot.outputSubClusters">false</bool>

    </lst>    

    <arr name="last-components">

      <str>clusteringComponent</str>

    </arr>

  </requestHandler>

然後在%solr_home%/lib目錄下添加擴充包:

從下載下傳的solr項目中将

dist/apache-solr-clustering-*.jar,

contrib/clustering目錄下的所有jar包,

contrib/clustering/downloads 目錄下的所有jar包

加入到%solr_home%/lib.

在加入擴充包時,遇到一個問題,就是下載下傳的solr項目下contrib/clustering/downloads的目錄下沒有jar包,這個需要運作contrib/clustering目錄下的 build.xml

是以先安裝Ant,然後運作 cmd,進入doc界面,進入contrib/clustering目錄,運作 ant指令

便會下載下傳相應的jar 包,包括

simple-xml-1.7.3.jar,pcj-1.2.jar,colt-1.2.0.jar, nni.jar四個包

但是可能build.xml指定的下載下傳nni.jar包時的路徑有問題,是以沒有下載下傳成功。是以自已得去網下搜尋下載下傳它。

運作solr:

http://localhost:8080/solr/clustering?q=*:*&rows=10      
下一篇: 綠皮書筆記

繼續閱讀