跟益達學Solr5之solrconfig.xml配置詳解

solrconfig.xml配置檔案中包含了很多solr自身配置相關的參數,solrconfig.xml配置檔案示例可以從solr的解壓目錄下找到，如圖：

用文本編輯軟體打開solrconfig.xml配置，你将會看到以下配置内容：

<?xml version="1.0" encoding="utf-8" ?>

<!--

licensed to the apache software foundation (asf) under one or more

contributor license agreements. see the notice file distributed with

this work for additional information regarding copyright ownership.

the asf licenses this file to you under the apache license, version 2.0

(the "license"); you may not use this file except in compliance with

the license. you may obtain a copy of the license at

http://www.apache.org/licenses/license-2.0

unless required by applicable law or agreed to in writing, software

distributed under the license is distributed on an "as is" basis,

without warranties or conditions of any kind, either express or implied.

see the license for the specific language governing permissions and

limitations under the license.

-->

<!--

for more details about configurations options that may appear in

this file, see http://wiki.apache.org/solr/solrconfigxml.

<!-- in all configuration below, a prefix of "solr." for class names

is an alias that causes solr to search appropriate packages,

including org.apache.solr.(search|update|request|core|analysis)

you may also specify a fully qualified java classname if you

have your own custom plugins.

-->

<!-- controls what version of lucene various components of solr

adhere to. generally, you want to use the latest version to

get all bug fixes and improvements. it is highly recommended

that you fully re-index after changing this setting as it can

affect both how text is indexed and queried.

-->

<!-- data directory

used to specify an alternate directory to hold all index data

other than the default ./data under the solr home. if

replication is in use, this should match the replication

configuration.

<!--

<!-- the directoryfactory to use for indexes.

solr.standarddirectoryfactory is filesystem

based and tries to pick the best implementation for the current

jvm and platform. solr.nrtcachingdirectoryfactory, the default,

wraps solr.standarddirectoryfactory and caches small files in memory

for better nrt performance.

one can force a particular implementation via solr.mmapdirectoryfactory,

solr.niofsdirectoryfactory, or solr.simplefsdirectoryfactory.

solr.ramdirectoryfactory is memory based, not

persistent, and doesn't work with replication.

<directoryfactory name="directoryfactory"

class="${solr.directoryfactory:solr.nrtcachingdirectoryfactory}">

</directoryfactory>

<!-- the codecfactory for defining the format of the inverted index.

the default implementation is schemacodecfactory, which is the official lucene

index format, but hooks into the schema to provide per-field customization of

the postings lists and per-document values in the fieldtype element

(postingsformat/docvaluesformat). note that most of the alternative implementations

are experimental, so if you choose to customize the index format, it's a good

idea to convert back to the official format e.g. via indexwriter.addindexes(indexreader)

before upgrading to a newer version to avoid unnecessary reindexing.

<!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

index config - these settings control low-level behavior of indexing

most example settings here show the default value, but are commented

out, to more easily see where customizations have been made.

note: this replaces <indexdefaults> and <mainindex> from older versions

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -->

<!-- lockfactory

this option specifies which lucene lockfactory implementation

to use.

single = singleinstancelockfactory - suggested for a

read-only index or when there is no possibility of

another process trying to modify the index.

native = nativefslockfactory - uses os native file locking.

do not use when multiple solr webapps in the same

jvm are attempting to share a single index.

simple = simplefslockfactory - uses a plain file for locking

defaults: 'native' is default for solr3.6 and later, otherwise

'simple' is the default

more details on the nuances of each lockfactory...

http://wiki.apache.org/lucene-java/availablelockfactories

<locktype>${solr.lock.type:native}</locktype>

<!-- lucene infostream

to aid in advanced debugging, lucene provides an "infostream"

of detailed information when indexing.

setting the value to true will instruct the underlying lucene

indexwriter to write its info stream to solr's log. by default,

this is enabled here, and controlled through log4j.properties.

-->

</indexconfig>

<!-- jmx

this example enables jmx if and only if an existing mbeanserver

is found, use this if you want to configure jmx through jvm

parameters. remove this to disable exposing solr configuration

and statistics to jmx.

for more details see http://wiki.apache.org/solr/solrjmx

<jmx />

<!-- if you want to connect to a particular server, specify the

agentid

<!-- <jmx serviceurl="service:jmx:rmi:///jndi/rmi://localhost:9999/solr"/>

<!-- enables a transaction log, used for real-time get, durability, and

and solr cloud replica recovery. the log can grow as big as

uncommitted changes to the index, so use of a hard autocommit

is recommended (see below).

"dir" - the target directory for transaction logs, defaults to the

solr data directory. -->

</updatelog>

<!-- autocommit

perform a hard commit automatically under certain conditions.

instead of enabling autocommit, consider using "commitwithin"

when adding documents.

http://wiki.apache.org/solr/updatexmlmessages

maxdocs - maximum number of documents to add since the last

commit before automatically triggering a new commit.

maxtime - maximum amount of time in ms that is allowed to pass

since a document was added before automatically

triggering a new commit.

opensearcher - if false, the commit causes recent index changes

to be flushed to stable storage, but does not cause a new

searcher to be opened to make those changes visible.

if the updatelog is enabled, then it's highly recommended to

have some sort of hard autocommit to limit the log size.

<maxtime>${solr.autocommit.maxtime:15000}</maxtime>

<opensearcher>false</opensearcher>

</autocommit>

<!-- softautocommit is like autocommit except it causes a

'soft' commit which only ensures that changes are visible

but does not ensure that data is synced to disk. this is

faster and more near-realtime friendly than a hard commit.

<maxtime>${solr.autosoftcommit.maxtime:-1}</maxtime>

</autosoftcommit>

</updatehandler>

query section - these settings control query time things like caches

<query>

<!-- max boolean clauses

maximum number of clauses in each booleanquery, an exception

is thrown if exceeded.

** warning **

this option actually modifies a global lucene property that

will affect all solrcores. if multiple solrconfig.xml files

disagree on this property, the value at any given moment will

be based on the last solrcore to be initialized.

<!-- solr internal query caches

there are two implementations of cache available for solr,

lrucache, based on a synchronized linkedhashmap, and

fastlrucache, based on a concurrenthashmap.

fastlrucache has faster gets and slower puts in single

threaded operation and thus is generally faster than lrucache

when the hit ratio of the cache is high (> 75%), and may be

faster under other scenarios on multi-cpu systems.

<!-- filter cache

cache used by solrindexsearcher for filters (docsets),

unordered sets of *all* documents that match a query. when a

new searcher is opened, its caches may be prepopulated or

"autowarmed" using data from caches in the old searcher.

autowarmcount is the number of items to prepopulate. for

lrucache, the autowarmed items will be the most recently

accessed items.

parameters:

class - the solrcache implementation lrucache or

(lrucache or fastlrucache)

size - the maximum number of entries in the cache

initialsize - the initial capacity (number of entries) of

the cache. (see java.util.hashmap)

autowarmcount - the number of entries to prepopulate from

and old cache.

<filtercache class="solr.fastlrucache"

size="512"

initialsize="512"

autowarmcount="0"/>

<!-- query result cache

caches results of searches - ordered lists of document ids

(doclist) based on a query, a sort, and the range of documents requested.

<queryresultcache class="solr.lrucache"

size="512"

initialsize="512"

autowarmcount="0"/>

<!-- document cache

caches lucene document objects (the stored fields for each

document). since lucene internal document ids are transient,

this cache will not be autowarmed.

<documentcache class="solr.lrucache"

size="512"

initialsize="512"

autowarmcount="0"/>

<cache name="persegfilter"

class="solr.search.lrucache"

size="10"

initialsize="0"

autowarmcount="10"

regenerator="solr.noopregenerator" />

<!-- lazy field loading

if true, stored fields that are not requested will be loaded

lazily. this can result in a significant speed improvement

if the usual case is to not load all stored fields,

especially if the skipped fields are large compressed text

fields.

<!-- result window size

an optimization for use with the queryresultcache. when a search

is requested, a superset of the requested number of document ids

are collected. for example, if a search for a particular query

requests matching documents 10 through 19, and querywindowsize is 50,

then documents 0 through 49 will be collected and cached. any further

requests in that range can be satisfied via the cache.

-->

<!-- maximum number of documents to cache for any entry in the

queryresultcache.

<!-- use cold searcher

if a search request comes in and there is no current

registered searcher, then immediately register the still

warming searcher and use it. if "false" then all requests

will block until the first searcher is done warming.

<usecoldsearcher>false</usecoldsearcher>

<!-- max warming searchers

maximum number of searchers that may be warming in the

background concurrently. an error is returned if this limit

is exceeded.

recommend values of 1-2 for read-only slaves, higher for

masters w/o cache warming.

</query>

<!-- request dispatcher

this section contains instructions for how the solrdispatchfilter

should behave when processing requests for this solrcore.

handleselect is a legacy option that affects the behavior of requests

such as /select?qt=xxx

handleselect="true" will cause the solrdispatchfilter to process

the request and dispatch the query to a handler specified by the

"qt" param, assuming "/select" isn't already registered.

handleselect="false" will cause the solrdispatchfilter to

ignore "/select" requests, resulting in a 404 unless a handler

is explicitly registered with the name "/select"

handleselect="true" is not recommended for new users, but is the default

for backwards compatibility

<!-- request parsing

these settings indicate how solr requests may be parsed, and

what restrictions may be placed on the contentstreams from

those requests

enableremotestreaming - enables use of the stream.file

and stream.url parameters for specifying remote streams.

multipartuploadlimitinkb - specifies the max size (in kib) of

multipart file uploads that solr will allow in a request.

formdatauploadlimitinkb - specifies the max size (in kib) of

form data (application/x-www-form-urlencoded) sent via

post. you can use post to pass request parameters not

fitting into the url.

addhttprequesttocontext - if set to true, it will instruct

the requestparsers to include the original httpservletrequest

object in the context map of the solrqueryrequest under the

key "httprequest". it will not be used by any of the existing

solr components, but may be useful when developing custom

plugins.

*** warning ***

the settings below authorize solr to fetch remote files, you

should make sure your system has some authentication before

using enableremotestreaming="true"

-->

<requestparsers enableremotestreaming="true"

multipartuploadlimitinkb="2048000"

formdatauploadlimitinkb="2048"

addhttprequesttocontext="false"/>

<!-- http caching

set http caching related parameters (for proxy caches and clients).

the options below instruct solr not to output any http caching

related headers

</requestdispatcher>

<!-- request handlers

http://wiki.apache.org/solr/solrrequesthandler

incoming queries will be dispatched to a specific handler by name

based on the path specified in the request.

legacy behavior: if the request path uses "/select" but no request

handler has that name, and if handleselect="true" has been specified in

the requestdispatcher, then the request handler is dispatched based on

the qt parameter. handlers without a leading '/' are accessed this way

like so: http://host/app/[core/]select?qt=name if no qt is

given, then the requesthandler that declares default="true" will be

used or the one named "standard".

if a request handler is declared with startup="lazy", then it will

not be initialized until the first request that uses it.

<!-- searchhandler

http://wiki.apache.org/solr/searchhandler

for processing search queries, the primary request handler

provided with solr is "searchhandler" it delegates to a sequent

of searchcomponents (see below) and supports distributed

queries across multiple shards

<!--

<str name="config">solr-data-config.xml</str>

</lst>

</requesthandler>

<str name="config">data-config.xml</str>

<!-- default values for query parameters can be specified, these

will be overridden by parameters in the request

<str name="echoparams">explicit</str>

</lst>

</requesthandler>

the export request handler is used to export full sorted result sets.

do not change these defaults.

<str name="rq">{!xport}</str>

<str name="wt">xsort</str>

<str name="distrib">false</str>

<str>query</str>

</arr>

</initparams>

<!-- field analysis request handler

requesthandler that provides much the same functionality as

analysis.jsp. provides the ability to specify multiple field

types and field names in the same request and outputs

index-time and query-time analysis for each of them.

request parameters are:

analysis.fieldname - field name whose analyzers are to be used

analysis.fieldtype - field type whose analyzers are to be used

analysis.fieldvalue - text for index-time analysis

q (or analysis.q) - text for query time analysis

analysis.showmatch (true|false) - when set to true and when

query analysis is performed, the produced tokens of the

field value analysis will be marked as "matched" for every

token that is produces by the query analysis

-->

<requesthandler name="/analysis/field"

startup="lazy"

class="solr.fieldanalysisrequesthandler" />

<!-- document analysis handler

http://wiki.apache.org/solr/analysisrequesthandler

an analysis handler that provides a breakdown of the analysis

process of provided documents. this handler expects a (single)

content stream with the following format:

<docs>

<doc>

<field name="text">the text value</field>

</doc>

...

</docs>

note: each document must contain a field which serves as the

unique key. this key is used in the returned response to associate

an analysis breakdown to the analyzed document.

like the fieldanalysisrequesthandler, this handler also supports

query analysis by sending either an "analysis.query" or "q"

request parameter that holds the query text to be analyzed. it

also supports the "analysis.showmatch" parameter which when set to

true, all field tokens that match the query tokens will be marked

as a "match".

<requesthandler name="/analysis/document"

class="solr.documentanalysisrequesthandler"

startup="lazy" />

<str name="echoparams">explicit</str>

<!-- search components

search components are registered to solrcore and used by

instances of searchhandler (which can access them by name)

by default, the following components are available:

<!-- terms component

http://wiki.apache.org/solr/termscomponent

a component to return terms and document frequency of those

terms

<bool name="distrib">false</bool>

</lst>

<str>terms</str>

<admin>

</admin>

</config>

下面我将對其中關鍵地方加以解釋說明：

lib

<lib> 标簽指令可以用來告訴solr如何去加載solr plugins(solr插件)依賴的jar包，在solrconfig.xml配置檔案的注釋中有配置示例，例如：

這裡的dir表示一個jar包目錄路徑，該目錄路徑是相對于你目前core根目錄的；regex表示一個正規表達式，用來過濾檔案名的，符合正規表達式的jar檔案将會被加載

datadir parameter

用來指定一個solr的索引資料目錄，solr建立的索引會存放在data\index目錄下，預設datadir是相對于目前core目錄(如果solr_home下存在core的話)，如果solr_home下不存在core的話，那datadir預設就是相對于solr_home啦，不過一般datadir都在core.properties下配置。

用來設定lucene反向索引的編碼工廠類，預設實作是官方提供的schemacodecfactory類。

在solrconfig.xml的<indexconfig>标簽中間有很多關于此配置項的說明：

<!-- maxfieldlength was removed in 4.0. to get similar behavior, include a

limittokencountfilterfactory in your fieldtype definition. e.g.

提供我們maxfieldlength配置項已經從4.0版本開始就已經被移除了，可以使用配置一個filter達到相似的效果，maxtokencount即在對某個域分詞的時候，最多隻提取前10000個token，後續的域值将被抛棄。maxfieldlength若表示1000，則意味着隻會對域值的0~1000範圍内的字元串進行分詞索引。

writelocktimeout表示indexwriter執行個體在擷取寫鎖的時候最大等待逾時時間，超過指定的逾時時間仍未擷取到寫鎖，則indexwriter寫索引操作将會抛出異常

表示建立索引的最大線程數，預設是開辟8個線程來建立索引

<usecompoundfile>false</usecompoundfile>

是否開啟複合檔案模式，啟用了複合檔案模式即意味着建立的索引檔案數量會減少，這樣占用的檔案描述符也會減少，但這會帶來性能的損耗，在lucene中，它預設是開啟，而在solr中，自從3.6版本開始，預設就是禁用的

表示建立索引時記憶體緩存大小，機關是mb,預設最大是100m,

表示在document寫入到硬碟之前，緩存的document最大個數，超過這個最大值會觸發索引的flush操作。

</mergepolicy>

用來配置lucene索引段合并政策的，裡面有兩個參數：

maxmergeatone: 一次最多合并段個數

segmentpertier: 每個層級的段個數，同時也是記憶體buffer遞減的等比數列的公比，看源碼：

// compute max allowed segs in the index

long levelsize = minsegmentbytes;

long bytesleft = totindexbytes;

double allowedsegcount = 0;

while(true) {

final double segcountlevel = bytesleft / (double) levelsize;

if (segcountlevel < segspertier) {

allowedsegcount += math.ceil(segcountlevel);

break;

}

allowedsegcount += segspertier;

bytesleft -= segspertier * levelsize;

levelsize *= maxmergeatonce;

}

int allowedsegcountint = (int) allowedsegcount;

要了解mergefactor因子的含義，還是先看看lucene in action中給出的解釋：

indexwriter’s mergefactor lets you control how many documents to store in memory

before writing them to the disk, as well as how often to merge multiple index

segments together. (index segments are covered in appendix b.) with the default

value of 10, lucene stores 10 documents in memory before writing them to a single

segment on the disk. the mergefactor value of 10 also means that once the

number of segments on the disk has reached the power of 10, lucene merges

these segments into a single segment.

for instance, if you set mergefactor to 10, a new segment is created on the disk

for every 10 documents added to the index. when the tenth segment of size 10 is

added, all 10 are merged into a single segment of size 100. when 10 such segments

of size 100 have been added, they’re merged into a single segment containing

1,000 documents, and so on. therefore, at any time, there are no more than 9

segments in the index, and the size of each merged segment is the power of 10.

there is a small exception to this rule that has to do with maxmergedocs,

another indexwriter instance variable: while merging segments, lucene ensuresthat no segment with more than maxmergedocs documents is created. for instance,

suppose you set maxmergedocs to 1,000. when you add the ten-thousandth document,

instead of merging multiple segments into a single segment of size 10,000,

lucene creates the tenth segment of size 1,000 and keeps adding new segments

of size 1,000 for every 1,000 documents added.

indexwriter的mergefactory允許你來控制索引在寫入磁盤之前記憶體中能緩存的document數量，以及合并

多個段檔案的頻率。預設這個值為10. 當往記憶體中存儲了10個document,此時lucene還沒有把單個段檔案

寫入磁盤，mergefactor值等于10也意味着當硬碟上的段檔案數量達到10，lucene将會把這10個段檔案合

并到一個段檔案中。例如：如果你把mergefactor設定為10，當你往索引中添加了10個document,一個段

檔案将會在硬碟上被建立，當第10個段檔案被添加時，這10個段檔案就會被合并到1個段檔案，此時這個

段檔案中有100個document,當10個這樣的包含了100個document的段檔案被添加時，他們又會被合并到一

個新的段檔案中，而此時這個段檔案包含 1000個document,以此類推。是以，在任何時候，在索引中不

存在超過9個段檔案。每個被合并的段檔案包含的document個數都是10，但這樣有點小問題，我們還必須

設定一個maxmergedocs變量，當合并段檔案的時候，lucene必須確定沒有哪個段檔案超過maxmergedocs

變量規定的最大document數量。設定maxmergedocs的目的是為了防止單個段檔案中包含的document數量

過大，假定你把maxmergedocs設定為1000，當你建立第10個包含1000個document段檔案的時候，這時并

不會觸發段檔案合并(如果沒有設定maxmergedocs為100的話，按理來說，這10個包含了1000個document

的段檔案将會被合并到一個包含了10000個document的段檔案當中，但maxmergedocs限制了單個段檔案中

最多包含1000個document,是以此時并不會觸發段合并操作)。影響段合并還有一些其他參數，比如：

mergefactor：當大小幾乎相當的段的數量達到此值的時候，開始合并。

minmergesize：所有大小小于此值的段，都被認為是大小幾乎相當，一同參與合并。

maxmergesize：當一個段的大小大于此值的時候，就不再參與合并。

maxmergedocs：當一個段包含的文檔數大于此值的時候，就不再參與合并。

段合并分兩個步驟：

1.首先篩選出哪些段需要合并，這一步由mergepolicy合并政策類來決定

2.然後就是真正的段合并過程了，這一步是交給mergescheduler來完成的，mergescheduler類主要做兩件事：

a.對存儲域，項向量，标準化因子即norms等資訊進行合并

b.對反向索引資訊進行合并

尼瑪扯遠了，接着繼續我們的solrconfig.xml中影響索引建立的一些參數配置；

mergescheduler剛才提到過了，這是用來配置段合并操作的處理類。預設實作類是lucene中自帶的concurrentmergescheduler。

<locktype>${solr.lock.type:native}</locktype>

這個是用來指定lucene中lockfactory實作的，可配置項如下：

single = singleinstancelockfactory - suggested for a

single：表示隻讀鎖，沒有另外一個處理線程會去修改索引資料

native：即lucene中的nativefslockfactory實作，使用的是基于作業系統的本地檔案鎖

simple：即lucene中的simplefslockfactory實作，通過在硬碟上建立write.lock鎖檔案實作

defaults：從solr3.6版本開始，這個預設值是native,否則，預設值就是simple,意思就是說，你如果配置為defaults，到底使用哪種鎖實作，取決于你目前使用的solr版本。

<unlockonstartup>false</unlockonstartup>

如果這個設定為true,那麼在solr啟動後，indexwriter和commit送出操作擁有的鎖将會被釋放，這會打破lucene的鎖機制，請謹慎使用。如果你的locktype設定為single,那麼這個配置true or false都不會産生任何影響。

用來配置索引删除政策的，預設使用的是solr的solrdeletionpolicy實作。如果你需要自定義删除政策，那麼你需要實作lucene的org.apache.lucene.index.indexdeletionpolicy接口。

<jmx />

這個配置是用來在solr中啟用jmx，有關這方面的詳細資訊，請移步到solr官方wiki，通路位址如下：

<a href="http://wiki.apache.org/solr/solrjmx">http://wiki.apache.org/solr/solrjmx</a>

指定索引更新操作處理類，directupdatehandler2是一個高性能的索引更新處理類，它支援軟送出

</updatelog>

<updatelog>用來指定上面的updatehandler的處理事務日志存放路徑的，預設值是solr的data目錄即solr的datadir配置的目錄。

<query>标簽是有關索引查詢相關的配置項

表示booleanquery最大能連結多少個子query,當不同的core下的solrconfig.xml中此配置項的參數值配置的不一樣時，以最後一個初始化的core的配置為準。

<filtercache class="solr.fastlrucache"

size="512"

initialsize="512"

autowarmcount="0"/>

用來配置filter過濾器的緩存相關的參數

<queryresultcache class="solr.lrucache"

size="512"

initialsize="512"

autowarmcount="0"/>

用來配置對query傳回的查詢結果集即topdocs的緩存

<documentcache class="solr.lrucache"

size="512"

initialsize="512"

autowarmcount="0"/>

用來配置對document中存儲域的緩存，因為每次從硬碟上加載存儲域的值都是很昂貴的操作，這裡說的存儲域指的是那些store.yes的field，是以你懂的。

<fieldvaluecache class="solr.fastlrucache"

size="512"

autowarmcount="128"

showitems="32" />

這個配置是用來緩存document id的，用來快速通路你的document id的。這個配置項預設就是開啟的，無需顯式配置。

<cache name="myusercache"

class="solr.lrucache"

size="4096"

initialsize="1024"

autowarmcount="1024"

regenerator="com.mycompany.myregenerator"

這個配置是用來配置你的自定義緩存的，你自己的regenerator需要實作solr的cacheregenerator接口。

表示啟用存儲域的延遲加載，前提是你的存儲域在query的時候沒有顯式指定需要return這個域。

表示當你的query沒有使用score進行排序時，是否使用filter來替代query.

<!--

<lst><str name="q">solr</str><str name="sort">price asc</str></lst>

<lst><str name="q">rocks</str><str name="sort">weight asc</str></lst>

-->

</arr>

</listener>

querysenderlistener用來監聽查詢發送過程，即你可以在query請求發送之前追加一些請求參數，如上面給的示例中，可以追加qery關鍵字以及sort排序規則。

這個select請求是為了相容先前的舊版本，已經不推薦使用。

表示solr伺服器段永遠不傳回304，那http響應狀态碼304表示什麼呢？表示伺服器端告訴用戶端，你請求的資源尚未被修改過，我傳回給你的是上次緩存的内容。never304即告訴伺服器，不管我通路的資源有沒有更新過，都給我重新傳回不走http緩存。這屬于http協定相關知識，不清楚的請去google http協定詳細了解去。

<str name="echoparams">explicit</str>

其他的一些requesthandler說明就略過了，其實都大同小異，就是一個請求url跟請求處理類的一個映射,就好比springmvc中請求url和controller類的一個映射。

用來配置查詢元件比如spellcheckcomponent拼寫檢查，有關拼寫檢查的詳細配置說明留到以後說到spellcheck時再說吧。

用來傳回所有的term以及每個document中term的出現頻率

用來配置關鍵字高亮的，solr高亮配置的詳細說明這裡暫時先略過，這篇我們隻是先暫時大緻了解下每個配置項的含義即可，具體如何使用留到後續再深入研究。

有關searchcomponent查詢元件的其他配置我就不一一說明了，太多了。你們自己看裡面的英文注釋吧，如果你實在看不懂再來問我。

<!-- for the purposes of the tutorial, json responses are written as

plain text so that they are easy to read in *any* browser.

if you expect a mime type of "application/json" just remove this override.

<str name="content-type">text/plain; charset=utf-8</str>

</queryresponsewriter>

這個是用來配置solr響應資料轉換類，jsonresponsewriter就是把http響應資料轉成json格式，content-type即response響應頭資訊中的content-type,即告訴用戶端傳回的資料的mime類型為text/plain，且charset字元集編碼為utf-8.

内置的響應資料轉換器還有velocity，xslt等，如果你想自定義一個基于freemarker的轉換器，那你需要實作solr的queryresponsewriter接口，模仿其他實作類，你懂的，然後在solrconfig.xml中添加類似的<queryresponsewriter配置即可

最後需要說明下的是solrconfig.xml中有大量類似<arr> <list> <str> <int>這樣的自定義标簽，下面做個統一的說明：

這張圖摘自于solr in action這本書，由于是英文的，是以我稍微解釋下：

arr:即array的縮寫，表示一個數組，name即表示這個數組參數的變量名

lst即list的縮寫，但注意它裡面存放的是key-value鍵值對

bool表示一個boolean類型的變量,name表示boolean變量名，

同理還有int,long,float,str等等

str即string的縮寫，唯一要注意的是arr下的str子元素是沒有name屬性的，而list下的str元素是有name屬性的

最後總結下：

solrconfig.xml中的配置項主要分以下幾大塊：

1.依賴的lucene版本配置，這決定了你建立的lucene索引結構，因為lucene各版本之間的索引結構并不是完全相容的，這個需要引起你的注意。

2.索引建立相關的配置，如索引目錄，indexwriterconfig類中的相關配置(它決定了你的索引建立性能)

3.solrconfig.xml中依賴的外部jar包加載路徑配置

4.jmx相關配置

5.緩存相關配置，緩存包括過濾器緩存，查詢結果集緩存，document緩存，以及自定義緩存等等

6.updatehandler配置即索引更新操作相關配置

7.requesthandler相關配置，即接收用戶端http請求的處理類配置

8.查詢元件配置如hightlight，spellchecker等等

9.responsewriter配置即響應資料轉換器相關配置，決定了響應資料是以什麼樣格式傳回給用戶端的。

10.自定義valuesourceparser配置，用來幹預document的權重、評分，排序

solrconfig.xml就解釋到這兒了，了解這些配置項是為後續solr學習掃清障礙。有些我沒說到的或者我有意略過的，就留給你們自己去閱讀和了解了，畢竟内容太多，1000多行的配置，一行不拉的解釋完太耗時，有些都是類似的配置，我想你們應該能看懂。

如果你還有什麼問題請加我Ｑ-q：7-3-6-0-3-1-3-0-5，

或者加裙

一起交流學習！

轉載：http://iamyida.iteye.com/blog/2211728

跟益達學Solr5之solrconfig.xml配置詳解

繼續閱讀

關于Gradle配置的小結

Java小案例——随機數猜測随機數猜測

nginx location中斜線的位置的重要性

27 Best Free Eclipse Plug-ins for Java Developer to be ProductiveCode Quality PluginsText Editor PluginsDependency ManagementVersion Control Integration PluginsFramework Development Continuous Integration Related PluginsOther Utility Plugins

Java String.format方法的簡單使用

neo4j之cypher使用文檔

GitHub連夜封殺！這份阿裡 10W 字内部 Java 字面試手冊到底有多強？

spark/scala關于【資源檔案】加載方法概述外部檔案加載方案測試資源檔案打包入jar包中小結

mybatis_入門程式Mybatis入門

AOP程式設計_Android優雅權限架構(1)概念基礎，2021金三銀四前言正文大綱正文

Effective Java 8:通用程式設計

OOM三種類型

工廠模式-三種類型

【遞歸】高效率求2的n次幂

win10本地scala和spark安裝安裝scala安裝spark

scala (3) Function 和 Method