這兩天在做sphinx全文索引的項目,研究了兩天了終于把它搞定了,下面來總結一下
1.安裝sphinx(我這裡用的是macOS,linux後文使用的大部分指令都相容)
mkdir /usr/local/sphinx
cd /usr/local/spinx
wget http://sphinxsearch.com/files/sphinx-2.2.11-release.tar.gz
tar -zvxf sphinx-2.2.11-release.tar.gz
cd sphinx-2.2.11
./configure
sudo make && make install
測試是否安裝成功
searchd -h //有提示即為成功
安裝過程碰到的錯誤
configuring Sphinx
checking for CFLAGS needed for pthreads… none
checking for LIBS needed for pthreads… -lpthread
checking for pthreads… found
checking whether to compile with MySQL support… yes
checking for mysql_config… mysql_config
checking for mysql_real_connect… no
checking for mysql_real_connect… no
checking MySQL include files… configure: error: missing include files.
**
ERROR: cannot find MySQL include files.
解決辦法:sudo apt-get install libmysql++
2.sphinx.conf配置(相關配置參數詳見sphinx官網)
# Minimal Sphinx configuration sample (clean, simple, functional)
#
source main_src
{
type = mysql
sql_host = 192.168.1.221
sql_user = root
sql_pass =root.remote
sql_db = caiban
sql_port = 3306 # optional, default is 3306
sql_sock =/tmp/mysql.scok
sql_query_pre =SET NAMES utf8
sql_query_pre =SET SESSION query_cache_type=OFF
sql_query_pre =replace into sph_counter select 1,max(id) from register_enterprise_extends
sql_query = \
SELECT id,company_name,trademark,legal_person_name, UNIX_TIMESTAMP(created_at) AS created_at,reg_address,reg_number,business_scope,linkman,reg_organs,operating_period,views,reg_capital FROM register_enterprise_extends where id<=(select max_doc_id from sph_counter where counter_id=1)
sql_attr_uint =id
sql_field_string =company_name
sql_field_string =trademark
sql_field_string =legal_person_name
sql_attr_timestamp =created_at
sql_field_string =reg_address
sql_field_string =reg_number
sql_field_string =business_scope
sql_field_string =reg_organs
sql_field_string =operating_period
sql_field_string =reg_capital
sql_field_string =views
}
source delta_src: main_src{
sql_ranged_throttle=100
sql_query_pre=SET NAMES utf8
sql_query_pre=SET SESSION query_cache_type=OFF
sql_query= SELECT id,company_name,trademark,legal_person_name, UNIX_TIMESTAMP(created_at) AS created_at,reg_address,reg_number,business_scope,linkman,reg_organs,operating_period,views,reg_capital FROM register_enterprise_extends where id>(select max_doc_id from sph_counter where counter_id=1)
sql_attr_uint =id
sql_field_string =company_name
sql_field_string =trademark
sql_field_string =legal_person_name
sql_attr_timestamp =created_at
sql_field_string =reg_address
sql_field_string =reg_number
sql_field_string =business_scope
sql_field_string =reg_organs
sql_field_string =operating_period
sql_field_string =reg_capital
sql_field_string =views
}
index main{
source =main_src
path=/usr/local/sphinx/main
docinfo =extern
min_word_len =1
charset_type=utf-8
min_prefix_len=0
min_infix_len =1
ngram_len =1
charset_table = U+FF10..U+FF19->0..9, 0..9, U+FF41..U+FF5A->a..z, U+FF21..U+FF3A->a..z,A..Z->a..z, a..z, U+0149, U+017F, U+0138, U+00DF, U+00FF, U+00C0..U+00D6->U+00E0..U+00F6,U+00E0..U+00F6, U+00D8..U+00DE->U+00F8..U+00FE, U+00F8..U+00FE, U+0100->U+0101, U+0101,U+0102->U+0103, U+0103, U+0104->U+0105, U+0105, U+0106->U+0107, U+0107, U+0108->U+0109,U+0109, U+010A->U+010B, U+010B, U+010C->U+010D, U+010D, U+010E->U+010F, U+010F,U+0110->U+0111, U+0111, U+0112->U+0113, U+0113, U+0114->U+0115, U+0115,U+0116->U+0117,U+0117, U+0118->U+0119, U+0119, U+011A->U+011B, U+011B, U+011C->U+011D,U+011D,U+011E->U+011F, U+011F, U+0130->U+0131, U+0131, U+0132->U+0133, U+0133,U+0134->U+0135,U+0135, U+0136->U+0137, U+0137, U+0139->U+013A, U+013A, U+013B->U+013C,U+013C,U+013D->U+013E, U+013E, U+013F->U+0140, U+0140, U+0141->U+0142, U+0142,U+0143->U+0144,U+0144, U+0145->U+0146, U+0146, U+0147->U+0148, U+0148, U+014A->U+014B,U+014B,U+014C->U+014D, U+014D, U+014E->U+014F, U+014F, U+0150->U+0151, U+0151,U+0152->U+0153,U+0153, U+0154->U+0155, U+0155, U+0156->U+0157, U+0157, U+0158->U+0159,U+0159,U+015A->U+015B, U+015B, U+015C->U+015D, U+015D, U+015E->U+015F, U+015F,U+0160->U+0161,U+0161, U+0162->U+0163, U+0163, U+0164->U+0165, U+0165, U+0166->U+0167,U+0167,U+0168->U+0169, U+0169, U+016A->U+016B, U+016B, U+016C->U+016D, U+016D,U+016E->U+016F,U+016F, U+0170->U+0171, U+0171, U+0172->U+0173, U+0173, U+0174->U+0175,U+0175,U+0176->U+0177, U+0177, U+0178->U+00FF, U+00FF, U+0179->U+017A, U+017A, U+017B->U+017C,U+017C, U+017D->U+017E, U+017E, U+0410..U+042F->U+0430..U+044F,U+0430..U+044F,U+05D0..U+05EA, U+0531..U+0556->U+0561..U+0586, U+0561..U+0587, U+0621..U+063A, U+01B9,U+01BF, U+0640..U+064A, U+0660..U+0669, U+066E, U+066F, U+0671..U+06D3, U+06F0..U+06FF,U+0904..U+0939, U+0958..U+095F, U+0960..U+0963, U+0966..U+096F, U+097B..U+097F,U+0985..U+09B9, U+09CE, U+09DC..U+09E3, U+09E6..U+09EF,U+0A05..U+0A39, U+0A59..U+0A5E,U+0A66..U+0A6F, U+0A85..U+0AB9, U+0AE0..U+0AE3,U+0AE6..U+0AEF, U+0B05..U+0B39,U+0B5C..U+0B61, U+0B66..U+0B6F, U+0B71, U+0B85..U+0BB9,U+0BE6..U+0BF2, U+0C05..U+0C39,U+0C66..U+0C6F, U+0C85..U+0CB9, U+0CDE..U+0CE3,U+0CE6..U+0CEF, U+0D05..U+0D39, U+0D60,U+0D61, U+0D66..U+0D6F, U+0D85..U+0DC6,U+1900..U+1938, U+1946..U+194F, U+A800..U+A805,U+A807..U+A822, U+0386->U+03B1,U+03AC->U+03B1, U+0388->U+03B5, U+03AD->U+03B5,U+0389->U+03B7, U+03AE->U+03B7,U+038A->U+03B9, U+0390->U+03B9, U+03AA->U+03B9,U+03AF->U+03B9, U+03CA->U+03B9,U+038C->U+03BF, U+03CC->U+03BF, U+038E->U+03C5,U+03AB->U+03C5, U+03B0->U+03C5,U+03CB->U+03C5, U+03CD->U+03C5, U+038F->U+03C9,U+03CE->U+03C9, U+03C2->U+03C3, U+0391..U+03A1->U+03B1..U+03C1,U+03A3..U+03A9->U+03C3..U+03C9, U+03B1..U+03C1,U+03C3..U+03C9, U+0E01..U+0E2E,U+0E30..U+0E3A, U+0E40..U+0E45, U+0E47, U+0E50..U+0E59, U+A000..U+A48F, U+4E00..U+9FBF,U+3400..U+4DBF, U+20000..U+2A6DF, U+F900..U+FAFF,U+2F800..U+2FA1F, U+2E80..U+2EFF,U+2F00..U+2FDF, U+3100..U+312F, U+31A0..U+31BF,U+3040..U+309F, U+30A0..U+30FF,U+31F0..U+31FF, U+AC00..U+D7AF, U+1100..U+11FF, U+3130..U+318F, U+A000..U+A48F,U+A490..U+A4CF
ngram_chars =U+4E00..U+9FBF, U+3400..U+4DBF, U+20000..U+2A6DF, U+F900..U+FAFF,U+2F800..U+2FA1F, U+2E80..U+2EFF, U+2F00..U+2FDF, U+3100..U+312F, U+31A0..U+31BF,U+3040..U+309F, U+30A0..U+30FF, U+31F0..U+31FF, U+AC00..U+D7AF, U+1100..U+11FF, U+3130..U+318F, U+A000..U+A48F, U+A490..U+A4CF
}
index delta: main{
source=delta_src
path =/usr/local/sphinx/delta
}
indexer
{
mem_limit = 128M
}
searchd{
listen = 9312
listen = 9306:mysql41
log = /usr/local/sphinx/log/searchd.log
query_log = /usr/local/sphinx/log/query.log
read_timeout = 5
max_children = 30
pid_file = /usr/local/sphinx/log/searchd.pid
seamless_rotate = 1
preopen_indexes = 1
unlink_old = 1
max_matches = 1000
workers = threads # for RT to work
binlog_path = /usr/local/sphinx/data
}
3.生成索引,開啟sphinx程序
/usr/local/bin/indexer -c /usr/local/sphinx/sphinx.conf –all
/usr/local/bin/searchd -c /usr/local/sphinx/sphinx.conf
# 檢視程序是否已經開啟
ps aux|grep searchd
4.計劃任務執行shell腳本,定期更新增量索引(新增資料的索引)和主索引
vim delta_index.sh
#/bin/sh
#停止sphinx服務,将輸出重定向
/usr/local/bin/indexer -c /usr/local/sphinx/sphinx.conf delta --rotate >> /usr/local/sphinx/log/deltaindex.log;
/usr/local/bin/indexer --merge main delta --rotate -c /usr/local/sphinx/sphinx.conf >> /usr/local/sphinx/log/deltaindex.log
:wq
vim main_index.sh
#!/bin/sh
#停止正在運作的searchd
/usr/local/bin/indexer -c /usr/local/sphinx/sphinx.conf main --rotate >> /usr/local/sphinx/log/mainindex.log
:wq
crontab -e
#插入以下内容
*/1 * * * * /bin/sh /usr/local/sphinx/build_delta_index.sh > /dev/null >&
* * * /bin/sh /usr/local/sphinx/build_main_index.sh > /dev/null >&
:wq
特别注意,這裡有個坑,我在寫腳本的時候,無數次一個字不差地敲完腳本代碼,然而發現并沒有正常運作,打開日志發現,一直在報錯:–merge無法識别的參數。我當時内心糾結,到底是哪裡出了錯,我找了一天都沒找出來,後來我看了indexer –help中的指令執行個體,于是我複制了一條指令,除了改了最後的索引檔案名,什麼都沒動,結果亮瞎我的雙眼,一個字不差,我自己敲的不行,複制過去的完美運作,ubantu的vim編輯器有毒。
5.laravel中sngrl插件使用相關(sngrl\SphinxSearch項目源碼位址)
5.1.1完成了前面的幾個步驟sphinx索引基本上所需環境已經搭建完畢,下面就是sngrl插件簡單使用方法
在composer.json中“require”選項中加入
"require": {
/*** Some others packages ***/
"sngrl/sphinxsearch": "dev-master",
},
執行composer install或者composer update,個人建議使用composer install,很多依賴包都是國外的,用更新的方式安裝過程會比較漫長而且可能出現更新中斷的情況
5.1.2直接運作使用composer指令行方式
5.2 在app.conf中”provider”選項中加入
'providers' => array(
/*** Some others providers ***/
'sngrl\SphinxSearch\SphinxSearchServiceProvider',
),
5.3生成元件所需配置檔案
5.4配置檔案修改
return array (
//本地sphinx伺服器位址
'host' => '127.0.0.1',
//本地sphinx伺服器端口号
'port' => ,
'indexes' => array (
//這裡的my_index_name是剛才配置sphinx.conf中的索引名稱,例如我上面的配置檔案我的索引名稱就應該為main,後面的數組中table表示索引關聯的表,第二個key為搜尋結果中關聯id對應的表id名,
'my_index_name' => array ( 'table' => 'my_keywords_table', 'column' => 'id' ),
//當然也可以不使用數組關聯表
//'my_index_name' => FALSE,
)
);
5.5簡單常見使用方法
//别忘記引入SphinxSearch()類
$sphinx = new SphinxSearch();
//search()第一個參數是查詢的關鍵字,第二個參數是配置檔案中添加的索引名(my_index_name)
$results = $sphinx->search('my query', 'index_name')->query();//傳回值為原生sphinx的結果
$results = $sphinx->search('my query', 'index_name')->get();//傳回值為封裝的後結果數組
//在某個字段中搜尋關鍵字(傳回原生的sphinx結果數組),并添加分頁限制
$sphinx->limit($limit,($page - ) * $limit);
$result=$sphinx->search('@title "my query"','index_name')->query();