elasticsearch-jdbc實作MySQL同步到ElasticSearch深入詳解

點此連結看原視訊

1.如何實作mysql與elasticsearch的資料同步？

逐條轉換為json顯然不合适，需要借助第三方工具或者自己實作。核心功能點：同步增、删、改、查同步。

2、mysql與elasticsearch同步的方法有哪些？優缺點對比？

目前該領域比較牛的插件有：

1）、elasticsearch-jdbc，嚴格意義上它已經不是第三方插件。已經成為獨立的第三方工具。

https://github.com/jprante/elasticsearch-jdbc

2）、elasticsearch-river-mysql插件

https://github.com/scharron/elasticsearch-river-mysql

3）、go-mysql-elasticsearch（國内作者siddontang）

https://github.com/siddontang/go-mysql-elasticsearch

1-3同步工具/插件對比：

go-mysql-elasticsearch仍處理開發不穩定階段。

為什麼選擇elasticsearch-jdbc而不是elasticsearch-river-mysql插件的原因？（參考：

http://stackoverflow.com/questions/23658534/using-elasticsearch-river-mysql-to-stream-data-from-mysql-database-to-elasticsea

）

1）通用性角度：elasticsearch-jdbc更通用，

2）版本更新角度：elasticsearch-jdbc GitHub活躍度很高，最新的版本2.3.3.02016年5月28日相容Elasticsearch2.3.3版本。

而elasticsearch-river-mysql 2012年12月13日後便不再更新。

綜上，選擇elasticsearch-jdbc作為mysql同步Elasticsearch的工具理所當然。

elasticsearch-jdbc的缺點與不足（他山之石）：

1）、go-mysql-elasticsearch作者siddontang在部落格提到的：

elasticsearch-river-jdbc的功能是很強大，但并沒有很好的支援增量資料更新的問題，它需要對應的表隻增不減，而這個幾乎在項目中是不可能辦到的。

http://www.jianshu.com/p/05cff717563c

2）、

部落客leotse90在博文中提到elasticsearch-jdbc的缺點：那就是删除操作不能同步（實體删除）！

http://leotse90.com/2015/11/11/ElasticSearch

與MySQL資料同步以及修改表結構/

我截止2016年6月16日沒有測試到，不妄加評論。

3、elasticsearch-jdbc如何使用？要不要安裝？

3.1 和早期版本不同點

elasticsearch-jdbcV2.3.2.0版本不需要安裝。以下筆者使用的elasticsearch也是2.3.2測試。

作業系統：CentOS release 6.6 (Final)

看到這裡，你可能會問早期的版本有什麼不同呢？很大不同。從我搜集資料來看，不同點如下：

1）早期1.x版本，作為插件，需要安裝。

2）配置也會有不同。

3.2 elasticsearch-jdbc使用(同步方法一）

前提：

1）elasticsearch 2.3.2 安裝成功，測試ok。

2）mysql安裝成功，能實作增、删、改、查。

可供測試的資料庫為test，表為cc，具體資訊如下：

mysql> select * from cc;

+----+------------+

| id | name |

| 1 | laoyang |

| 2 | dluzhang |

| 3 | dlulaoyang |

3 rows in set (0.00 sec)

第一步：下載下傳工具。

址：

http://xbib.org/repository/org/xbib/elasticsearch/importer/elasticsearch-jdbc/2.3.2.0/elasticsearch-jdbc-2.3.2.0-dist.zip

第二步：導入Centos。路徑自己定，筆者放到根目錄下，解壓。unzip elasticsearch-jdbc-2.3.2.0-dist.zip

第三步：設定環境變量。

[root@5b9dbaaa148a /]# vi /etc/profile

export JDBC_IMPORTER_HOME=/elasticsearch-jdbc-2.3.2.0

使環境變量生效：

[root@5b9dbaaa148a /]# source /etc/profile

第四步：配置使用。詳細參考：

1）、根目錄下建立檔案夾odbc_es 如下：

[root@5b9dbaaa148a /]# ll /odbc_es/

drwxr-xr-x 2 root root 4096 Jun 16 03:11 logs

-rwxrwxrwx 1 root root 542 Jun 16 04:03 mysql_import_es.sh

2）、建立腳本mysql_import_es.sh，内容如下；

[root@5b9dbaaa148a odbc_es]# cat mysql_import_es.sh

’#!/bin/sh

bin=$JDBC_IMPORTER_HOME/bin

lib=$JDBC_IMPORTER_HOME/lib

echo '{

"type" : "jdbc",

"jdbc": {

"elasticsearch.autodiscover":true,

"elasticsearch.cluster":"my-application", #簇名，詳見：/usr/local/elasticsearch/config/elasticsearch.yml

"url":"jdbc:mysql://10.8.5.101:3306/test", #mysql資料庫位址

"user":"root", #mysql使用者名

"password":"123456", #mysql密碼

"sql":"select * from cc",

"elasticsearch" : {

"host" : "10.8.5.101",

"port" : 9300

"index" : "myindex", #新的index

"type" : "mytype" #新的type

}

}'| java \

-cp "${lib}/*" \

-Dlog4j.configurationFile=${bin}/log4j2.xml \

org.xbib.tools.Runner \

org.xbib.tools.JDBCImporter

3）、為 mysql_import_es.sh 添加可執行權限。

[root@5b9dbaaa148a odbc_es]# chmod a+x mysql_import_es.sh

4）執行腳本mysql_import_es.sh

[root@5b9dbaaa148a odbc_es]# ./mysql_import_es.sh

第五步：測試資料同步是否成功。

使用elasticsearch檢索查詢：

[root@5b9dbaaa148a odbc_es]# curl -XGET 'http://10.8.5.101:9200/myindex/mytype/_search?pretty'

{

"took" : 4,

"timed_out" : false,

"_shards" : {

"total" : 8,

"successful" : 8,

"failed" : 0

"hits" : {

"total" : 3,

"max_score" : 1.0,

"hits" : [ {

"_index" : "myindex",

"_type" : "mytype",

"_id" : "AVVXKgeEun6ksbtikOWH",

"_score" : 1.0,

"_source" : {

"id" : 1,

"name" : "laoyang"

}

}, {

"_id" : "AVVXKgeEun6ksbtikOWI",

"id" : 2,

"name" : "dluzhang"

"_id" : "AVVXKgeEun6ksbtikOWJ",

"id" : 3,

"name" : "dlulaoyang"

} ]

出現以上包含mysql資料字段的資訊則為同步成功。

4、 elasticsearch-jdbc 同步方法二

[root@5b9dbaaa148a odbc_es]# cat mysql_import_es_simple.sh

#!/bin/sh

java \

org.xbib.tools.JDBCImporter statefile.json

[root@5b9dbaaa148a odbc_es]# cat statefile.json

"elasticsearch.cluster":"my-application",

"url":"jdbc:mysql://10.8.5.101:3306/test",

"user":"root",

"password":"123456",

"index" : "myindex_2",

"type" : "mytype_2"

腳本和json檔案分開，腳本執行前先加載json檔案。

執行方式：直接運作腳本 ./mysql_import_es_simple.sh 即可。

5、Mysql與elasticsearch等價查詢

目标：實作從表cc中查詢id=3的name資訊。

1）MySQL中sql語句查詢：

mysql> select * from cc where id=3;

1 row in set (0.00 sec)

2）elasticsearch檢索：

[root@5b9dbaaa148a odbc_es]# curl

http://10.8.5.101:9200/myindex/mytype/_search?pretty

-d '

"filter" : { "term" : { "id" : "3" } }

"took" : 3,

"total" : 1,

常見錯誤：

錯誤日志位置：/odbc_es/logs

日志内容：

[root@5b9dbaaa148a logs]# tail -f jdbc.log

[04:03:39,570][INFO ][org.xbib.elasticsearch.helper.client.BaseTransportClient][pool-3-thread-1] after auto-discovery connected to [{5b9dbaaa148a}{aksn2ErNRlWjUECnp_8JmA}{10.8.5.101}{10.8.5.101:9300}{master=true}]

Bug1、[02:46:23,894][ERROR][importer.jdbc ][pool-3-thread-1] error while processing request: cluster state is RED and not YELLOW, from here on, everything will fail!

原因：

you created an index with replicas but you had only one node in the cluster. One way to solve this problem is by allocating them on a second node. Another way is by turning replicas off.

你建立了帶副本 replicas 的索引，但是在你的簇中隻有一個節點。

解決方案：

方案一：允許配置設定‘它們’到第二個節點。

方案二：關閉副本replicas（非常可行）。如下：

curl -XPUT 'localhost:9200/_settings' -d '

"index" : {

"number_of_replicas" : 0

’

Bug2、[13:00:37,137][ERROR][importer.jdbc ][pool-3-thread-1] error while processing request: no cluster nodes available, check settings {autodiscover=false, client.transport.ignore_cluster_name=false, client.transport.nodes_sampler_interval=5s, client.transport.ping_timeout=5s, cluster.name=elasticsearch,

org.elasticsearch.client.transport.NoNodeAvailableException: no cluster nodes available, check

見上腳本中新增：

“elasticsearch.cluster”:“my-application”, #簇名，和/usr/local/elasticsearch/config/elasticsearch.yml 簇名保持一緻。

參考：

http://stackoverflow.com/questions/11944915/getting-an-elasticsearch-cluster-to-green-cluster-setup-on-os-x

elasticsearch-jdbc實作MySQL同步到ElasticSearch深入詳解

繼續閱讀

samba伺服器的功能

AOP程式設計_Android優雅權限架構(1)概念基礎，2021金三銀四前言正文大綱正文

登入plsql 報錯 the account is locked --使用者被鎖

sqlServer根據經緯查距離

Ajax發送和擷取json資料到Spring mvc 1.spring mvc後端2.web前段

Effective Java 8:通用程式設計

【Linux】UDP廣播封包接收速率問題

SequoiaDB巨杉資料庫C++驅動概述

OOM三種類型

工廠模式-三種類型

【遞歸】高效率求2的n次幂

win10本地scala和spark安裝安裝scala安裝spark

Linux裝置模型（中）之上層容器

scala (3) Function 和 Method

PowerPC平台 Linux移植三

JSONObject包導入異常 java.lang.NoClassDefFoundErrorweb項目的導入包的問題