天天看点

Spark thriftserver连elasticsearch

1.需要将elasticsearch-hadoop-2.1.0.Beta4.jar包放入/usr/local/spark/lib,下载地址为:https://www.elastic.co/products/hadoop/

2.需要在/usr/local/spark/conf的hive-site.xml中进行配置

Spark thriftserver连elasticsearch

3.启动thriftserver,并在—jars后面带上此jar包

./start-thriftserver.sh --master local--driver-class-path /usr/local/spark/postgresql-9.4-1201.jdbc41.jar --jars/usr/local/spark/lib/elasticsearch-hadoop-2.1.0.Beta4.jar

4.创建artists表,并将它的index名称命名为default,type名称命名为artists

Spark thriftserver连elasticsearch

5.可以看到它的数据,通过curl访问elasticsarch的rest api:

Spark thriftserver连elasticsearch

6.可以联合查询此表与已存在的表

Spark thriftserver连elasticsearch

此方法完美解决elasticsearch自身的api难以多表关联查询的问题。

继续阅读