天天看點

es不建議模糊搜尋_工作中的Elasticsearch-模糊檢索

【簡介】

ElasticSearch是一個基于Lucene的搜尋伺服器。實時分布式搜尋和分析引擎。讓你以前所未有的速度處理大資料成為可能。使用Java開發并使用Lucene作為其核心來實作所有索引和搜尋的功能,但是它的目的是通過簡單的 RESTful

API 來隐藏Lucene的複雜性,進而讓全文搜尋變得簡單。

【特點】

1、分布式的實時檔案存儲,每個字段都被索引并可被搜尋

2、分布式的實時分析搜尋引擎

3、可以擴充到上百台伺服器,處理PB級結構化或非結構化資料

4、面向文檔(document)

針對特點4我們有必要看看關系型資料庫與ES的結結構對比

Relational DB -> Databases -> Tables -> Rows -> Columns

Elasticsearch -> Indices -> Types -> Documents -> Fields

【請求方式】

VERB HTTP方法: GET , POST , PUT , HEAD , DELETE

1、PROTOCOL http或者https協定(隻有在Elasticsearch前面有https代理的時候可用)

2、HOST Elasticsearch叢集中的任何一個節點的主機名,如果是在本地的節點,那麼就叫localhost

3、PORT Elasticsearch HTTP服務所在的端口,預設為9200

4、QUERY_STRING 一些可選的查詢請求參數,例如 ?pretty 參數将使請求傳回更加美觀易讀的JSON資料

5、BODY 一個JSON格式的請求主體(如果請求需要的話)

這裡就具體安利ES了。。。

【需求場景】

由于項目中有模糊查詢的業務,而且資料量特别大(單表8000W,并以每天50W的增量不斷增加),不得不說這是前期架構設定留下來的大坑。我們知道,mysql進行模糊查詢本來就慢,而且資料量還很大,查詢效率自然就不敢恭維了,毋庸置疑,這個光榮的任務坑了LZ,內建ES做分表是必然的。

反正淌了不少坑... ...

【實作】

1:kibana視圖語句

2:依賴(注意版本對應)

dependencies {

compile('org.springframework.boot:spring-boot-starter-data-elasticsearch')

compile('org.springframework.data:spring-data-elasticsearch')

compile('io.searchbox:jest:5.3.3')

compile('com.sun.jna:jna:3.0.9')

}

3:配置

#es

# 本地local環境

#spring.elasticsearch.jest.uris=http://192.168.90.201:9200

# 測試伺服器環境

spring.elasticsearch.jest.uris=http://estest.data.autohome.com.cn:80

spring.elasticsearch.jest.read-timeout=10000

4:工具類

4.1、ESDoc 文檔接口

public interface ESDoc {

//自定義索引文檔ID,需要在實作類中添加一個文檔id字段并使用@JestId注解,如果傳回空則使用ES自動生成的文檔ID

String getDocId();

//文檔所屬的索引名,一般為XX-XXX-yyyy.MM.dd

String getIndex();

//ES允許每個索引包含多個Type的文檔

String getType();

}

4.2、ESHits.java

@Data

@NoArgsConstructor

public class ESHits {

private Integer total;

@SerializedName(value = "maxScore", alternate = "max_score")

private Double maxScore;

private List> hits;

//get、set方法

}

4.3、ESShards.java

@Data

@NoArgsConstructor

public class ESShards {

private Integer total;

private Integer successful;

private Integer skipped;

private Integer failed;

private List> failures;

//get、set方法

}

4.4、Pagination.java

@Data

public class Pagination {

private int totalSize;

private List list;//資料清單

private int pageIndex;

private int pageSize;

public Pagination(){}

public Pagination(int totalSize, List list, int pageIndex, int pageSize) {

this.totalSize = totalSize;

this.list = list;

this.pageIndex = pageIndex;

this.pageSize = pageSize;

}

//get、set方法

}

4.5、ElasticSearchResult.java

@Data

@NoArgsConstructor

public class ElasticSearchResult {

private Integer took;

@SerializedName(value = "timeOut", alternate = {"time_out"})

private Boolean timeOut;

@SerializedName(value = "shards", alternate = {"_shards"})

private ESShards shards;

private ESHits hits;

//get、set方法

}

4.6、es業務實體類ESNlpInfoDoc.java

public class ESNlpInfoDoc implements ESDoc {

@JestId

private String docId;

private String username;

private String title;

private String content;

private Long ctime;

public ESNlpInfoDoc(){};

public ESNlpInfoDoc(String docId,String username,String title,String content){

this.docId=docId;

this.username=username;

this.title=title;

this.content=content;

}

public ESNlpInfoDoc(String docId,String username,String title,String content,Long ctime){

this.docId=docId;

this.username=username;

this.title=title;

this.content=content;

this.ctime=ctime;

}

@Override

public String getDocId() {

return docId;

}

@Override

public String getIndex() {

//return String.format(ESConstants.INDEX_FORMAT, ESConstants.INDEX_TYPE_MOXIE, this.date);

return ESConstants.INDEX_NAME_NLP;

}

@Override

public String getType() {

return ESConstants.INDEX_TYPE_NLPINFO;

}

//get、set方法

}

4.7、ES靜态常量ESConstants.java

public class ESConstants {

public static String INDEX_FORMAT = "loan-%s-%s";//loan-debt-yyyy.MM.dd

public static String INDEX_NAME_NLP = "nlpindex_";//總索引

public static String INDEX_NAME_NLP_NEWLY = "nlpindex_newly";//最近一周内資料(實際對應info表,不包含his表)

public static String INDEX_NAME_NLP_ALL_CLM = "nlpindex";//總索引

public static String INDEX_TYPE_NLPINFO = "nlpinfo";

public static String INDEX_TYPE_NLPLOG = "nlplog";

public static String INDEX_TYPE_NLPDICT = "nlpdict";

public static String INDEX_TYPE_NLPUSER = "nlpuser";

public static Long BATCH_FAILURE_ID = -1L;//批次執行失敗時失敗文檔ID,區分正常的文檔失敗

public static Integer INDEX_FAILURE = 1;//定時任務處理錯誤資料,補到索引中

}

4.8、jest工具類 JestService.java

import io.searchbox.client.JestClient;

import io.searchbox.client.JestResult;

import io.searchbox.cluster.Health;

import io.searchbox.cluster.NodesStats;

import io.searchbox.core.*;

import io.searchbox.core.search.sort.Sort;

import io.searchbox.indices.ClearCache;

import io.searchbox.indices.CreateIndex;

import io.searchbox.indices.CreateIndex.Builder;

import io.searchbox.indices.IndicesExists;

import io.searchbox.indices.Optimize;

import io.searchbox.indices.mapping.PutMapping;

import io.searchbox.params.Parameters;

import org.apache.commons.lang3.StringUtils;

import org.elasticsearch.action.search.SearchRequestBuilder;

import org.elasticsearch.common.settings.Settings;

import org.elasticsearch.index.query.*;

import org.elasticsearch.search.builder.SearchSourceBuilder;

import org.slf4j.Logger;

import org.slf4j.LoggerFactory;

import org.springframework.beans.factory.annotation.Autowired;

import org.springframework.stereotype.Service;

@Service

public class JestService {

private Logger logger = LoggerFactory.getLogger(getClass());

@Autowired

private JestClient jestClient;

private final static String SCROLL = "1m";

private final static int BATCH_SIZE = 5000;

private final static int MAX_SIZE = 10000;

public JestResult health() throws IOException {

Health health = new Health.Builder().build();

JestResult result = jestClient.execute(health);

return result;

}

public JestResult nodesStats() throws IOException {

NodesStats nodesStats = new NodesStats.Builder().build();

JestResult result = jestClient.execute(nodesStats);

return result;

}

public JestResult createIndex(String indexName, String settings) throws IOException {

Builder builder = new Builder(indexName);

if(StringUtils.isNotBlank(settings)){

builder.settings(Settings.builder().loadFromSource(settings));

}

CreateIndex createIndex = builder.build();

JestResult result = jestClient.execute(createIndex);

return result;

}

public JestResult putMapping(String indexName, String type, String mappings)

throws IOException {

PutMapping putMapping = new PutMapping.Builder(indexName, type, mappings).build();

JestResult result = jestClient.execute(putMapping);

return result;

}

public JestResult isExists(String indexName) throws IOException {

IndicesExists indicesExists = new IndicesExists.Builder(indexName).build();

JestResult result = jestClient.execute(indicesExists);

return result;

}

public JestResult optimizeIndex() {

Optimize optimize = new Optimize.Builder().build();

JestResult result = null ;

try {

result = jestClient.execute(optimize);

} catch (IOException e) {

e.printStackTrace();

}

return result ;

}

public JestResult clearCache() {

ClearCache closeIndex = new ClearCache.Builder().build();

JestResult result = null ;

try {

result = jestClient.execute(closeIndex);

} catch (IOException e) {

e.printStackTrace();

}

return result ;

}

public JestResult insertDocument(String indexName, String type, ESDoc doc) throws IOException {

Index index = new Index.Builder(doc).index(indexName).type(type).build();

JestResult result = jestClient.execute(index);

return result;

}

public JestResult bulkIndex(String indexName, String type, List docs)

throws IOException {

Bulk.Builder builder = new Bulk.Builder().defaultIndex(indexName).defaultType(type);

for(ESDoc doc: docs){

builder.addAction(new Index.Builder(doc).build());

}

JestResult result = jestClient.execute(builder.build());

return result;

}

public JestResult bulkIndex(List docs)

throws IOException {

Bulk.Builder builder = new Bulk.Builder();

for(ESDoc doc: docs){

builder.addAction(new Index.Builder(doc).index(doc.getIndex()).type(doc.getType()).build());

}

JestResult result = jestClient.execute(builder.build());

return result;

}

public JestResult deleteDocument(String indexName, String type, String id) throws Exception {

Delete delete = new Delete.Builder(id).index(indexName).type(type).build();

JestResult result = jestClient.execute(delete);

return result;

}

public JestResult updateDocument(String indexName, String type, ESDoc doc)

throws IOException {

Update update = new Update.Builder(doc).index(indexName).type(type).id(doc.getDocId()).build();

JestResult result = jestClient.execute(update);

return result;

}

public JestResult getDocument(String indexName, String type, String docId) throws IOException {

Get get = new Get.Builder(indexName, docId).type(type).build();

JestResult result = jestClient.execute(get);

return result;

}

public SearchResult simpleSearch(String indexName, String type, String query)

throws IOException {

Search search = new Search.Builder(query)

// multiple index or types can be added.

.addIndex(indexName)

.addType(type)

.build();

SearchResult result = jestClient.execute(search);

return result;

}

public SearchResult searchAll ( String indexName , String type ) throws IOException {

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

searchSourceBuilder.query(QueryBuilders.matchAllQuery());

Search search = new Search.Builder(

searchSourceBuilder.toString())

.addIndex(indexName)

.addType(type).build();

SearchResult result = jestClient.execute(search);

return result;

}

public SearchResult searchAllDesc ( String indexName , String type , String sortField ) throws IOException {

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

searchSourceBuilder.query(QueryBuilders.matchAllQuery());

Search search = new Search.Builder(

searchSourceBuilder.toString()).addSort(new Sort(sortField,Sort.Sorting.DESC))

.addIndex(indexName)

.addType(type).build();

SearchResult result = jestClient.execute(search);

return result;

}

public SearchResult searchAllAsc ( String indexName , String type , String sortField ) throws IOException {

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

searchSourceBuilder.query(QueryBuilders.matchAllQuery());

Search search = new Search.Builder(

searchSourceBuilder.toString()).addSort(new Sort(sortField,Sort.Sorting.ASC))

.addIndex(indexName)

.addType(type).build();

SearchResult result = jestClient.execute(search);

return result;

}

public SearchResult searchMax ( String indexName , String type , String field ) throws IOException {

String query = "{\n" +

" \"size\": 0,\n" +

" \"aggs\": {\n" +

" \"maxCtime\": {\n" +

" \"max\": {\n" +

" \"field\": \"ctime\"\n" +

" }\n" +

" }\n" +

" }\n" +

" }";

Search search = new Search.Builder(query)

.addIndex(indexName)

.addType(type)

.build();

SearchResult result = jestClient.execute(search);

return result;

}

public SearchResult searchInfoByField(String indexName , String type ,String field,Object keyword) throws Exception{

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// QueryBuilder queryBuilder = QueryBuilders.termQuery(field+".keyword", keyword);//單值完全比對查詢

QueryBuilder queryBuilder = QueryBuilders.termQuery(field, keyword);//單值完全比對查詢

searchSourceBuilder.query(queryBuilder).size(MAX_SIZE);

String query = searchSourceBuilder.toString();

System.out.println(query);

Search search = new Search.Builder(query)

.addIndex(indexName)

.addType(type)

.build();

SearchResult result = jestClient.execute(search);

return result;

}

public SearchResult blurSearch ( String indexName , String type , String field , String keyWord,Long startTime,Long endTime) throws IOException {

//方式五:查詢query(用API進行查詢是對應視圖工具上的json參數進行查詢)

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

//時間過濾

QueryBuilder timeQuery = QueryBuilders.rangeQuery("ctime").from(startTime).to(endTime);

//文本過濾

QueryBuilder contentBuilder = QueryBuilders.wildcardQuery(field, "*"+keyWord+"*");

QueryBuilder boolQuery = QueryBuilders.boolQuery().must(timeQuery).must(contentBuilder);

searchSourceBuilder.query(boolQuery).size(MAX_SIZE);

String query = searchSourceBuilder.toString();

System.out.println(query);

Search search = new Search.Builder(query)

.addIndex(indexName)

.addType(type)

.build();

SearchResult result = jestClient.execute(search);

return result;

}

public SearchResult getlteByCtime ( String indexName , String type ,Long startTime) throws IOException {

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

//時間過濾

QueryBuilder timeQuery = QueryBuilders.rangeQuery("ctime").lte(startTime);

QueryBuilder boolQuery = QueryBuilders.boolQuery().must(timeQuery);

searchSourceBuilder.query(boolQuery).size(MAX_SIZE);

String query = searchSourceBuilder.toString();

System.out.println(query);

Search search = new Search.Builder(query)

.addIndex(indexName)

.addType(type)

.build();

SearchResult result = jestClient.execute(search);

return result;

}

public SearchResult getgteByField ( String indexName , String type ,String field,Long startTime) throws IOException {

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

//時間過濾

QueryBuilder timeQuery = QueryBuilders.rangeQuery(field).gte(startTime);

QueryBuilder boolQuery = QueryBuilders.boolQuery().must(timeQuery);

searchSourceBuilder.query(boolQuery).size(MAX_SIZE);

String query = searchSourceBuilder.toString();

System.out.println(query);

Search search = new Search.Builder(query)

.addIndex(indexName)

.addType(type)

.build();

SearchResult result = jestClient.execute(search);

return result;

}

public SearchResult getlteByField ( String indexName , String type , String field, Long startTime) throws IOException {

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

//時間過濾

QueryBuilder timeQuery = QueryBuilders.rangeQuery(field).lte(startTime);

QueryBuilder boolQuery = QueryBuilders.boolQuery().must(timeQuery);

searchSourceBuilder.query(boolQuery).size(MAX_SIZE);

String query = searchSourceBuilder.toString();

System.out.println(query);

Search search = new Search.Builder(query)

.addIndex(indexName)

.addType(type)

.build();

SearchResult result = jestClient.execute(search);

return result;

}

public SearchResult blurSearch ( String indexName , String type , String field , String keyWord) throws IOException {

//方式一

// QueryBuilder queryBuilder = QueryBuilders.fuzzyQuery(field+".keyword", keyWord);

// Search search = new Search.Builder(queryBuilder.toString()).addIndex(indexName).addType(type).build();

//方式二

// Term term=new Term(field+".keyword", "*"+keyWord+"*");

// WildcardQuery query=new WildcardQuery(term);

// Search search = new Search.Builder(query.toString()).addIndex(indexName).addType(type).build();

//方式三

// QueryBuilder queryBuilder = QueryBuilders.constantScoreQuery(QueryBuilders.termQuery(field+".keyword", "*"+keyWord+"*"));

// Search search = new Search.Builder(

// queryBuilder.toString())

// .addIndex(indexName)

// .addType(type).build();

//方式四

// WildcardQueryBuilder queryBuilder = QueryBuilders.wildcardQuery(field+".keyword", "*"+keyWord+"*");

// Search search = new Search.Builder(queryBuilder.toString())

// .addIndex(indexName)

// .addType(type).build();

//方式五:查詢query(用API進行查詢是對應視圖工具上的json參數進行查詢)

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

// QueryBuilder queryBuilder = QueryBuilders.wildcardQuery(field+".keyword", "*"+keyWord+"*");

QueryBuilder queryBuilder = QueryBuilders.wildcardQuery(field, "*"+keyWord+"*");

// QueryBuilder queryBuilder = QueryBuilders.matchQuery(field,keyWord);

searchSourceBuilder.query(queryBuilder).size(MAX_SIZE);

String query = searchSourceBuilder.toString();

System.out.println(query);

Search search = new Search.Builder(query)

.addIndex(indexName)

.addType(type)

.build();

SearchResult result = jestClient.execute(search);

return result;

}

public List scanAndScrollSearch(String indexName, String type, String query, Class clazz)

throws IOException {

List ret = new ArrayList<>();

Search search = new Search.Builder(query)

.addIndex(indexName)

.addType(type)

.setParameter(Parameters.SIZE, BATCH_SIZE)

.setParameter(Parameters.SCROLL, SCROLL)

.build();

SearchResult searchResult = jestClient.execute(search);

if(!searchResult.isSucceeded()){

logger.error(searchResult.getErrorMessage());

return ret;

}

String scrollId = searchResult.getJsonObject().get("_scroll_id").getAsString();

ElasticSearchResult esr = new Gson().fromJson(searchResult.getJsonString(), ElasticSearchResult.class);

logger.info("ES 搜尋花費{}毫秒,逾時:{},分片總數:{}, 成功執行的分片數:{},跳過的分片數:{},失敗的分片數:{},命中數目:{},命中最高得分:{},本次取得{}條", esr.getTook(), esr.getTimeOut(), esr.getShards().getTotal(), esr.getShards().getSuccessful(), esr.getShards().getSkipped(), esr.getShards().getFailed(), esr.getHits().getTotal(), esr.getHits().getMaxScore(), esr.getHits().getHits().size());

List curList = searchResult.getSourceAsObjectList(clazz, false);

int curPageSize = curList.size();

ret.addAll(curList);

while(curPageSize != 0 && ret.size() < MAX_SIZE) {

SearchScroll scrollSearch = new SearchScroll.Builder(scrollId, SCROLL).build();

JestResult scrollResult = jestClient.execute(scrollSearch);

scrollId = scrollResult.getJsonObject().get("_scroll_id").getAsString();

esr = new Gson().fromJson(scrollResult.getJsonString(), ElasticSearchResult.class);

logger.info("ES 搜尋花費{}毫秒,逾時:{},分片總數:{}, 成功執行的分片數:{},跳過的分片數:{},失敗的分片數:{},命中數目:{},命中最高得分:{},本次取得{}條", esr.getTook(), esr.getTimeOut(), esr.getShards().getTotal(), esr.getShards().getSuccessful(), esr.getShards().getSkipped(), esr.getShards().getFailed(), esr.getHits().getTotal(), esr.getHits().getMaxScore(), esr.getHits().getHits().size());

curList = scrollResult.getSourceAsObjectList(clazz, false);

curPageSize = curList.size();

ret.addAll(curList);

}

return ret;

}

public T getESDoc(String index, String type, String id, Class clazz) {

try {

JestResult result = getDocument(index, type, id);

if (result.isSucceeded()) {

T doc = result.getSourceAsObject(clazz);

return doc;

}

} catch (IOException e) {

logger.error("從ES讀取{}:{}:{}失敗 {}[{}]", index, type, id, e.getMessage(), e.getStackTrace());

}

return null;

}

public Pagination search(String index, String type, String query, Class clazz){

Pagination ret = null;

try {

SearchResult result = simpleSearch(index, type, query);

if(result.isSucceeded()) {

ElasticSearchResult esr = new Gson().fromJson(result.getJsonString(), ElasticSearchResult.class);

logger.info("ES 搜尋花費{}毫秒,逾時:{},分片總數:{}, 成功執行的分片數:{},跳過的分片數:{},失敗的分片數:{},命中數目:{},命中最高得分:{}", esr.getTook(), esr.getTimeOut(), esr.getShards().getTotal(), esr.getShards().getSuccessful(), esr.getShards().getSkipped(), esr.getShards().getFailed(), esr.getHits().getTotal(), esr.getHits().getMaxScore());

if(esr.getShards().getFailed() == 0) {

ret = new Pagination<>();

ret.setList(result.getSourceAsObjectList(clazz, false));

ret.setTotalSize(esr.getHits().getTotal());

}else{

logger.error("搜尋錯誤資訊:{}", new Gson().toJson(esr.getShards().getFailures()));

}

}else{

logger.error("ES查詢失敗{}[{}]", result.getErrorMessage(), result.getJsonString());

}

} catch (IOException e) {

logger.error("搜尋ES失敗 {}[{}]", e.getMessage(), e.getStackTrace());

}

return ret;

}

public String generateQuery(){

SearchSourceBuilder builder = new SearchSourceBuilder();

TermQueryBuilder termQuery = QueryBuilders.termQuery("beat.hostname", "bjzw_99_138");

TermsQueryBuilder termsQuery = QueryBuilders

.termsQuery("response_code", "301", "302", "404", "500");

RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("response_duration").gte(0).lte(1);

RangeQueryBuilder rangeQuery1 = QueryBuilders.rangeQuery("response_duration").gte(2).lte(3);

RangeQueryBuilder rangeQuery2 = QueryBuilders.rangeQuery("response_duration").gte(4).lte(5);

RangeQueryBuilder rangeQuery3 = QueryBuilders.rangeQuery("response_duration").gte(8).lte(10);

BoolQueryBuilder shouldQuery = QueryBuilders.boolQuery().should(rangeQuery1).should(rangeQuery2).should(rangeQuery3);

BoolQueryBuilder boolQuery = QueryBuilders.boolQuery().filter(termQuery).filter(termsQuery).mustNot(rangeQuery).filter(shouldQuery);

return builder.query(boolQuery).toString();

}

}

5:業務實作

private List getESInfoByContent(String index , String type ,String content, Long startTime, Long endTime) {

List idList = new ArrayList<>();

try {

if (StringUtils.isNotBlank(content)) {

//進行ES模糊檢索,并傳回出id的集合清單

JestResult jestResult = jestService.blurSearch(index, type, "content", content, startTime, endTime);

List list = jestResult.getSourceAsObjectList(ESNlpInfoDoc.class);

for (ESNlpInfoDoc doc : list) {

idList.add(Long.parseLong(doc.getDocId()));

}

}

} catch (Exception e) {

logger.info("ES query by content error : " + e.getMessage());

}

Collections.sort(idList);

Collections.reverse(idList);

return idList;

}

private List getESInfoByTitle(String index , String type ,String title, Long startTime, Long endTime) {

List idList = new ArrayList<>();

try {

if (StringUtils.isNotBlank(title)) {

//title全文比對

// JestResult jestResult = jestService.searchInfoByField(ESConstants.INDEX_NAME_NLP, ESConstants.INDEX_TYPE_NLPINFO,"title",title);

//title模糊比對

JestResult jestResult = jestService.blurSearch(index, type,"title", title, startTime, endTime);

List list = jestResult.getSourceAsObjectList(ESNlpInfoDoc.class);

for (ESNlpInfoDoc doc : list) {

idList.add(Long.parseLong(doc.getDocId()));

}

}

} catch (Exception e) {

logger.info("ES query by title error : " + e.getMessage());

}

Collections.sort(idList);

Collections.reverse(idList);

return idList;

}

//-------------------------------------業務使用-------------------------------------

public RcASMsgVo infoRequestV2(Integer function,

String businessId,

String uName,

String curStatus,

String manualConfirm,

String contentType,

String stime,

String etime,

String type,

String searchName, //5 内容 6 标題

String searchContent,

int pageSize,

int pageNum) throws Exception {

RcASMsgVo rcASMsgVo = new RcASMsgVo(Constants.CODE_FAIL, Constants.MSG_FAIL, null);

//驗證使用者名和使用者id一緻性,如果驗證失敗,傳回空内容

if (!checkAdmin(uName, businessId)) {

rcASMsgVo.setMsg(Constants.MSG_NAME_ID_NO_ALIKE);

return rcASMsgVo;

}

Long startTime = Long.parseLong(stime);

Long endTime = Long.parseLong(etime);

//時間轉換格式

if (stime != null && etime != null) {

stime = timeTransform(stime);

etime = timeTransform(etime);

} else {

stime = null;

etime = null;

}

//根據日期判斷查那張表(一周為基準)

String isHis = getIsHis(stime, etime);

//判斷模糊查詢參數 不為空則進行ES模糊查詢

List idList;

//ES使用開始

//(1)内容模糊檢索

if (searchName.equals("5") || searchName == "5") {

//(優化v2)ES分表查詢

if("0".equals(isHis)){

idList = getESInfoByContent(ESConstants.INDEX_NAME_NLP_NEWLY, ESConstants.INDEX_TYPE_NLPINFO,searchContent, startTime, endTime);

} else {

idList = getESInfoByContent(ESConstants.INDEX_NAME_NLP, ESConstants.INDEX_TYPE_NLPINFO,searchContent, startTime, endTime);

}

//(優化v1)ES查詢

// idList = getESInfoByContent(ESConstants.INDEX_NAME_NLP, ESConstants.INDEX_TYPE_NLPINFO,searchContent, startTime, endTime);

System.out.println("ES檢索後list大小:" + idList.size());

//(2)标題全文檢索

} else if (searchName.equals("6") || searchName == "6") {

if("0".equals(isHis)){

idList = getESInfoByContent(ESConstants.INDEX_NAME_NLP_NEWLY, ESConstants.INDEX_TYPE_NLPINFO,searchContent, startTime, endTime);

} else {

idList = getESInfoByContent(ESConstants.INDEX_NAME_NLP, ESConstants.INDEX_TYPE_NLPINFO,searchContent, startTime, endTime);

}

System.out.println("ES檢索後list大小:" + idList.size());

//(3)其他情況

} else {

//老查詢方法

return infoRequest(function, businessId, uName, curStatus, manualConfirm, contentType, startTime.toString(), endTime.toString(), type, searchName, searchContent, pageSize, pageNum);

}

//ES使用結束

//計算總數

int totalNum = 0;

//計算limit分頁所需參數

int rowStartNum = pageSize * pageNum;

//關系型資料庫查詢,使用id排查

List rcAdminNlpInfoList = new ArrayList<>();

if (idList.size() > 0) {

totalNum = rcAdminNlpInfoDao.countTotalNumNew(idList, businessId, curStatus, manualConfirm, contentType, stime, etime, type, isHis);

rcAdminNlpInfoList = rcAdminNlpInfoDao.selectByParamNew(idList, businessId, curStatus, manualConfirm, contentType, stime, etime, type, rowStartNum, pageSize, isHis);

}

logger.info("total num is : " + totalNum);

//使用字典緩存,将代碼替換為文本,調用方法增加文本高亮标簽

List list = rcAdminNlpInfoFormat(rcAdminNlpInfoList);

//包裝資料

RcASPageVo rcASPage = new RcASPageVo(totalNum, pageSize, pageNum, list);

rcASMsgVo.setCode(Constants.CODE_SUCCESS);

rcASMsgVo.setMsg(Constants.MSG_SUCCESS);

rcASMsgVo.setData(rcASPage);

// logger.info("查詢結果:{}", rcASPage);

//初始化傳回前端的資料對象

return rcASMsgVo;

}

我們可以看到,在查詢中,老的查詢方式是直接在mysql中進行模糊查詢,整合ES之後,我們可以先使用ES進行模糊查詢進行過濾,得到符合的結果id清單,在使用mysql對id進行in查詢,進而避免使用mysql進行文本的模糊查詢,提高效率。當然,我們也可以對ES進行分索引操作,對應mysql的分表。

【優化報表】當然,這不是最終的優化方案

查詢時間段 查詢類型 查詢關鍵字 原始版本 優化V1(ES) 優化v2(ES分表) 優化v3(mysql去union做單表) 優化v4(去除mysql) 資料表跨度(張/資料量) 預估效率倍數(機關:倍)

當天 content 交警 用時:5079 ms 用時:8457 ms 用時:879 ms 用時:923 ms 1/500W 3

三天 content 交警 用時:23212 ms 用時:8359 ms 用時:942 ms 用時:1052 ms 1/500W 22

最近一周 content 交警 用時:42380 ms 用時:9055 ms 用時:1069 ms 用時:1063 ms 1/500W 39

一周以外 content 交警 用時:113900 ms 用時:9607 ms 用時:8508 ms 用時:8826 ms 1/7500W 13

最近一月 content 交警 用時:448306 ms 用時:214725 ms 用時:170833 ms 用時:44213 ms 2/8000W 10

當天 content 歪果仁 用時:31973 ms 用時:8053 ms 用時:659 ms 用時:665 ms 1/500W 48

三天 content 歪果仁 用時:25086 ms 用時:8587 ms 用時:652 ms 用時:753 ms 1/500W 33

最近一周 content 歪果仁 用時:49043 ms 用時:7883 ms 用時:759 ms 用時:704 ms 1/500W 69

一周以外 content 歪果仁 用時:116818 ms 用時:8156 ms 用時:7999 ms 用時:8613 m 1/7500W 14

最近一月 content 歪果仁 用時:456657 ms 用時:8579 ms 用時:8223 ms 用時:8113 ms 2/8000W 56

當天 title 提車 用時:4280 ms 用時:1408 ms 用時:1021 ms 用時:1139 ms 1/500W 4

三天 title 提車 用時:11150 ms 用時:3869 ms 用時:3417 ms 用時:3190 ms 1/500W 3

最近一周 title 提車 用時:18370 ms 用時:3921 ms 用時:3500 ms 用時:3288 ms 1/500W 5

一周以外 title 提車 用時:84029 ms 用時:42576 ms 用時:41153 ms 用時:40660 ms 1/7500W 2

最近一月 title 提車 用時:263730 ms 用時:177561 ms 用時:173627 ms 用時:42564 ms 2/8000W 6

當天 title 保養 用時:4503 ms 用時:966 ms 用時:912 ms 用時:975 ms 1/500W 5

三天 title 保養 用時:10832 ms 用時:3336 ms 用時:3142 ms 用時:3094 ms 1/500W 3

最近一周 title 保養 用時:19063 ms 用時:3813 ms 用時:3642 ms 用時:3169 ms 1/500W 7

一周以外 title 保養 用時:79402 ms 用時:41638 ms 用時:40708 ms 用時:40718 ms 1/7500W 2

最近一月 title 保養 用時:265663 ms 用時:173113 ms 用時:179834 ms 用時:44493 ms 2/8000W 6

方案演變:

原始版本-優化v1(ES接入)

旨在使用ES的代替mysql模糊查詢,降低模糊查詢時間,提高效率。此段優化ES單個查詢消耗時間在10s-20s之間,所有通常在當天或者近期的查詢時間段性能不會提升,反而會有所下降,主要原因是ES的查詢消耗

優化v1-優化v2(ES分表)

旨在避免每一次查詢ES都對整體資料量進行模糊比對,将搜尋頻次比較高的時間段建立一個索引,提高熱度高頻時間段查詢的效率(以一周為準),此段優化在查詢一周内效率提升比較明顯

優化v2-優化v3(mysql單表處理)

旨在去除mysql的union連結清單查詢,轉單表查詢,對結果集進行程式整合操作來替換mysql查詢,使用程式彌補mysql的查詢瓶頸,查詢效率,此段優化在查詢一周以外以及橫跨雙表時效率提升比較明顯

優化v3-優化v4(去除mysql)

待續... ...

備注:

目前,查詢瓶頸主要在分表問題上,mysql通常單表資料量不易超過500W,目前業務需求看,mysql存儲資料量過大是造成查詢效率低下的主要原因之一,可轉mysql為HBase或ES

上張圖

es不建議模糊搜尋_工作中的Elasticsearch-模糊檢索

車盾.png