ES-面试-原理

2023-06-28 13:32:32

索引及搜索

默认情况下

索引时，会对要索引的字符串做字符过滤、分词、token过滤。

搜索时，会对搜索关键词用同样一套字符过滤器、分词器、token过滤器。

character filter - 字符过滤器，字符整理，如将 & 转换成 and 等。

Tokenizer - 分词器，根据一定规则将字符串分割成多个单词

Token filter - 单词过滤器，如去掉无意义的单词：a, an, the

自定义字符过滤器、分词器、分析器

PUT /my_index
{
	"analyzer" : "standard",	# 指定索引级别的分析器
	"mapping" : {
		"my_type" : {
			"properties" : {
				"product_no" : {
					"type" : "text",
					"analyzer" : "my_analyzer"		# 单独对某字段使用自定义的分析器
				} 
			}
		}
	}	
}

#自定义分析器


PUT /my_index
{
	"analysis" : {		# 进入分析器相关配置
		"char_filter": { }, 	#自定义字符过滤器
        "tokenizer":   { },  	# 自定义分词器
        "filter": {},			# 自定义次元过滤器
        "analyzer" : {} 		#自定义分析器，使用前边自定义的过滤器等
	}
}
#例：
"char_filter": {
    "&_to_and": {
        "type":       "mapping",
        "mappings": [ "&=> and "]
    }
}
"filter": {
    "my_stopwords": {
        "type":        "stop",
        "stopwords": [ "the", "a" ]
    }
}
"analyzer": {
    "my_analyzer": {
        "type":           "custom",
        "char_filter":  [ "html_strip", "&_to_and" ],
        "tokenizer":      "standard",
        "filter":       [ "lowercase", "my_stopwords" ]
    }
}

测试分析器

PUT /_analyze
{
	"analyzer" : "standard"，
	"text" : "中文试一试？"
}

ES-面试-原理

索引及搜索

继续阅读

【elasticsearch】The number of object passed must be even but was [1]1.概述

ElasticJob‐Lite：部署ElasticJob-Lite-UI

ElasticJob‐Lite：Simple作业

ElasticJob‐Lite：整合Spring Boot

跟据经纬度实现附近搜索Java实现

【最新 v7.9】Elasticsearch的基本概念与配置

图解elasticsearch的_source、_all、store和index

深入elasticsearch源码之环境搭建

elasticsearch 的 Percolator操作

es使用项目中遇到的问题

15.profile-api

【转】ElasticSearch是什么以及应用场景

ElasticSearch是什么以及应用场景ES是如何产生的？ES 基础一网打尽ES特点和优势为什么要用ES？ES的应用场景是怎样的？

延云行业搜索数据库在大数据生态中位置和重要性大数据的挑战大数据技术的现状延云行业搜索数据库

尚硅谷—韩顺平—图解 Java设计模式（结构型）（55～）

30天了解30种技术系列---(10)面向Cloud的搜索引擎 ElasticSearch