es-04搜索和查询

文章目录

  • 搜索和查询
    • 查询的上下文
  • Query DSL(Domain Specific Language)
      • 1 查询上下文
      • 2 相关度评分:_score
      • 3 元数据:_source
      • 4 Query String
        • 查询所有:
        • 带参数:
        • 分页:
        • 精准匹配 exact value
        • _all搜索 相当于在所有有索引的字段中检索
      • 5 全文检索-Fulltext query
        • match:匹配包含某个term的子句
        • match_all:匹配所有结果的子句
        • multi_match:多字段条件
        • match_phrase:短语查询,
      • 6 精准查询-Term query
        • term:匹配和搜索词项完全相等的结果
        • terms:匹配和搜索词项列表中任意项匹配的结果
        • range:范围查找
      • 7 过滤器-Filter
      • 8 组合查询-Bool query
    • 数据代码

搜索和查询

查询的上下文

	{#请求消耗的时间 ms"took" : 722,#是否超时"timed_out" : false,#当前请求的分片"_shards" : {#分片的总数"total" : 1,#成功的个数"successful" : 1,#跳过的个数"skipped" : 0,#失败的个数"failed" : 0},#真正返回的结果"hits" : {#结果的总数"total" : {#查询到的结果个数,不是返回显示的个数"value" : 2,#查询关系"relation" : "eq"},#当前最大的评分"max_score" : 1.0,#具体的结果"hits" : [{#索引"_index" : "product",#类型"_type" : "_doc",#id"_id" : "1",#相关度评分"_score" : 1.0,#具体的结果详情,元数据"_source" : {"name" : "xiami phone","desc" : "shouji zhong de jianjiji","price" : 4999,"tags" : ["xingjiabi","fashao","buka"]}},{"_index" : "product","_type" : "_doc","_id" : "2","_score" : 1.0,"_source" : {"name" : "xiami nfc phone","desc" : "zhichi quangongneng nfc,shouji zhong de jianjiji","price" : 49995,"tags" : ["xingjiabi","fashao","gongjiaoka"]}}]}}

Query DSL(Domain Specific Language)

1 查询上下文

​ 使用query关键字进行检索,倾向于相关度搜索,故需要计算评分。搜索是Elasticsearch最关键和重要的部分。

2 相关度评分:_score

​ 概念:相关度评分用于对搜索结果排序,评分越高则认为其结果和搜索的预期值相关度越高,即越符合搜索预期值。在7.x之前相关度评分默认使用TF/IDF算法计算而来,7.x之后默认为BM25。在核心知识篇不必关心相关评分的具体原理,只需知晓其概念即可。

​ 排序:相关度评分为搜索结果的排序依据,默认情况下评分越高,则结果越靠前。

3 元数据:_source

  1. 禁用_source:

    1. 好处:节省存储开销

    2. 坏处:

      • 不支持update、update_by_query和reindex API。
      • 不支持高亮。
      • 不支持reindex、更改mapping分析器和版本升级。
      • 通过查看索引时使用的原始文档来调试查询或聚合的功能。
      • 将来有可能自动修复索引损坏。
GET product2/_search
{"_source": ["owner.name","owner.sex"], "query": {"match_all": {}}
}
  **总结:如果只是为了节省磁盘,可以压缩索引比禁用_source更好。**
  1. 数据源过滤器:

    Including:结果中返回哪些field

    Excluding:结果中不要返回哪些field,不返回的field不代表不能通过该字段进行检索,因为元数据不存在不代表索引不存在

    1. 在mapping中定义过滤:支持通配符,但是这种方式不推荐,因为mapping不可变

      PUT product
      {"mappings": {"_source": {"includes": ["name","price"],"excludes": ["desc","tags"]}}
      }
      
    2. 常用过滤规则

      • “_source”: “false”,
      • “_source”: “obj.*”,
      • “_source”: [ “obj1.*”, “obj2.*” ],
      • “_source”: {
        “includes”: [ “obj1.*”, “obj2.*” ],
        “excludes”: [ “*.description” ]
        }
        GET product2/_search
        {"_source": {"includes": ["owner.*","name"],"excludes": ["name", "desc","price"]},"query": {"match_all": {}}
        

4 Query String

5 全文检索-Fulltext query

GET index/_search
{"query": {***}
}
  • match:匹配包含某个term的子句
		GET product/_search{"query": {"match": {"name": "xiaomi nfc phone"}}}
  • match_all:匹配所有结果的子句
GET product/_search
{"query": {"match_all": {}}
}
  • multi_match:多字段条件
GET product/_search
{"query": {"multi_match": {"query": "phone huangmenji","fields": ["name","desc"]}}
}
  • match_phrase:短语查询,
{"query": {"match_phrase": {"name": "xiaomi nfc"}}
}

6 精准查询-Term query

  • term:匹配和搜索词项完全相等的结果
    • term和match_phrase区别:
GET product/_search
{"query": {"match": {"name": "xiaomi phone"}}
}
GET product/_search
{"query": {"term": {"name": "xiaomi phone"}}
}
GET product/_search
{"query": {"match_phrase": {"name": "xiaomi phone"}}
}
match_phrase 会将检索关键词分词, match_phrase的分词结果必须在被检索字段的分词中都包含,而且顺序必须相同,而且默认必须都是连续的 term搜索不会将搜索词分词
  • term和keyword区别

    term是对于搜索词不分词,

    keyword是字段类型,是对于source data中的字段值不分词

GET product/_search
{"query": {"term": {"name": "xiaomi phone"}}
}
GET product/_search
{"query": {"term": {"name.keyword": "xiaomi phone"}}
}
  • terms:匹配和搜索词项列表中任意项匹配的结果
GET product/_search
{"query": {"terms": {"tags": [ "lowbee", "gongjiaoka" ],"boost": 1.0}}
}
  • range:范围查找
GET /_search
{"query": {"range": {"age": {"gte": 10,"lte": 20,"boost": 2.0}}}
}
GET product/_search
{"query": {"range": {"date": {"gte": "2021-04-15","lt": "2021-04-16"}}}
}

7 过滤器-Filter

GET _search
{"query": {"constant_score": {"filter": {"term": {"status": "active"}}}}
}
  • filter:query和filter的主要区别在: filter是结果导向的而query是过程导向。query倾向于“当前文档和查询的语句的相关度”而filter倾向于“当前文档和查询的条件是不是相符”。即在查询过程中,query是要对查询的每个结果计算相关性得分的,而filter不会。另外filter有相应的缓存机制,可以提高查询效率。
GET product/_search
{"query": {"constant_score": {"filter": {"term": {"name": "phone"}},"boost": 1.2}}
}
GET product/_search
{"query": {"bool": {"filter": {"term": {"name": "phone"}}}}
}

8 组合查询-Bool query

bool:可以组合多个查询条件,bool查询也是采用more_matches_is_better的机制,因此满足must和should子句的文档将会合并起来计算分值

  • must:必须满足子句(查询)必须出现在匹配的文档中,并将有助于得分。
# bool query 组合查询#must 计算相关度得分
#条件1:包含"xiaomi""phone"
#条件2:包含"shouji zhong"
GET product/_search
{"query": {"bool": {"must": [{"match": {"name": "xiaomi phone"}},{"match_phrase": {"desc": "shouji zhong"}}]}}
}
  • filter:过滤器 不计算相关度分数,cache☆子句(查询)必须出现在匹配的文档中。但是不像 must查询的分数将被忽略。Filter子句在filter上下文中执行,这意味着计分被忽略,并且子句被考虑用于缓存。
#filter 不计算相关度得分
GET product/_search
{"query": {"bool": {"filter": [{"match": {"name": "xiaomi phone"}},{"match_phrase": {"desc": "shouji zhong"}}]}}
}
  • should:可能满足 or子句(查询)应出现在匹配的文档中。
#should
GET product/_search
{"query": {"bool": {"should": [{"match_phrase": {"name": "xiaomi nfc"}},{"range": {"price": {"lte": "500"}}}]}}
}
  • must_not:必须不满足 不计算相关度分数 not子句(查询)不得出现在匹配的文档中。子句在过滤器上下文中执行,这意味着计分被忽略,并且子句被视为用于缓存。由于忽略计分,0因此将返回所有文档的分数。
#must not 不计算相关度得分
#条件1: 排除包含xiaomi的和包含nfc的(不能包含xiaomi和nfc中的任意一个)
#条件2: 排除价格大于等于500GET product/_search
{"query": {"bool": {"must_not": [{"match": {"name": "xiaomi nfc"}},{"range": {"price": {"gte": "500"}}}]}}
}

minimum_should_match:参数指定should返回的文档必须匹配的子句的数量或百分比。如果bool查询包含至少一个should子句,而没有must或 filter子句,则默认值为1。否则,默认值为0

#filter和must组合
GET product/_search
{"_source": false, "query": {"bool": {"filter": [{"range": {"price": {"lte": "1000"}}}],"must": [{"match": {"name": "xiaomi"}}]}}
}
GET product/_search
{"_source": false, "query": {"bool": {"must": [{"match": {"name": "xiaomi"}},{"range": {"price": {"lte": "1000"}}}]}}
}
#(must或者filter)和should组合
#条件1:价格小于10000
#条件2:name中包含"hongmi"或者"xiaomi nfc phone"
GET product/_search
{"_source": false,"query": {"bool": {"filter": [{"range": {"price": {"lte": "10000"}}}],"should": [{"match_phrase": {"name": "nfc phone"}},{"match": {"name": "erji"}},{"bool": {"must": [{"range": {"price": {"gte": 900,"lte": 3000}}}]}}],"minimum_should_match": 2}}
}

数据代码

PUT /product/_doc/1
{"name" : "xiaomi phone","desc" :  "shouji zhong de zhandouji","date": "2021-06-01","price" :  3999,"tags": [ "xingjiabi", "fashao", "buka" ]
}
PUT /product/_doc/2
{"name" : "xiaomi nfc phone","desc" :  "zhichi quangongneng nfc,shouji zhong de jianjiji","date": "2021-06-02","price" :  4999,"tags": [ "xingjiabi", "fashao", "gongjiaoka" ]
}
PUT /product/_doc/3
{"name" : "nfc phone","desc" :  "shouji zhong de hongzhaji","date": "2021-06-03","price" :  2999,"tags": [ "xingjiabi", "fashao", "menjinka" ]
}
PUT /product/_doc/4
{"name" : "xiaomi erji","desc" :  "erji zhong de huangmenji","date": "2021-04-15","price" :  999,"tags": [ "low", "bufangshui", "yinzhicha" ]
}
PUT /product/_doc/5
{"name" : "hongmi erji","desc" :  "erji zhong de kendeji 2021-06-01","date": "2021-04-16","price" :  399,"tags": [ "lowbee", "xuhangduan", "zhiliangx" ]
}
PUT /product2/_doc/1
{"owner":{"name":"zhangsan","sex":"男","age":18},"name": "hongmi erji","desc": "erji zhong de kendeji","price": 399,"tags": ["lowbee","xuhangduan","zhiliangx"]
}
PUT product2
{"mappings": {"_source": {"includes": ["name","price"],"excludes": ["desc","tags"]}}
}


本文来自互联网用户投稿,文章观点仅代表作者本人,不代表本站立场,不承担相关法律责任。如若转载,请注明出处。 如若内容造成侵权/违法违规/事实不符,请点击【内容举报】进行投诉反馈!

相关文章

立即
投稿

微信公众账号

微信扫一扫加关注

返回
顶部