检查Elasticsearch文档中是否存在字段的最佳方法（elasticsearch查询所有文档）

25-03-11 8

本文将分享检查Elasticsearch文档中是否存在字段的最佳方法的详细内容，并且还将对elasticsearch查询所有文档进行详尽解释，此外，我们还将为大家带来关于Elasticsearch在字

本文将分享检查Elasticsearch文档中是否存在字段的最佳方法的详细内容，并且还将对elasticsearch查询所有文档进行详尽解释，此外，我们还将为大家带来关于Elasticsearch 在字符串的两个范围字段之间查询文档、elasticsearch-将嵌套字段与文档中的另一个字段进行比较、elasticsearch-查询文档是否存在、elasticsearch判断索引是否存在的相关知识，希望对你有所帮助。

本文目录一览：

检查Elasticsearch文档中是否存在字段的最佳方法（elasticsearch查询所有文档）
Elasticsearch 在字符串的两个范围字段之间查询文档
elasticsearch-将嵌套字段与文档中的另一个字段进行比较
elasticsearch-查询文档是否存在
elasticsearch判断索引是否存在

检查Elasticsearch文档中是否存在字段的最佳方法（elasticsearch查询所有文档）

可能是一个非常愚蠢的问题，检查elasticsearch中文档的字段是否存在的最佳方法是什么？我在文档中找不到任何内容。

例如，如果该文档没有字段/关键字“ price”，那么我不想返回结果。

{“ updated”：“ 2015/09/17 11:27:27”，“ name”：“ Eye Shadow”，“ format”：“ 1.5 g /
0.05 oz”，}

我可以做什么？

谢谢

答案1

小编典典

您可以将exists过滤器与以下bool/must过滤器结合使用：

{  "query": {    "filtered": {      "filter": {        "bool": {          "must": [            {              "exists": {                "field": "price"              }            },            ...     <-- your other constraints, if any          ]        }      }    }  }}

不推荐使用（自ES5起）
您也可以将missing过滤器与bool/must_not过滤器结合使用：

{  "query": {    "filtered": {      "filter": {        "bool": {          "must_not": [            {              "missing": {                "field": "price"              }            }          ]        }      }    }  }}

Elasticsearch 在字符串的两个范围字段之间查询文档

如何解决Elasticsearch 在字符串的两个范围字段之间查询文档？

我在 Elasticsearch 中存储了一个日志文件，其中文档是文件的一行。消息块以某些关键字开始和结束。我想获取包含这些关键字的文档之间的所有文档。有没有办法利用 Elasticsearch 中的范围查询/范围过滤器来查询文本字段？

示例日志文件：
...
...
xyz foo "keyword1" .....
..
....
...
xyz 栏 "keyword2" .....
..
..

我想查询“keyword1”和“keyword2”之间的所有文档，包括包含关键字本身的文档。假设有多个这样的块，带有“keyword1”和“keyword2”。

此外，我正在使用新字段 test_field 更新包含这些关键字的文档，该字段包含这些关键字作为值。可以在范围过滤器中使用这个新字段来实现上述任务吗？

Elasticsearch 字段：_source: { "log_line","test_field" }

解决方法

我假设您还有一些标识符来定义这些文档的顺序。假设您有一个字段 library(shiny) ui <- fluidPage( fileInput(inputId = "upload_file","",accept = ''.csv''),uiOutput("num_inputs"),actionButton("calc","Calculate"),tableOutput("table") ) server <- function(input,output,session) { data <- reactive({ infile <- input$upload_file if (is.null(infile)) return(NULL) read.csv(infile$datapath,header = TRUE,sep = ",") }) header_vars <- reactive({ names(data()[-1]) }) output$num_inputs <- renderUI({ vars <- length(header_vars()) if (vars > 0) { div( lapply(seq(vars),function(x) { numericInput(inputId = paste0("var_",x),label = header_vars()[x],value = 0,min = 0,max = 100) }) ) } }) input_vals <- eventReactive(input$calc,{ n_vars <- length(header_vars()) vals <- c() if (n_vars > 0) { vals <- sapply(seq(n_vars),function(i) { input[[paste0("var_",i)]] }) } calc_sum(vals) return(vals) }) calc_sum <- function(vals) { print(sum(vals)) } output$table <- renderTable({ data.frame( vars = header_vars(),vals = input_vals() ) }) } shinyApp(ui = ui,server = server)。

您可以进行前两次搜索，匹配包含关键字的所有文档。然后对于每对这些关键字，您都有起始行号和结束行号。对于每一对，您可以搜索两个行号之间的所有文档（使用 line_number）。这不是纯 ES 解决方案，需要一些脚本，例如python 或任何其他语言。如果您在查询方面需要帮助，请告诉我。

但在做这样的事情之前，如果我是你，我会批判性地质疑这个要求。为什么将行的日志文件行读入 ES？为什么不使用 Logstash/Filebeat 以您喜欢的模式加载数据，这样您就有了一个包含整个块的文档？使查询和分析变得更加容易:)

elasticsearch-将嵌套字段与文档中的另一个字段进行比较

如何解决elasticsearch-将嵌套字段与文档中的另一个字段进行比较？

对于嵌套搜索，您要搜索没有父对象的嵌套对象。不幸的是，没有可以与nested对象一起应用的隐藏联接。

至少在当前，这意味着您不会在脚本中同时收到“父”文档和嵌套文档。您可以通过以下两种方式替换脚本并测试结果来确认这一点：

# Parent Document does not exist
"script": {
  "script": "doc[''primary_content_type_id''].value == 12"
}

# nested Document should exist
"script": {
  "script": "doc[''content.content_type_id''].value == 12"
}

您可以通过在objects上循环来以低于性能的方式执行此操作（而不是天生就让ES使用来为您执行此操作nested）。这意味着您必须将文档和nested文档重新索引为单个文档才能正常工作。考虑到您尝试使用它的方式，这可能并没有太大不同，甚至可能会表现得更好（特别是在缺少替代方法的情况下）。

# This assumes that your default scripting language is Groovy (default in 1.4)
# Note1: "find" will loop across all of the values, but it will
#  appropriately short circuit if it finds any!
# Note2: It would be preferable to use doc throughout, but since we need the
#  arrays (plural!) to be in the _same_ order, then we need to parse the
#  _source. This inherently means that you must _store_ the _source, which
#  is the default. Parsing the _source only happens on the first touch.
"script": {
  "script": "_source.content.find { it.content_type_id == _source.primary_content_type_id && ! it.assigned } != null",
  "_cache" : true
}

我缓存的结果，因为没有动态发生在这里（例如，不比较日期Now为实例），所以它是很安全的高速缓存，从而使未来的查找多快。默认情况下，大多数过滤器都是缓存的，但是脚本是少数例外之一。

由于必须比较两个值以确保找到正确的内部对象，因此您正在重复一些工作，但这实际上是不可避免的。拥有term过滤器最有可能胜过没有过滤器的情况。

解决方法

我需要比较同一文档中的2个字段，实际值无关紧要。考虑以下文档：

_source: {
    id: 123,primary_content_type_id: 12,content: [
        {
            id: 4,content_type_id: 1
            assigned: true
        },{
            id: 5,content_type_id: 12,assigned: false
        }
    ]
}

我需要找到所有未分配主要内容的文档。我无法找到一种方法来比较primary_content_type_id和嵌套的content.content_type_id以确保它们是相同的值。这是我使用脚本尝试过的。我认为我不了解脚本，但这可能是解决此问题的一种方式：

{
    "filter": {
        "nested": {
            "path": "content","filter": {
                "bool": {
                    "must": [
                        {
                            "term": {
                                "content.assigned": false
                            }
                        },{
                            "script": {
                                "script": "primary_content_type_id==content.content_type_id"
                            }
                        }
                    ]
                }
            }
        }
    }
}

请注意，如果我删除过滤器的脚本部分，并用另一个术语过滤器替换为，并在过滤器的脚本部分content_type_id = 12添加了另一个过滤器，则会很好地工作primary_content_id = 12。问题在于，我将不知道（或对我的用例来说也无关紧要）primary_content_type_idor
的值是什么content.content_type_id。只不过与content_type_id匹配的内容所分配的false无关紧要primary_content_type_id。

Elasticsearch是否可以进行此检查？

elasticsearch-查询文档是否存在

检查文档是否存在

如果你想做的只是检查文档是否存在——你对内容完全不感兴趣——使用HEAD方法来代替GET。HEAD请求不会返回响应体，只有HTTP头：

curl -i -XHEAD http://localhost:9200/website/blog/123

Elasticsearch将会返回200 OK状态如果你的文档存在：

HTTP/1.1 200 OK
Content-Type: text/plain; charset=UTF-8
Content-Length: 0

如果不存在返回404 Not Found：

curl -i -XHEAD http://localhost:9200/website/blog/124

HTTP/1.1 404 Not Found
Content-Type: text/plain; charset=UTF-8
Content-Length: 0

当然，这只表示你在查询的那一刻文档不存在，但并不表示几毫秒后依旧不存在。另一个进程在这期间可能创建新文档。

elasticsearch判断索引是否存在

一、判断索引是否存在
指定索引名，判断指定的索引是否存在集群中

/**
     * 判断指定的索引名是否存在
     * @param indexName 索引名
     * @return  存在：true; 不存在：false;
     */
    public boolean isExistsIndex(String indexName){
        IndicesExistsResponse  response = 
                getClient().admin().indices().exists( 
                        new IndicesExistsRequest().indices(new String[]{indexName})).actionGet();
        return response.isExists();
}

二、判断索引指定类型是否存在

/**
 * 判断指定的索引的类型是否存在
 * @param indexName 索引名
 * @param indexType 索引类型
 * @return  存在：true; 不存在：false;
 */
public boolean isExistsType(String indexName,String indexType){
    TypesExistsResponse  response = 
            getClient().admin().indices()
            .typesExists(new TypesExistsRequest(new String[]{indexName}, indexType)
            ).actionGet();
    System.out.println(FastJSONHelper.serialize(response));
    return response.isExists();
}
输出的JSON格式内容：
{
    "context":{
        "empty":true
    },
    "contextEmpty":true,
    "exists":true,
    "headers":[]
}

今天关于检查Elasticsearch文档中是否存在字段的最佳方法和elasticsearch查询所有文档的讲解已经结束，谢谢您的阅读，如果想了解更多关于Elasticsearch 在字符串的两个范围字段之间查询文档、elasticsearch-将嵌套字段与文档中的另一个字段进行比较、elasticsearch-查询文档是否存在、elasticsearch判断索引是否存在的相关知识，请在本站搜索。

本文标签：