Skip to content

SQL: implement COUNT(ALL|DISTINCT expression) to handle non-null values #30285

@astefan

Description

@astefan

At the moment, there is no difference in handling between COUNT(*) and COUNT(expression) when it comes to non-null values (ie all values are considered). Whereas the COUNT(ALL|DISTINCT expression) should consider only non null values.

The current implementation generates and uses a query like the following to perform this select select author,count(ip_addr) from ip_index group by author having count(ip_addr)>1:

{
  "size": 0,
  "_source": false,
  "stored_fields": "_none_",
  "aggregations": {
    "groupby": {
      "composite": {
        "size": 1000,
        "sources": [
          {
            "1562": {
              "terms": {
                "field": "author.keyword",
                "order": "asc"
              }
            }
          }
        ]
      },
      "aggregations": {
        "1572": {
          "bucket_selector": {
            "buckets_path": {
              "a0": "_count"
            },
            "script": {
              "source": "params.a0 > params.v0",
              "lang": "painless",
              "params": {
                "v0": 1
              }
            },
            "gap_policy": "skip"
          }
        }
      }
    }
  }
}

where the relevant part is "buckets_path": { "a0": "_count" } which simply counts the documents. To also consider non-null values something like this should work:

{
  "size": 0,
  "_source": false,
  "stored_fields": "_none_",
  "aggregations": {
    "groupby": {
      "composite": {
        "size": 1000,
        "sources": [
          {
            "1395": {
              "terms": {
                "field": "author.keyword",
                "order": "asc"
              }
            }
          }
        ]
      },
      "aggregations": {
        "non_null_values": {
          "filter": {
            "exists": {
              "field": "ip_addr"
            }
          }
        },
        "1406": {
          "bucket_selector": {
            "buckets_path": {
              "a0": "non_null_values._count"
            },
            "script": {
              "source": "params.a0 == params.v0",
              "lang": "painless",
              "params": {
                "v0": 1
              }
            },
            "gap_policy": "skip"
          }
        }
      }
    }
  }
}

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions