Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-5124

LIMIT won't work when GROUP BY two or more columns in Elasticsearch Adapter

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.30.0
    • None
    • elasticsearch-adapter
    • None

    Description

      Add one doc(like following doc4) in AggregationTest :

      String doc4 = "{val1:1, cat4:'2018-01-02'}"
      

      Then running the following test case:

      @Test void dateCat2() {
          CalciteAssert.that()
              .with(AggregationTest::createConnection)
              .query("select val1, cat4 from view group by val1, cat4 limit 2")
              .returnsUnordered("val1=1; cat4=1514764800000",
                  "val1=1; cat4=1514851200000",
                  "val1=null; cat4=1576108800000");
        }
      

      We can see that limit 2 in SQL doesn't take effect. The generated ES script is:

      {
        "_source": false,
        "size": 0,
        "stored_fields": "_none_",
        "aggregations": {
          "g_val1": {
            "terms": {
              "field": "val1",
              "missing": -9223372036854775808,
              "size": 2
            },
            "aggregations": {
              "g_cat4": {
                "terms": {
                  "field": "cat4",
                  "missing": 253402214400000,
                  "size": 2
                }
              }
            }
          }
        }
      }
      

      There are two bucket aggregations in the script, which both have the size 2. However, the size can only control the doc's num for the current bucket, when two buckets interact, the total results cannot be assured.

      Attachments

        Activity

          People

            Unassigned Unassigned
            VAE ZheHu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: