Uploaded image for project: 'Jackrabbit Oak'
  1. Jackrabbit Oak
  2. OAK-9899

Optimize in(x, y, z) constraints to use the terms ES query instead of a union of range queries

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • indexing
    • None

    Description

      The Elastic plugin is converting the in(x, y, z) constraint of SQL2 queries to a series of ElasticSearch range queries, one for each element in the set. For instance this query:

      select [jcr:path] from [nt:base] where [propa] in('2', '3', '5', '7')" 
      

      is implemented with the following ElasticSearch query:

      "query": {
        "bool": {
          "filter": [{
            "bool": {
              "should": [
                {"range": {"propa": {"gte": "2","lte": "2"}}},
                {"range": {"propa": {"gte": "3","lte": "3"}}},
                {"range": {"propa": {"gte": "5","lte": "5"}}},
                {"range": {"propa": {"gte": "7","lte": "7"}}},              
              ]
            }
          }]
        }
      }
      

      The resulting ES query may become very large if the set in the in constraint is large, resulting in a large JSON request (which may be slow to generate, send and parse) and a potentially long execution time. In the worst case, the query can hit some size limits of ElasticSearch and fail.

      The same in constraint can be expressed much more efficiently using the ElasticSearch terms query:

      {
        "bool": {
          "filter": {
            "terms": {
              "propa": [2", "3", "5", "7"]
            }
          }
        }
      }
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              nuno.santos Nuno Santos
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: