Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5105

Query time increases exponentially with increasing nested levels

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.9.0
    • Fix Version/s: None
    • Component/s: Storage - JSON
    • Labels:
    • Environment:

      3 Node Cluster with default memory and configurations.

      Description

      The time taken to query any JSON dataset depends on number of nested levels within the dataset. Also, increasing the complexity of the dataset further impacts the execution time.

      Tabulated below is cached query execution times for a simple select * query over two simple forms of JSON datasets:

      No. Levels Time (s) Dataset 1 Time (s) Dataset 2
      1 0.22 0.27
      2 0.23 0.25
      4 0.24 0.22
      8 0.22 0.23
      16 0.34 0.48
      24 25.76 72.51
      26 103.48 289.6
      28 336.12 1151.94
      30 1342.22 4586.79
      32 5360.2 Expected: ~20k

      The above table lists query times for 20 different JSON files, 10 belonging to dataset 1 & 10 belonging to dataset 2. Each have 1 record, but the number of nested levels within them vary as mentioned in the "No. Levels" column.

      It appears that the query time almost doubles with addition of a nested level (note that in the table above, it translates to almost 4x across levels starting 24)

      The below two are the representative datasets, showcasing simple JSON structures with nested levels.

      Structure of Dataset 1:

      {
        "level1": {
          "field1": "a",
          "level2": {
            "field1"": "b",
            ...
          }
        }
      }
      

      Structure of Dataset 2:

      "{
        "level1": {
          "field1": ""a",
          "field2": {
            "nfield1": true,
            "nfield2": 1.1
          },
          "level2": {
            "field1": "b",
            "field2": {
              "nfield1": false,
              "nfield2": 2.2
            },
            ...
          }
        }
      }
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                cshi Chunhui Shi
                Reporter:
                agirish Abhishek Girish
                Reviewer:
                Paul Rogers
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: