Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2290

Very slow performance for a query involving nested map

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Resolved
    • 0.8.0
    • Future
    • Execution - Data Types
    • None

    Description

      #Thu Feb 19 18:40:10 EST 2015
      git.commit.id.abbrev=1ceddff

      This query took 17 minutes to complete. Too long. I think this happened after the fix dealing with nested maps.

      0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select b.id, a.ooa[1].fl.f1, b.oooi, a.ooof.oa.oab.oabc from `complex.json` a inner join `complex.json` b on a.ooa[1].fl.f1=b.ooa[1].fl.f1 order by b.id limit 20;
      +------------+------------+------------+------------+
      |     id     |   EXPR$1   |    oooi    |   EXPR$3   |
      +------------+------------+------------+------------+
      | 1          | 1.6789     | {"oa":{"oab":{"oabc":1}}} | 1.5678     |
      | 3          | 3.6789     | {"oa":{"oab":{"oabc":3}}} | 3.5678     |
      | 4          | 4.6789     | {"oa":{"oab":{"oabc":4}}} | 4.5678     |
      | 5          | 5.6789     | {"oa":{"oab":{"oabc":5}}} | 5.5678     |
      | 7          | 7.6789     | {"oa":{"oab":{"oabc":7}}} | 7.5678     |
      | 9          | 9.6789     | {"oa":{"oab":{"oabc":9}}} | 9.5678     |
      | 10         | 10.6789    | {"oa":{"oab":{"oabc":10}}} | 10.5678    |
      | 11         | 11.6789    | {"oa":{"oab":{"oabc":11}}} | 11.5678    |
      | 13         | 13.6789    | {"oa":{"oab":{"oabc":13}}} | 13.5678    |
      | 14         | 14.6789    | {"oa":{"oab":{"oabc":14}}} | 14.5678    |
      | 15         | 15.6789    | {"oa":{"oab":{"oabc":15}}} | 15.5678    |
      | 16         | 16.6789    | {"oa":{"oab":{"oabc":16}}} | 16.5678    |
      | 17         | 17.6789    | {"oa":{"oab":{"oabc":17}}} | 17.5678    |
      | 18         | 18.6789    | {"oa":{"oab":{"oabc":18}}} | 18.5678    |
      | 19         | 19.6789    | {"oa":{"oab":{"oabc":19}}} | 19.5678    |
      | 20         | 20.6789    | {"oa":{"oab":{"oabc":20}}} | 20.5678    |
      | 21         | 21.6789    | {"oa":{"oab":{"oabc":21}}} | 21.5678    |
      | 22         | 22.6789    | {"oa":{"oab":{"oabc":22}}} | 22.5678    |
      | 24         | 24.6789    | {"oa":{"oab":{"oabc":24}}} | 24.5678    |
      | 25         | 25.6789    | {"oa":{"oab":{"oabc":25}}} | 25.5678    |
      +------------+------------+------------+------------+
      20 rows selected (1020.036 seconds)
      

      The query deals just a little less than 1 million records so should not be that slow.

      0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(*) from (select b.id, a.ooa[1].fl.f1, b.oooi, a.ooof.oa.oab.oabc from `complex.json` a inner join `complex.json` b on a.ooa[1].fl.f1=b.ooa[1].fl.f1) c;
      +------------+
      |   EXPR$0   |
      +------------+
      | 900190     |
      +------------+
      1 row selected (700.516 seconds)
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            cchang@maprtech.com Chun Chang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: