Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-2376

Scan of array value with 100m elements with reasonable mem limit hits DCHECK.

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 2.3.0
    • None
    • None

    Description

      The query below when run without a mem limit needs roughly 2.4g of memory in the scan.
      My expectation is that I get a mem limit exceeded error when running the same query with a mem limit below that 2.4g. However, we hit a DCHECK in the scanner.

      Repro:
      1. Grab Parquet file from here:
      vd0212.halxg.cloudera.com:/data/1/huge_array_parquet/100m_array.parq
      2. Copy file to HDFS and use CREATE TABLE LIKE FILE
      3. The query below runs fine without a mem limit:

      select cnt from huge_array_table t, (select count(item) cnt from t.f) v;
      

      4. Set the mem limit to 1g and run the query again. You will hit this DCHECK:

      hdfs-parquet-scanner.cc:1299] Check failed: !parse_status_.ok()
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            skye Skye Wanderman-Milne
            alex.behm Alexander Behm
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment