Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-7735

Query against empty parquet file fails with: IndexOutOfBoundsException: Index: 0, Size: 0

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.17.0
    • Fix Version/s: None
    • Component/s: Server, Storage - Parquet
    • Labels:
      None
    • Environment:

      64Gb machine running on AWS.

      Description

      Running a `SELECT *` query against an empty Parquet file (i.e. one with correct column metadata written, but no rows) triggers an `IndexOutOfBoundsException`.

      I've got an empty parquet file with the following schema:

      $ parquet-tools schema dispute.parquet
      message parquet_go_root {
        required int32 dispute_id (INT_32) = 0;
        required binary title (UTF8) = 0;
        optional int32 start_date (DATE) = 0;
        optional int32 end_date (DATE) = 0;
        optional binary docket_number (UTF8) = 0;
        required binary route (UTF8) = 0;
        required binary jurisdiction (UTF8) = 0;
      }
      

      If I then run the following query via the Drill web UI:

      SELECT * FROM dfs.`/data/dispute.parquet`
      

      then I get the following error from Drill:

      org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IndexOutOfBoundsException: Index: 0, Size: 0 Please, refer to logs for more information. 
      
      [Error Id: a93e1aa1-a7e6-4bc9-9f11-c42b9f6fe108 on e531a6492cf4:31010]
      

      Expected result was just to get an empty result set (i.e. 0 rows).

       

      I've attached the parquet file in question, and the relevant entries from the drillbit.log.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              suicas Dave Challis

              Dates

              • Created:
                Updated:

                Issue deployment