Apache Drill
  1. Apache Drill
  2. DRILL-844

hit java.lang.IndexOutOfBoundsException while querying some large data set

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: Execution - Flow
    • Labels:
      None

      Description

      0: jdbc:drill:schema=dfs> SELECT columns[13] from `/user/root/cust-d1.tsv` where columns[13] like '%noticias%';
      ------------

      EXPR$0

      ------------
      java.lang.IndexOutOfBoundsException
      at io.netty.buffer.EmptyByteBuf.checkIndex(EmptyByteBuf.java:857)
      at io.netty.buffer.EmptyByteBuf.getBytes(EmptyByteBuf.java:321)
      at org.apache.drill.exec.vector.VarCharVector$Accessor.get(VarCharVector.java:325)
      at org.apache.drill.exec.vector.VarCharVector$Accessor.getObject(VarCharVector.java:345)
      at org.apache.drill.exec.vector.accessor.VarCharAccessor.getObject(VarCharAccessor.java:94)
      at org.apache.drill.jdbc.AvaticaDrillSqlAccessor.getObject(AvaticaDrillSqlAccessor.java:136)
      at net.hydromatic.avatica.AvaticaResultSet.getObject(AvaticaResultSet.java:336)
      at sqlline.SqlLine$Rows$Row.<init>(SqlLine.java:2388)
      at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2504)
      at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
      at sqlline.SqlLine.print(SqlLine.java:1809)
      at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
      at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
      at sqlline.SqlLine.dispatch(SqlLine.java:889)
      at sqlline.SqlLine.begin(SqlLine.java:763)
      at sqlline.SqlLine.start(SqlLine.java:498)
      at sqlline.SqlLine.main(SqlLine.java:460)

      no error in Lilith

      It will hang in sqlline if you just run select columns[13] from `/user/root/cust-d1.tsv` ;

      I have the data set and just ask for location for it.

        Activity

        Hide
        Ramana Inukonda Nagaraj added a comment -

        Existing length tests should cover this.

        Show
        Ramana Inukonda Nagaraj added a comment - Existing length tests should cover this.
        Hide
        Vivian Summers added a comment -

        verified it's working in the last build.

        Show
        Vivian Summers added a comment - verified it's working in the last build.
        Hide
        Jacques Nadeau added a comment -

        fixed in last build

        Show
        Jacques Nadeau added a comment - fixed in last build
        Hide
        Vivian Summers added a comment -

        maxlen, minlen for each columns:
        c0, 1, 1
        c1, 9, 9
        c2, 8, 1
        c3, 3, 1
        c4, 10, 8
        c5, 4, 2
        c6, 2097, 84
        c7, 1, 1
        c8, 84, 3
        c9, 38, 0
        c10, 36, 0
        c11, 6, 0
        c12, 29, 0
        c13, 1828, 0
        c14, 186, 0
        c15, 1096, 0
        c16, 4, 0
        c17, 0,
        c18, 0,
        c19, 0,
        c20, 0,
        c21, 0,
        c22, 0
        c23, 1, 1
        c24, 10, 4
        c25, 36, 36
        c26, 23, 23
        c27, 11, 11
        c28, 32, 32
        c29, 9, 1
        c30, 3, 1
        c31, 2, 0
        c32, 7, 0
        c33, 5, 4

        I'm not able to get avg length of each columns, this doesn't work,:
        0: jdbc:drill:schema=dfs> SELECT avg(length(columns[0])) as L1 from `/user/root/cust-d1.tsv` ;
        Query failed: org.apache.drill.exec.rpc.RpcException: Remote failure while running query.[error_id: "1213a1bb-df7e-47e6-914d-52e6fa838009"
        endpoint

        { address: "mfs101.qa.lab" user_port: 31010 control_port: 31011 data_port: 31012 }

        error_type: 0
        message: "Failure while setting up Foreman. < AssertionError:[ Internal error: while converting `length`(`columns`[0]) ] < InvocationTargetException < UnsupportedOperationException:[ class org.eigenbase.sql.SqlUnresolvedFunction: length ]"
        ]
        Error: exception while executing query (state=,code=0)

        neither does max:
        0: jdbc:drill:schema=dfs> SELECT max(length(columns[0])) from `/user/root/cust-d1.tsv`;
        Query failed: org.apache.drill.exec.rpc.RpcException: Remote failure while running query.[error_id: "77b11ab6-950f-479c-838f-ae92fdd6e4c7"
        endpoint

        { address: "mfs101.qa.lab" user_port: 31010 control_port: 31011 data_port: 31012 }

        error_type: 0
        message: "Failure while setting up Foreman. < AssertionError:[ Internal error: while converting `length`(`columns`[0]) ] < InvocationTargetException < UnsupportedOperationException:[ class org.eigenbase.sql.SqlUnresolvedFunction: length ]"
        ]
        Error: exception while executing query (state=,code=0)

        Show
        Vivian Summers added a comment - maxlen, minlen for each columns: c0, 1, 1 c1, 9, 9 c2, 8, 1 c3, 3, 1 c4, 10, 8 c5, 4, 2 c6, 2097, 84 c7, 1, 1 c8, 84, 3 c9, 38, 0 c10, 36, 0 c11, 6, 0 c12, 29, 0 c13, 1828, 0 c14, 186, 0 c15, 1096, 0 c16, 4, 0 c17, 0, c18, 0, c19, 0, c20, 0, c21, 0, c22, 0 c23, 1, 1 c24, 10, 4 c25, 36, 36 c26, 23, 23 c27, 11, 11 c28, 32, 32 c29, 9, 1 c30, 3, 1 c31, 2, 0 c32, 7, 0 c33, 5, 4 I'm not able to get avg length of each columns, this doesn't work,: 0: jdbc:drill:schema=dfs> SELECT avg(length(columns [0] )) as L1 from `/user/root/cust-d1.tsv` ; Query failed: org.apache.drill.exec.rpc.RpcException: Remote failure while running query.[error_id: "1213a1bb-df7e-47e6-914d-52e6fa838009" endpoint { address: "mfs101.qa.lab" user_port: 31010 control_port: 31011 data_port: 31012 } error_type: 0 message: "Failure while setting up Foreman. < AssertionError:[ Internal error: while converting `length`(`columns` [0] ) ] < InvocationTargetException < UnsupportedOperationException:[ class org.eigenbase.sql.SqlUnresolvedFunction: length ]" ] Error: exception while executing query (state=,code=0) neither does max: 0: jdbc:drill:schema=dfs> SELECT max(length(columns [0] )) from `/user/root/cust-d1.tsv`; Query failed: org.apache.drill.exec.rpc.RpcException: Remote failure while running query.[error_id: "77b11ab6-950f-479c-838f-ae92fdd6e4c7" endpoint { address: "mfs101.qa.lab" user_port: 31010 control_port: 31011 data_port: 31012 } error_type: 0 message: "Failure while setting up Foreman. < AssertionError:[ Internal error: while converting `length`(`columns` [0] ) ] < InvocationTargetException < UnsupportedOperationException:[ class org.eigenbase.sql.SqlUnresolvedFunction: length ]" ] Error: exception while executing query (state=,code=0)
        Hide
        Jacques Nadeau added a comment -

        Can you please give a length distribution of the file you are querying? E.g., what is min, max, median and avg length of each column.

        Show
        Jacques Nadeau added a comment - Can you please give a length distribution of the file you are querying? E.g., what is min, max, median and avg length of each column.
        Hide
        Vivian Summers added a comment -

        The datafile is about 80M and has about 33 columns. The column13 is a variable length string, not sure the maximum length it can be, but I've seen some as long as 1000 chars.

        Show
        Vivian Summers added a comment - The datafile is about 80M and has about 33 columns. The column13 is a variable length string, not sure the maximum length it can be, but I've seen some as long as 1000 chars.
        Hide
        Yash Sharma added a comment -

        I tried the query on a 26Gig file and it works for me. What is the size of data you are dealing with. Could you upload the file on some sharing site and provide me with the link.
        Also the file might have less than 14 cols of data, could you please verify?

        Show
        Yash Sharma added a comment - I tried the query on a 26Gig file and it works for me. What is the size of data you are dealing with. Could you upload the file on some sharing site and provide me with the link. Also the file might have less than 14 cols of data, could you please verify?

          People

          • Assignee:
            DrillCommitter
            Reporter:
            Vivian Summers
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development