Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Duplicate
-
1.2.0
-
None
Description
I had a query running out of memory during CTAS and after that drillbit was rendered unusable:
0: jdbc:drill:schema=dfs> create table lineitem as select . . . . . . . . . . . . > cast(columns[0] as int) l_orderkey, . . . . . . . . . . . . > cast(columns[1] as int) l_partkey, . . . . . . . . . . . . > cast(columns[2] as int) l_suppkey, . . . . . . . . . . . . > cast(columns[3] as int) l_linenumber, . . . . . . . . . . . . > cast(columns[4] as double) l_quantity, . . . . . . . . . . . . > cast(columns[5] as double) l_extendedprice, . . . . . . . . . . . . > cast(columns[6] as double) l_discount, . . . . . . . . . . . . > cast(columns[7] as double) l_tax, . . . . . . . . . . . . > cast(columns[8] as varchar(200)) l_returnflag, . . . . . . . . . . . . > cast(columns[9] as varchar(200)) l_linestatus, . . . . . . . . . . . . > cast(columns[10] as date) l_shipdate, . . . . . . . . . . . . > cast(columns[11] as date) l_commitdate, . . . . . . . . . . . . > cast(columns[12] as date) l_receiptdate, . . . . . . . . . . . . > cast(columns[13] as varchar(200)) l_shipinstruct, . . . . . . . . . . . . > cast(columns[14] as varchar(200)) l_shipmode, . . . . . . . . . . . . > cast(columns[15] as varchar(200)) l_comment . . . . . . . . . . . . > from `lineitem.dat`; Error: RESOURCE ERROR: One or more nodes ran out of memory while executing the query. Fragment 1:10 [Error Id: 11084315-5388-4500-b165-642a5f595ebf on atsqa4-133.qa.lab:31010] (state=,code=0)
Here is drill's behavior after that:
1. Tried to run: "select * from sys.options" in the same sqlline session - hangs.
2. Was able to start sqlline and connect to drillbit:
- If you try running anything on this connection: it hangs.
- Issue ^C --> you will get result if you are lucky (these queries will appear as: "CANCELLATION_REQUESTED" on WebUI)
(I only tried querying sys.memory, sys.options which possibly have a different code path than queries from actual user data) - If you are not lucky, you will get this error below:
0: jdbc:drill:schema=dfs> show files; java.lang.RuntimeException: java.sql.SQLException: Unexpected RuntimeException: java.lang.IllegalArgumentException: Buffer has negative reference count. at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73) at sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87) at sqlline.TableOutputFormat.print(TableOutputFormat.java:118) at sqlline.SqlLine.print(SqlLine.java:1583) at sqlline.Commands.execute(Commands.java:852) at sqlline.Commands.sql(Commands.java:751) at sqlline.SqlLine.dispatch(SqlLine.java:738) at sqlline.SqlLine.begin(SqlLine.java:612) at sqlline.SqlLine.start(SqlLine.java:366) at sqlline.SqlLine.main(SqlLine.java:259)
or maybe something like this:
0: jdbc:drill:schema=dfs> select count(*) from nation group by n_regionkey;
Error: CONNECTION ERROR: Exceeded timeout (5000) while waiting send intermediate work fragments to remote nodes. Sent 1 and only heard response back from 0 nodes.
[Error Id: 6abce8e9-78a1-4b3d-bcec-503930482b40 on atsqa4-133.qa.lab:31010] (state=,code=0)
I'm attaching results of a jstack and drillbit.log and so far I was not able to reproduce this problem again (working on it).
Attachments
Attachments
Issue Links
- duplicates
-
DRILL-5599 Notify StatusHandlerListener that batch sending has failed even if channel is still open
- Resolved