[DRILL-3208] Hive : Tpch (SF 0.01) query 10 fails with a system error when the data is backed by hive tables - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Invalid
Affects Version/s: None
Fix Version/s: 1.1.0
Component/s: Storage - Hive
Labels:
None

Description

git.commit.id.abbrev=6f54223

I created hive tables on top of tpch parquet data. (Attached the hive ddl script). Since hive does not support date in parquet serde, I regenerated the parquet files for orders and lineitem to use string for the date fields. Remaining files do not have a date column.

When I executed query 10 in the tpch suite, it failed with a system error.

0: jdbc:drill:schema=dfs_eea> use hive.tpch01_parquet_nodate;
+-------+---------------------------------------------------------+
|  ok   |                         summary                         |
+-------+---------------------------------------------------------+
| true  | Default schema changed to [hive.tpch01_parquet_nodate]  |
+-------+---------------------------------------------------------+
1 row selected (0.091 seconds)
0: jdbc:drill:schema=dfs_eea>
select
  c.c_custkey,
  c.c_name,
  sum(l.l_extendedprice * (1 - l.l_discount)) as revenue,
  c.c_acctbal,
  n.n_name,
  c.c_address,
  c.c_phone,
  c.c_comment
from
  customer c,
  orders o,
  lineitem l,
  nation n
where
  c.c_custkey = o.o_custkey
  and l.l_orderkey = o.o_orderkey
  and cast(o.o_orderdate as date) >= date '1994-03-01'
  and cast(o.o_orderdate as date) < date '1994-03-01' + interval '3' month
  and l.l_returnflag = 'R'
  and c.c_nationkey = n.n_nationkey
group by
  c.c_custkey,
  c.c_name,
  c.c_acctbal,
  c.c_phone,
  n.n_name,
  c.c_address,
  c.c_comment
order by
  revenue desc
limit 20;

Error: SYSTEM ERROR: 

Fragment 0:0

[Error Id: 1d327ae0-1cf2-4776-acd3-8eef6cca4b6a on qa-node191.qa.lab:31010] (state=,code=0)

I tried running the above query using dfs instead of hive and it worked as expected.

I attached the newly generated parquet files and the hive ddl for creating hive tables. Let me know if you need anything

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

customer.parquet
29/May/15 16:47
130 kB
Rahul Kumar Challapalli
error.log
28/May/15 22:45
8 kB
Rahul Kumar Challapalli
lineitem_nodate.parquet
28/May/15 22:45
2.92 MB
Rahul Kumar Challapalli
nation.parquet
29/May/15 16:47
2 kB
Rahul Kumar Challapalli
orders_nodate.parquet
28/May/15 22:45
663 kB
Rahul Kumar Challapalli
tpch.ddl
28/May/15 22:45
3 kB
Rahul Kumar Challapalli

Issue Links

duplicates

DRILL-3217 System error not being propogated to sqlline in multi-node cluster

Open

Activity

People

Assignee:: Venki Korukanti

Reporter:: Rahul Kumar Challapalli

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 28/May/15 22:43

Updated:: 01/Jul/15 01:01

Resolved:: 29/May/15 21:55