Description
The scenario: Create a Hive table, populate it with some rows, then in Trafodion perform UPDATE STATISTICS on it. Then, in Hive, drop the table and create it again with fewer columns. Then in Trafodion, execute "select count" against the table. Trafodion will then core, because the old histograms are still there and are not consistent with the new Hive DDL.
The following scripts will reproduce it:
In Hive, do:
create table hive1
( a int,
b int,
c int )
stored as textfile;
In Trafodion sqlci do:
insert into hive.hive.hive1 values (1,2,3),(2,2,3),(2,3,4),(3,3,5),(3,4,6),(3,6,
8),(4,1,1);update statistics for table hive.hive.hive1 on every column;
Then in Hive do:
drop table if exists hive1;
create table hive1
( a int,
b int )
stored as textfile;
Then in Trafodion sqlci do:
select count from hive.hive.hive1;
You will now get a sqlci core, with a stack trace like:
(gdb) bt
#0 0x00007fe5b6976495 in raise () from /lib64/libc.so.6
#1 0x00007fe5b6977bfd in abort () from /lib64/libc.so.6
#2 0x00007fe5b63a93ea in assert_botch_abend (f=
0x7fe5b06459e8 "../common/Collections.cpp", l=903, m=
0x7fe5b06459c8 "List index exceeds # of entries", c=0x0)
at ../export/NAAbort.cpp:285
#3 0x00007fe5b63a9107 in NAAbort (filename=
0x7fe5b06459e8 "../common/Collections.cpp", lineno=903, msg=
0x7fe5b06459c8 "List index exceeds # of entries")
at ../export/NAAbort.cpp:207
#4 0x00007fe5b0455371 in NAList<NAColumn*>::operator[] (this=0x7fe5a3a45e88,
i=2) at ../common/Collections.cpp:903
#5 0x00007fe5afd95ca6 in HSHistogrmCursor::fetch (this=0x7fff381773a0, cs=
..., cursor2=..., colmap=0x7fe594704428, fakeHistogram=0x7fe5947043a8,
emptyHistogram=0x7fe5947043c8, smallSampleHistogram=0x7fe5947043e8,
smallSampleSize=0x7fe594704408, fakeRowCount=@0x7fff38177f78, statsTime=
@0x7fe5a3a45fd8, allFakeStats=@0x7fff38178794, preFetch=1, offset=0,
tabDef=0x7fe594705238, cmpContextSwitched=1) at ../ustat/hs_read.cpp:1537
#6 0x00007fe5afd954ad in readHistograms (tabDef=0x7fe594705238, fullQualName=
..., histogramTableName=..., histintsTableName=..., specialTable=0, type=
ExtendedQualName::NORMAL_TABLE, colArray=..., statsTime=@0x7fe5a3a45fd8,
allFakeStat=@0x7fff38178794, preFetch=1, fakeHistogram=0x7fe5947043a8,
emptyHistogram=0x7fe5947043c8, smallSampleHistogram=0x7fe5947043e8,
--
Type <return> to continue, or q <return> to quit--smallSampleSize=0x7fe594704408, colmap=0x7fe594704428, histogramRowCount=
@0x7fff38177f78, cs=0x7fff38177d10, offset=0) at ../ustat/hs_read.cpp:1330
#7 0x00007fe5afd93d3f in FetchHistograms (qualifiedName=..., type=
ExtendedQualName::NORMAL_TABLE, colArray=..., colStatsList=...,
isSQLMPTable=0, heap=0x7fe5946e3228, statsTime=@0x7fe5a3a45fd8,
allFakeStat=@0x7fff38178794, preFetch=1, createStatsSize=0)
at ../ustat/hs_read.cpp:962
#8 0x00007fe5ae795d23 in HistogramCache::createColStatsList (this=
0x7fe5a3a6cac8, table=..., cachedHistograms=0x0)
at ../optimizer/NATable.cpp:497
#9 0x00007fe5ae7958ec in HistogramCache::getHistograms (this=0x7fe5a3a6cac8,
table=...) at ../optimizer/NATable.cpp:327
#10 0x00007fe5ae7a8d6d in NATable::getStatistics (this=0x7fe5a3a45be0)
at ../optimizer/NATable.cpp:5980
#11 0x00007fe5aea95bae in TableDesc::getTableColStats (this=0x7fe5946eb4d8)
at ../optimizer/TableDesc.cpp:373
#12 0x00007fe5b0631eda in TableDesc::tableColStats (this=0x7fe5946eb4d8)
at ../optimizer/TableDesc.h:134
#13 0x00007fe5ae892f6e in Scan::synthLogProp (this=0x7fe5946c86b0, normWAPtr=
0x7fff3817b270) at ../optimizer/OptLogRelExpr.cpp:5193
#14 0x00007fe5ae880279 in RelExpr::synthLogProp (this=0x7fe5946f70f0,
normWAPtr=0x7fff3817b270) at ../optimizer/OptLogRelExpr.cpp:622
#15 0x00007fe5ae88ff09 in GroupByAgg::synthLogProp (this=0x7fe5946f70f0,
--
Type <return> to continue, or q <return> to quit--
Note: Trafodion has the feature that it can execute Hive DDL. So, the CREATE TABLE and DROP TABLE statements could be done through sqlci directly. (The table name has to be qualified as hive.hive.hive1 however.) When done through sqlci, any Trafodion histograms are cleaned up and this issue does not occur.
Attachments
Issue Links
- links to