Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.1-incubating
-
None
-
Likely all
Description
The scenario is the following:
1. Create a Hive text format table that has string columns with values 1048576 bytes long.
2. Create a Trafodion external table with varchar(1048576) columns for these string columns.
3. Do an UPDATE STATISTICS PERSISTENT on the table. This succeeds.
4. Do an UPDATE STATISTICS REMOVE SAMPLE on the table. This succeeds.
5. Repeat the UPDATE STATISTICS PERSISTENT on the table. This fails with the following errors:
-
-
- ERROR[9200] UPDATE STATISTICS for table HIVE.MYSCH.MYTABLE encountered an error (8609) from statement Process_Query.
-
-
-
- ERROR[8609] Waited rollback performed without starting a transaction.
-
-
-
- ERROR[9200] UPDATE STATISTICS for table HIVE.MYSCH.MYTABLE encountered an error (8609) from statement Process_Query.
-
-
-
- ERROR[8609] Waited rollback performed without starting a transaction.
-
-
-
- ERROR[9200] UPDATE STATISTICS for table HIVE.MYSCH.MYTABLE encountered an error (8609) from statement Process_Query.
-
-
-
- ERROR[8609] Waited rollback performed without starting a transaction.
-
-
-
- ERROR[9200] UPDATE STATISTICS for table HIVE.MYSCH.MYTABLE encountered an error (8609) from statement Process_Query.
-
-
-
- ERROR[8609] Waited rollback performed without starting a transaction.
-
-
-
- ERROR[9214] Object TRAFODION."HIVESTATS".TRAF_SAMPLE_93_1485453261_202793 could not be created.
-
-
-
- ERROR[8609] Waited rollback performed without starting a transaction.
-
To reproduce the problem, create 3 files datagen.py, hive.sql, mytest.sql with the following contents and run them in this order:
chmod 755 datagen.py
./datagen.py data.10rows_1MB.txt 10 2 1048576
hive -f hive.sql
sqlci -i mytest.sql
$ cat datagen.py
#! /usr/bin/env python
import sys
if len(sys.argv) != 5 or \
sys.argv[1].lower() == '-h' or \
sys.argv[1].lower() == '-help':
print 'Usage: ' + sys.argv[0] + ' <file> <num of rows> <num of varchar columns> <varchar column length>'
sys.exit()
f = open(sys.argv[1], "w+")
marker=list('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
for num_rows in range(0, int(sys.argv[2])):
f.write(str(num_rows) + '|')
for num_cols in range(0, int(sys.argv[3])):
f.write(marker[num_rows%len(marker)])
for i in range (1, int(sys.argv[4])):
f.write(str(i % 10))
f.write('|')
f.write(str(num_rows))
f.write('\n')
f.close()
$ cat hive.sql
drop database if exists mysch cascade;
create database mysch;
use mysch;
create table base_lgvarchar_table(c_int int, c_string1 string, c_string2 string , p_int int) row format delimited fields terminated by '|';
load data local inpath './data.10rows_10MB.txt' overwrite into table base_lgvarchar_table;
create table mytable (c_int int, c_string1 string, c_string2 string ,p_int int) row format delimited fields terminated by '|';
from base_lgvarchar_table insert overwrite table mytable select *;
$ cat mytest.sql
log mytest.log clear;
control query default HIVE_MAX_STRING_LENGTH '1048576';
set schema hive.mysch;
drop external table if exists mytable cascade;
create external table mytable (c_int int, c_string1 varchar(1048576), c_string2 varchar(1048576), p_int int) for hive.mysch.mytable;
showddl mytable;
update statistics for table mytable on every column sample random 90 percent persistent;
update statistics for table mytable remove sample;
update statistics for table mytable on existing column sample random 90 percent persistent;
update statistics for table mytable remove sample;
Attachments
Issue Links
- links to