Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
This is due to a issue in
initializeFromFilesystem(), which tries to check if MDT partition needs to be initialized based on the absence of partition-type. But for functional index, partition-type actually store the prefix (func_index_)- hence the check always fails and we try to reinit the same functional index partition again.
Simple test:
spark.sql(
s"""
create table $tableName ( id int, name string, price double, ts long ) using hudi options ( primaryKey ='id', type = '$tableType', preCombineField = 'ts', hoodie.metadata.record.index.enable = 'true', hoodie.datasource.write.recordkey.field = 'id' ) partitioned by(ts) location '$basePath'
""".stripMargin)
spark.sql(s"insert into $tableName values(1, 'a1', 10, 1000)")
spark.sql(s"insert into $tableName values(2, 'a2', 10, 1001)")
spark.sql(s"insert into $tableName values(3, 'a3', 10, 1002)")
var createIndexSql = s"create index idx_datestr on $tableName using column_stats(ts) options(func='from_unixtime', format='yyyy-MM-dd')"
spark.sql(createIndexSql)
– This insert throws null-pointer exception
spark.sql(s"insert into $tableName values(4, 'a4', 10, 1004)")