Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
There is a scenario when different SplitGenerator instances try to cover the delta-only buckets (having no base file) more than once, so there could be multiple OrcSplit instances generated for the same delta file, causing more tasks to read the same delta file more than once, causing duplicate records in a simple select star query.
File structure for a 256 bucket table
drwxrwxrwx - hive hadoop 0 2019-11-29 15:55 /apps/hive/warehouse/naresh.db/test1/base_0000013 -rw-r--r-- 3 hive hadoop 353 2019-11-29 15:55 /apps/hive/warehouse/naresh.db/test1/base_0000013/bucket_00012 -rw-r--r-- 3 hive hadoop 1642 2019-11-29 15:55 /apps/hive/warehouse/naresh.db/test1/base_0000013/bucket_00140 drwxrwxrwx - hive hadoop 0 2019-11-29 15:55 /apps/hive/warehouse/naresh.db/test1/delta_0000014_0000014_0000 -rwxrwxrwx 3 hive hadoop 348 2019-11-29 15:55 /apps/hive/warehouse/naresh.db/test1/delta_0000014_0000014_0000/bucket_00012 -rwxrwxrwx 3 hive hadoop 1635 2019-11-29 15:55 /apps/hive/warehouse/naresh.db/test1/delta_0000014_0000014_0000/bucket_00140 drwxrwxrwx - hive hadoop 0 2019-11-29 16:04 /apps/hive/warehouse/naresh.db/test1/delta_0000015_0000015_0000 -rwxrwxrwx 3 hive hadoop 348 2019-11-29 16:04 /apps/hive/warehouse/naresh.db/test1/delta_0000015_0000015_0000/bucket_00012 -rwxrwxrwx 3 hive hadoop 1808 2019-11-29 16:04 /apps/hive/warehouse/naresh.db/test1/delta_0000015_0000015_0000/bucket_00140 drwxrwxrwx - hive hadoop 0 2019-11-29 16:06 /apps/hive/warehouse/naresh.db/test1/delta_0000016_0000016_0000 -rwxrwxrwx 3 hive hadoop 348 2019-11-29 16:06 /apps/hive/warehouse/naresh.db/test1/delta_0000016_0000016_0000/bucket_00043 -rwxrwxrwx 3 hive hadoop 1633 2019-11-29 16:06 /apps/hive/warehouse/naresh.db/test1/delta_0000016_0000016_0000/bucket_00171
in this case, when bucket_00171 file has a record, and there is no base file for that, a select with ETL split strategy can generate 2 splits for the same delta bucket...
the scenario of the issue:
1. ETLSplitStrategy contains a covered[] array which is shared between the SplitInfo instances to be created
2. a SplitInfo instance is created for every base file (2 in this case)
3. for every SplitInfo, a SplitGenerator is created, and in the constructor, parent's getSplit is called, which tries to take care of the deltas
I'm not sure at the moment what's the intention of this, but this way, duplicated delta split can be generated, which can cause duplicated read later (note that both tasks read the same delta file: bucket_00171)
2019-12-01T16:24:53,669 INFO [TezTR-127843_16_30_0_171_0 (1575040127843_0016_30_00_000171_0)] orc.ReaderImpl: Reading ORC rows from hdfs://c3351-node2.squadron.support.hortonworks.com:8020/apps/hive/warehouse/naresh.db/test1/delta_0000016_0000016_0000/bucket_00171 with {include: [true, true, true, true, true, true, true, true, true, true, true, true], offset: 0, length: 9223372036854775807, schema: struct<idp_warehouse_id:bigint,idp_audit_id:bigint,batch_id:decimal(9,0),source_system_cd:varchar(500),insert_time:timestamp,process_status_cd:varchar(20),business_date:date,last_update_time:timestamp,report_date:date,etl_run_time:timestamp,etl_run_nbr:bigint>} 2019-12-01T16:24:53,672 INFO [TezTR-127843_16_30_0_171_0 (1575040127843_0016_30_00_000171_0)] lib.MRReaderMapred: Processing split: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit [hdfs://c3351-node2.squadron.support.hortonworks.com:8020/apps/hive/warehouse/naresh.db/test1, start=171, length=0, isOriginal=false, fileLength=9223372036854775807, hasFooter=false, hasBase=false, deltas=[{ minTxnId: 14 maxTxnId: 14 stmtIds: [0] }, { minTxnId: 15 maxTxnId: 15 stmtIds: [0] }, { minTxnId: 16 maxTxnId: 16 stmtIds: [0] }]] 2019-12-01T16:24:55,807 INFO [TezTR-127843_16_30_0_425_0 (1575040127843_0016_30_00_000425_0)] orc.ReaderImpl: Reading ORC rows from hdfs://c3351-node2.squadron.support.hortonworks.com:8020/apps/hive/warehouse/naresh.db/test1/delta_0000016_0000016_0000/bucket_00171 with {include: [true, true, true, true, true, true, true, true, true, true, true, true], offset: 0, length: 9223372036854775807, schema: struct<idp_warehouse_id:bigint,idp_audit_id:bigint,batch_id:decimal(9,0),source_system_cd:varchar(500),insert_time:timestamp,process_status_cd:varchar(20),business_date:date,last_update_time:timestamp,report_date:date,etl_run_time:timestamp,etl_run_nbr:bigint>} 2019-12-01T16:24:55,813 INFO [TezTR-127843_16_30_0_425_0 (1575040127843_0016_30_00_000425_0)] lib.MRReaderMapred: Processing split: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat:OrcSplit [hdfs://c3351-node2.squadron.support.hortonworks.com:8020/apps/hive/warehouse/naresh.db/test1, start=171, length=0, isOriginal=false, fileLength=9223372036854775807, hasFooter=false, hasBase=false, deltas=[{ minTxnId: 14 maxTxnId: 14 stmtIds: [0] }, { minTxnId: 15 maxTxnId: 15 stmtIds: [0] }, { minTxnId: 16 maxTxnId: 16 stmtIds: [0] }]]
seems like this issue doesn't affect AcidV2, as getSplits() returns an empty collection or throws an exception in case of unexpected deltas (which was the case here, where deltas was not unexpected):
https://github.com/apache/hive/blob/8ee3497f87f81fa84ee1023e891dc54087c2cd5e/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L1178-L1197