Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24020

Automatic Compaction not working in existing partitions for Streaming Ingest with Dynamic Partition

    XMLWordPrintableJSON

    Details

      Description

      This issue happens when we try to do streaming ingest with dynamic partition on already existing partitions. I checked in the code, we have following check in the AbstractRecordWriter.

       

      PartitionInfo partitionInfo = conn.createPartitionIfNotExists(partitionValues);
      // collect the newly added partitions. connection.commitTransaction() will report the dynamically added
      // partitions to TxnHandler
      if (!partitionInfo.isExists()) {
        addedPartitions.add(partitionInfo.getName());
      } else {
        if (LOG.isDebugEnabled()) {
          LOG.debug("Partition {} already exists for table {}",
              partitionInfo.getName(), fullyQualifiedTableName);
        }
      }
      

      Above addedPartitions is passed to addDynamicPartitions during TransactionBatch commit. So in case of already existing partitions, addedPartitions will be empty and addDynamicPartitions **will not move entries from TXN_COMPONENTS to COMPLETED_TXN_COMPONENTS. This results in Initiator not able to trigger auto compaction.

      Another issue which has been observed is, we are not clearing addedPartitions on writer close, which results in information flowing across transactions.

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                vpnvishv Vipin Vishvkarma
                Reporter:
                vpnvishv Vipin Vishvkarma
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h