Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-4893

More than 1 splits are created for a single log file for MOR table

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • None
    • 0.12.2
    • reader-core
    • None
    • 2

    Description

      While debugging a flaky test, realized that we are generating more than 1 split for one log file itself. Root caused it to isSpllitable() that returns true for HoodieRealTimePath. 

       

      https://github.com/apache/hudi/blob/6dbe2960f2eaf0408dc0ef544991cad0190050a9/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java#L91

       

      I made a quick fix locally and verified that only one split is generated per log file. 

       

      git diff hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java
      diff --git a/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java b/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java
      index bba44d5c66..d09dfdf753 100644
      --- a/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java
      +++ b/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/realtime/HoodieRealtimePath.java
      @@ -89,7 +89,7 @@ public class HoodieRealtimePath extends Path {
         }
       
         public boolean isSplitable() {
      -    return !toString().isEmpty() && !includeBootstrapFilePath();
      +    return !toString().contains(".log") && !includeBootstrapFilePath();
         }
       
         public PathWithBootstrapFileStatus getPathWithBootstrapFileStatus() { 

       

       

      Attachments

        Activity

          People

            shivnarayan sivabalan narayanan
            shivnarayan sivabalan narayanan
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: