Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1442

Simplify clustering executor SparkRunClusteringCommitActionExecutor

    XMLWordPrintableJSON

Details

    Description

      readRecordsForGroup in SparkRunClusteringCommitActionExecutor has two implementations

      1) readRecordsForGroupWithLogs to read records from fileslice with log files
      2) readRecordsForGroupBaseFiles to read records from fileslice that dont have log files

      If theres no performance impact of using #1, we can just use the same approach for file slice that dont have log files.

      Do performance measurement and remove#2 if there is no big difference.

      Attachments

        Activity

          People

            Unassigned Unassigned
            satishkotha satish
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: