Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-6892

ExternalSpillableMap may cause data duplication when flink compaction

    XMLWordPrintableJSON

Details

    Description

      reproduce:

      1、fullfill in-memory map with records, and let this.inMemoryMap.size() % NUMBER_OF_RECORDS_TO_ESTIMATE_PAYLOAD_SIZE == 0

      2、insert a record with key1 into ExternalSpillableMap (which will cause size estimate and make sure the currentInMemoryMapSize is still greater than or equal to the maxInMemorySizeInBytes).
         it will be spilled to disk. 

      3、Reduce the size of record of key1 which will make the currentInMemoryMapSize less than maxInMemorySizeInBytes when put into ExternalSpillableMap
         it will be put into in-memory map.
         
      data duplication when iterator finally.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              llch__ Linleicheng
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: