Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-3985

Optimize the segment-timestamp file clean up

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core, spark-integration
    • Labels:
      None

      Description

      For data update, in the CarbonProjectForUpdateCommand process, after the delete delta file is generated, the status of each segment is checked. If the status is not successful, all the segment directories are traversed to clean up the timestamp corresponding .carbondata, .carbonindex and .deletedelta files.

      If a great many segments have been generated in the Partion directory, it will be very time-consuming.

      In fact, in the process of cleaning up timestamp files, we only need to clean up the files in the Segment directory involved in this update.

      In the process of generating delete delta, record the segment path involved in this update; after entering the checkAndUpdateStatusFiles() function, if a segment status is found to be not successful, it will be cleaned directly according to the segment path list that has been recorded during generating delete delta, without searching all the segment directories.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              su-article suwen
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 2h 10m
                2h 10m