Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-2895

[Batch-sort]Query result mismatch with Batch-sort in save to disk (sort temp files) scenario.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • None
    • None

    Description

      probelm: Query result mismatch with Batch-sort in save to disk (sort temp files) scenario.

      scenario:
      a) Configure batchsort but give batch size more than UnsafeMemoryManager.INSTANCE.getUsableMemory().
      b) Load data that is greater than batch size. Observe that unsafeMemoryManager save to disk happened as it cannot process one batch.  
      c) so load happens in 2 batch.
      d) When query the results. There result data rows is more than expected data rows.

      root cause:

      For each batch, createSortDataRows() will be called.
      Files saved to disk during sorting of previous batch was considered for this batch.

      solution:
      Files saved to disk during sorting of previous batch ,should not be considered for this batch.
      Hence use batchID as rangeID field of sorttempfiles.
      So getFilesToMergeSort() will select files of only this batch.

      Attachments

        Issue Links

          Activity

            People

              Ajantha_Bhat Ajantha Bhat
              Ajantha_Bhat Ajantha Bhat
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 40m
                  4h 40m