Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-3490

Concurrent data load failure with carbondata FileNotFound exception

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.1
    • Component/s: None
    • Labels:
      None

      Description

      Caused by: org.apache.carbondata.core.datastore.exception.CarbonDataWriterException: Problem while copying file from local store to carbon store
      at org.apache.carbondata.core.util.CarbonUtil.copyCarbonDataFileToCarbonStorePath(CarbonUtil.java:2750)
      at org.apache.carbondata.processing.store.writer.AbstractFactDataWriter.commitCurrentFile(AbstractFactDataWriter.java:283)
      at org.apache.carbondata.processing.store.writer.v3.CarbonFactDataWriterImplV3.closeWriter(CarbonFactDataWriterImplV3.java:393)
      ... 11 more
      Caused by: java.io.FileNotFoundException: /tmp/carbon865982118689228_1/Fact/Part0/Segment_6/1/part-0-1_batchno0-0-6-1565329654844.carbondata (No such file or directory)
      at java.io.FileInputStream.open0(Native Method)
      at java.io.FileInputStream.open(FileInputStream.java:195)
      at java.io.FileInputStream.<init>(FileInputStream.java:138)
      at java.io.FileInputStream.<init>(FileInputStream.java:93)
      at org.apache.carbondata.core.datastore.filesystem.LocalCarbonFile.getDataInputStream(LocalCarbonFile.java:309)
      at org.apache.carbondata.core.datastore.filesystem.LocalCarbonFile.getDataInputStream(LocalCarbonFile.java:299)
      at org.apache.carbondata.core.datastore.impl.FileFactory.getDataInputStream(FileFactory.java:179)
      at org.apache.carbondata.core.datastore.impl.FileFactory.getDataInputStream(FileFactory.java:175)
      at org.apache.carbondata.core.util.CarbonUtil.copyLocalFileToCarbonStore(CarbonUtil.java:2781)
      at org.apache.carbondata.core.util.CarbonUtil.copyCarbonDataFileToCarbonStorePath(CarbonUtil.java:2746)
      ... 13 more

      problem: When two load is happening concurrently, one load is cleaning the temp directory of the concurrent load

      cause: temp directory to store the carbon files is created using system.get nano time, due to this two load have same store location. when one load is completed, it cleaned the temp directory. causing dataload failure for other load.

      solution:
      use UUID instead of nano time while creating the temp directory to have each load a unique directory.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                Ajantha_Bhat Ajantha Bhat
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 50m
                  4h 50m