Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-3490

Concurrent data load failure with carbondata FileNotFound exception

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.6.1
    • None
    • None

    Description

      Caused by: org.apache.carbondata.core.datastore.exception.CarbonDataWriterException: Problem while copying file from local store to carbon store
      at org.apache.carbondata.core.util.CarbonUtil.copyCarbonDataFileToCarbonStorePath(CarbonUtil.java:2750)
      at org.apache.carbondata.processing.store.writer.AbstractFactDataWriter.commitCurrentFile(AbstractFactDataWriter.java:283)
      at org.apache.carbondata.processing.store.writer.v3.CarbonFactDataWriterImplV3.closeWriter(CarbonFactDataWriterImplV3.java:393)
      ... 11 more
      Caused by: java.io.FileNotFoundException: /tmp/carbon865982118689228_1/Fact/Part0/Segment_6/1/part-0-1_batchno0-0-6-1565329654844.carbondata (No such file or directory)
      at java.io.FileInputStream.open0(Native Method)
      at java.io.FileInputStream.open(FileInputStream.java:195)
      at java.io.FileInputStream.<init>(FileInputStream.java:138)
      at java.io.FileInputStream.<init>(FileInputStream.java:93)
      at org.apache.carbondata.core.datastore.filesystem.LocalCarbonFile.getDataInputStream(LocalCarbonFile.java:309)
      at org.apache.carbondata.core.datastore.filesystem.LocalCarbonFile.getDataInputStream(LocalCarbonFile.java:299)
      at org.apache.carbondata.core.datastore.impl.FileFactory.getDataInputStream(FileFactory.java:179)
      at org.apache.carbondata.core.datastore.impl.FileFactory.getDataInputStream(FileFactory.java:175)
      at org.apache.carbondata.core.util.CarbonUtil.copyLocalFileToCarbonStore(CarbonUtil.java:2781)
      at org.apache.carbondata.core.util.CarbonUtil.copyCarbonDataFileToCarbonStorePath(CarbonUtil.java:2746)
      ... 13 more

      problem: When two load is happening concurrently, one load is cleaning the temp directory of the concurrent load

      cause: temp directory to store the carbon files is created using system.get nano time, due to this two load have same store location. when one load is completed, it cleaned the temp directory. causing dataload failure for other load.

      solution:
      use UUID instead of nano time while creating the temp directory to have each load a unique directory.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Ajantha_Bhat Ajantha Bhat
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 50m
                  4h 50m