Uploaded image for project: 'Apache Gobblin'
  1. Apache Gobblin
  2. GOBBLIN-1379

Distcp hides real exception when retry happen

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 0.16.0
    • gobblin-core
    • None

    Description

      When folder creation fails in FileAwareInputStreamDataWriter with permission error , operation is retried. However, the original permission error is not logged or shown anywhere. Instead users see a misleading error about incorrect writer state.

      
      2021-01-14 20:04:21,173 ERROR [main] org.apache.gobblin.runtime.fork.Fork-0: Fork 0 of task task_HiveDistcpForDatabasesTier0_1610654490216_2004 failed to process data records. Set throwable in holder org.apache.gobblin.runtime.ForkThrowableHolder@567cfbdd
      java.io.IOException: com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 5 attempts.
      at org.apache.gobblin.writer.RetryWriter.callWithRetry(RetryWriter.java:144)
      at org.apache.gobblin.writer.RetryWriter.writeEnvelope(RetryWriter.java:124)
      at org.apache.gobblin.runtime.fork.Fork.processRecord(Fork.java:520)
      at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecord(AsynchronousFork.java:103)
      at org.apache.gobblin.runtime.fork.AsynchronousFork.processRecords(AsynchronousFork.java:86)
      at org.apache.gobblin.runtime.fork.Fork.run(Fork.java:250)
      at org.apache.gobblin.util.executors.MDCPropagatingRunnable.run(MDCPropagatingRunnable.java:39)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)
      Caused by: com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 5 attempts.
      at com.github.rholder.retry.Retryer.call(Retryer.java:174)
      at com.github.rholder.retry.Retryer$RetryerCallable.call(Retryer.java:318)
      at org.apache.gobblin.writer.RetryWriter.callWithRetry(RetryWriter.java:142)
      ... 11 more
      Caused by: java.io.IOException: org.apache.gobblin.data.management.copy.writer.FileAwareInputStreamDataWriter can only process one file.
      at org.apache.gobblin.data.management.copy.writer.FileAwareInputStreamDataWriter.writeImpl(FileAwareInputStreamDataWriter.java:199)
      at org.apache.gobblin.data.management.copy.writer.FileAwareInputStreamDataWriter.writeImpl(FileAwareInputStreamDataWriter.java:83)
      at org.apache.gobblin.instrumented.writer.InstrumentedDataWriterBase.write(InstrumentedDataWriterBase.java:158)
      at org.apache.gobblin.instrumented.writer.InstrumentedDataWriter.write(InstrumentedDataWriter.java:38)
      at org.apache.gobblin.writer.DataWriter.writeEnvelope(DataWriter.java:106)
      at org.apache.gobblin.writer.CloseOnFlushWriterWrapper.writeEnvelope(CloseOnFlushWriterWrapper.java:97)
      at org.apache.gobblin.instrumented.writer.InstrumentedDataWriterDecorator.writeEnvelope(InstrumentedDataWriterDecorator.java:76)
      at org.apache.gobblin.writer.PartitionedDataWriter.writeEnvelope(PartitionedDataWriter.java:239)
      at org.apache.gobblin.writer.RetryWriter$2.call(RetryWriter.java:119)
      at org.apache.gobblin.writer.RetryWriter$2.call(RetryWriter.java:116)
      at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78)
      at com.github.rholder.retry.Retryer.call(Retryer.java:160)
      ... 13 more
      2021-01-14 20:04:21,173 INFO [main] org.apache.gobblin.runtime.Task: Task shutdown: Fork future reaped in 15358 millis
      

      Attachments

        Activity

          People

            abti Abhishek Tiwari
            aplex Alexander Prokofiev
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h
                1h