Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-1215

AvroMultipleOutputs not working when specifying baseOutputPath

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.7.2
    • 1.7.4
    • java
    • avro

    Description

      I'm calling the write() method of AvroMultipleOutputs which takes the baseOutputPath. The reducer appears to begin hanging once it tries writing to a baseOuputPath value not already encountered. It then fails with:

      org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file ... because current leaseholder is trying to recreate file.

      I think the problem has to do with this line in AvroMultipleOutputs:

      // get the record writer from context output format
      //FileOutputFormat.setOutputName(taskContext, baseFileName);
      

      This line is not commented out in the similar code from Hadoop. So I think the baseOutputPath is ignored. As a result when each record writer is created it uses the same path, leading to the exception.

      Uncommenting this line does not work because of visibility of the method. However what this method does is set "mapreduce.output.basename". But setting this doesn't work either.

      After digging through Avro code I found that AvroOutputFormatBase is using "avro.mo.config.namedOutput" to create the path. If I replace the commented out line with this it seems to work:

      taskContext.getConfiguration().set("avro.mo.config.namedOutput", baseFileName);  
      

      Attachments

        1. AVRO-1215_final.patch
          9 kB
          Ashish Nagavaram
        2. AVRO-1215.patch
          9 kB
          Ashish Nagavaram
        3. AVRO-1215.patch
          10 kB
          Ashish Nagavaram
        4. AVRO-1215.patch
          29 kB
          Ashish Nagavaram
        5. AVRO-1215-v3.patch
          7 kB
          Ashish Nagavaram

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            nagav.ashish Ashish Nagavaram
            mhayes Matthew Hayes
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment