Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-424

mapreduce jobs fail when no split is returned via inputFormat.getSplits

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.5.0
    • 0.6.0
    • None
    • None

    Description

      I'm using a MapReduce job to process some data logged and timestamped into files.
      When the job runs, it does not process the whole data, but filters only the data that has been logged since the last job run.

      However, when no new data has been logged, the job fails because the getSplits method of InputFormat returns no split. Thus the number of map tasks is 0. This is not intercepted, and the job fails at reduce step because it seems it does not find any data to process:

      java.io.FileNotFoundException: /local/home/hadoop/var/mapred/local/task_0030_r_000000_3/all.2 at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:121) at org.apache.hadoop.fs.FSDataInputStream$Checker.(FSDataInputStream.java:47) at org.apache.hadoop.fs.FSDataInputStream.(FSDataInputStream.java:221) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:150) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:259) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:253) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:241) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1013)

      What should be Hadoop's behaviour in such a case?

      IMHO, the job should be considered as successful. Indeed, this is not a job failure, but just a lack of input data. WDYT?

      Attachments

        1. emptyJobTest.patch
          5 kB
          Frédéric Bertin
        2. hadoop-424.patch
          0.7 kB
          Frédéric Bertin
        3. hadoop-424-2.patch
          0.8 kB
          Frédéric Bertin

        Activity

          People

            Unassigned Unassigned
            fred.bertin Frédéric Bertin
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: