Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6996

FileInputFormat#getBlockIndex should include file name in the exception.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 2.6.0
    • None
    • None

    Description

      FileInputFormat..java
      // Some comments here
       protected int getBlockIndex(BlockLocation[] blkLocations, 
                                    long offset) {
      {
      ...
      ...
      BlockLocation last = blkLocations[blkLocations.length -1];
          long fileLength = last.getOffset() + last.getLength() -1;
          throw new IllegalArgumentException("Offset " + offset + 
                                             " is outside of file (0.." +
                                             fileLength + ")");
      }
      

      When the file is open for writing, the last.getLength() and last.getOffset() will be zero and we see the following exception stack trace.

      org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:288)
      Caused by: java.lang.IllegalArgumentException: Offset 0 is outside of file (0..-1)
      at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getBlockIndex(FileInputFormat.java:453)
      at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:413)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:265)
      ... 18 more
      

      Its difficult to debug which file was open.
      So creating this ticket to include the filename in the exception.
      Since FileInputFormat#getBlockIndex is protected, we can't change the signature of that method and add file name to arguments.
      The only way I can think to fix this is:

      FileInputFormat..java
       public InputSplit[] getSplits(JobConf job, int numSplits)
          throws IOException {
      {
      ...
      ...
         for (FileStatus file: files) {
            Path path = file.getPath();
            long length = file.getLen();
            if (length != 0) {
              FileSystem fs = path.getFileSystem(job);
              BlockLocation[] blkLocations;
              if (file instanceof LocatedFileStatus) {
                blkLocations = ((LocatedFileStatus) file).getBlockLocations();
              } else {
                blkLocations = fs.getFileBlockLocations(file, 0, length);
              }
              if (isSplitable(fs, path)) {
                long blockSize = file.getBlockSize();
                long splitSize = computeSplitSize(goalSize, minSize, blockSize);
      
                long bytesRemaining = length;
                while (((double) bytesRemaining)/splitSize > SPLIT_SLOP) {
                  String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations,
                      length-bytesRemaining, splitSize, clusterMap);
                  splits.add(makeSplit(path, length-bytesRemaining, splitSize,
                      splitHosts[0], splitHosts[1]));
                  bytesRemaining -= splitSize;
                }
      
                if (bytesRemaining != 0) {
                  String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations, length
                      - bytesRemaining, bytesRemaining, clusterMap);
                  splits.add(makeSplit(path, length - bytesRemaining, bytesRemaining,
                      splitHosts[0], splitHosts[1]));
                }
              } else {
                String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations,0,length,clusterMap);
                splits.add(makeSplit(path, 0, length, splitHosts[0], splitHosts[1]));
              }
            } else { 
              //Create empty hosts array for zero length files
              splits.add(makeSplit(path, 0, length, new String[0]));
            }
          }
      

      Have a try-catch block around the above code chunk and catch IllegalArgumentException and check for message Offset 0 is outside of file (0..-1).
      If yes, add the file name and rethrow IllegalArgumentException.

      Attachments

        Activity

          People

            Unassigned Unassigned
            shahrs87 Rushabh Shah
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: