Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-1911

infinite loop in dfs -cat command.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.13.1, 0.14.3
    • 0.17.0
    • None
    • None
    • Reviewed

    Description

      [knoguchi]$ hadoop dfs -cat fileA
      07/09/13 17:36:02 INFO fs.DFSClient: Could not obtain block 0 from any node:
      java.io.IOException: No live nodes contain current block
      07/09/13 17:36:20 INFO fs.DFSClient: Could not obtain block 0 from any node:
      java.io.IOException: No live nodes contain current block
      [repeats forever]

      Setting one of the Debug statement to Warn, it kept on showing

       
       WARN org.apache.hadoop.fs.DFSClient: Failed to connect
      to /99.99.999.9 :11111:java.io.IOException: Recorded block size is 7496, but
      datanode reports size of 0
      	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:690)
      	at org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:771)
      	at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(FSDataInputStream.java:41)
      	at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
      	at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
      	at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
      	at java.io.DataInputStream.readFully(DataInputStream.java:178)
      	at java.io.DataInputStream.readFully(DataInputStream.java:152)
      	at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.(ChecksumFileSystem.java:123)
      	at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:340)
      	at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:259)
      	at org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:466)
      	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:186)
      	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1707)
      

      Turns out fileA was corrupted. Fsck showed crc file of 7496 bytes, but when I searched for the blocks on each node, 3 replicas were all size 0.

      Not sure how it got corrupted, but it would be nice if the dfs command fail instead of getting into an infinite loop.

      Attachments

        1. 1911-0.patch
          0.8 kB
          Christopher Douglas

        Issue Links

          Activity

            People

              cdouglas Christopher Douglas
              knoguchi Koji Noguchi
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: