Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-3067

NPE in DFSInputStream.readBuffer if read is repeated on corrupted block

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: 0.23.3, 2.0.2-alpha
    • Component/s: hdfs-client
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      With a singly-replicated block that's corrupted, issuing a read against it twice in succession (e.g. if ChecksumException is caught by the client) gives a NullPointerException.

      Here's the body of a test that reproduces the problem:

      
          final short REPL_FACTOR = 1;
          final long FILE_LENGTH = 512L;
          cluster.waitActive();
          FileSystem fs = cluster.getFileSystem();
      
          Path path = new Path("/corrupted");
      
          DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
          DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
      
          ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
          int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
          assertEquals("All replicas not corrupted", REPL_FACTOR, blockFilesCorrupted);
      
          InetSocketAddress nnAddr =
              new InetSocketAddress("localhost", cluster.getNameNodePort());
          DFSClient client = new DFSClient(nnAddr, conf);
          DFSInputStream dis = client.open(path.toString());
          byte[] arr = new byte[(int)FILE_LENGTH];
          boolean sawException = false;
          try {
            dis.read(arr, 0, (int)FILE_LENGTH);
          } catch (ChecksumException ex) {     
            sawException = true;
          }
          
          assertTrue(sawException);
          sawException = false;
          try {
            dis.read(arr, 0, (int)FILE_LENGTH); // <-- NPE thrown here
          } catch (ChecksumException ex) {     
            sawException = true;
          } 
      

      The stack:

      java.lang.NullPointerException
      	at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
      	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
              [snip test stack]
      

      and the problem is that currentNode is null. It's left at null after the first read, which fails, and then is never refreshed because the condition in read that protects blockSeekTo is only triggered if the current position is outside the block's range.

      1. HDFS-3607.patch
        4 kB
        Henry Robinson
      2. HDFS-3067.1.patch
        4 kB
        Henry Robinson

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Henry Robinson
            Reporter:
            Henry Robinson
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development