Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1835

DataNode.setNewStorageID pulls entropy from /dev/random

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.2
    • Fix Version/s: 0.23.0
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      DataNode.setNewStorageID uses SecureRandom.getInstance("SHA1PRNG") which always pulls fresh entropy.

      It wouldn't be so bad if this were only the 120 bits needed by sha1, but the default impl of SecureRandom actually uses a BufferedInputStream around /dev/random and pulls 1024 bits of entropy for this one call.

      If you are on a system without much entropy coming in, this call can block and block others.

      Can we just change this to use "new SecureRandom().nextInt(Integer.MAX_VALUE)" instead?

      1. hdfs-1835.txt
        0.8 kB
        Todd Lipcon
      2. DataNode.patch
        0.9 kB
        John Carrino

        Issue Links

          Activity

          John Carrino created issue -
          Hide
          John Carrino added a comment -

          by 120 bits, I mean 160 bits. 20 bytes.

          Show
          John Carrino added a comment - by 120 bits, I mean 160 bits. 20 bytes.
          Hide
          John Carrino added a comment -

          Index: src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          ===================================================================
          — src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java (revision 1094852)
          +++ src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java (working copy)
          @@ -654,7 +654,7 @@

          int rand = 0;
          try

          { - rand = SecureRandom.getInstance("SHA1PRNG").nextInt(Integer.MAX_VALUE); + rand = new SecureRandom().nextInt(Integer.MAX_VALUE); }

          catch (NoSuchAlgorithmException e) {
          LOG.warn("Could not use SecureRandom");
          rand = R.nextInt(Integer.MAX_VALUE);

          Show
          John Carrino added a comment - Index: src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java =================================================================== — src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java (revision 1094852) +++ src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java (working copy) @@ -654,7 +654,7 @@ int rand = 0; try { - rand = SecureRandom.getInstance("SHA1PRNG").nextInt(Integer.MAX_VALUE); + rand = new SecureRandom().nextInt(Integer.MAX_VALUE); } catch (NoSuchAlgorithmException e) { LOG.warn("Could not use SecureRandom"); rand = R.nextInt(Integer.MAX_VALUE);
          John Carrino made changes -
          Field Original Value New Value
          Status Open [ 1 ] Patch Available [ 10002 ]
          Fix Version/s 0.22.0 [ 12314241 ]
          John Carrino made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          John Carrino added a comment -

          Index: src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
          ===================================================================
          — src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java (revision 1094852)
          +++ src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java (working copy)
          @@ -652,13 +652,7 @@
          LOG.warn("Could not find ip address of \"default\" inteface.");
          }

          • int rand = 0;
          • try { - rand = SecureRandom.getInstance("SHA1PRNG").nextInt(Integer.MAX_VALUE); - }

            catch (NoSuchAlgorithmException e)

            { - LOG.warn("Could not use SecureRandom"); - rand = R.nextInt(Integer.MAX_VALUE); - }

            + int rand = new SecureRandom().nextInt(Integer.MAX_VALUE);
            dnReg.storageID = "DS-" + rand + ""+ ip + "" + dnReg.getPort() + "-" +
            System.currentTimeMillis();
            }

          Show
          John Carrino added a comment - Index: src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java =================================================================== — src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java (revision 1094852) +++ src/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java (working copy) @@ -652,13 +652,7 @@ LOG.warn("Could not find ip address of \"default\" inteface."); } int rand = 0; try { - rand = SecureRandom.getInstance("SHA1PRNG").nextInt(Integer.MAX_VALUE); - } catch (NoSuchAlgorithmException e) { - LOG.warn("Could not use SecureRandom"); - rand = R.nextInt(Integer.MAX_VALUE); - } + int rand = new SecureRandom().nextInt(Integer.MAX_VALUE); dnReg.storageID = "DS-" + rand + " "+ ip + " " + dnReg.getPort() + "-" + System.currentTimeMillis(); }
          John Carrino made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          John Carrino added a comment -

          I don't understand the "create patch" tool, so I'm just going to upload a patch file.

          Show
          John Carrino added a comment - I don't understand the "create patch" tool, so I'm just going to upload a patch file.
          John Carrino made changes -
          Attachment DataNode.patch [ 12476682 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12476682/DataNode.patch
          against trunk revision 1094748.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          -1 core tests. The patch failed these core unit tests:
          org.apache.hadoop.hdfs.TestFileAppend4
          org.apache.hadoop.hdfs.TestLargeBlock
          org.apache.hadoop.hdfs.TestWriteConfigurationToDFS

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/382//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/382//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/382//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12476682/DataNode.patch against trunk revision 1094748. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.TestFileAppend4 org.apache.hadoop.hdfs.TestLargeBlock org.apache.hadoop.hdfs.TestWriteConfigurationToDFS -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/382//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/382//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/382//console This message is automatically generated.
          Hide
          John Carrino added a comment -

          The difference between my patch and the original only exists on the linux version of the jre. On windows it behaves the same.

          The original behavior was to seed a sha1prng with randomness from /dev/random.
          The new behavior is to just read from /dev/urandom. Writing a test to tell these two cases apart would be quite difficult.

          I think the best way would involve turning off all entropy generation on linux and then consuming all the entropy, then running DataNode.setNewStorageID to make sure it doesn't block.

          Show
          John Carrino added a comment - The difference between my patch and the original only exists on the linux version of the jre. On windows it behaves the same. The original behavior was to seed a sha1prng with randomness from /dev/random. The new behavior is to just read from /dev/urandom. Writing a test to tell these two cases apart would be quite difficult. I think the best way would involve turning off all entropy generation on linux and then consuming all the entropy, then running DataNode.setNewStorageID to make sure it doesn't block.
          Hide
          John Carrino added a comment -

          I have a test that can go into the org/apache/hadoop/hdfs/server/datanode package

          public void testRate() throws Exception {
          DatanodeRegistration reg = new DatanodeRegistration();
          reg.setName("name");
          long startTime = System.currentTimeMillis();
          long count = 0;
          while (System.currentTimeMillis() - startTime < 20)

          { DataNode.setNewStorageID(reg); count++; }

          assertTrue("count was: " + count, count > 10);
          }

          Show
          John Carrino added a comment - I have a test that can go into the org/apache/hadoop/hdfs/server/datanode package public void testRate() throws Exception { DatanodeRegistration reg = new DatanodeRegistration(); reg.setName("name"); long startTime = System.currentTimeMillis(); long count = 0; while (System.currentTimeMillis() - startTime < 20) { DataNode.setNewStorageID(reg); count++; } assertTrue("count was: " + count, count > 10); }
          Hide
          Todd Lipcon added a comment -

          +1, since this isn't for anything truly cryptographic, and we have other parts of the string to make it unique (ie IP, port, time), I don't see any reason to specify which PRNG to use.

          The test case proposed, though, is way too likely to be flaky, so I won't include it in commit.

          Show
          Todd Lipcon added a comment - +1, since this isn't for anything truly cryptographic, and we have other parts of the string to make it unique (ie IP, port, time), I don't see any reason to specify which PRNG to use. The test case proposed, though, is way too likely to be flaky, so I won't include it in commit.
          Hide
          Todd Lipcon added a comment -

          Rebased John's patch on trunk (federation made an unrelated change that made it not apply, but code is the same)

          Show
          Todd Lipcon added a comment - Rebased John's patch on trunk (federation made an unrelated change that made it not apply, but code is the same)
          Todd Lipcon made changes -
          Attachment hdfs-1835.txt [ 12479213 ]
          Hide
          Todd Lipcon added a comment -

          Committed to trunk. Thanks, John!

          Show
          Todd Lipcon added a comment - Committed to trunk. Thanks, John!
          Todd Lipcon made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Hadoop Flags [Reviewed]
          Fix Version/s 0.23.0 [ 12315571 ]
          Fix Version/s 0.22.0 [ 12314241 ]
          Resolution Fixed [ 1 ]
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk-Commit #658 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/658/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #658 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/658/ )
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Hdfs-trunk #673 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/673/)

          Show
          Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #673 (See https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/673/ )
          Hide
          Eli Collins added a comment -

          Shouldn't we make this change to DataNodeCluster and NNStorage as well? They're not truly cryptographic uses as well. We should also factor this out to a utility method, seems like the three uses are slightly different, eg one uses DFSUtil.getRandom and the other creates a new Random object.

          On a somewhat related note, a workaround for the blocking is to use urandom, however according to this bug (http://bugs.sun.com/view_bug.do?bug_id=6202721) setting securerandom.source to "/dev/urandom" doesn't actually work when using SHA1PRNG.

          Show
          Eli Collins added a comment - Shouldn't we make this change to DataNodeCluster and NNStorage as well? They're not truly cryptographic uses as well. We should also factor this out to a utility method, seems like the three uses are slightly different, eg one uses DFSUtil.getRandom and the other creates a new Random object. On a somewhat related note, a workaround for the blocking is to use urandom, however according to this bug ( http://bugs.sun.com/view_bug.do?bug_id=6202721 ) setting securerandom.source to "/dev/urandom" doesn't actually work when using SHA1PRNG.
          Hide
          John Carrino added a comment -

          One issue we saw was that if some processes were reading from urandom it would still eat up entropy. Then when a process actually needed /dev/random, there wouldn't be any entropy left and it would block. Also I think the securerandom.source setting is system wide and I think that is a bit heavy handed.

          Show
          John Carrino added a comment - One issue we saw was that if some processes were reading from urandom it would still eat up entropy. Then when a process actually needed /dev/random, there wouldn't be any entropy left and it would block. Also I think the securerandom.source setting is system wide and I think that is a bit heavy handed.
          Eli Collins made changes -
          Assignee John Carrino [ johnyoh ]
          Eli Collins made changes -
          Link This issue is related to HDFS-2335 [ HDFS-2335 ]
          Hide
          Eli Collins added a comment -

          Perhaps because it wasn't actually reading from urandom (per above bug)?

          Filed HDFS-2335 for propagating this change to DataNodeCluster and NNStorage.

          Show
          Eli Collins added a comment - Perhaps because it wasn't actually reading from urandom (per above bug)? Filed HDFS-2335 for propagating this change to DataNodeCluster and NNStorage.
          Hide
          John Carrino added a comment -

          If you just 'cat /dev/urandom > /dev/null' it will eat entropy. urandom prefers entropy, but doesn't require it. Even if you are using urandom, you can push the entropy low enough that if someone then uses /dev/random, it may block.

          Show
          John Carrino added a comment - If you just 'cat /dev/urandom > /dev/null' it will eat entropy. urandom prefers entropy, but doesn't require it. Even if you are using urandom, you can push the entropy low enough that if someone then uses /dev/random, it may block.
          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              John Carrino
              Reporter:
              John Carrino
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 10m
                10m
                Remaining:
                Remaining Estimate - 10m
                10m
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development