Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14305

Serial number in BlockTokenSecretManager could overlap between different namenodes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • namenode, security
    • Reviewed
    • Hide
      NameNodes rely on independent block token key ranges to communicate block token identities to DataNodes and clients in a way that does not create conflicts between the tokens issued by multiple NameNodes. HDFS-6440 introduced the potential for overlaps in key ranges; this fixes the issue by creating 64 possible key ranges that NameNodes assign themselves to, allowing for up to 64 NameNodes to run safely. This limitation only applies within a single Namespace; there may be more than 64 NameNodes total spread among multiple federated Namespaces.
      Show
      NameNodes rely on independent block token key ranges to communicate block token identities to DataNodes and clients in a way that does not create conflicts between the tokens issued by multiple NameNodes. HDFS-6440 introduced the potential for overlaps in key ranges; this fixes the issue by creating 64 possible key ranges that NameNodes assign themselves to, allowing for up to 64 NameNodes to run safely. This limitation only applies within a single Namespace; there may be more than 64 NameNodes total spread among multiple federated Namespaces.

    Description

      Currently, a BlockTokenSecretManager starts with a random integer as the initial serial number, and then use this formula to rotate it:

          this.intRange = Integer.MAX_VALUE / numNNs;
          this.nnRangeStart = intRange * nnIndex;
          this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
       

      while numNNs is the total number of NameNodes in the cluster, and nnIndex is the index of the current NameNode specified in the configuration dfs.ha.namenodes.<nameservice>.

      However, with this approach, different NameNode could have overlapping ranges for serial number. For simplicity, let's assume Integer.MAX_VALUE is 100, and we have 2 NameNodes nn1 and nn2 in configuration. Then the ranges for these two are:

      nn1 -> [-49, 49]
      nn2 -> [1, 99]
      

      This is because the initial serial number could be any negative integer.

      Moreover, when the keys are updated, the serial number will again be updated with the formula:

      this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
      

      which means the new serial number could be updated to a range that belongs to a different NameNode, thus increasing the chance of collision again.

      When the collision happens, DataNodes could overwrite an existing key which will cause clients to fail because of InvalidToken error.

      Attachments

        1. HDFS-14305.001.patch
          2 kB
          Xiaoqiao He
        2. HDFS-14305.002.patch
          2 kB
          Xiaoqiao He
        3. HDFS-14305.003.patch
          5 kB
          Xiaoqiao He
        4. HDFS-14305.004.patch
          5 kB
          Xiaoqiao He
        5. HDFS-14305.005.patch
          4 kB
          Xiaoqiao He
        6. HDFS-14305.006.patch
          4 kB
          Xiaoqiao He
        7. HDFS-14305-007.patch
          3 kB
          Konstantin Shvachko
        8. HDFS-14305-008.patch
          5 kB
          Konstantin Shvachko

        Issue Links

          Activity

            People

              shv Konstantin Shvachko
              csun Chao Sun
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: