Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14090

RBF: Improved isolation for downstream name nodes. {Static}

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.4.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Router is a gateway to underlying name nodes. Gateway architectures, should help minimize impact of clients connecting to healthy clusters vs unhealthy clusters.

      For example - If there are 2 name nodes downstream, and one of them is heavily loaded with calls spiking rpc queue times, due to back pressure the same with start reflecting on the router. As a result of this, clients connecting to healthy/faster name nodes will also slow down as same rpc queue is maintained for all calls at the router layer. Essentially the same IPC thread pool is used by router to connect to all name nodes.

      Currently router uses one single rpc queue for all calls. Lets discuss how we can change the architecture and add some throttling logic for unhealthy/slow/overloaded name nodes.

      One way could be to read from current call queue, immediately identify downstream name node and maintain a separate queue for each underlying name node. Another simpler way is to maintain some sort of rate limiter configured for each name node and let routers drop/reject/send error requests after certain threshold. 

      This won’t be a simple change as router’s ‘Server’ layer would need redesign and implementation. Currently this layer is the same as name node.

      Opening this ticket to discuss, design and implement this feature.

       

        Attachments

        1. RBF_ Isolation design.pdf
          230 kB
          CR Hota
        2. HDFS-14090-HDFS-13891.001.patch
          10 kB
          CR Hota
        3. HDFS-14090-HDFS-13891.002.patch
          40 kB
          CR Hota
        4. HDFS-14090-HDFS-13891.003.patch
          43 kB
          CR Hota
        5. HDFS-14090-HDFS-13891.004.patch
          44 kB
          CR Hota
        6. HDFS-14090-HDFS-13891.005.patch
          44 kB
          CR Hota
        7. HDFS-14090.006.patch
          48 kB
          CR Hota
        8. HDFS-14090.007.patch
          48 kB
          CR Hota
        9. HDFS-14090.008.patch
          49 kB
          CR Hota
        10. HDFS-14090.009.patch
          49 kB
          CR Hota
        11. HDFS-14090.010.patch
          51 kB
          CR Hota
        12. HDFS-14090.011.patch
          51 kB
          CR Hota
        13. HDFS-14090.012.patch
          49 kB
          CR Hota
        14. HDFS-14090.013.patch
          50 kB
          CR Hota
        15. HDFS-14090.014.patch
          50 kB
          CR Hota
        16. HDFS-14090.015.patch
          46 kB
          Fengnan Li
        17. HDFS-14090.016.patch
          46 kB
          Fengnan Li
        18. HDFS-14090.017.patch
          47 kB
          Fengnan Li
        19. HDFS-14090.018.patch
          47 kB
          Fengnan Li
        20. HDFS-14090.019.patch
          47 kB
          Fengnan Li
        21. HDFS-14090.020.patch
          47 kB
          Fengnan Li
        22. HDFS-14090.021.patch
          47 kB
          Fengnan Li
        23. HDFS-14090.022.patch
          48 kB
          Fengnan Li
        24. HDFS-14090.023.patch
          48 kB
          Fengnan Li
        25. HDFS-14090.024.patch
          49 kB
          Fengnan Li
        26. HDFS-14090.025.patch
          50 kB
          Fengnan Li

          Issue Links

            Activity

              People

              • Assignee:
                fengnanli Fengnan Li
                Reporter:
                crh CR Hota
              • Votes:
                0 Vote for this issue
                Watchers:
                20 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: