Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10285 Storage Policy Satisfier in HDFS
  3. HDFS-11334

[SPS]: NN switch and rescheduling movements can lead to have more than one coordinator for same file blocks

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: HDFS-10285
    • Fix Version/s: HDFS-10285, 3.2.0
    • Component/s: datanode, namenode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      I am summarizing the scenarios here what Rakesh and me discussed offline:
      Here we need to handle couple of cases:

      1. NN switch - it will freshly start scheduling for all files.
        At this time, old co-ordinators may continue movement work and send results back. This could confuse NN SPS that which result is right one.
        NEED TO HANDLE
      2. DN disconnected for heartbeat expiry - If DN disconnected for long time(more than heartbeat expiry), NN will remove this nodes. After SPS Monitor time out, it may retry for files which were scheduled to that DN, by finding new co-ordinator. But if it reconnects back after NN reschedules, it may lead to get different results from deferent co-ordinators.
        NEED TO HANDLE
      3. NN Restart- Should be same as point 1
      4. DN disconnect - here When DN disconnected simply and reconnected immediately (before heartbeat expiry), there should not any issues
        NEED NOT HANDLE, but can think of more scenarios if any thing missing
      5. DN Restart- If DN restarted, DN can not send any results as it will loose everything. After NN SPS Monitor timeout, it will retry.
        NEED NOT HANDLE, but can think of more scenarios if any thing missing

        Attachments

        1. HDFS-11334-HDFS-10285-04.patch
          43 kB
          Rakesh Radhakrishnan
        2. HDFS-11334-HDFS-10285-03.patch
          43 kB
          Rakesh Radhakrishnan
        3. HDFS-11334-HDFS-10285-02.patch
          42 kB
          Rakesh Radhakrishnan
        4. HDFS-11334-HDFS-10285-01.patch
          37 kB
          Rakesh Radhakrishnan
        5. HDFS-11334-HDFS-10285-00.patch
          33 kB
          Rakesh Radhakrishnan

          Activity

            People

            • Assignee:
              rakeshr Rakesh Radhakrishnan
              Reporter:
              umamaheswararao Uma Maheswara Rao G
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: