Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 0.2.0
    • Fix Version/s: 0.2.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      large clusters

      Description

      When dfs.data.dir has multiple values, we currently start a DataNode for each (all in the same JVM). Instead we should run a single DataNode that stores block files into the different directories. This will reduce the number of connections to the namenode. We cannot hash because different devices might be different amounts full. So the datanode will need to keep a table mapping from block id to file location, and add new blocks to less full devices.

        Issue Links

          Activity

          There are no comments yet on this issue.

            People

            • Assignee:
              Konstantin Shvachko
              Reporter:
              Doug Cutting
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development