Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: HDFS-6581
    • Fix Version/s: 2.6.0
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The current replica eviction scheme is inefficient since it performs multiple file operations in the context of block allocation.

      A better implementation would be asynchronous eviction when free space on RAM disk falls below a low watermark to make block allocation faster.

      1. HDFS-6930.01.patch
        16 kB
        Arpit Agarwal
      2. HDFS-6930.02.patch
        17 kB
        Arpit Agarwal

        Issue Links

          Activity

          Hide
          Arpit Agarwal added a comment -

          The lazyWriter also removes replicas from RAM disk when RAM disk free spaces falls below one of the two watermarks:

          1. Less than 10% free space
          2. Insufficient space for 3 default length blocks.

          Eviction is always done immediately after saving blocks, since only saved blocks become eligible for eviction.

          Saving/eviction could potentially still be bursty since it is timer-based. We could make it behave more smoothly in the future.

          Show
          Arpit Agarwal added a comment - The lazyWriter also removes replicas from RAM disk when RAM disk free spaces falls below one of the two watermarks: Less than 10% free space Insufficient space for 3 default length blocks. Eviction is always done immediately after saving blocks, since only saved blocks become eligible for eviction. Saving/eviction could potentially still be bursty since it is timer-based. We could make it behave more smoothly in the future.
          Hide
          Colin Patrick McCabe added a comment -

          Eviction is done when we have Less than 10% free space or Insufficient space for 3 default length blocks.

          One thing that might be suboptimal here is that we're using the dfs.blocksize configuration key on the DataNode and assuming that will be the same value used by the client. Clearly, the client could use 256 MB blocks, whereas the DN could use 128 MB blocks. Etc.

          Also, we don't really know how big the ramdisks are going to be. I can easily see a 300 GB ramdisk being used in a few years. Just defaulting to keeping 10% free seems like too much.

          So, why not just have a minimum free space configuration key. It could be specified as a number of bytes, rather than as a percentage. So we could default it to 128 MB * 3 to get your current default of leaving space for 3 blocks. This would work better for bigger ramdisks (unlike a percentage-based scheme) and wouldn't make assumptions about the client's and DN's block size configuration being the same.

          Show
          Colin Patrick McCabe added a comment - Eviction is done when we have Less than 10% free space or Insufficient space for 3 default length blocks. One thing that might be suboptimal here is that we're using the dfs.blocksize configuration key on the DataNode and assuming that will be the same value used by the client. Clearly, the client could use 256 MB blocks, whereas the DN could use 128 MB blocks. Etc. Also, we don't really know how big the ramdisks are going to be. I can easily see a 300 GB ramdisk being used in a few years. Just defaulting to keeping 10% free seems like too much. So, why not just have a minimum free space configuration key. It could be specified as a number of bytes, rather than as a percentage. So we could default it to 128 MB * 3 to get your current default of leaving space for 3 blocks. This would work better for bigger ramdisks (unlike a percentage-based scheme) and wouldn't make assumptions about the client's and DN's block size configuration being the same.
          Hide
          Xiaoyu Yao added a comment -

          +1

          Can you check if capacity > 0? It can be 0 when the RAM_DISK volume is allowed to add/remove dynamically,

           int percentFree = (int) (free * 100 / capacity);
          
          Show
          Xiaoyu Yao added a comment - +1 Can you check if capacity > 0? It can be 0 when the RAM_DISK volume is allowed to add/remove dynamically, int percentFree = ( int ) (free * 100 / capacity);
          Hide
          Arpit Agarwal added a comment -

          Thanks for the reviews.

          Colin, I filed HDFS-6988 to make the thresholds configurable.

          Xiaoyu, updated patch attached to check capacity before division.

          Show
          Arpit Agarwal added a comment - Thanks for the reviews. Colin, I filed HDFS-6988 to make the thresholds configurable. Xiaoyu, updated patch attached to check capacity before division.
          Hide
          Colin Patrick McCabe added a comment -

          Colin, I filed HDFS-6988 to make the thresholds configurable.

          thanks

          updated patch attached to check capacity before division.

          +1

          Show
          Colin Patrick McCabe added a comment - Colin, I filed HDFS-6988 to make the thresholds configurable. thanks updated patch attached to check capacity before division. +1
          Hide
          Arpit Agarwal added a comment -

          Thanks Colin and Xiaoyu. Committed to the feature branch.

          Show
          Arpit Agarwal added a comment - Thanks Colin and Xiaoyu. Committed to the feature branch.
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #6163 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6163/)
          HDFS-6930. Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-6581.txt
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #6163 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6163/ ) HDFS-6930 . Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java hadoop-hdfs-project/hadoop-hdfs/CHANGES- HDFS-6581 .txt
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #698 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/698/)
          HDFS-6930. Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-6581.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #698 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/698/ ) HDFS-6930 . Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab) hadoop-hdfs-project/hadoop-hdfs/CHANGES- HDFS-6581 .txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #1889 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1889/)
          HDFS-6930. Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-6581.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1889 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1889/ ) HDFS-6930 . Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab) hadoop-hdfs-project/hadoop-hdfs/CHANGES- HDFS-6581 .txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #1914 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1914/)
          HDFS-6930. Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-6581.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1914 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1914/ ) HDFS-6930 . Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab) hadoop-hdfs-project/hadoop-hdfs/CHANGES- HDFS-6581 .txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java

            People

            • Assignee:
              Arpit Agarwal
              Reporter:
              Arpit Agarwal
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development