Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: HDFS-6581
    • Fix Version/s: 2.6.0
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      The current replica eviction scheme is inefficient since it performs multiple file operations in the context of block allocation.

      A better implementation would be asynchronous eviction when free space on RAM disk falls below a low watermark to make block allocation faster.

      1. HDFS-6930.01.patch
        16 kB
        Arpit Agarwal
      2. HDFS-6930.02.patch
        17 kB
        Arpit Agarwal

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open In Progress In Progress
          6d 1h 33m 1 Arpit Agarwal 30/Aug/14 00:01
          In Progress In Progress Resolved Resolved
          4d 21h 52m 1 Arpit Agarwal 03/Sep/14 21:54
          Resolved Resolved Closed Closed
          88d 6h 13m 1 Arun C Murthy 01/Dec/14 03:07
          Arun C Murthy made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Jitendra Nath Pandey made changes -
          Fix Version/s 2.6.0 [ 12327181 ]
          Fix Version/s 3.0.0 [ 12320356 ]
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Mapreduce-trunk #1914 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1914/)
          HDFS-6930. Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-6581.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1914 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1914/ ) HDFS-6930 . Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab) hadoop-hdfs-project/hadoop-hdfs/CHANGES- HDFS-6581 .txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Hdfs-trunk #1889 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1889/)
          HDFS-6930. Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-6581.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Hdfs-trunk #1889 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1889/ ) HDFS-6930 . Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab) hadoop-hdfs-project/hadoop-hdfs/CHANGES- HDFS-6581 .txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-Yarn-trunk #698 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/698/)
          HDFS-6930. Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab)

          • hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-6581.txt
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-Yarn-trunk #698 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/698/ ) HDFS-6930 . Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab) hadoop-hdfs-project/hadoop-hdfs/CHANGES- HDFS-6581 .txt hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          Arpit Agarwal made changes -
          Fix Version/s 3.0.0 [ 12320356 ]
          Fix Version/s HDFS-6581 [ 12327854 ]
          Hide
          Hudson added a comment -

          FAILURE: Integrated in Hadoop-trunk-Commit #6163 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6163/)
          HDFS-6930. Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab)

          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
          • hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java
          • hadoop-hdfs-project/hadoop-hdfs/CHANGES-HDFS-6581.txt
          Show
          Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #6163 (See https://builds.apache.org/job/Hadoop-trunk-Commit/6163/ ) HDFS-6930 . Improve replica eviction from RAM disk. (Arpit Agarwal) (arp: rev cb9b485075ce773f2d6189aa2f54bbc69aad4dab) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/LazyWriteReplicaTracker.java hadoop-hdfs-project/hadoop-hdfs/CHANGES- HDFS-6581 .txt
          Arpit Agarwal made changes -
          Status In Progress [ 3 ] Resolved [ 5 ]
          Hadoop Flags Reviewed [ 10343 ]
          Fix Version/s HDFS-6581 [ 12327854 ]
          Resolution Fixed [ 1 ]
          Hide
          Arpit Agarwal added a comment -

          Thanks Colin and Xiaoyu. Committed to the feature branch.

          Show
          Arpit Agarwal added a comment - Thanks Colin and Xiaoyu. Committed to the feature branch.
          Hide
          Colin Patrick McCabe added a comment -

          Colin, I filed HDFS-6988 to make the thresholds configurable.

          thanks

          updated patch attached to check capacity before division.

          +1

          Show
          Colin Patrick McCabe added a comment - Colin, I filed HDFS-6988 to make the thresholds configurable. thanks updated patch attached to check capacity before division. +1
          Arpit Agarwal made changes -
          Link This issue is duplicated by HDFS-6991 [ HDFS-6991 ]
          Arpit Agarwal made changes -
          Link This issue is required by HDFS-6950 [ HDFS-6950 ]
          Arpit Agarwal made changes -
          Link This issue is required by HDFS-6950 [ HDFS-6950 ]
          Arpit Agarwal made changes -
          Link This issue is required by HDFS-6977 [ HDFS-6977 ]
          Arpit Agarwal made changes -
          Link This issue is required by HDFS-6991 [ HDFS-6991 ]
          Arpit Agarwal made changes -
          Attachment HDFS-6930.02.patch [ 12666143 ]
          Arpit Agarwal made changes -
          Attachment HDFS-6950.1.patch [ 12666141 ]
          Arpit Agarwal made changes -
          Attachment HDFS-6950.1.patch [ 12666141 ]
          Arpit Agarwal made changes -
          Attachment HDFS-6930.02.patch [ 12666137 ]
          Arpit Agarwal made changes -
          Attachment HDFS-6930.02.patch [ 12666137 ]
          Arpit Agarwal made changes -
          Attachment HDFS-6930.02.patch [ 12666134 ]
          Arpit Agarwal made changes -
          Attachment HDFS-6930.02.patch [ 12666134 ]
          Hide
          Arpit Agarwal added a comment -

          Thanks for the reviews.

          Colin, I filed HDFS-6988 to make the thresholds configurable.

          Xiaoyu, updated patch attached to check capacity before division.

          Show
          Arpit Agarwal added a comment - Thanks for the reviews. Colin, I filed HDFS-6988 to make the thresholds configurable. Xiaoyu, updated patch attached to check capacity before division.
          Hide
          Xiaoyu Yao added a comment -

          +1

          Can you check if capacity > 0? It can be 0 when the RAM_DISK volume is allowed to add/remove dynamically,

           int percentFree = (int) (free * 100 / capacity);
          
          Show
          Xiaoyu Yao added a comment - +1 Can you check if capacity > 0? It can be 0 when the RAM_DISK volume is allowed to add/remove dynamically, int percentFree = ( int ) (free * 100 / capacity);
          Hide
          Colin Patrick McCabe added a comment -

          Eviction is done when we have Less than 10% free space or Insufficient space for 3 default length blocks.

          One thing that might be suboptimal here is that we're using the dfs.blocksize configuration key on the DataNode and assuming that will be the same value used by the client. Clearly, the client could use 256 MB blocks, whereas the DN could use 128 MB blocks. Etc.

          Also, we don't really know how big the ramdisks are going to be. I can easily see a 300 GB ramdisk being used in a few years. Just defaulting to keeping 10% free seems like too much.

          So, why not just have a minimum free space configuration key. It could be specified as a number of bytes, rather than as a percentage. So we could default it to 128 MB * 3 to get your current default of leaving space for 3 blocks. This would work better for bigger ramdisks (unlike a percentage-based scheme) and wouldn't make assumptions about the client's and DN's block size configuration being the same.

          Show
          Colin Patrick McCabe added a comment - Eviction is done when we have Less than 10% free space or Insufficient space for 3 default length blocks. One thing that might be suboptimal here is that we're using the dfs.blocksize configuration key on the DataNode and assuming that will be the same value used by the client. Clearly, the client could use 256 MB blocks, whereas the DN could use 128 MB blocks. Etc. Also, we don't really know how big the ramdisks are going to be. I can easily see a 300 GB ramdisk being used in a few years. Just defaulting to keeping 10% free seems like too much. So, why not just have a minimum free space configuration key. It could be specified as a number of bytes, rather than as a percentage. So we could default it to 128 MB * 3 to get your current default of leaving space for 3 blocks. This would work better for bigger ramdisks (unlike a percentage-based scheme) and wouldn't make assumptions about the client's and DN's block size configuration being the same.
          Arpit Agarwal made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Arpit Agarwal made changes -
          Assignee Arpit Agarwal [ arpitagarwal ]
          Arpit Agarwal made changes -
          Attachment HDFS-6930.01.patch [ 12665428 ]
          Arpit Agarwal made changes -
          Attachment HDFS-6930.01.patch [ 12665424 ]
          Hide
          Arpit Agarwal added a comment -

          The lazyWriter also removes replicas from RAM disk when RAM disk free spaces falls below one of the two watermarks:

          1. Less than 10% free space
          2. Insufficient space for 3 default length blocks.

          Eviction is always done immediately after saving blocks, since only saved blocks become eligible for eviction.

          Saving/eviction could potentially still be bursty since it is timer-based. We could make it behave more smoothly in the future.

          Show
          Arpit Agarwal added a comment - The lazyWriter also removes replicas from RAM disk when RAM disk free spaces falls below one of the two watermarks: Less than 10% free space Insufficient space for 3 default length blocks. Eviction is always done immediately after saving blocks, since only saved blocks become eligible for eviction. Saving/eviction could potentially still be bursty since it is timer-based. We could make it behave more smoothly in the future.
          Arpit Agarwal made changes -
          Attachment HDFS-6930.01.patch [ 12665424 ]
          Arpit Agarwal made changes -
          Field Original Value New Value
          Component/s datanode [ 12312927 ]
          Arpit Agarwal created issue -

            People

            • Assignee:
              Arpit Agarwal
              Reporter:
              Arpit Agarwal
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development