Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10690

Optimize insertion/removal of replica in ShortCircuitCache

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.0-alpha2
    • 2.8.0, 3.0.0-alpha2
    • hdfs-client
    • None
    • Reviewed
    • ShortCircuitCache

    Description

      Currently in ShortCircuitCache, two TreeMap objects are used to track the cached replicas.

      private final TreeMap<Long, ShortCircuitReplica> evictable = new TreeMap<>();
      private final TreeMap<Long, ShortCircuitReplica> evictableMmapped = new TreeMap<>();

      TreeMap employs Red-Black tree for sorting. This isn't an issue when using traditional HDD. But when using high-performance SSD/PCIe Flash, the cost inserting/removing an entry becomes considerable.

      To mitigate it, we designed a new list-based for replica tracking.

      The list is a double-linked FIFO. FIFO is time-based, thus insertion is a very low cost operation. On the other hand, list is not lookup-friendly. To address this issue, we introduce two references into ShortCircuitReplica object.

      ShortCircuitReplica next = null;
      ShortCircuitReplica prev = null;

      In this way, lookup is not needed when removing a replica from the list. We only need to modify its predecessor's and successor's references in the lists.

      Our tests showed up to 15-50% performance improvement when using PCIe flash as storage media.

      The original patch is against 2.6.4, now I am porting to Hadoop trunk, and patch will be posted soon.

      Attachments

        1. HDFS-10690.001.patch
          22 kB
          Fenghua Hu
        2. HDFS-10690.002.patch
          23 kB
          Fenghua Hu
        3. HDFS-10690.003.patch
          7 kB
          Fenghua Hu
        4. HDFS-10690.004.patch
          8 kB
          Fenghua Hu
        5. HDFS-10690.005.patch
          8 kB
          Fenghua Hu
        6. HDFS-10690.006.patch
          8 kB
          Fenghua Hu
        7. HDFS-10690.007.patch
          11 kB
          Fenghua Hu
        8. HDFS-10690.008.patch
          12 kB
          Fenghua Hu
        9. ShortCircuitCache_LinkedMap.patch
          7 kB
          Xiaoyu Yao

        Issue Links

          Activity

            People

              fenghua_hu Fenghua Hu
              fenghua_hu Fenghua Hu
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 336h
                  336h
                  Remaining:
                  Remaining Estimate - 336h
                  336h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified