Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10690

Optimize insertion/removal of replica in ShortCircuitCache

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha2
    • Fix Version/s: 2.8.0, 3.0.0-alpha2
    • Component/s: hdfs-client
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed
    • Tags:
      ShortCircuitCache

      Description

      Currently in ShortCircuitCache, two TreeMap objects are used to track the cached replicas.

      private final TreeMap<Long, ShortCircuitReplica> evictable = new TreeMap<>();
      private final TreeMap<Long, ShortCircuitReplica> evictableMmapped = new TreeMap<>();

      TreeMap employs Red-Black tree for sorting. This isn't an issue when using traditional HDD. But when using high-performance SSD/PCIe Flash, the cost inserting/removing an entry becomes considerable.

      To mitigate it, we designed a new list-based for replica tracking.

      The list is a double-linked FIFO. FIFO is time-based, thus insertion is a very low cost operation. On the other hand, list is not lookup-friendly. To address this issue, we introduce two references into ShortCircuitReplica object.

      ShortCircuitReplica next = null;
      ShortCircuitReplica prev = null;

      In this way, lookup is not needed when removing a replica from the list. We only need to modify its predecessor's and successor's references in the lists.

      Our tests showed up to 15-50% performance improvement when using PCIe flash as storage media.

      The original patch is against 2.6.4, now I am porting to Hadoop trunk, and patch will be posted soon.

        Attachments

        1. ShortCircuitCache_LinkedMap.patch
          7 kB
          Xiaoyu Yao
        2. HDFS-10690.008.patch
          12 kB
          Fenghua Hu
        3. HDFS-10690.007.patch
          11 kB
          Fenghua Hu
        4. HDFS-10690.006.patch
          8 kB
          Fenghua Hu
        5. HDFS-10690.005.patch
          8 kB
          Fenghua Hu
        6. HDFS-10690.004.patch
          8 kB
          Fenghua Hu
        7. HDFS-10690.003.patch
          7 kB
          Fenghua Hu
        8. HDFS-10690.002.patch
          23 kB
          Fenghua Hu
        9. HDFS-10690.001.patch
          22 kB
          Fenghua Hu

          Issue Links

            Activity

              People

              • Assignee:
                fenghua_hu Fenghua Hu
                Reporter:
                fenghua_hu Fenghua Hu
              • Votes:
                0 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 336h
                  336h
                  Remaining:
                  Remaining Estimate - 336h
                  336h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified