Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13891 HDFS RBF stabilization phase I
  3. HDFS-14440

RBF: Optimize the file write process in case of multiple destinations.

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.3.0, HDFS-13891
    • None
    • None
    • Reviewed

    Description

      In case of multiple destinations, We need to check if the file already exists in one of the subclusters for which we use the existing getBlockLocation() API which is by default a sequential Call,

      In an ideal scenario where the file needs to be created each subcluster shall be checked sequentially, this can be done concurrently to save time.

      In another case where the file is found and if the last block is null, we need to do getFileInfo to all the locations to get the location where the file exists. This also can be prevented by use of ConcurrentCall since we shall be having the remoteLocation to where the getBlockLocation returned a non null entry.

       

      Attachments

        1. HDFS-14440-HDFS-13891-06.patch
          3 kB
          Ayush Saxena
        2. HDFS-14440-HDFS-13891-05.patch
          3 kB
          Ayush Saxena
        3. HDFS-14440-HDFS-13891-04.patch
          3 kB
          Ayush Saxena
        4. HDFS-14440-HDFS-13891-03.patch
          3 kB
          Ayush Saxena
        5. HDFS-14440-HDFS-13891-02.patch
          3 kB
          Ayush Saxena
        6. HDFS-14440-HDFS-13891-01.patch
          3 kB
          Ayush Saxena

        Activity

          People

            ayushtkn Ayush Saxena
            ayushtkn Ayush Saxena
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: