Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-6339

Bulkload call to RS should begin holding write lock only after the file has been transferred

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Invalid
    • Affects Version/s: 0.90.0
    • Fix Version/s: None
    • Component/s: Client, regionserver
    • Labels:
      None

      Description

      I noticed that right now, under a bulkLoadHFiles call to an RS, we grab the HRegion write lock as soon as we determine that it is a multi-family bulk load we'll be attempting. The file copy from the caller's source FS is done after holding the lock.

      This doesn't seem right. For instance, we had a recent use-case where the bulk load running cluster is a separate HDFS instance/cluster than the one that runs HBase and the transfers between these FSes can get slower than an intra-cluster transfer. Hence I think we should begin to hold the write lock only after we've got a successful destinationFS copy of the requested file, and thereby allow more write throughput to pass.

      Does this sound reasonable to do?

        Attachments

          Activity

            People

            • Assignee:
              qwertymaniac Harsh J
              Reporter:
              qwertymaniac Harsh J
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: