Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-20526

multithreads bulkload performance

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Patch Available
    • Minor
    • Resolution: Unresolved
    • 1.2.5, 1.3.2
    • None
    • mapreduce, Zookeeper
    • hbase-server-1.2.0-cdh5.12.1 

      spark version 1.6

    Description

      When doing bulkload , some interactive with zookeeper to getting region key range may be cost more time.

      In multithreads enviorment, the duration maybe cost 5 minute or more.

      From the executor log, like 'Reading reply sessionid:0x262fb37f4a07080 , packet:: clientPath:null server ...' contents appear many times.

       

      It likely to provide new method for bulkload, caching the key range outside

       

      Attachments

        Activity

          People

            Key Hutu Key Hutu
            Key Hutu Key Hutu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 96h
                96h
                Remaining:
                Remaining Estimate - 96h
                96h
                Logged:
                Time Spent - Not Specified
                Not Specified