Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-20526

multithreads bulkload performance

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Patch Available
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.2.5, 1.3.2
    • Fix Version/s: None
    • Component/s: mapreduce, Zookeeper
    • Labels:
    • Environment:

      hbase-server-1.2.0-cdh5.12.1 

      spark version 1.6

      Description

      When doing bulkload , some interactive with zookeeper to getting region key range may be cost more time.

      In multithreads enviorment, the duration maybe cost 5 minute or more.

      From the executor log, like 'Reading reply sessionid:0x262fb37f4a07080 , packet:: clientPath:null server ...' contents appear many times.

       

      It likely to provide new method for bulkload, caching the key range outside

       

        Attachments

          Activity

            People

            • Assignee:
              Key Hutu Key Hutu
              Reporter:
              Key Hutu Key Hutu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 96h
                96h
                Remaining:
                Remaining Estimate - 96h
                96h
                Logged:
                Time Spent - Not Specified
                Not Specified