Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-13011

TestLoadIncrementalHFiles is flakey when using AsyncRpcClient as client implementation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.1.0, 2.0.0
    • 1.1.0, 2.0.0
    • None
    • None
    • Reviewed

    Description

      The test sometimes failed because of timeout.
      https://builds.apache.org/job/PreCommit-HBASE-Build/12769/testReport/junit/org.apache.hadoop.hbase.mapreduce/TestLoadIncrementalHFiles/testSimpleLoad/

      Dig into it, I found this

      2015-02-11 02:01:47,304 INFO  [LoadIncrementalHFiles-1] mapreduce.LoadIncrementalHFiles(563): Trying to load hfile=hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1 first=ddd last=ooo
      2015-02-11 02:01:47,308 INFO  [LoadIncrementalHFiles-0] mapreduce.LoadIncrementalHFiles(563): Trying to load hfile=hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0 first=aaaa last=cccc
      2015-02-11 02:01:47,317 DEBUG [LoadIncrementalHFiles-2] mapreduce.LoadIncrementalHFiles$3(664): Going to connect to server region=bulkNS:mytable_testSimpleLoad,,1423620104753.fdcbd21e43683c753bae40f1d890daa6., hostname=asf910.gq1.ygridcore.net,41003,1423620099272, seqNum=2 for row  with hfile group [{[B@7173d25a,hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0}]
      2015-02-11 02:01:47,320 DEBUG [LoadIncrementalHFiles-3] mapreduce.LoadIncrementalHFiles$3(664): Going to connect to server region=bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f., hostname=asf910.gq1.ygridcore.net,41003,1423620099272, seqNum=2 for row ddd with hfile group [{[B@7173d25a,hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1}]
      

      There are two files to commit, but after this

      2015-02-11 02:01:47,327 INFO  [B.defaultRpcServer.handler=3,queue=0,port=41003] regionserver.HStore(690): Validating hfile at hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_0 for inclusion in store myfam region bulkNS:mytable_testSimpleLoad,,1423620104753.fdcbd21e43683c753bae40f1d890daa6.
      2015-02-11 02:01:47,330 INFO  [B.defaultRpcServer.handler=1,queue=0,port=41003] regionserver.HStore(690): Validating hfile at hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1 for inclusion in store myfam region bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.
      2015-02-11 02:01:47,330 INFO  [B.defaultRpcServer.handler=4,queue=0,port=41003] regionserver.HStore(690): Validating hfile at hdfs://localhost:59736/user/jenkins/test-data/d964a632-8db5-4f3a-966f-89746947294b/testSimpleLoad/myfam/hfile_1 for inclusion in store myfam region bulkNS:mytable_testSimpleLoad,ddd,1423620104753.ec757ff718ce8ab99f4f6bcca389d67f.
      

      We can see that hfile_1 have been committed twice and the second call will fail and cause the test timeout.

      I'm not sure if it is a issue of AsyncRpcClient. But if I use RpcClientImpl, the test always passes.

      Attachments

        1. HBASE-13011.patch
          3 kB
          Jurriaan Mous
        2. HBASE-13011_3.patch
          26 kB
          Duo Zhang
        3. HBASE-13011_3.patch
          26 kB
          Michael Stack
        4. HBASE-13011_2.patch
          25 kB
          Duo Zhang
        5. HBASE-13011_1.patch
          25 kB
          Duo Zhang

        Issue Links

          Activity

            People

              zhangduo Duo Zhang
              zhangduo Duo Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: