Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-13308

Fix flaky TestEndToEndSplitTransaction

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.1.0, 2.0.0
    • 1.1.0, 2.0.0
    • test
    • None
    • Reviewed

    Description

      https://builds.apache.org/job/HBase-TRUNK-jacoco/24/testReport/junit/org.apache.hadoop.hbase.regionserver/TestEndToEndSplitTransaction/testFromClientSideWhileSplitting/

      First, we split 'e9eb97847340ea7c6b9616d63d62a784' to 'abe1973ea732066b12d8e33fce12a951' and '4940dad7ef9b4b699fd13eede5740d9d'.

      And then, we try to split 'abe1973ea732066b12d8e33fce12a951'.

      2015-03-21 03:58:46,970 INFO  [Thread-191] regionserver.TestEndToEndSplitTransaction(399): Initiating region split for:testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
      2015-03-21 03:58:46,976 INFO  [PriorityRpcServer.handler=7,queue=1,port=54177] regionserver.RSRpcServices(1596): Splitting testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
      2015-03-21 03:58:46,977 DEBUG [PriorityRpcServer.handler=7,queue=1,port=54177] regionserver.CompactSplitThread(259): Split requested for testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951..  compaction_queue=(0:0), split_queue=1, merge_queue=0
      2015-03-21 03:58:46,978 INFO  [Thread-191] regionserver.TestEndToEndSplitTransaction(399): blocking until region is split:testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951.
      2015-03-21 03:58:46,985 DEBUG [RS:0;priapus:54177-splits-1426910324832] lock.ZKInterProcessLockBase(226): Acquired a lock for /hbase/table-lock/testFromClientSideWhileSplitting/read-regionserver:541770000000002
      2015-03-21 03:58:46,988 DEBUG [RS:0;priapus:54177-splits-1426910324832] lock.ZKInterProcessLockBase(328): Released /hbase/table-lock/testFromClientSideWhileSplitting/read-regionserver:541770000000002
      2015-03-21 03:58:46,988 DEBUG [Thread-191] ipc.AsyncRpcClient(163): Use global event loop group NioEventLoopGroup
      2015-03-21 03:58:46,988 INFO  [RS:0;priapus:54177-splits-1426910324832] regionserver.SplitRequest(142): Split transaction journal:
      	STARTED at 1426910326977
      

      We can see that it failed without any error message.
      I think can only happen when the parent is not splittable or we can not find a splitrow.

      2015-03-21 03:58:47,019 INFO  [RS:0;priapus:54177-shortCompactions-1426910324308] regionserver.HStore(1334): Completed major compaction of 2 (all) file(s) in family of testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951. into e97bccdc4b014c15a52a579cd49ebb31(size=12.6 K), total size for store is 12.6 K. This selection was in queue for 0sec, and took 0sec to execute.
      2015-03-21 03:58:47,019 INFO  [RS:0;priapus:54177-shortCompactions-1426910324308] regionserver.CompactSplitThread$CompactionRunner(523): Completed compaction: Request = regionName=testFromClientSideWhileSplitting,,1426910324847.abe1973ea732066b12d8e33fce12a951., storeName=family, fileCount=2, fileSize=25.5 K, priority=1, time=14542808784655186; duration=0sec
      2015-03-21 03:58:47,020 DEBUG [RS:0;priapus:54177-shortCompactions-1426910324308] regionserver.CompactSplitThread$CompactionRunner(546): CompactSplitThread Status: compaction_queue=(0:0), split_queue=0, merge_queue=0
      

      We can see that, the compaction was completed at 03:58:47,019, but split was started at 03:58:46,970 which is earlier.
      So we have a reference file and is not splittable.

      I think the problem is 'compactAndBlockUntilDone' is not reliable, it may return before the compaction complete.

      Will try to prepare a patch.

      Attachments

        1. HBASE-13308.patch
          17 kB
          Duo Zhang

        Activity

          People

            zhangduo Duo Zhang
            zhangduo Duo Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: