HBase
  1. HBase
  2. HBASE-4562

When split doing offlineParentInMeta encounters error, it'll cause data loss

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.90.4
    • Fix Version/s: 0.90.5
    • Component/s: regionserver
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Follow below steps to replay the problem:
      1. change the SplitTransaction.java as below,just like mock the timeout error.

      SplitTransaction.java
            if (!testing) {
              MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
                 this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
              throw new IOException("some unexpected error in split");
            }
         

      2. update the regionserver code,restart;
      3. create a table & put some data to the table;
      4. split the table;
      5. kill the regionserver hosted the table;
      6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost.

      We can fix the bug just use the patch.

      1. test-4562-trunk.txt
        31 kB
        bluedavy
      2. test-4562-0.92.txt
        30 kB
        bluedavy
      3. test-4562-0.90.txt
        21 kB
        bluedavy
      4. test-4562-0.90.4.txt
        21 kB
        bluedavy
      5. HBASE-4562-trunk.patch
        3 kB
        bluedavy
      6. HBASE-4562-0.92.patch
        3 kB
        bluedavy
      7. HBASE-4562-0.90.patch
        2 kB
        bluedavy
      8. HBASE-4562-0.90.4.patch
        2 kB
        bluedavy

        Activity

        bluedavy created issue -
        bluedavy made changes -
        Field Original Value New Value
        Description Follow below steps to replay the problem:
        1. change the SplitTransaction.java as below,just like mock the timeout error.
           {code:title=SplitTransaction.java|borderStyle=solid}
              if (!testing) {
                MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
                   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
                throw new IOException("some unexpected error in split");
              }
           {code}
        2. update the regionserver code,restart;
        3. create a table & put some data to the table;
        4. split the table;
        5. kill the regionserver hosted the table;
        6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost.

        We can fix the bug just use below code:
        {code:title=SplitTransaction.java|borderStyle=solid}
              this.journal.add(JournalEntry.PONR);
              if (!testing) {
                MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
                   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
                throw new IOException("some unexpected error in split");
              }
        {code}
        Follow below steps to replay the problem:
        1. change the SplitTransaction.java as below,just like mock the timeout error.
           {code:title=SplitTransaction.java|borderStyle=solid}
              if (!testing) {
                MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
                   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
                throw new IOException("some unexpected error in split");
              }
           {code}
        2. update the regionserver code,restart;
        3. create a table & put some data to the table;
        4. split the table;
        5. kill the regionserver hosted the table;
        6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost.

        We can fix the bug just use below code:
        {code:title=SplitTransaction.java|borderStyle=solid}
              this.journal.add(JournalEntry.PONR);
              if (!testing) {
                MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
                   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
                throw new IOException("some unexpected error in split");
              }
        {code}
        {code:title=CompactSplitThread.java|borderStyle=solid}
              if (st.rollback(this.server, this.server)) {
                  LOG.info("Successful rollback of failed split of " +
                    parent.getRegionNameAsString());
              }
              else {
                  this.server.abort("Abort; we got an error after point-of-no-return");
              }
        {code}
        bluedavy made changes -
        Summary When split occurs error,it'll cause data loss When split doing offlineParentInMeta occurs error,it'll cause data loss
        bluedavy made changes -
        Attachment HBASE-4562&4563.patch [ 12498385 ]
        bluedavy made changes -
        Attachment HBASE-4562&4563.patch [ 12498385 ]
        bluedavy made changes -
        Attachment HBASE-4562.patch [ 12498405 ]
        Attachment HBASE-4562-test.report.txt [ 12498406 ]
        bluedavy made changes -
        Description Follow below steps to replay the problem:
        1. change the SplitTransaction.java as below,just like mock the timeout error.
           {code:title=SplitTransaction.java|borderStyle=solid}
              if (!testing) {
                MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
                   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
                throw new IOException("some unexpected error in split");
              }
           {code}
        2. update the regionserver code,restart;
        3. create a table & put some data to the table;
        4. split the table;
        5. kill the regionserver hosted the table;
        6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost.

        We can fix the bug just use below code:
        {code:title=SplitTransaction.java|borderStyle=solid}
              this.journal.add(JournalEntry.PONR);
              if (!testing) {
                MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
                   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
                throw new IOException("some unexpected error in split");
              }
        {code}
        {code:title=CompactSplitThread.java|borderStyle=solid}
              if (st.rollback(this.server, this.server)) {
                  LOG.info("Successful rollback of failed split of " +
                    parent.getRegionNameAsString());
              }
              else {
                  this.server.abort("Abort; we got an error after point-of-no-return");
              }
        {code}
        Follow below steps to replay the problem:
        1. change the SplitTransaction.java as below,just like mock the timeout error.
           {code:title=SplitTransaction.java|borderStyle=solid}
              if (!testing) {
                MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
                   this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
                throw new IOException("some unexpected error in split");
              }
           {code}
        2. update the regionserver code,restart;
        3. create a table & put some data to the table;
        4. split the table;
        5. kill the regionserver hosted the table;
        6. wait some time after master ServerShutdownHandler.process execute,then scan the table,u'll find the data wrote before lost.

        We can fix the bug just use the patch.
        bluedavy made changes -
        Attachment HBASE-4562.patch [ 12498405 ]
        bluedavy made changes -
        Attachment HBASE-4562.patch [ 12498521 ]
        bluedavy made changes -
        Attachment HBASE-4562for0.92.patch [ 12498527 ]
        Attachment HBASE-4562fortrunk.patch [ 12498528 ]
        Ted Yu made changes -
        Summary When split doing offlineParentInMeta occurs error,it'll cause data loss When split doing offlineParentInMeta encounters error, it'll cause data loss
        bluedavy made changes -
        Attachment HBASE-4562-test.report.txt [ 12498406 ]
        bluedavy made changes -
        Attachment HBASE-4562.patch [ 12498521 ]
        bluedavy made changes -
        Attachment HBASE-4562for0.92.patch [ 12498527 ]
        bluedavy made changes -
        Attachment HBASE-4562fortrunk.patch [ 12498528 ]
        bluedavy made changes -
        Attachment HBASE-4562-0.90.patch [ 12499176 ]
        Attachment HBASE-4562-0.92.patch [ 12499177 ]
        Attachment HBASE-4562-trunk.patch [ 12499178 ]
        Attachment test-4562-0.90.txt [ 12499179 ]
        Attachment test-4562-0.92.txt [ 12499180 ]
        Attachment test-4562-trunk.txt [ 12499181 ]
        Ted Yu made changes -
        Comment [ In JIRA description:
        bq. 5. kill the regionserver hosted the table;
        ]
        bluedavy made changes -
        Attachment HBASE-4562-0.90.patch [ 12499176 ]
        bluedavy made changes -
        Attachment HBASE-4562-0.92.patch [ 12499177 ]
        bluedavy made changes -
        Attachment HBASE-4562-trunk.patch [ 12499178 ]
        bluedavy made changes -
        Attachment HBASE-4562-0.90.patch [ 12499235 ]
        Attachment HBASE-4562-0.92.patch [ 12499236 ]
        Attachment HBASE-4562-trunk.patch [ 12499237 ]
        Ted Yu made changes -
        Assignee bluedavy [ bluedavy ]
        bluedavy made changes -
        Status Open [ 1 ] In Progress [ 3 ]
        bluedavy made changes -
        Status In Progress [ 3 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        bluedavy made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        bluedavy made changes -
        Attachment HBASE-4562-0.90.patch [ 12499235 ]
        bluedavy made changes -
        Attachment test-4562-0.90.txt [ 12499179 ]
        bluedavy made changes -
        Attachment HBASE-4562-0.90.4.patch [ 12499262 ]
        Attachment test-4562-0.90.4.txt [ 12499263 ]
        bluedavy made changes -
        Attachment HBASE-4562-0.90.patch [ 12499270 ]
        Attachment test-4562-0.90.txt [ 12499271 ]
        Lars Hofhansl made changes -
        Status Reopened [ 4 ] Resolved [ 5 ]
        Hadoop Flags Reviewed [ 10343 ]
        Resolution Fixed [ 1 ]
        stack made changes -
        Fix Version/s 0.90.6 [ 12319200 ]
        Fix Version/s 0.90.5 [ 12317145 ]
        stack made changes -
        Fix Version/s 0.90.5 [ 12317145 ]
        Fix Version/s 0.90.6 [ 12319200 ]

          People

          • Assignee:
            bluedavy
            Reporter:
            bluedavy
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development