Derby
  1. Derby
  2. DERBY-3896

Assert failure in LogicalUndoOperation.doMe() when starting slave with uncommitted transactions on the master

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 10.4.1.3, 10.5.1.1
    • Fix Version/s: None
    • Component/s: Replication
    • Environment:
    • Urgency:
      Normal

      Description

      I created and froze the master database like this:

      /tmp/master % java -jar /code/derby/trunk1/jars/sane/derbyrun.jar ij
      ij version 10.5
      ij> connect 'jdbc:derby:db;create=true';
      ij> autocommit off;
      ij> create table t1(x int);
      0 rows inserted/updated/deleted
      ij> insert into t1 values 1,2,3,4;
      4 rows inserted/updated/deleted
      ij> connect 'jdbc:derby:db';
      ij(CONNECTION1)> call syscs_util.syscs_freeze_database();
      0 rows inserted/updated/deleted
      ij(CONNECTION1)>

      Then in another terminal window I copied the database with "cp -rp" and booted the copy as a slave:

      /tmp/slave % java -jar /code/derby/trunk1/jars/sane/derbyrun.jar ij
      ij version 10.5
      ij> connect 'jdbc:derby:db;startSlave=true';
      org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED
      at org.apache.derby.shared.common.sanity.SanityManager.ASSERT(SanityManager.java:98)
      at org.apache.derby.impl.store.raw.data.LogicalUndoOperation.doMe(LogicalUndoOperation.java:174)
      at org.apache.derby.impl.store.raw.log.FileLogger.logAndUndo(FileLogger.java:533)
      at org.apache.derby.impl.store.raw.xact.Xact.logAndUndo(Xact.java:374)
      at org.apache.derby.impl.store.raw.log.FileLogger.undo(FileLogger.java:1015)
      at org.apache.derby.impl.store.raw.xact.Xact.abort(Xact.java:949)
      at org.apache.derby.impl.store.raw.xact.XactFactory.rollbackAllTransactions(XactFactory.java:529)
      at org.apache.derby.impl.store.raw.log.LogToFile.recover(LogToFile.java:1213)
      at org.apache.derby.impl.store.raw.RawStore.boot(RawStore.java:334)
      at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2019)
      at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)
      at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)
      at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)
      at org.apache.derby.impl.store.access.RAMAccessManager.boot(RAMAccessManager.java:1019)
      at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2019)
      at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)
      at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(BaseMonitor.java:573)
      at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Monitor.java:427)
      at org.apache.derby.impl.db.BasicDatabase.bootStore(BasicDatabase.java:780)
      at org.apache.derby.impl.db.BasicDatabase.boot(BasicDatabase.java:196)
      at org.apache.derby.impl.services.monitor.BaseMonitor.boot(BaseMonitor.java:2019)
      at org.apache.derby.impl.services.monitor.TopService.bootModule(TopService.java:291)
      at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(BaseMonitor.java:1856)
      at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(BaseMonitor.java:1722)
      at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(BaseMonitor.java:1602)
      at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(BaseMonitor.java:1021)
      at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Monitor.java:550)
      at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(EmbedConnection.java:2572)
      at org.apache.derby.impl.jdbc.EmbedConnection.<init>(EmbedConnection.java:365)
      at org.apache.derby.impl.jdbc.EmbedConnection30.<init>(EmbedConnection30.java:73)
      at org.apache.derby.impl.jdbc.EmbedConnection40.<init>(EmbedConnection40.java:54)
      at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Driver40.java:68)
      at org.apache.derby.jdbc.InternalDriver.connect(InternalDriver.java:238)
      at org.apache.derby.jdbc.AutoloadedDriver.connect(AutoloadedDriver.java:119)
      at java.sql.DriverManager.getConnection(DriverManager.java:582)
      at java.sql.DriverManager.getConnection(DriverManager.java:154)
      at org.apache.derby.impl.tools.ij.ij.dynamicConnection(ij.java:1243)
      at org.apache.derby.impl.tools.ij.ij.ConnectStatement(ij.java:1093)
      at org.apache.derby.impl.tools.ij.ij.ijStatement(ij.java:921)
      at org.apache.derby.impl.tools.ij.utilMain.runScriptGuts(utilMain.java:328)
      at org.apache.derby.impl.tools.ij.utilMain.go(utilMain.java:248)
      at org.apache.derby.impl.tools.ij.Main.go(Main.java:215)
      at org.apache.derby.impl.tools.ij.Main.mainCore(Main.java:181)
      at org.apache.derby.impl.tools.ij.Main.main(Main.java:73)
      at org.apache.derby.tools.ij.main(ij.java:59)
      at org.apache.derby.iapi.tools.run.main(run.java:53)

        Issue Links

          Activity

          Hide
          Knut Anders Hatlen added a comment -

          Since our documentation now (as of DERBY-4196) says the database should be shut down cleanly before it's copied to the slave, we may consider closing this bug as "Won't Fix".

          Show
          Knut Anders Hatlen added a comment - Since our documentation now (as of DERBY-4196 ) says the database should be shut down cleanly before it's copied to the slave, we may consider closing this bug as "Won't Fix".
          Hide
          Dag H. Wanvik added a comment -

          DERBY-4299 is another signature of this underlying problem.

          Show
          Dag H. Wanvik added a comment - DERBY-4299 is another signature of this underlying problem.
          Hide
          Kristian Waagan added a comment -

          Triaged July 3, 2009: Assigned normal urgency.

          Show
          Kristian Waagan added a comment - Triaged July 3, 2009: Assigned normal urgency.
          Hide
          Knut Anders Hatlen added a comment -

          If we could change the documentation from saying that we should freeze the database before copying it to saying that we should shut down the database cleanly before copying it, this problem would go away. (Logged as DERBY-4196.)

          Show
          Knut Anders Hatlen added a comment - If we could change the documentation from saying that we should freeze the database before copying it to saying that we should shut down the database cleanly before copying it, this problem would go away. (Logged as DERBY-4196 .)
          Hide
          V.Narayanan added a comment -

          The decision on when to log ship, is taken everytime a log buffer element becomes full.
          When a log buffer element becomes full, this event is communicated to the log shipper
          via the master controller. The log shipper then takes a decision based on the fill information
          obtained from the log buffer

          I think Knut is right in saying that log shipping does not happen when the freeze happens,
          but I am not sure it is related to the log switch not happening.

          I think I am going to try triggering the log shipping inside freezePersistentStore() of LogToFile and
          see what happens!

          Show
          V.Narayanan added a comment - The decision on when to log ship, is taken everytime a log buffer element becomes full. When a log buffer element becomes full, this event is communicated to the log shipper via the master controller. The log shipper then takes a decision based on the fill information obtained from the log buffer I think Knut is right in saying that log shipping does not happen when the freeze happens, but I am not sure it is related to the log switch not happening. I think I am going to try triggering the log shipping inside freezePersistentStore() of LogToFile and see what happens!
          Hide
          V.Narayanan added a comment -

          Knut said:

          "I think the problem is that if you have uncommitted transactions when you freeze the
          database, the log file is not switched. so you continue for a while writing to the old log
          file on the master, but don't ship log to the slave until the log file is actually switched.
          so you miss the log between freeze and switch on the slave."

          I will try to look at this bug when I find some time, I will not be working on this issue
          regularly hence am not assigning it to myself.

          Show
          V.Narayanan added a comment - Knut said: "I think the problem is that if you have uncommitted transactions when you freeze the database, the log file is not switched. so you continue for a while writing to the old log file on the master, but don't ship log to the slave until the log file is actually switched. so you miss the log between freeze and switch on the slave." I will try to look at this bug when I find some time, I will not be working on this issue regularly hence am not assigning it to myself.
          Hide
          Knut Anders Hatlen added a comment -

          Appeared to work with insane jars, but it didn't actually work. Using the same example as in the description, after the slave has been started, do this in the ij terminal running the master:

          ij(CONNECTION1)> connect 'jdbc:derby:db;startMaster=true;slaveHost=localhost';
          ij(CONNECTION2)> disconnect;
          ij> set connection connection0;
          ij(CONNECTION0)> select * from t1;
          X
          -----------
          1
          2
          3
          4

          4 rows selected
          ij(CONNECTION0)> commit;
          ij(CONNECTION0)> connect 'jdbc:derby:db;failover=true';
          ERROR XRE20: Failover performed successfully for database 'db', the database has been shutdown.
          ij(CONNECTION0)>

          Then, on the slave:

          ij> connect 'jdbc:derby:db';
          ij> select * from t1;
          ERROR 42X05: Table/View 'T1' does not exist.
          ij> select * from app.t1;
          ERROR 42X05: Table/View 'APP.T1' does not exist.
          ij>

          So the transaction that created the table T1 has not been reflected on the slave.

          Show
          Knut Anders Hatlen added a comment - Appeared to work with insane jars, but it didn't actually work. Using the same example as in the description, after the slave has been started, do this in the ij terminal running the master: ij(CONNECTION1)> connect 'jdbc:derby:db;startMaster=true;slaveHost=localhost'; ij(CONNECTION2)> disconnect; ij> set connection connection0; ij(CONNECTION0)> select * from t1; X ----------- 1 2 3 4 4 rows selected ij(CONNECTION0)> commit; ij(CONNECTION0)> connect 'jdbc:derby:db;failover=true'; ERROR XRE20: Failover performed successfully for database 'db', the database has been shutdown. ij(CONNECTION0)> Then, on the slave: ij> connect 'jdbc:derby:db'; ij> select * from t1; ERROR 42X05: Table/View 'T1' does not exist. ij> select * from app.t1; ERROR 42X05: Table/View 'APP.T1' does not exist. ij> So the transaction that created the table T1 has not been reflected on the slave.
          Hide
          Knut Anders Hatlen added a comment -

          I get the same failure on 10.4.

          Things appear to work when I build insane jars.

          Show
          Knut Anders Hatlen added a comment - I get the same failure on 10.4. Things appear to work when I build insane jars.

            People

            • Assignee:
              Unassigned
              Reporter:
              Knut Anders Hatlen
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:

                Development