OFBiz
  1. OFBiz
  2. OFBIZ-3557

Enforced sequence does not work with concurrent access

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: Release Branch 09.04, SVN trunk
    • Fix Version/s: None
    • Component/s: framework
    • Labels:
      None

      Description

      There is a fundamental issue with enforced sequences (for orders, invoices, etc ..) and concurrency.

      For example if two users are creating an order at the same time one of them will see the creation fail with a PK error. The problem is that the "getNextXXXId" rely on the party accounting preference entity, but there is absolutely no guarantee that the last number in the sequence gets updated before another service can read it.

      This is at best very annoying when used only internally but may be unpractical for e-commerce sites.

      1. OFBIZ-3557-1.patch
        5 kB
        Adrian Crum
      2. OFBIZ-3557-2.patch
        8 kB
        Aleksey Fedorchenko

        Issue Links

          Activity

          Hide
          Markus M. May added a comment - - edited

          We have now implemented the Service "invoiceSequenceEnforced" with a small Java-Method, so that the Transaction starts in the method and ends after committing the change on the DB. Seems to work now

          InvoiceServices.java
            // service to create a new invoiceId in a small transaction, so that the ids are updated correctly
            public static Map<String, Object> invoiceSequenceEnforced(DispatchContext dctx, Map<String, Object> context) throws GenericServiceException {
              Delegator delegator = dctx.getDelegator();
              LocalDispatcher dispatcher = dctx.getDispatcher();
              Locale locale = (Locale) context.get("locale");
          
              String partyId = (String) context.get("partyId");
              Transaction parentTransaction = null;
              Long lastInvoiceNumber = null;
          
              try {
                // Enforce a new Transaction
                parentTransaction = TransactionUtil.suspend();
                if (TransactionUtil.isTransactionInPlace()) {
                  throw new GenericTransactionException("In service " + module + " transaction is still in place after suspend, status is " + TransactionUtil.getStatusString());
                }
          
                // now start a new transaction
                boolean beganTrans = TransactionUtil.begin(0);
          
                GenericValue partyAcctgPreference = delegator.findOne("PartyAcctgPreference", false, UtilMisc.toMap("partyId", partyId));
          
                lastInvoiceNumber = partyAcctgPreference.getLong("lastInvoiceNumber");
          
                if (UtilValidate.isNotEmpty(lastInvoiceNumber)) {
                  lastInvoiceNumber = lastInvoiceNumber + 1L;
                } else {
                  lastInvoiceNumber = 1L;
                }
          
                Debug.logInfo("generated new InvoiceNumber: " + lastInvoiceNumber, module);
                partyAcctgPreference.set("lastInvoiceNumber", lastInvoiceNumber);
                partyAcctgPreference.store();
          
                TransactionUtil.commit();
          
              } catch (GenericEntityException e) {
                Debug.logError (e, "Entity/data problem creating invoiceId: " + e.toString(), module);
                return ServiceUtil.returnError(UtilProperties.getMessage(resource,
                  "AccountingEntityDataProblemCreatingInvoiceFromOrderItems",
                  UtilMisc.toMap("reason", e.toString()), locale));
              } finally {
                // resume the parent transaction
                if (parentTransaction != null) {
                  try {
                    TransactionUtil.resume(parentTransaction);
                  } catch (GenericTransactionException ite) {
                    Debug.logWarning(ite, "Transaction error, not resumed", module);
                    throw new GenericServiceException("Resume transaction exception, see logs");
                  }
                }
              }
          
              Map<String, Object> result = ServiceUtil.returnSuccess();
              result.put("invoiceId", lastInvoiceNumber);
              return result;
            }
          

          And in the service-defition:

          sales_invoice.xml
              <service name="invoiceSequence-enforced" engine="java"
                       location="org.ofbiz.accounting.invoice.InvoiceServices" invoke="invoiceSequenceEnforced">
                <attribute name="partyAcctgPreference" type="org.ofbiz.entity.GenericValue" mode="IN"/>
          
                <attribute name="partyId" type="String" mode="IN" optional="false"/>
                <attribute name="invoiceId" type="Long" mode="OUT"/>
              </service>
          

          Up until now, this seems to work fine. It still has the above mentioned problem, that a sequence number is wasted if an invoice creation is rolled back, though. Therefor this issue is still marked as open, this is just a minor workaround.

          Show
          Markus M. May added a comment - - edited We have now implemented the Service "invoiceSequenceEnforced" with a small Java-Method, so that the Transaction starts in the method and ends after committing the change on the DB. Seems to work now InvoiceServices.java // service to create a new invoiceId in a small transaction, so that the ids are updated correctly public static Map< String , Object > invoiceSequenceEnforced(DispatchContext dctx, Map< String , Object > context) throws GenericServiceException { Delegator delegator = dctx.getDelegator(); LocalDispatcher dispatcher = dctx.getDispatcher(); Locale locale = (Locale) context.get( "locale" ); String partyId = ( String ) context.get( "partyId" ); Transaction parentTransaction = null ; Long lastInvoiceNumber = null ; try { // Enforce a new Transaction parentTransaction = TransactionUtil.suspend(); if (TransactionUtil.isTransactionInPlace()) { throw new GenericTransactionException( "In service " + module + " transaction is still in place after suspend, status is " + TransactionUtil.getStatusString()); } // now start a new transaction boolean beganTrans = TransactionUtil.begin(0); GenericValue partyAcctgPreference = delegator.findOne( "PartyAcctgPreference" , false , UtilMisc.toMap( "partyId" , partyId)); lastInvoiceNumber = partyAcctgPreference.getLong( "lastInvoiceNumber" ); if (UtilValidate.isNotEmpty(lastInvoiceNumber)) { lastInvoiceNumber = lastInvoiceNumber + 1L; } else { lastInvoiceNumber = 1L; } Debug.logInfo( "generated new InvoiceNumber: " + lastInvoiceNumber, module); partyAcctgPreference.set( "lastInvoiceNumber" , lastInvoiceNumber); partyAcctgPreference.store(); TransactionUtil.commit(); } catch (GenericEntityException e) { Debug.logError (e, "Entity/data problem creating invoiceId: " + e.toString(), module); return ServiceUtil.returnError(UtilProperties.getMessage(resource, "AccountingEntityDataProblemCreatingInvoiceFromOrderItems" , UtilMisc.toMap( "reason" , e.toString()), locale)); } finally { // resume the parent transaction if (parentTransaction != null ) { try { TransactionUtil.resume(parentTransaction); } catch (GenericTransactionException ite) { Debug.logWarning(ite, "Transaction error, not resumed" , module); throw new GenericServiceException( "Resume transaction exception, see logs" ); } } } Map< String , Object > result = ServiceUtil.returnSuccess(); result.put( "invoiceId" , lastInvoiceNumber); return result; } And in the service-defition: sales_invoice.xml <service name= "invoiceSequence-enforced" engine= "java" location= "org.ofbiz.accounting.invoice.InvoiceServices" invoke= "invoiceSequenceEnforced" > <attribute name= "partyAcctgPreference" type= "org.ofbiz.entity.GenericValue" mode= "IN" /> <attribute name= "partyId" type= " String " mode= "IN" optional= " false " /> <attribute name= "invoiceId" type= " Long " mode= "OUT" /> </service> Up until now, this seems to work fine. It still has the above mentioned problem, that a sequence number is wasted if an invoice creation is rolled back, though. Therefor this issue is still marked as open, this is just a minor workaround.
          Hide
          Markus M. May added a comment -

          Are we the only one experiencing this problem?

          We still experience this problem, even with the suggested solution to use require-new-transaction="true" on the getNextInvoiceId-Service. There are a couple of solutions in my head (like e.g. use the database sequence directly), but these are not really a solution for such a "common" problem.

          Any help on this one is greatly appreciated.

          Show
          Markus M. May added a comment - Are we the only one experiencing this problem? We still experience this problem, even with the suggested solution to use require-new-transaction="true" on the getNextInvoiceId-Service. There are a couple of solutions in my head (like e.g. use the database sequence directly), but these are not really a solution for such a "common" problem. Any help on this one is greatly appreciated.
          Hide
          Markus M. May added a comment -

          We are running into the same issue on the 12.04 branch. Any hints on this one?

          Show
          Markus M. May added a comment - We are running into the same issue on the 12.04 branch. Any hints on this one?
          Hide
          Jacques Le Roux added a comment -

          Not sure there will be dependence on OFBIZ-2353 but I prefer to put a such link because the UtilSequence code will certainly be modified.

          Show
          Jacques Le Roux added a comment - Not sure there will be dependence on OFBIZ-2353 but I prefer to put a such link because the UtilSequence code will certainly be modified.
          Hide
          Bruno Brandmeier added a comment -

          Hi all,

          why not using a lock table like this (example in c#):

          c_Lock mylock = new c_Lock("rechnungs_nr");
          mylock.use_lock();
          int nur_rechnungs_nr = mylock.increment_counter();
          mylock.open_lock();

          class c_Lock
          {

          string name = "";
          string lock_closed_by = "";
          int lock_counter = 0;
          bool can_use_lock_now = false;
          string unique_id = "";
          MySqlConnection conn_shop = null;
          c_BB_Shop bb_shop = new c_BB_Shop();
          string sql_string = "";
          MySqlDataAdapter da = null;
          System.Data.DataTable dt_remote = null;
          DataRow row_remote = null;
          MySqlCommand cmd = null;
          BB_Gen.c_BB_Gen bb_gen = new BB_Gen.c_BB_Gen();

          public c_Lock(string feldname)
          {
          name = feldname;
          if (name == "")

          { MessageBox.Show("Feldname wurde nicht übergeben"); }

          else

          { conn_shop = bb_gen.conn_open(bb_shop.g_connectionstring_shop); read_lock(); }

          Random zufallszahlen = new Random();
          unique_id = zufallszahlen.NextDouble().ToString().Substring(2).PadRight(18, Convert.ToChar("0"));
          }

          public bool use_lock()
          {
          int start_zeit = 0;
          int schloss_geschlossen_seit = 0;
          int max_lock_lease = 10;
          int aktuelle_zeit = 0;
          bool result = false;
          if (row_remote["lock_type"].ToString() != "undefined")
          {
          start_zeit = bb_gen.unix_timestamp();
          while ((result = close_lock()) == false)
          {
          aktuelle_zeit = bb_gen.unix_timestamp();
          schloss_geschlossen_seit = int.Parse(row_remote["lock_closed_since"].ToString());
          if (((aktuelle_zeit - start_zeit) > max_lock_lease) && ((aktuelle_zeit - schloss_geschlossen_seit) > max_lock_lease))

          { crack_lock(); break; }

          else

          { Thread.Sleep(1000); read_lock(); }

          }
          }
          else

          { result = false; }

          return result;
          }

          public void open_lock()
          {
          if (can_use_lock_now == true)
          {

          sql_string = "UPDATE locks SET lock_status='unlocked',lock_closed_by='',lock_closed_since='0' WHERE Lock_ID='" + row_remote["Lock_ID"].ToString() + "';";
          cmd = new MySqlCommand(sql_string, conn_shop);
          int ergebnis = cmd.ExecuteNonQuery();
          cmd.Dispose();
          if (ergebnis == 1)

          { lock_closed_by = ""; can_use_lock_now = false; }

          }
          conn_shop.Close();
          conn_shop.Dispose();
          }

          private bool crack_lock()
          {
          bool result_var = false;
          sql_string = "UPDATE locks SET lock_status='locked',lock_closed_by='" + unique_id + "' WHERE Lock_ID='" + row_remote["Lock_ID"].ToString() + "';";
          cmd = new MySqlCommand(sql_string, conn_shop);
          int ergebnis = cmd.ExecuteNonQuery();
          cmd.Dispose();
          if (ergebnis == 1)

          { lock_closed_by = unique_id; can_use_lock_now = true; result_var = true; }

          else

          { result_var = false; }
          return result_var;
          }


          public int increment_counter()
          {
          int return_var = 0;
          if (can_use_lock_now == true && row_remote["lock_type"].ToString() == "counter")
          {
          sql_string = "SELECT lock_counter FROM locks WHERE Lock_ID='" + row_remote["Lock_ID"].ToString() + "';";
          cmd = new MySqlCommand(sql_string, conn_shop);
          int result = int.Parse(cmd.ExecuteScalar().ToString());
          cmd.Dispose();
          if (result == 0)
          { return_var = 0; }
          else
          {
          int lock_counter_incremented = result;
          lock_counter_incremented++;
          sql_string = "UPDATE locks SET lock_counter='" + lock_counter_incremented.ToString() + "' WHERE Lock_ID='" + row_remote["Lock_ID"].ToString() + "';";
          cmd = new MySqlCommand(sql_string, conn_shop);
          int ergebnis = cmd.ExecuteNonQuery();
          cmd.Dispose();
          if (ergebnis == 0)
          { return_var = 0; }
          else
          { lock_counter = lock_counter_incremented; return_var = lock_counter_incremented; }
          }
          }
          else
          { return_var = 0; }
          return return_var;
          }


          private bool close_lock()
          {
          bool result_var = false;
          string object_identifier = unique_id;
          sql_string = "UPDATE locks SET lock_status='locked',lock_closed_by='" + unique_id + "', lock_closed_since='" + bb_gen.unix_timestamp().ToString() + "' WHERE Lock_ID='" + row_remote["Lock_ID"].ToString() + "' AND lock_status='unlocked';";
          cmd = new MySqlCommand(sql_string, conn_shop);
          int ergebnis = cmd.ExecuteNonQuery();
          cmd.Dispose();
          if (ergebnis == 0)
          { result_var = false; }

          else
          {
          sql_string = "SELECT lock_closed_by FROM locks WHERE Lock_ID='" + row_remote["Lock_ID"].ToString() + "';";
          cmd = new MySqlCommand(sql_string, conn_shop);
          string object_identifier_aus_db = cmd.ExecuteScalar().ToString();
          cmd.Dispose();
          if (object_identifier_aus_db != object_identifier)

          { read_lock(); result_var = false; }

          else

          { lock_closed_by = object_identifier; can_use_lock_now = true; result_var = true; }

          }
          return result_var;
          }

          private bool read_lock()
          {
          bool return_val = false;
          sql_string = "SELECT * FROM locks where name='" + name + "';";
          da = new MySqlDataAdapter(sql_string, conn_shop);
          dt_remote = new System.Data.DataTable();
          da.Fill(dt_remote);
          if (dt_remote.Rows.Count == 1)

          { row_remote = dt_remote.Rows[0]; return_val = true; }

          return return_val;
          }

          }

          Bruno

          Show
          Bruno Brandmeier added a comment - Hi all, why not using a lock table like this (example in c#): c_Lock mylock = new c_Lock("rechnungs_nr"); mylock.use_lock(); int nur_rechnungs_nr = mylock.increment_counter(); mylock.open_lock(); class c_Lock { string name = ""; string lock_closed_by = ""; int lock_counter = 0; bool can_use_lock_now = false; string unique_id = ""; MySqlConnection conn_shop = null; c_BB_Shop bb_shop = new c_BB_Shop(); string sql_string = ""; MySqlDataAdapter da = null; System.Data.DataTable dt_remote = null; DataRow row_remote = null; MySqlCommand cmd = null; BB_Gen.c_BB_Gen bb_gen = new BB_Gen.c_BB_Gen(); public c_Lock(string feldname) { name = feldname; if (name == "") { MessageBox.Show("Feldname wurde nicht übergeben"); } else { conn_shop = bb_gen.conn_open(bb_shop.g_connectionstring_shop); read_lock(); } Random zufallszahlen = new Random(); unique_id = zufallszahlen.NextDouble().ToString().Substring(2).PadRight(18, Convert.ToChar("0")); } public bool use_lock() { int start_zeit = 0; int schloss_geschlossen_seit = 0; int max_lock_lease = 10; int aktuelle_zeit = 0; bool result = false; if (row_remote ["lock_type"] .ToString() != "undefined") { start_zeit = bb_gen.unix_timestamp(); while ((result = close_lock()) == false) { aktuelle_zeit = bb_gen.unix_timestamp(); schloss_geschlossen_seit = int.Parse(row_remote ["lock_closed_since"] .ToString()); if (((aktuelle_zeit - start_zeit) > max_lock_lease) && ((aktuelle_zeit - schloss_geschlossen_seit) > max_lock_lease)) { crack_lock(); break; } else { Thread.Sleep(1000); read_lock(); } } } else { result = false; } return result; } public void open_lock() { if (can_use_lock_now == true) { sql_string = "UPDATE locks SET lock_status='unlocked',lock_closed_by='',lock_closed_since='0' WHERE Lock_ID='" + row_remote ["Lock_ID"] .ToString() + "';"; cmd = new MySqlCommand(sql_string, conn_shop); int ergebnis = cmd.ExecuteNonQuery(); cmd.Dispose(); if (ergebnis == 1) { lock_closed_by = ""; can_use_lock_now = false; } } conn_shop.Close(); conn_shop.Dispose(); } private bool crack_lock() { bool result_var = false; sql_string = "UPDATE locks SET lock_status='locked',lock_closed_by='" + unique_id + "' WHERE Lock_ID='" + row_remote ["Lock_ID"] .ToString() + "';"; cmd = new MySqlCommand(sql_string, conn_shop); int ergebnis = cmd.ExecuteNonQuery(); cmd.Dispose(); if (ergebnis == 1) { lock_closed_by = unique_id; can_use_lock_now = true; result_var = true; } else { result_var = false; } return result_var; } public int increment_counter() { int return_var = 0; if (can_use_lock_now == true && row_remote ["lock_type"] .ToString() == "counter") { sql_string = "SELECT lock_counter FROM locks WHERE Lock_ID='" + row_remote ["Lock_ID"] .ToString() + "';"; cmd = new MySqlCommand(sql_string, conn_shop); int result = int.Parse(cmd.ExecuteScalar().ToString()); cmd.Dispose(); if (result == 0) { return_var = 0; } else { int lock_counter_incremented = result; lock_counter_incremented++; sql_string = "UPDATE locks SET lock_counter='" + lock_counter_incremented.ToString() + "' WHERE Lock_ID='" + row_remote ["Lock_ID"] .ToString() + "';"; cmd = new MySqlCommand(sql_string, conn_shop); int ergebnis = cmd.ExecuteNonQuery(); cmd.Dispose(); if (ergebnis == 0) { return_var = 0; } else { lock_counter = lock_counter_incremented; return_var = lock_counter_incremented; } } } else { return_var = 0; } return return_var; } private bool close_lock() { bool result_var = false; string object_identifier = unique_id; sql_string = "UPDATE locks SET lock_status='locked',lock_closed_by='" + unique_id + "', lock_closed_since='" + bb_gen.unix_timestamp().ToString() + "' WHERE Lock_ID='" + row_remote ["Lock_ID"] .ToString() + "' AND lock_status='unlocked';"; cmd = new MySqlCommand(sql_string, conn_shop); int ergebnis = cmd.ExecuteNonQuery(); cmd.Dispose(); if (ergebnis == 0) { result_var = false; } else { sql_string = "SELECT lock_closed_by FROM locks WHERE Lock_ID='" + row_remote ["Lock_ID"] .ToString() + "';"; cmd = new MySqlCommand(sql_string, conn_shop); string object_identifier_aus_db = cmd.ExecuteScalar().ToString(); cmd.Dispose(); if (object_identifier_aus_db != object_identifier) { read_lock(); result_var = false; } else { lock_closed_by = object_identifier; can_use_lock_now = true; result_var = true; } } return result_var; } private bool read_lock() { bool return_val = false; sql_string = "SELECT * FROM locks where name='" + name + "';"; da = new MySqlDataAdapter(sql_string, conn_shop); dt_remote = new System.Data.DataTable(); da.Fill(dt_remote); if (dt_remote.Rows.Count == 1) { row_remote = dt_remote.Rows[0]; return_val = true; } return return_val; } } Bruno
          Hide
          Jacopo Cappellato added a comment -

          Bilgin,

          yes you are right... copy and paste error.
          I just wanted to illustrate a situation where a gap in the sequence is generated: invoice B could fail for another reason. All I wanted to show is that use case #2 could lead to gaps.

          Thanks for the detailed review.

          Jacopo

          Show
          Jacopo Cappellato added a comment - Bilgin, yes you are right... copy and paste error. I just wanted to illustrate a situation where a gap in the sequence is generated: invoice B could fail for another reason. All I wanted to show is that use case #2 could lead to gaps. Thanks for the detailed review. Jacopo
          Hide
          Bilgin Ibryam added a comment -

          Jacopo,

          your example cases seems logical to me, one question though:

          You said "createInvoice for B assigns InvoiceId 12 and when attempts to store the invoice it gets a duplicated PK error;"

          Why there would be duplicate PK error? the previous invoice had id 11?

          I think in the second case there wouldn't be any duplicated PK errors. There might be other errors preventing invoice creation and lead to invoiceId gaps?

          Show
          Bilgin Ibryam added a comment - Jacopo, your example cases seems logical to me, one question though: You said "createInvoice for B assigns InvoiceId 12 and when attempts to store the invoice it gets a duplicated PK error;" Why there would be duplicate PK error? the previous invoice had id 11? I think in the second case there wouldn't be any duplicated PK errors. There might be other errors preventing invoice creation and lead to invoiceId gaps?
          Hide
          Jacopo Cappellato added a comment -

          I have spent some time studying the problem and doing some research.

          First of all, I agree that the current solution has major flaws and it should be completely redesigned: my preference would be to keep the default id creation for the invoiceId pk as is and add a new non-pk field to Invoice: "invoiceName" or "invoiceNumber" or "invoiceProtocolNumber" etc... The idea is that, based on PartyAcctgPreference, the system will set the invoiceName only right after the invoice has been successfully created (based on the invoiceName of the previous invoice).

          That said, I would like to share my thoughts about the current issue.
          In my opinion, even if it is definitely true that the issue is caused by a race condition on PartyAcctgPreference, I don't think this is caused by the fact that the getNextInvoiceId runs in a separate transaction; it shouldn't be the case because OFBiz uses the same transaction for all the services unless a new transaction is explicitly required (require-new-transaction="true").
          I think that the problem is caused by the default isolation level used by OFBiz: "ReadCommitted" (this is set in entityengine.xml). Here is the description from Wikipedia:
          ======================
          READ COMMITTED
          In this isolation level, a lock-based concurrency control DBMS implementation keeps write locks (acquired on selected data) till the end of the transaction, but read locks are released as soon as the SELECT operations is performed (so non-repeatable reads phenomenon can occur in this isolation level see below). As in the previous level, range-locks are not managed.
          ======================
          This means that if I run "createInvoice" for invoice A and B and PartyAcctgPreference.lastInvoiceNumber is initially 10:
          createInvoice for A starts transaction A
          createInvoice for B starts transaction B
          createInvoice for A calls getNextInvoiceId using transaction A
          createInvoice for B calls getNextInvoiceId using transaction B
          getNextInvoiceId for A reads lastInvoiceNumber = 10, increment it (11) and store (not committed), the record becomes write locked
          getNextInvoiceId for B reads lastInvoiceNumber = 10 (because of ReadCommitted isolation level), increment it (11) but the record is write locked by transaction A and so it waits to store the value (not committed)
          createInvoice for A assigns InvoiceId 11 and store the invoice; transaction A is committed and PartyAcctgPreference and Invoice are persisted and unlocked; now we have invoice A with invoiceId 11 and PartyAcctgPreference.lastInvoiceNumber=11
          getNextInvoiceId for B can continue now, and it stores the PartyAcctgPreference.lastInvoiceNumber=11 (first problem)
          createInvoice for B assigns InvoiceId 11 and when attempts to store the invoice it gets a duplicated PK error;
          transaction B is rolled back: invoice B is not stored and also PartyAcctgPreference is rolled back: I suspect that it could roll back to the original version read by B instead of the newly committed version A ***

          The end result is that in the system we will have: Invoice for A with id 11; PartyAcctgPreference.lastInvoiceNumber=10

              • = this is the step I am not 100% sure about

          How can we fix this?
          You could try with the following hack: set require-new-transaction="true" in the service definition for getNextInvoiceId.
          This should cause the following behaviour:

          createInvoice for A starts transaction A
          createInvoice for B starts transaction B
          createInvoice for A calls getNextInvoiceId using new transaction A1
          createInvoice for B calls getNextInvoiceId using new transaction B1
          getNextInvoiceId for A reads lastInvoiceNumber = 10, increment it (11) and store; transaction A1 is committed getNextInvoiceId for B reads lastInvoiceNumber = 11, increment it (12) and store;
          transaction B1 is committed createInvoice for A assigns InvoiceId 11 and store the invoice;
          transaction A is committed and Invoice is persisted and unlocked;
          now we have invoice A with invoiceId 11 and PartyAcctgPreference.lastInvoiceNumber=12
          createInvoice for B assigns InvoiceId 12 and when attempts to store the invoice it gets a duplicated PK error;
          transaction B is rolled back: invoice B is not stored but PartyAcctgPreference is NOT rolled back The end result is that in the system we will have: Invoice for A with id 11; PartyAcctgPreference.lastInvoiceNumber=12

          With this solution we will get gaps (Invoice C will get invoiceId=13) but not errors (errors could still happen but it will be very rare because it is very unlikely that we will get the race condition because getNextInvoiceId is very quick, while createInvoice is a slow service).

          Show
          Jacopo Cappellato added a comment - I have spent some time studying the problem and doing some research. First of all, I agree that the current solution has major flaws and it should be completely redesigned: my preference would be to keep the default id creation for the invoiceId pk as is and add a new non-pk field to Invoice: "invoiceName" or "invoiceNumber" or "invoiceProtocolNumber" etc... The idea is that, based on PartyAcctgPreference, the system will set the invoiceName only right after the invoice has been successfully created (based on the invoiceName of the previous invoice). That said, I would like to share my thoughts about the current issue. In my opinion, even if it is definitely true that the issue is caused by a race condition on PartyAcctgPreference, I don't think this is caused by the fact that the getNextInvoiceId runs in a separate transaction; it shouldn't be the case because OFBiz uses the same transaction for all the services unless a new transaction is explicitly required (require-new-transaction="true"). I think that the problem is caused by the default isolation level used by OFBiz: "ReadCommitted" (this is set in entityengine.xml). Here is the description from Wikipedia: ====================== READ COMMITTED In this isolation level, a lock-based concurrency control DBMS implementation keeps write locks (acquired on selected data) till the end of the transaction, but read locks are released as soon as the SELECT operations is performed (so non-repeatable reads phenomenon can occur in this isolation level see below ). As in the previous level, range-locks are not managed. ====================== This means that if I run "createInvoice" for invoice A and B and PartyAcctgPreference.lastInvoiceNumber is initially 10: createInvoice for A starts transaction A createInvoice for B starts transaction B createInvoice for A calls getNextInvoiceId using transaction A createInvoice for B calls getNextInvoiceId using transaction B getNextInvoiceId for A reads lastInvoiceNumber = 10, increment it (11) and store (not committed), the record becomes write locked getNextInvoiceId for B reads lastInvoiceNumber = 10 (because of ReadCommitted isolation level), increment it (11) but the record is write locked by transaction A and so it waits to store the value (not committed) createInvoice for A assigns InvoiceId 11 and store the invoice; transaction A is committed and PartyAcctgPreference and Invoice are persisted and unlocked; now we have invoice A with invoiceId 11 and PartyAcctgPreference.lastInvoiceNumber=11 getNextInvoiceId for B can continue now, and it stores the PartyAcctgPreference.lastInvoiceNumber=11 (first problem) createInvoice for B assigns InvoiceId 11 and when attempts to store the invoice it gets a duplicated PK error; transaction B is rolled back: invoice B is not stored and also PartyAcctgPreference is rolled back: I suspect that it could roll back to the original version read by B instead of the newly committed version A *** The end result is that in the system we will have: Invoice for A with id 11; PartyAcctgPreference.lastInvoiceNumber=10 = this is the step I am not 100% sure about How can we fix this? You could try with the following hack: set require-new-transaction="true" in the service definition for getNextInvoiceId. This should cause the following behaviour: createInvoice for A starts transaction A createInvoice for B starts transaction B createInvoice for A calls getNextInvoiceId using new transaction A1 createInvoice for B calls getNextInvoiceId using new transaction B1 getNextInvoiceId for A reads lastInvoiceNumber = 10, increment it (11) and store; transaction A1 is committed getNextInvoiceId for B reads lastInvoiceNumber = 11, increment it (12) and store; transaction B1 is committed createInvoice for A assigns InvoiceId 11 and store the invoice; transaction A is committed and Invoice is persisted and unlocked; now we have invoice A with invoiceId 11 and PartyAcctgPreference.lastInvoiceNumber=12 createInvoice for B assigns InvoiceId 12 and when attempts to store the invoice it gets a duplicated PK error; transaction B is rolled back: invoice B is not stored but PartyAcctgPreference is NOT rolled back The end result is that in the system we will have: Invoice for A with id 11; PartyAcctgPreference.lastInvoiceNumber=12 With this solution we will get gaps (Invoice C will get invoiceId=13) but not errors (errors could still happen but it will be very rare because it is very unlikely that we will get the race condition because getNextInvoiceId is very quick, while createInvoice is a slow service).
          Hide
          Aleksey Fedorchenko added a comment -

          Adrian,

          It seems I have not understand your vision properly, sorry. As for me, it is not possible as those getNextXXXId methods located in the different services, they have different body and they are in use in different locations over the system (I found no intersections in the tree of dependencies from those methods). From my standpoint my fix is the only quick-n-dirty solution for the problem. At least, our production servers are telling me about this – problem has gone.

          Could you please set me an example implementation of your idea? Even the pseudo code would be acceptable in this case.

          Thanks!

          Show
          Aleksey Fedorchenko added a comment - Adrian, It seems I have not understand your vision properly, sorry. As for me, it is not possible as those getNextXXXId methods located in the different services, they have different body and they are in use in different locations over the system (I found no intersections in the tree of dependencies from those methods). From my standpoint my fix is the only quick-n-dirty solution for the problem. At least, our production servers are telling me about this – problem has gone. Could you please set me an example implementation of your idea? Even the pseudo code would be acceptable in this case. Thanks!
          Hide
          Adrian Crum added a comment -

          Aleksey,

          Instead of a named lock, couldn't you call the individual ID generation methods from a single method and then synchronize that method? It seems to me it would have the same effect as a named lock.

          Show
          Adrian Crum added a comment - Aleksey, Instead of a named lock, couldn't you call the individual ID generation methods from a single method and then synchronize that method? It seems to me it would have the same effect as a named lock.
          Hide
          Aleksey Fedorchenko added a comment - - edited

          Hi!

          I've attached another kind of a fix of the problem above. It is a OFBIZ-3557-2.patch but let me describe the problem first.

          On the production system we're meeting this problem periodically: either invoice or order identifiers kept by the PartyAcctgPreference loose their integrity from time to time and are starting to generate duplicated values that violate the database constraints and prevent from the creation of invoices and orders.

          OFBIZ-3557-1.patch have not provided any feasible solution to our problem except for the fact that it pointed us to the problematic code there in the OFBiz core and was a great example of the way minilang may be extended. Thanks to Adrian Crum!

          The problem we have in the system may be described as a heavy access to the PartyAcctgPreference from the concurrent requests that are creating both orders and invoices simultaneously for the same party. In this case we need to serialize not only the getNextInvoiceId / getNextOrderId / getNextQuoteId separately but rather provide a synchronous access to the group of those operations to avoid concurrent update of a different counter fields of the same entity.

          In this perspective we're extending the "synchronized" attribute from OFBIZ-3557-1.patch with the "synchronized-lock-name" attribute that is a name of a mutex object being applied to a group of different operations.

          Another finding was in the field of the transaction scope getNextInvoiceId / getNextOrderId / getNextQuoteId operations are working in: they belong to the same transaction as the service of invoice / order / quote creation which would be considered as a wrong behavior for the multi-tasking environments. We introduced an atomic transaction scope for the getNextInvoiceId / getNextOrderId / getNextQuoteId operations that is basically suspending / resuming outer transaction to generate and store in the database an incremental identifier regardless of the possible failure of the outer business transaction.

          Please note that the OFBIZ-3557-2.patch might be considered as a replacement of OFBIZ-3557-1.patch and it provides changes against the 09.04 release source code.

          Please note as well that it is suitable for the single instance setups only. No clustering support was provided.

          Let me have your comments. Thanks!

          Show
          Aleksey Fedorchenko added a comment - - edited Hi! I've attached another kind of a fix of the problem above. It is a OFBIZ-3557 -2.patch but let me describe the problem first. On the production system we're meeting this problem periodically: either invoice or order identifiers kept by the PartyAcctgPreference loose their integrity from time to time and are starting to generate duplicated values that violate the database constraints and prevent from the creation of invoices and orders. OFBIZ-3557 -1.patch have not provided any feasible solution to our problem except for the fact that it pointed us to the problematic code there in the OFBiz core and was a great example of the way minilang may be extended. Thanks to Adrian Crum! The problem we have in the system may be described as a heavy access to the PartyAcctgPreference from the concurrent requests that are creating both orders and invoices simultaneously for the same party. In this case we need to serialize not only the getNextInvoiceId / getNextOrderId / getNextQuoteId separately but rather provide a synchronous access to the group of those operations to avoid concurrent update of a different counter fields of the same entity. In this perspective we're extending the "synchronized" attribute from OFBIZ-3557 -1.patch with the "synchronized-lock-name" attribute that is a name of a mutex object being applied to a group of different operations. Another finding was in the field of the transaction scope getNextInvoiceId / getNextOrderId / getNextQuoteId operations are working in: they belong to the same transaction as the service of invoice / order / quote creation which would be considered as a wrong behavior for the multi-tasking environments. We introduced an atomic transaction scope for the getNextInvoiceId / getNextOrderId / getNextQuoteId operations that is basically suspending / resuming outer transaction to generate and store in the database an incremental identifier regardless of the possible failure of the outer business transaction. Please note that the OFBIZ-3557 -2.patch might be considered as a replacement of OFBIZ-3557 -1.patch and it provides changes against the 09.04 release source code. Please note as well that it is suitable for the single instance setups only. No clustering support was provided. Let me have your comments. Thanks!
          Hide
          BJ Freeman added a comment -

          I have not gone through all 15 to see if they are specific to getNextInvoiceId
          but they have getNextInvoiceId in them
          r558532 this one was 7/27/2007
          r646697
          r667748
          r684546
          r730092 [1/3] and a bunch righ after.

          Show
          BJ Freeman added a comment - I have not gone through all 15 to see if they are specific to getNextInvoiceId but they have getNextInvoiceId in them r558532 this one was 7/27/2007 r646697 r667748 r684546 r730092 [1/3] and a bunch righ after.
          Hide
          Adrian Crum added a comment -

          I'm not sure record locking is needed. I suggested a solution that I think will work - the generated IDs are kept in an entity where they are part of the primary key, and you write a loop to keep trying incremental ID values until a store is successful. Plus, the code needs to be rewritten so the ID generation and invoice/order generation are atomic - they both succeed or they both fail.

          Show
          Adrian Crum added a comment - I'm not sure record locking is needed. I suggested a solution that I think will work - the generated IDs are kept in an entity where they are part of the primary key, and you write a loop to keep trying incremental ID values until a store is successful. Plus, the code needs to be rewritten so the ID generation and invoice/order generation are atomic - they both succeed or they both fail.
          Hide
          Scott Gray added a comment -

          http://svn.ofbiz.org/viewcvs
          that's the viewvc link for the old repo

          About the lack of synchronization, shouldn't be possible to issue a lock on the row retrieved from PartyAcctgPreference so that no other transactions can read from it until the current one is committed?

          Show
          Scott Gray added a comment - http://svn.ofbiz.org/viewcvs that's the viewvc link for the old repo About the lack of synchronization, shouldn't be possible to issue a lock on the row retrieved from PartyAcctgPreference so that no other transactions can read from it until the current one is committed?
          Hide
          Adrian Crum added a comment -

          By the way, this issue is NOT a duplicate of OFBIZ-2353. The problem described here is not due to any alleged flaws in SequenceUtil.

          Show
          Adrian Crum added a comment - By the way, this issue is NOT a duplicate of OFBIZ-2353 . The problem described here is not due to any alleged flaws in SequenceUtil.
          Hide
          Adrian Crum added a comment -

          Here is some more information that might help anyone wanting to tackle this problem.

          Originally, OFBiz followed this pattern for generating invoices/orders:

          1. Begin transaction
          2. Create a new value
          3. Get an ID from SequenceUtil
          4. Store the ID in the new value
          5. Store the new value
          6. Commit transaction

          The process was atomic and reliable. SequenceUtil would guarantee a unique ID. The entity receiving the ID had the ID in the primary key, so even in the unlikely event SequenceUtil produced a duplicate ID, the store operation would fail because of a duplicate key violation.

          Somewhere along the way this process was changed (I can't find out where because it was before OFBiz joined the ASF). If an OFBiz instance is set up for sequential invoice/order IDs, this pattern is used for generating invoices/orders:

          1. Begin transaction
          2. Create a new value
          3. Get an ID from the getNextInvoiceId service:
          4. Begin transaction
          5. Get last invoice/order number from PartyAcctgPreference
          6. Increment number
          7. Store new number in PartyAcctgPreference
          8. Commit transaction
          9. Store the ID in the new value
          10. Store the new value
          11. Commit transaction

          The getNextInvoiceId service is in a simple method. Simple methods are not synchronized.

          This new pattern is a bad design. It is not atomic and the generated IDs are not guaranteed to be unique. Not only is this pattern likely to fail, it is guaranteed to fail. The whole thing needs to be redesigned.

          In the meantime, the patch I provided fixes the getNextInvoiceId service by making the simple method synchronized. It should help in single-server setups. There is still a possibility that a valid ID will be generated, but the invoice/order creation will fail (since the two operations are in separate transactions). The end result from the user's perspective is that there will be random gaps in invoice/order IDs.

          Show
          Adrian Crum added a comment - Here is some more information that might help anyone wanting to tackle this problem. Originally, OFBiz followed this pattern for generating invoices/orders: 1. Begin transaction 2. Create a new value 3. Get an ID from SequenceUtil 4. Store the ID in the new value 5. Store the new value 6. Commit transaction The process was atomic and reliable. SequenceUtil would guarantee a unique ID. The entity receiving the ID had the ID in the primary key, so even in the unlikely event SequenceUtil produced a duplicate ID, the store operation would fail because of a duplicate key violation. Somewhere along the way this process was changed (I can't find out where because it was before OFBiz joined the ASF). If an OFBiz instance is set up for sequential invoice/order IDs, this pattern is used for generating invoices/orders: 1. Begin transaction 2. Create a new value 3. Get an ID from the getNextInvoiceId service: 4. Begin transaction 5. Get last invoice/order number from PartyAcctgPreference 6. Increment number 7. Store new number in PartyAcctgPreference 8. Commit transaction 9. Store the ID in the new value 10. Store the new value 11. Commit transaction The getNextInvoiceId service is in a simple method. Simple methods are not synchronized. This new pattern is a bad design. It is not atomic and the generated IDs are not guaranteed to be unique. Not only is this pattern likely to fail, it is guaranteed to fail. The whole thing needs to be redesigned. In the meantime, the patch I provided fixes the getNextInvoiceId service by making the simple method synchronized. It should help in single-server setups. There is still a possibility that a valid ID will be generated, but the invoice/order creation will fail (since the two operations are in separate transactions). The end result from the user's perspective is that there will be random gaps in invoice/order IDs.
          Hide
          Adrian Crum added a comment -

          OFBIZ-3557-1.patch fixes PART of the problem. It adds synchronization capability to simple methods, and the problematic methods are flagged as synchronized.

          This should fix the duplicate ID problem in a single-server environment. Multi-server setups will require a completely different ID generation algorithm.

          Show
          Adrian Crum added a comment - OFBIZ-3557 -1.patch fixes PART of the problem. It adds synchronization capability to simple methods, and the problematic methods are flagged as synchronized. This should fix the duplicate ID problem in a single-server environment. Multi-server setups will require a completely different ID generation algorithm.
          Hide
          Adrian Crum added a comment -

          Another solution would be to have a separate entity for just the party ID and the invoice number. The primary key includes both fields. Then you can guarantee a unique invoice number this way:

          1. Get current value
          2. Add increment
          3. Store new value
          4. If an exception is thrown, go to step 2

          Show
          Adrian Crum added a comment - Another solution would be to have a separate entity for just the party ID and the invoice number. The primary key includes both fields. Then you can guarantee a unique invoice number this way: 1. Get current value 2. Add increment 3. Store new value 4. If an exception is thrown, go to step 2
          Hide
          Jacques Le Roux added a comment -

          Thanks for comment Jeremy,

          I will sleep on that...

          Show
          Jacques Le Roux added a comment - Thanks for comment Jeremy, I will sleep on that...
          Hide
          Wickersheimer Jeremy added a comment -

          The above fix is not relative to sequence issues. But i've been able to use the semaphore locking on storeOrder for example which actually fixes the problem of ID collisions.

          Obviously this is not a clean solution, because it involves changing the service definition to use the semaphore for people who use strict sequencing.

          I guess the real solution could be in between, but i am not sure how it should be implemented. The getNextOrderId service should do the locking inside the simple method and only if the strict sequencing is used. And then the lock should be kept until the transaction is committed, and not just until the end of the service.

          Show
          Wickersheimer Jeremy added a comment - The above fix is not relative to sequence issues. But i've been able to use the semaphore locking on storeOrder for example which actually fixes the problem of ID collisions. Obviously this is not a clean solution, because it involves changing the service definition to use the semaphore for people who use strict sequencing. I guess the real solution could be in between, but i am not sure how it should be implemented. The getNextOrderId service should do the locking inside the simple method and only if the strict sequencing is used. And then the lock should be kept until the transaction is committed, and not just until the end of the service.
          Hide
          Wickersheimer Jeremy added a comment -

          While looking for a possible solution using SemaphoreService, i noticed that the semaphore / semaphoreWait / semaphoreSleep attributes are not copied in the ModelService clone constructor.
          The result is that they are never used.

          This happens in trunk and 09.04, i do not attach a patch all that is needed in ModelService is to add:

          this.semaphore = model.semaphore;
          this.semaphoreWait = model.semaphoreWait;
          this.semaphoreSleep = model.semaphoreSleep;

          in the clone constructor.

          Then i think we can work around this JIRA issue by marking the storeOrder service (for example) to use the semaphores.

          Show
          Wickersheimer Jeremy added a comment - While looking for a possible solution using SemaphoreService, i noticed that the semaphore / semaphoreWait / semaphoreSleep attributes are not copied in the ModelService clone constructor. The result is that they are never used. This happens in trunk and 09.04, i do not attach a patch all that is needed in ModelService is to add: this.semaphore = model.semaphore; this.semaphoreWait = model.semaphoreWait; this.semaphoreSleep = model.semaphoreSleep; in the clone constructor. Then i think we can work around this JIRA issue by marking the storeOrder service (for example) to use the semaphores.

            People

            • Assignee:
              Unassigned
              Reporter:
              Wickersheimer Jeremy
            • Votes:
              4 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:

                Development