Derby
  1. Derby
  2. DERBY-3900

SELECT ... FOR UPDATE cannot be used in many queries

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: SQL
    • Urgency:
      Normal

      Description

      The documentation in http://db.apache.org/derby/docs/10.4/ref/rrefsqlj41360.html#rrefsqlj41360__sqlj15384 says that SELECT ... FOR UPDATE cannot be used in many situations (e.g. when ORDER BY is present or when JOINs are used). I can very well understand that the current implementation using updatable cursors is very hard to implement when multiple tables are used and therefore these restrictions are probably necessary.

      However, besides that functionality, "FOR UPDATE" is extremely useful for transactional integrity: For example, we - http://www.jfire.org - use transaction isolation level read committed, because it provides good transaction safety combined with good performance. When modifying records, we first select the appropriate table rows with a SELECT FOR UPDATE in order to guarantee that the data we just read cannot be manipulated by another transaction while we are working with it.

      I do not see any reason why this locking behaviour should not be possible for certain queries. Therefore, I recommend to introduce a configuration setting (maybe a system property? or an option passed to the JDBC-URL?) that disables updatable queries completely (we don't need them anyway and probably it improves performance when not using them). With this option set, the SELECT ... FOR UPDATE should solely affect locks on rows - and work with all SELECT expressions - no matter whether they use JOIN, UNION, ORDER BY etc..

        Activity

        Hide
        Dag H. Wanvik added a comment -

        Have a look at the "WITH

        {RR|RS|CS|UR}

        " clause of Derby's SELECT statement, cf.
        http://db.apache.org/derby/docs/10.4/ref/rrefsqlj41360.html
        http://db.apache.org/derby/docs/10.4/ref/rrefsqlj41180.html

        I think this will let you achieve what you need, by using SELECT .. WITH RS (or for serializable: WITH RR).

        Show
        Dag H. Wanvik added a comment - Have a look at the "WITH {RR|RS|CS|UR} " clause of Derby's SELECT statement, cf. http://db.apache.org/derby/docs/10.4/ref/rrefsqlj41360.html http://db.apache.org/derby/docs/10.4/ref/rrefsqlj41180.html I think this will let you achieve what you need, by using SELECT .. WITH RS (or for serializable: WITH RR).
        Hide
        Dag H. Wanvik added a comment -

        I think this can ble closed; Derby has a way to achieve this functionality already.

        Show
        Dag H. Wanvik added a comment - I think this can ble closed; Derby has a way to achieve this functionality already.
        Hide
        Knut Anders Hatlen added a comment - - edited

        I think we should leave this issue open.

        SELECT ... WITH RS doesn't acquire update locks, so it doesn't have the exact same transactional properties as FOR UPDATE. If you have a transaction that first reads a set of records, then processes the information, and finally updates the records, you could end up with deadlocks and rollbacks if other, similar transactions are executed concurrently. If you additionally use FOR UPDATE when you read the records in the beginning of the transaction, no other transaction can read the same records with FOR UPDATE, and you don't get deadlocks. This could in some cases give significant performance benefits. Setting an exclusive table lock before reading the rows could in some cases give you a similar benefit, but that could have a much higher concurrency penalty if only a small part of the table is actually touched.

        GlassFish is one application that uses SELECT ... FOR UPDATE WITH RS to improve the performance of certain transactions. See https://glassfish.dev.java.net/issues/show_bug.cgi?id=4523. However, since Derby doesn't support the FOR UPDATE clause in joins, it needs to fall back to a less efficient approach for some queries. With the functionality suggested by the reporter, higher performance could also be achieved for those queries that involve joins.

        Show
        Knut Anders Hatlen added a comment - - edited I think we should leave this issue open. SELECT ... WITH RS doesn't acquire update locks, so it doesn't have the exact same transactional properties as FOR UPDATE. If you have a transaction that first reads a set of records, then processes the information, and finally updates the records, you could end up with deadlocks and rollbacks if other, similar transactions are executed concurrently. If you additionally use FOR UPDATE when you read the records in the beginning of the transaction, no other transaction can read the same records with FOR UPDATE, and you don't get deadlocks. This could in some cases give significant performance benefits. Setting an exclusive table lock before reading the rows could in some cases give you a similar benefit, but that could have a much higher concurrency penalty if only a small part of the table is actually touched. GlassFish is one application that uses SELECT ... FOR UPDATE WITH RS to improve the performance of certain transactions. See https://glassfish.dev.java.net/issues/show_bug.cgi?id=4523 . However, since Derby doesn't support the FOR UPDATE clause in joins, it needs to fall back to a less efficient approach for some queries. With the functionality suggested by the reporter, higher performance could also be achieved for those queries that involve joins.
        Hide
        Kristian Waagan added a comment -

        I agree with Knut Anders, there are scenarios where using the SELECT ... FOR UPDATE with more complex queries would be beneficial.
        The benchmark SPECjAppServer2004 is one such example, and it seems the reporter also has a concrete example.

        We should probably read up on the standards, and try to introduce the extended functionality in a way that doesn't break standards compliance (having a switch has already been suggested).

        Show
        Kristian Waagan added a comment - I agree with Knut Anders, there are scenarios where using the SELECT ... FOR UPDATE with more complex queries would be beneficial. The benchmark SPECjAppServer2004 is one such example, and it seems the reporter also has a concrete example. We should probably read up on the standards, and try to introduce the extended functionality in a way that doesn't break standards compliance (having a switch has already been suggested).
        Hide
        Dag H. Wanvik added a comment -

        You are right of course, that only using WITH RS could cause more deadlocks (although it would solve
        the basic integrity issue of another transaction "manipulating the data while we are working with it"). So having
        non-sharabled locks a priori could be useful for some apps. A global switch would not add any non-standard
        syntax but isn't very flexible. SQL doesn't deal with locks, only isolation level of course.. How do you see the switch working?

        Show
        Dag H. Wanvik added a comment - You are right of course, that only using WITH RS could cause more deadlocks (although it would solve the basic integrity issue of another transaction "manipulating the data while we are working with it"). So having non-sharabled locks a priori could be useful for some apps. A global switch would not add any non-standard syntax but isn't very flexible. SQL doesn't deal with locks, only isolation level of course.. How do you see the switch working?
        Hide
        Kristian Waagan added a comment -

        I haven't given the switch much thought, I only noticed that the reporter mentioned it.
        The idea is to keep the current behavior when the switch is "off" I guess, and then many queries would be denied. It has been speculated this is because of the difficulty of implementing updatable result sets for joins etc.
        If the switch is turned "on", the result sets won't be updatable, but exclusive locks will be set on the rows touched.

        I have no idea how easily such behavior can be achieved. I believe other people are in a better position to say something about the implementation cost/difficulty.
        Your point about the lack of flexibility of a global switch is absolutely valid. Another option is to add (non-standard?) per query SQL syntax, or maybe something for a connection/session.

        It seems other DBMSs are using SELECT ... FOR UPDATE [NOWAIT] for the purpose described by this Jira issue.

        Show
        Kristian Waagan added a comment - I haven't given the switch much thought, I only noticed that the reporter mentioned it. The idea is to keep the current behavior when the switch is "off" I guess, and then many queries would be denied. It has been speculated this is because of the difficulty of implementing updatable result sets for joins etc. If the switch is turned "on", the result sets won't be updatable, but exclusive locks will be set on the rows touched. I have no idea how easily such behavior can be achieved. I believe other people are in a better position to say something about the implementation cost/difficulty. Your point about the lack of flexibility of a global switch is absolutely valid. Another option is to add (non-standard?) per query SQL syntax, or maybe something for a connection/session. It seems other DBMSs are using SELECT ... FOR UPDATE [NOWAIT] for the purpose described by this Jira issue.
        Hide
        Marco added a comment -

        Thanks a lot for your replies! I have opened an issue in the DataNucleus JIRA: http://www.jpox.org/servlet/jira/browse/NUCRDBMS-88

        @Dag H. Wanvik
        > A global switch ... isn't very flexible.
        Today, many people use persistence layers (JDO, JPA, Hibernate, etc.). When using such a framework, updatable ResultSets are normally not necessary at all, since the persistence engines perform explicit UPDATE/INSERT/DELETE commands. Thus, it's not a problem at all that the flexibility is reduced and IMHO it's not required to modify the behaviour on a per-query-base.

        > You are right of course, that only using WITH RS could cause more deadlocks
        I have the impression that Derby seems to need quite some time to detect deadlocks (why?). Therefore, getting even more of them would not be desirable. Thus, I'd favour a modification of the "FOR UPDATE" behaviour, if it would lead to the same result with less deadlocks.

        Show
        Marco added a comment - Thanks a lot for your replies! I have opened an issue in the DataNucleus JIRA: http://www.jpox.org/servlet/jira/browse/NUCRDBMS-88 @Dag H. Wanvik > A global switch ... isn't very flexible. Today, many people use persistence layers (JDO, JPA, Hibernate, etc.). When using such a framework, updatable ResultSets are normally not necessary at all, since the persistence engines perform explicit UPDATE/INSERT/DELETE commands. Thus, it's not a problem at all that the flexibility is reduced and IMHO it's not required to modify the behaviour on a per-query-base. > You are right of course, that only using WITH RS could cause more deadlocks I have the impression that Derby seems to need quite some time to detect deadlocks (why?). Therefore, getting even more of them would not be desirable. Thus, I'd favour a modification of the "FOR UPDATE" behaviour, if it would lead to the same result with less deadlocks.
        Hide
        Dag H. Wanvik added a comment -

        Kristian: > Another option is to add (non-standard?) per query SQL syntax, or maybe something for a connection/session.

        A third option may be to always allow FOR UPDATE, but just deny actual updates if the FROM expression
        is anything but a single basetable (thanks to Knut for suggesting this).

        Show
        Dag H. Wanvik added a comment - Kristian: > Another option is to add (non-standard?) per query SQL syntax, or maybe something for a connection/session. A third option may be to always allow FOR UPDATE, but just deny actual updates if the FROM expression is anything but a single basetable (thanks to Knut for suggesting this).
        Hide
        Kristian Waagan added a comment -

        Yes, that third option sound like the best approach to me.
        Hopefully the locking is already implemented, and we just have to change the place where the query/update is denied?

        Well, there's probably more work to be done, but at least it's a plan

        Show
        Kristian Waagan added a comment - Yes, that third option sound like the best approach to me. Hopefully the locking is already implemented, and we just have to change the place where the query/update is denied? Well, there's probably more work to be done, but at least it's a plan
        Hide
        Knut Anders Hatlen added a comment -

        I found this comment in CursorNode.determineUpdateMode():

        // The FOR UPDATE clause has two uses:
        //
        // for positioned cursor updates
        //
        // to change locking behaviour of the select
        // to reduce deadlocks on subsequent updates
        // in the same transaction.
        //
        // We now support this latter case, without requiring
        // that the source of the rows be able to implement
        // a positioned update.

        So it seems there is (or has been) support in the code for getting this locking behaviour in statements that don't allow positioned updates.

        Show
        Knut Anders Hatlen added a comment - I found this comment in CursorNode.determineUpdateMode(): // The FOR UPDATE clause has two uses: // // for positioned cursor updates // // to change locking behaviour of the select // to reduce deadlocks on subsequent updates // in the same transaction. // // We now support this latter case, without requiring // that the source of the rows be able to implement // a positioned update. So it seems there is (or has been) support in the code for getting this locking behaviour in statements that don't allow positioned updates.

          People

          • Assignee:
            Unassigned
            Reporter:
            Marco
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development