OpenJPA
  1. OpenJPA
  2. OPENJPA-1896

OpenJPA cannot store POJOs if a corresponding record already exists

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.1, 2.1.0, 2.2.0
    • Fix Version/s: 2.1.0, 2.2.0
    • Component/s: None
    • Labels:
      None
    • Environment:
      Eclipse, Sun Java 1.6, Ubuntu Lucid, Guice JPAPersistModule, build-time enhance.xml

      Description

      If a POJO is created using a java constructor, merge() cannot store the newly constructed object's data if this means updating a pre-existing record with a matching identity.

      This is a major bug since it means applications where the objects have a natural key cannot use OpenJPA. In my case the example was a filesystem; each crawl of the filesystem generates its own data objects with file path as the natural key. These objects then need to be stored into the database. Previous crawls may have encountered the same files, and the merge operation should cause the latest data from the POJO to be stored in the pre-existing record.

      Instead, any attempt to execute either merge() or persist() on an independently constructed object with a matching record identity in the database triggers the same error in the database layer, since OpenJPA attempts to execute an insert for a pre-existing primary key, throwing...
      org.apache.openjpa.lib.jdbc.ReportingSQLException: ERROR: duplicate key value violates unique constraint "file_pkey"

      {prepstmnt 32879825 INSERT INTO file (locationString, location, version, folder) VALUES (?, ?, ?, ?) [params=?, ?, ?, ?]}

      [code=0, state=23505]

      From discussion with Rick Curtis on the users@openjpa.apache.org list, this is because the version field on a POJO which is unmanaged is not yet set.

      An ASSUMPTION seems to be made that no such record exists in the database already since it wasn't loaded from the database in the first place, so a persist is attempted. Instead, I recommend the database is QUERIED TO FIND OUT if such a record already exists, and the version field is set correspondingly before attempting the merge()

      Here is the corresponding thread containing Ricks comments and links to an example in Github which can recreate the problem.

      http://bit.ly/hfPjTI

      1. openjpa-1896.jar
        122 kB
        Rick Curtis

        Activity

        Albert Lee made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Hide
        Albert Lee added a comment -

        Close issue in preparation for 2.2.0 release.

        Show
        Albert Lee added a comment - Close issue in preparation for 2.2.0 release.
        Hide
        Rick Curtis added a comment -

        Committed changes to 2.1.x and trunk.

        Show
        Rick Curtis added a comment - Committed changes to 2.1.x and trunk.
        Rick Curtis made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Rick Curtis made changes -
        Fix Version/s 1.0.5 [ 12313777 ]
        Affects Version/s 2.1.0 [ 12314542 ]
        Affects Version/s 2.2.0 [ 12315910 ]
        Hide
        Rick Curtis added a comment -

        This required two changes to get everything working :

        1.] I updated the enhancer (PCEnhancer) so that when we are writing pcIsDetached() if we can't make a definitive answer, to return null. When merging an Entity and we don't know if it is detached, we will query for the Entity to see if we need to issue an UPDATE or an INSERT.

        2.] *Note - This is the part that I'm not very excited about....

        This change is required for an Entity which has a primitive @Version field. After applying fix [1] we now query the DB to see if these Entities exist. Now that we know that we're going to need to do an update, we look at the version field to see if the value is different than those in the DB. In the case of a non-primitive version field, it will be null and we know that it wasn't set.

        In the case of a primitive field, it will be initialized to the types default value. If the current version is greater than the default for the type we will incorrectly throw on optimistic lock exception. To fix the problem I was able to peer inside the Entity and check the value of the pcVersionInit field(added by the Enhancer to detect whether the version field was set, or if it is a default value). Unfortunately the value of this field isn't exposed on the PersistenceCapeable interface, so I had to use reflection.

        After thinking some more about this, I have an edge case that I can't fix...

        Lets say I have an Entity with an int version column and for whatever reason the value is 0. Now we find an Entity and stream it out to client 1. Next another client modifies that row so the version gets incremented to 1. Client 1 updates that data and sends it back to the server. At this time we have no state manager and we try to merge it back into the persistence context. This is the problem. We should get an OLE here because the current Entity has a version of 0 but the DB has a version of 1. We don't get an OLE because we will assume that the value of the version column is a defaulted value, not one that was actually set there. Make sense?

        The net of this is that we should tell users to either make sure their version columns start with a value greater than 0, or to use non primitive types.

        Show
        Rick Curtis added a comment - This required two changes to get everything working : 1.] I updated the enhancer (PCEnhancer) so that when we are writing pcIsDetached() if we can't make a definitive answer, to return null. When merging an Entity and we don't know if it is detached, we will query for the Entity to see if we need to issue an UPDATE or an INSERT. 2.] *Note - This is the part that I'm not very excited about.... This change is required for an Entity which has a primitive @Version field. After applying fix [1] we now query the DB to see if these Entities exist. Now that we know that we're going to need to do an update, we look at the version field to see if the value is different than those in the DB. In the case of a non-primitive version field, it will be null and we know that it wasn't set. In the case of a primitive field, it will be initialized to the types default value. If the current version is greater than the default for the type we will incorrectly throw on optimistic lock exception. To fix the problem I was able to peer inside the Entity and check the value of the pcVersionInit field(added by the Enhancer to detect whether the version field was set, or if it is a default value). Unfortunately the value of this field isn't exposed on the PersistenceCapeable interface, so I had to use reflection. After thinking some more about this, I have an edge case that I can't fix... Lets say I have an Entity with an int version column and for whatever reason the value is 0. Now we find an Entity and stream it out to client 1. Next another client modifies that row so the version gets incremented to 1. Client 1 updates that data and sends it back to the server. At this time we have no state manager and we try to merge it back into the persistence context. This is the problem. We should get an OLE here because the current Entity has a version of 0 but the DB has a version of 1. We don't get an OLE because we will assume that the value of the version column is a defaulted value, not one that was actually set there. Make sense? The net of this is that we should tell users to either make sure their version columns start with a value greater than 0, or to use non primitive types.
        Rick Curtis made changes -
        Fix Version/s 1.0.5 [ 12313777 ]
        Fix Version/s 2.1.0 [ 12314542 ]
        Fix Version/s 2.2.0 [ 12315910 ]
        Hide
        Rick Curtis added a comment -

        Cefn -

        Well that sucks that you're leaving for hibernate... but I understand that you have a job that needs to get done. I'll be sure to have this one fixed so things are working when you give us another try in the future.

        Thanks,
        Rick

        Show
        Rick Curtis added a comment - Cefn - Well that sucks that you're leaving for hibernate... but I understand that you have a job that needs to get done. I'll be sure to have this one fixed so things are working when you give us another try in the future. Thanks, Rick
        Hide
        Cefn Hoile added a comment -

        Sorry to let you down, Rick. I can't easily verify the patch. I no longer have a codebase incorporating OpenJPA as I switched to Hibernate. This bug just seemed too major to me, and made me wonder what else I would discover if I relied on OpenJPA.

        I appreciate the work you've put in, and I hope it helps supporting other OpenJPA users who want to use POJOs with natural keys as the project gets wider adoption. I like a lot of the features which OpenJPA adds on top of JPA, but standards conformance is more important for the future of our project (switching easily between persistence frameworks).

        Focusing just on JPA functionality I found the OpenJPA enhancement process really complex and likely to introduce issues in deployment, (Hibernate just worked). However, this merge bug was the clincher because of the domain we're working on.

        Show
        Cefn Hoile added a comment - Sorry to let you down, Rick. I can't easily verify the patch. I no longer have a codebase incorporating OpenJPA as I switched to Hibernate. This bug just seemed too major to me, and made me wonder what else I would discover if I relied on OpenJPA. I appreciate the work you've put in, and I hope it helps supporting other OpenJPA users who want to use POJOs with natural keys as the project gets wider adoption. I like a lot of the features which OpenJPA adds on top of JPA, but standards conformance is more important for the future of our project (switching easily between persistence frameworks). Focusing just on JPA functionality I found the OpenJPA enhancement process really complex and likely to introduce issues in deployment, (Hibernate just worked). However, this merge bug was the clincher because of the domain we're working on.
        Rick Curtis made changes -
        Attachment openjpa-1896.jar [ 12466659 ]
        Hide
        Rick Curtis added a comment -

        Cefn -

        Please try to merge this patch on top of your current openjpa jar. Please let me know if this fixes your problem... I'm not very excited about this approach, but I'll think about it while you try it out.

        Thanks,
        Rick

        Show
        Rick Curtis added a comment - Cefn - Please try to merge this patch on top of your current openjpa jar. Please let me know if this fixes your problem... I'm not very excited about this approach, but I'll think about it while you try it out. Thanks, Rick
        Rick Curtis made changes -
        Assignee Rick Curtis [ curtisr7 ]
        Cefn Hoile made changes -
        Field Original Value New Value
        Description If a POJO is created using a java constructor, merge() cannot store the newly constructed object's data if this means updating a pre-existing record with a matching identity.

        This is a major bug since it means applications where the objects have a natural key cannot use OpenJPA. In my case the example was a filesystem; each crawl of the filesystem generates its own data objects with file path as the natural key. These objects then need to be stored into the database. Previous crawls may have encountered the same files, and the merge operation should cause the latest data from the POJO to be stored in the pre-existing record.

        Instead, any attempt to execute either merge() or persist() on an independently constructed object with a matching record identity in the database triggers the same error in the database layer, since OpenJPA attempts to execute an insert for a pre-existing primary key, throwing...
        org.apache.openjpa.lib.jdbc.ReportingSQLException: ERROR: duplicate key value violates unique constraint "file_pkey" {prepstmnt 32879825 INSERT INTO file (locationString, location, version, folder) VALUES (?, ?, ?, ?) [params=?, ?, ?, ?]} [code=0, state=23505]

        From discussion with Rick Curtis on the users@openjpa.apache.org list, this is because the version field on a POJO which is unmanaged is not yet set.

        An ASSUMPTION seems to be made that no such record exists in the database already since it wasn't loaded from the database in the first place, so a persist is attempted. Instead, I recommend the database is QUERIED TO FIND OUT if such a record already exists, and the version field is set correspondingly before attempting the merge()

        Here is the corresponding thread containing Ricks comments and links to an example in Github which can recreate the problem.
        http://mail-archives.apache.org/mod_mbox/openjpa-users/201011.mbox/%3CAANLkTi=tzs-nB6+NQQJmUcfbpCF_kzPnhOtLkYHvH+bM@mail.gmail.com%3E
        If a POJO is created using a java constructor, merge() cannot store the newly constructed object's data if this means updating a pre-existing record with a matching identity.

        This is a major bug since it means applications where the objects have a natural key cannot use OpenJPA. In my case the example was a filesystem; each crawl of the filesystem generates its own data objects with file path as the natural key. These objects then need to be stored into the database. Previous crawls may have encountered the same files, and the merge operation should cause the latest data from the POJO to be stored in the pre-existing record.

        Instead, any attempt to execute either merge() or persist() on an independently constructed object with a matching record identity in the database triggers the same error in the database layer, since OpenJPA attempts to execute an insert for a pre-existing primary key, throwing...
        org.apache.openjpa.lib.jdbc.ReportingSQLException: ERROR: duplicate key value violates unique constraint "file_pkey" {prepstmnt 32879825 INSERT INTO file (locationString, location, version, folder) VALUES (?, ?, ?, ?) [params=?, ?, ?, ?]} [code=0, state=23505]

        From discussion with Rick Curtis on the users@openjpa.apache.org list, this is because the version field on a POJO which is unmanaged is not yet set.

        An ASSUMPTION seems to be made that no such record exists in the database already since it wasn't loaded from the database in the first place, so a persist is attempted. Instead, I recommend the database is QUERIED TO FIND OUT if such a record already exists, and the version field is set correspondingly before attempting the merge()

        Here is the corresponding thread containing Ricks comments and links to an example in Github which can recreate the problem.

        http://bit.ly/hfPjTI
        Hide
        Cefn Hoile added a comment -

        I've changed the link to the mailing list thread above to be something which Jira's smart formatting doesn't break through incompetent parsing.

        So...

        http://bit.ly/hfPjTI

        ...was previously...

        http://mail-archives.apache.org/mod_mbox/openjpa-users/201011.mbox/%3CAANLkTi=tzs-nB6+NQQJmUcfbpCF_kzPnhOtLkYHvH+bM@mail.gmail.com%3E

        Show
        Cefn Hoile added a comment - I've changed the link to the mailing list thread above to be something which Jira's smart formatting doesn't break through incompetent parsing. So... http://bit.ly/hfPjTI ...was previously... http://mail-archives.apache.org/mod_mbox/openjpa-users/201011.mbox/%3CAANLkTi=tzs-nB6+NQQJmUcfbpCF_kzPnhOtLkYHvH+bM@mail.gmail.com%3E
        Cefn Hoile created issue -

          People

          • Assignee:
            Rick Curtis
            Reporter:
            Cefn Hoile
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development