Uploaded image for project: 'OpenJPA'
  1. OpenJPA
  2. OPENJPA-2539

JPQL interpret wrongly for inner join table (without mapped relation)

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.1.0
    • Fix Version/s: 2.1.2, 2.2.1.1, 2.2.3, 2.4.1
    • Component/s: sql
    • Labels:
      None
    • Environment:
      WebSphere v8.0

      Description

      I found the following issue after a few months of production of my client.

      I write a java batch job to process some account data for my client.

      And I found out that the JPQL interpret wrong randomly and it is not always having the same result for the same of data and code.

      JPQL statement:
      SELECT B
      FROM EntityA_PO A, EntityB_PO B
      WHERE A.deletingDateTime IS NULL and A.statusCode= :app and A.someId in (:ids)
      and B.deletingDateTime IS NULL and B.statusCode= :app2 and A.headerId = B.headerId
      ORDER BY B.someId

      translated to unexpected SQL:

      SELECT t1.HEADER_ID, t1.SOME_ID
      FROM EntityB t0 JOIN EntityB t1 ON (1 = 1)
      WHERE (t0.DELETING_DATE_TIME IS NULL AND t0.STATUS_CODE = ? AND t0.SOME_ID IN
      AND t1.DELETING_DATE_TIME IS NULL AND t1.STATUS_CODE = ? AND t1.HEADER_ID = t0.HEADER_ID)
      ORDER BY t1.SOME_ID ASC

      the correct translated SQL:

      SELECT t1.HEADER_ID, t1.SOME_ID
      FROM EntityA t0 JOIN EntityB t1 ON (1 = 1)
      WHERE (t0.DELETING_DATE_TIME IS NULL AND t0.STATUS_CODE = ? AND t0.SOME_ID IN
      AND t1.DELETING_DATE_TIME IS NULL AND t1.STATUS_CODE = ? AND t1.HEADER_ID = t0.HEADER_ID)
      ORDER BY t1.SOME_ID ASC

      Detail:
      The two EntityA_PO and EntityB_PO do not have any mapped relation. They are joined by a column called headerId in where clause only.

      The problem and correct generated native SQL is different by one is to=Entity A , and other is to=Entity B.

      I don't know why the wrong one joining Entity B itself without any Entity A involvement. This situation seems to be existed randomly. I can't catch the pattern of it.

      If there is any fix or workaround for this case, please let me know. Thanks

      1. OPENJPA-2539-2.1.x_v2.test
        17 kB
        Heath Thomann
      2. OPENJPA-2539-2.1.x_v3.test
        17 kB
        Heath Thomann
      3. OPENJPA-2539-2.1.x.test
        16 kB
        Heath Thomann
      4. schema_and_testing_sql.txt
        7 kB
        Heath Thomann

        Activity

        Hide
        curtisr7 Rick Curtis added a comment -

        For giggles, can you try to disable the QuerySqlCache?

        <property name="openjpa.jdbc.QuerySQLCache" value="false">

        If that doesn't work, I'm going to suggest to open a PMR.

        Show
        curtisr7 Rick Curtis added a comment - For giggles, can you try to disable the QuerySqlCache? <property name="openjpa.jdbc.QuerySQLCache" value="false"> If that doesn't work, I'm going to suggest to open a PMR.
        Hide
        zechs Benson So added a comment - - edited

        PMR has been opened. I am here to see whether there is any more information
        Is it really related to cached of querySQLCache ?

        Show
        zechs Benson So added a comment - - edited PMR has been opened. I am here to see whether there is any more information Is it really related to cached of querySQLCache ?
        Hide
        curtisr7 Rick Curtis added a comment -

        Honestly I didn't have much time to dig into your issue... The reason I suggested that property is that I seem to remember seeing a number of issues when using 'IS NULL' JPQL and getting different SQL on different executions.

        Did you try the property see if it helped your scenario?

        Show
        curtisr7 Rick Curtis added a comment - Honestly I didn't have much time to dig into your issue... The reason I suggested that property is that I seem to remember seeing a number of issues when using 'IS NULL' JPQL and getting different SQL on different executions. Did you try the property see if it helped your scenario?
        Hide
        zechs Benson So added a comment -

        Since my client in banking field, I can't remote the testing machine to test the property easily. I would like to gather more info so that I am confident enough to do more test on the problem.

        You have mentioned that 'IS NULL' , can you evaluate more on that ? why it is a problem to use "is null" ? do there any example or fix on that ? Thank in advance.

        Show
        zechs Benson So added a comment - Since my client in banking field, I can't remote the testing machine to test the property easily. I would like to gather more info so that I am confident enough to do more test on the problem. You have mentioned that 'IS NULL' , can you evaluate more on that ? why it is a problem to use "is null" ? do there any example or fix on that ? Thank in advance.
        Hide
        curtisr7 Rick Curtis added a comment -

        After looking a bit closer at the JPQL, I am not certain that this is related to the QuerySQLCache. It's hard to say without more data.

        Is this problem fairly easily repeatable? If so, posting a unit-ish test will make it much easier to diagnose the problem.

        Show
        curtisr7 Rick Curtis added a comment - After looking a bit closer at the JPQL, I am not certain that this is related to the QuerySQLCache. It's hard to say without more data. Is this problem fairly easily repeatable? If so, posting a unit-ish test will make it much easier to diagnose the problem.
        Hide
        zechs Benson So added a comment -

        This problem is not repeatable easily. I need to loop this SQL about 200k+ times to reproduce it on client machine.
        Since I am using "in" in the where clause, this sql can't be cached as seem in the log of openjpa.
        I am not sure the root clause of this behavior and how to reproduce it in easier way.

        Show
        zechs Benson So added a comment - This problem is not repeatable easily. I need to loop this SQL about 200k+ times to reproduce it on client machine. Since I am using "in" in the where clause, this sql can't be cached as seem in the log of openjpa. I am not sure the root clause of this behavior and how to reproduce it in easier way.
        Hide
        jpaheath Heath Thomann added a comment -

        Hey Rick! Benson sent to me a "test" by other means than this JIRA. I've ran his test as-is (a simply JSE test packaged with necessary jars). I was NOT able to recreate the issue described on my Windows machine. As I understand it, Benson is not able to recreate on Windows either, rather he can occasionally recreate on AIX. I've taken Benson's test and put it into a standard OpenJPA test which you can apply. I'm also including an SQL script which you can run to set up the tables and necessary data. I've ran the SQL against my DB2. This is what Benson is using as well.

        Benson - can you verify that you can only recreate on AIX??

        Thanks,

        Heath

        Show
        jpaheath Heath Thomann added a comment - Hey Rick! Benson sent to me a "test" by other means than this JIRA. I've ran his test as-is (a simply JSE test packaged with necessary jars). I was NOT able to recreate the issue described on my Windows machine. As I understand it, Benson is not able to recreate on Windows either, rather he can occasionally recreate on AIX. I've taken Benson's test and put it into a standard OpenJPA test which you can apply. I'm also including an SQL script which you can run to set up the tables and necessary data. I've ran the SQL against my DB2. This is what Benson is using as well. Benson - can you verify that you can only recreate on AIX?? Thanks, Heath
        Hide
        curtisr7 Rick Curtis added a comment -

        I successfully ran the test last night against trunk on osx and mysql. I also modified the test slightly to have multiple threads execute this query in parallel. I'll try to find some time this evening to run the same scenario against 2.1.x.

        Benson – Have you tried to disable the QuerySQLCache as I mentioned before? That might help us narrow in on where the problem is.

        Show
        curtisr7 Rick Curtis added a comment - I successfully ran the test last night against trunk on osx and mysql. I also modified the test slightly to have multiple threads execute this query in parallel. I'll try to find some time this evening to run the same scenario against 2.1.x. Benson – Have you tried to disable the QuerySQLCache as I mentioned before? That might help us narrow in on where the problem is.
        Hide
        zechs Benson So added a comment -

        Rick:
        I have not tried to disable QuerySQLCache.

        As the JPQL has "in" statement in where clause, the SQL interpreted can not be cached as stated in openJPA log.

        Show
        zechs Benson So added a comment - Rick: I have not tried to disable QuerySQLCache. As the JPQL has "in" statement in where clause, the SQL interpreted can not be cached as stated in openJPA log.
        Hide
        zechs Benson So added a comment -

        Hi Rick, Health,

        Do you have update for me ? Please let me know your progress. Thanks

        Benson

        Show
        zechs Benson So added a comment - Hi Rick, Health, Do you have update for me ? Please let me know your progress. Thanks Benson
        Hide
        curtisr7 Rick Curtis added a comment -

        Benson – Sorry but I have not been able to recreate your issue.

        Show
        curtisr7 Rick Curtis added a comment - Benson – Sorry but I have not been able to recreate your issue.
        Hide
        jpaheath Heath Thomann added a comment -

        I have gotten to the bottom of this very tricky issue! For history sake, or for the curious reader, I will post in great detail the means by which this issue occurs. The details are very long, so first I'd like to provide a quick summary and get right to what I think is a solution. For my quick summary, and your convenience, please look at method 'compilationFromCache() ' in QueryImpl here:

        http://svn.apache.org/viewvc/openjpa/branches/2.1.x/openjpa-kernel/src/main/java/org/apache/openjpa/kernel/QueryImpl.java?revision=1035834&view=markup

        Notice in this method we do a 'compCache.get(key)'. In other words, we are getting a Compilation from the cache with a "key". If we do not find a Compilation with the given key, we can see that in this method we will create a Compilation and at the end of the method put it into the 'compCache' using the key. Now, imagine the case where two threads access the cache at the same time, with the same key, and there is no Compilation for the key. BOTH threads will create one and put it into the compCache, where the last thread to put its Compilation "wins". This is potentially bad because now you have two threads each with its own Compilation for the same key. Later on when the "losing" thread accesses the cache to get a Compilation, it will get the one from the "winning" thread, not the one it had created. As I understand it, there should only ever be one Compilation per key. I will explain below why this becomes a problem.
        My thinking is that 'compCache' in 'compilationFromCache' needs to be synchronized.

        With that shorter introduction, let me dive into the details on why this cache issue would have an effect on generated SQL. Apply the provided test (OPENJPA-2539-2.1.x_v2.test), and test data (schema_and_testing_sql.txt). Let me start by looking at the spot where we have a potential race condition. First, In QueryImpl, put a break point here:

        compCache.put(key, comp)

        Run the test in debug mode, and with two threads, and you will notice that both threads will break here. Notice that in the containing method (compilationFromCache()) that we have this code before the compCache.put:

        CompilationKey key = new CompilationKey();
        ......
        Compilation comp = (Compilation) compCache.get(key);

        As you can see, if one thread is running slower than the other, the second thread should get the Compilation from the cache for the same key. However, since I've deliberately set a break point at compCache.put, both threads break here and they will both put a Compilation into compCache with THE SAME key. So, the thread which executes the 'put' last effectively wins.

        Now, under most circumstances I don't think this is so bad (although ultimately I think the intentions are that only one Compilation per key would ever be added). If it was bad we'd likely see more issues. However, I think we need to have one more 'timing'/race condition to occur, as I'll explain, along with a few other specific things. To help explain, I would suggest putting the following println after compCache.put:

        // START - ALLOW PRINT STATEMENTS
        System.out.println(Thread.currentThread() + ": " + this.getClass() + ".compilationFromCache(), put a comp into the cache."
        + " comp = " + comp + " and comp.datastore = " + comp.datastore);
        // STOP - ALLOW PRINT STATEMENTS

        This will tell us the Compilation instance added per thread which we will use later. After adding this println, rerun the test (in debug mode) and take note of which thread added to the cache LAST.

        Next, put a break point on this line in QueryImpl, method 'compileForDataStore':

        (#1) if (comp.datastore == null){
        (#2) comp.datastore = createExecutor(false);

        At this time, both threads should be still stopped at compCache.put. Allow each thread to run, which should allow them both to break on the break point at the 'if' block in 'compileForDataStore' (i.e. #1). Notice that comp.datastore will be assigned a value returned by 'createExecutor' (keep this in mind for later).

        Next, add the following to compileForExecutor in QueryImpl (likely you'll need to re-run the test and get back to the break point in the 'if' block in 'compileForDataStore'):

        private StoreQuery.Executor compileForExecutor() {
        Compilation comp = compileForCompilation();
        // START - ALLOW PRINT STATEMENTS
        (#3) System.out.println(Thread.currentThread() + ": " + this.getClass()
        + ".compileForExecutor: comp = " + comp + ", comp.datastore = " + comp.datastore);
        // STOP - ALLOW PRINT STATEMENTS

        Previously I mentioned that we needed to take note of the thread which added to the compCache LAST (call it T0). Don't run this thread, allow the OTHER thread to run (call it T1). When you run this thread, 'comp.datastore' will be assigned a value (#2 above), and the println (#3) we just added to 'compileForExecutor' will show that T1 gets from 'compileForCompilation()' (which calls to get the Compilation from the cache) the Compilation that it put into the Compilation cache. So at this point, T1 has worked with the Compilation it created, and the Compilation.datastore (#2) has been assigned a value.
        Next, continue to allow thread T1 to run again. Notice that flow will come back to #1 above, thus the break point hit. Also notice that, the println in #3 will have made a print, meaning we executed 'compileForCompilation()', and thus got a Compilation from the cache. Comparing the println from this thread (T1) should show that it is using the Compilation from T0! And we know that thread T0 has just been sitting (breaking) at #1 above, so its 'comp.datastore' is null! That means, we have to go through the 'createExecutor'!! If thread T1 got its own Compilation, 'comp.datastore' would be non-null, and we simply would have returned! This seems to be at the root of the issue.

        Now, at this point you might be wondering how in the world this would effect something as far "downstream" as SQL generation. Well, to see this I'm going to skip way ahead in the test to the point where the test causes 'DBDictionary.toSQL92Join'. Oh, you must run this test with a DB that supports SLQ92 (like DB2). Put a break point here and let thread T1 run to this point. In this method you will see that 'join.getTable1()' returns a 'CFP_PLAN_HEADER', but it should be 'CTR_CFP_CURRENT'. Because of this, we generate 'FROM CFP_PLAN_HEADER t0 JOIN CFP_PLAN_HEADER t1'.

        The latter paragraph described the end results, that is, the point at which we generate the wrong SQL. The prior part of this post describes the problem at its root with the Compilation cache. To tie these two together would require far more details. However, it became an exercise to walk backwards in the code to determine where the value returned by 'join.getTable1()' came from. Basically this is set as the "candidate type" in the PCPath. In this case, because the test uses a TypedQuery, at the very end of creating a TypedQuery, we set the type. I don't think this issue exists when the test changes to Query from TypedQuery. Anyway, the fact that this type was set, and the fact that 'createExecutor' is executed a second time (as described above) ultimately caused the issue.

        My proposed fix would be to synchronize the compCache as follows:

        protected Compilation compilationFromCache() {
        + Compilation comp = null;
        Map compCache = _broker.getConfiguration().getQueryCompilationCacheInstance();

        if (compCache == null || !isParsedQuery())

        { return newCompilation(); }

        else {
        + synchronized (compCache)

        { .......<remained of method removed for brevity>...... + }

        }
        return comp;
        }

        Thanks,

        Heath

        Show
        jpaheath Heath Thomann added a comment - I have gotten to the bottom of this very tricky issue! For history sake, or for the curious reader, I will post in great detail the means by which this issue occurs. The details are very long, so first I'd like to provide a quick summary and get right to what I think is a solution. For my quick summary, and your convenience, please look at method 'compilationFromCache() ' in QueryImpl here: http://svn.apache.org/viewvc/openjpa/branches/2.1.x/openjpa-kernel/src/main/java/org/apache/openjpa/kernel/QueryImpl.java?revision=1035834&view=markup Notice in this method we do a 'compCache.get(key)'. In other words, we are getting a Compilation from the cache with a "key". If we do not find a Compilation with the given key, we can see that in this method we will create a Compilation and at the end of the method put it into the 'compCache' using the key. Now, imagine the case where two threads access the cache at the same time, with the same key, and there is no Compilation for the key. BOTH threads will create one and put it into the compCache, where the last thread to put its Compilation "wins". This is potentially bad because now you have two threads each with its own Compilation for the same key. Later on when the "losing" thread accesses the cache to get a Compilation, it will get the one from the "winning" thread, not the one it had created. As I understand it, there should only ever be one Compilation per key. I will explain below why this becomes a problem. My thinking is that 'compCache' in 'compilationFromCache' needs to be synchronized. With that shorter introduction, let me dive into the details on why this cache issue would have an effect on generated SQL. Apply the provided test ( OPENJPA-2539 -2.1.x_v2.test), and test data (schema_and_testing_sql.txt). Let me start by looking at the spot where we have a potential race condition. First, In QueryImpl, put a break point here: compCache.put(key, comp) Run the test in debug mode, and with two threads, and you will notice that both threads will break here. Notice that in the containing method (compilationFromCache()) that we have this code before the compCache.put: CompilationKey key = new CompilationKey(); ...... Compilation comp = (Compilation) compCache.get(key); As you can see, if one thread is running slower than the other, the second thread should get the Compilation from the cache for the same key. However, since I've deliberately set a break point at compCache.put, both threads break here and they will both put a Compilation into compCache with THE SAME key. So, the thread which executes the 'put' last effectively wins. Now, under most circumstances I don't think this is so bad (although ultimately I think the intentions are that only one Compilation per key would ever be added). If it was bad we'd likely see more issues. However, I think we need to have one more 'timing'/race condition to occur, as I'll explain, along with a few other specific things. To help explain, I would suggest putting the following println after compCache.put: // START - ALLOW PRINT STATEMENTS System.out.println(Thread.currentThread() + ": " + this.getClass() + ".compilationFromCache(), put a comp into the cache." + " comp = " + comp + " and comp.datastore = " + comp.datastore); // STOP - ALLOW PRINT STATEMENTS This will tell us the Compilation instance added per thread which we will use later. After adding this println, rerun the test (in debug mode) and take note of which thread added to the cache LAST. Next, put a break point on this line in QueryImpl, method 'compileForDataStore': (#1) if (comp.datastore == null){ (#2) comp.datastore = createExecutor(false); At this time, both threads should be still stopped at compCache.put. Allow each thread to run, which should allow them both to break on the break point at the 'if' block in 'compileForDataStore' (i.e. #1). Notice that comp.datastore will be assigned a value returned by 'createExecutor' (keep this in mind for later). Next, add the following to compileForExecutor in QueryImpl (likely you'll need to re-run the test and get back to the break point in the 'if' block in 'compileForDataStore'): private StoreQuery.Executor compileForExecutor() { Compilation comp = compileForCompilation(); // START - ALLOW PRINT STATEMENTS (#3) System.out.println(Thread.currentThread() + ": " + this.getClass() + ".compileForExecutor: comp = " + comp + ", comp.datastore = " + comp.datastore); // STOP - ALLOW PRINT STATEMENTS Previously I mentioned that we needed to take note of the thread which added to the compCache LAST (call it T0). Don't run this thread, allow the OTHER thread to run (call it T1). When you run this thread, 'comp.datastore' will be assigned a value (#2 above), and the println (#3) we just added to 'compileForExecutor' will show that T1 gets from 'compileForCompilation()' (which calls to get the Compilation from the cache) the Compilation that it put into the Compilation cache. So at this point, T1 has worked with the Compilation it created, and the Compilation.datastore (#2) has been assigned a value. Next, continue to allow thread T1 to run again. Notice that flow will come back to #1 above, thus the break point hit. Also notice that, the println in #3 will have made a print, meaning we executed 'compileForCompilation()', and thus got a Compilation from the cache. Comparing the println from this thread (T1) should show that it is using the Compilation from T0! And we know that thread T0 has just been sitting (breaking) at #1 above, so its 'comp.datastore' is null! That means, we have to go through the 'createExecutor'!! If thread T1 got its own Compilation, 'comp.datastore' would be non-null, and we simply would have returned! This seems to be at the root of the issue. Now, at this point you might be wondering how in the world this would effect something as far "downstream" as SQL generation. Well, to see this I'm going to skip way ahead in the test to the point where the test causes 'DBDictionary.toSQL92Join'. Oh, you must run this test with a DB that supports SLQ92 (like DB2). Put a break point here and let thread T1 run to this point. In this method you will see that 'join.getTable1()' returns a 'CFP_PLAN_HEADER', but it should be 'CTR_CFP_CURRENT'. Because of this, we generate 'FROM CFP_PLAN_HEADER t0 JOIN CFP_PLAN_HEADER t1'. The latter paragraph described the end results, that is, the point at which we generate the wrong SQL. The prior part of this post describes the problem at its root with the Compilation cache. To tie these two together would require far more details. However, it became an exercise to walk backwards in the code to determine where the value returned by 'join.getTable1()' came from. Basically this is set as the "candidate type" in the PCPath. In this case, because the test uses a TypedQuery, at the very end of creating a TypedQuery, we set the type. I don't think this issue exists when the test changes to Query from TypedQuery. Anyway, the fact that this type was set, and the fact that 'createExecutor' is executed a second time (as described above) ultimately caused the issue. My proposed fix would be to synchronize the compCache as follows: protected Compilation compilationFromCache() { + Compilation comp = null; Map compCache = _broker.getConfiguration().getQueryCompilationCacheInstance(); if (compCache == null || !isParsedQuery()) { return newCompilation(); } else { + synchronized (compCache) { .......<remained of method removed for brevity>...... + } } return comp; } Thanks, Heath
        Hide
        jpaheath Heath Thomann added a comment -

        For history sake, I'm adding a final version of the test (OPENJPA-2539-2.1.x_v3.test). Before running the test, run the schema_and_testing_sql.txt file to create the tables/data.

        Thanks,

        Heath Thomann

        Show
        jpaheath Heath Thomann added a comment - For history sake, I'm adding a final version of the test ( OPENJPA-2539 -2.1.x_v3.test). Before running the test, run the schema_and_testing_sql.txt file to create the tables/data. Thanks, Heath Thomann
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1686894 from Heath Thomann in branch 'openjpa/branches/2.1.x'
        [ https://svn.apache.org/r1686894 ]

        OPENJPA-2539: Query Compilation causing inner join table to be randomly generated incorrectly.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1686894 from Heath Thomann in branch 'openjpa/branches/2.1.x' [ https://svn.apache.org/r1686894 ] OPENJPA-2539 : Query Compilation causing inner join table to be randomly generated incorrectly.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1686910 from Heath Thomann in branch 'openjpa/branches/2.2.1.x'
        [ https://svn.apache.org/r1686910 ]

        OPENJPA-2539: Query Compilation causing inner join table to be randomly generated incorrectly - ported changes to 2.2.1.x.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1686910 from Heath Thomann in branch 'openjpa/branches/2.2.1.x' [ https://svn.apache.org/r1686910 ] OPENJPA-2539 : Query Compilation causing inner join table to be randomly generated incorrectly - ported changes to 2.2.1.x.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1686911 from Heath Thomann in branch 'openjpa/branches/2.2.x'
        [ https://svn.apache.org/r1686911 ]

        OPENJPA-2539: Query Compilation causing inner join table to be randomly generated incorrectly - ported changes to 2.2.x.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1686911 from Heath Thomann in branch 'openjpa/branches/2.2.x' [ https://svn.apache.org/r1686911 ] OPENJPA-2539 : Query Compilation causing inner join table to be randomly generated incorrectly - ported changes to 2.2.x.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 1686913 from Heath Thomann in branch 'openjpa/trunk'
        [ https://svn.apache.org/r1686913 ]

        OPENJPA-2539: Query Compilation causing inner join table to be randomly generated incorrectly - ported changes to trunk.

        Show
        jira-bot ASF subversion and git services added a comment - Commit 1686913 from Heath Thomann in branch 'openjpa/trunk' [ https://svn.apache.org/r1686913 ] OPENJPA-2539 : Query Compilation causing inner join table to be randomly generated incorrectly - ported changes to trunk.

          People

          • Assignee:
            jpaheath Heath Thomann
            Reporter:
            zechs Benson So
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development