OpenJPA
  1. OpenJPA
  2. OPENJPA-1483

count (Distinct e) in JPQL gives wrong result when the id field is a compound primary key

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-beta
    • Fix Version/s: 1.2.3, 1.3.0, 2.0.0-beta2
    • Component/s: None
    • Labels:
      None
    • Patch Info:
      Patch Available

      Description

      This is a fundamental problem with count when compound primary key is involved.

      (1) If no relation navigation is involved:
      String jpql = "SELECT COUNT (DISTINCT e) FROM G2 e";

      With the property below:
      <property name="openjpa.jdbc.DBDictionary"
      value="db2(useWildCardForCount=true)" />

      Openjpa will generate the following sql and return the correct count:

      SELECT COUNT FROM G2 t0 optimize for 1 row

      (2) If there is relation navigation invloved:
      String jpql = "SELECT COUNT (DISTINCT f1.g2) FROM F1 f1";

      The property of useWildCardForCount will not generate correct sql with right result. However, given the object-relational impedance mismatch, there is no corresponding SQL construct for count of multiple primary keys, and there is no clean and generic solution to solve this problem. The only workaround is to use native SQL with table expression:

      SELECT COUNT
      FROM (SELECT DISTINCT G1.G1PK, G1.G2PK FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK)) TX;

      Rather than giving a wrong answer, OpenJPA should give an Unsupported exception.

      1. OPENJPA-1483-2.patch
        17 kB
        Fay Wang
      2. OPENJPA-1483-1.3.x.patch.txt
        25 kB
        Heath Thomann
      3. OPENJPA-1483-1.2.x.patch.txt
        26 kB
        Heath Thomann
      4. OPENJPA-1483.patch
        13 kB
        Fay Wang

        Activity

        Hide
        Fay Wang added a comment -

        Limited support for count(distinct compound key) will be supported when count(compound primary key) appears in the projection list. This projection list will have only one projection item, ie, count, in it. OpenJPA will use generate the following SQL:

        (1)
        String jpql = "SELECT COUNT (DISTINCT e) FROM G2 e";
        generated SQL:
        SELECT COUNT FROM G2 t0

        (2)
        String jpql = "SELECT COUNT (DISTINCT f1.g2) FROM F1 f1";
        generated SQL:
        SELECT COUNT FROM (SELECT DISTINCT G1.G1PK, G1.G2PK FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK)

        For count(compound key), OpenJPA will generate count in the SQL.

        An unsupported exception will be thrown in all other situations.

        Show
        Fay Wang added a comment - Limited support for count(distinct compound key) will be supported when count(compound primary key) appears in the projection list. This projection list will have only one projection item, ie, count, in it. OpenJPA will use generate the following SQL: (1) String jpql = "SELECT COUNT (DISTINCT e) FROM G2 e"; generated SQL: SELECT COUNT FROM G2 t0 (2) String jpql = "SELECT COUNT (DISTINCT f1.g2) FROM F1 f1"; generated SQL: SELECT COUNT FROM (SELECT DISTINCT G1.G1PK, G1.G2PK FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK) For count(compound key), OpenJPA will generate count in the SQL. An unsupported exception will be thrown in all other situations.
        Hide
        Fay Wang added a comment -

        The patch will produce the following sql:

        (1) "select count (distinct f1) from F1 f1",
        SELECT COUNT FROM (SELECT DISTINCT t0.F1PK, t0.F2PK FROM F1 t0 ) s optimize for 1 row

        (2) "select count (distinct f1.pk) from F1 f1",
        SELECT COUNT FROM (SELECT DISTINCT t0.F1PK, t0.F2PK FROM F1 t0 ) s optimize for 1 row

        (3)"SELECT COUNT (DISTINCT f1.g2) FROM F1 f1",
        SELECT COUNT FROM (SELECT DISTINCT t1.G1PK, t1.G2PK FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK ) s optimize for 1 row

        (4) "SELECT COUNT (DISTINCT f1.g2.pk) FROM F1 f1",
        SELECT COUNT FROM (SELECT DISTINCT t1.G1PK, t1.G2PK FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK ) s optimize for 1 row

        (5)"select count (f1) from F1 f1",
        SELECT COUNT FROM F1 t0 optimize for 1 row

        (6)"select count (f1.pk) from F1 f1",
        SELECT COUNT FROM F1 t0 optimize for 1 row

        (7) "SELECT COUNT (f1.g2) FROM F1 f1",
        SELECT COUNT FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK optimize for 1 row

        (8)"SELECT COUNT (f1.g2.pk) FROM F1 f1",
        SELECT COUNT FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK optimize for 1 row

        Show
        Fay Wang added a comment - The patch will produce the following sql: (1) "select count (distinct f1) from F1 f1", SELECT COUNT FROM (SELECT DISTINCT t0.F1PK, t0.F2PK FROM F1 t0 ) s optimize for 1 row (2) "select count (distinct f1.pk) from F1 f1", SELECT COUNT FROM (SELECT DISTINCT t0.F1PK, t0.F2PK FROM F1 t0 ) s optimize for 1 row (3)"SELECT COUNT (DISTINCT f1.g2) FROM F1 f1", SELECT COUNT FROM (SELECT DISTINCT t1.G1PK, t1.G2PK FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK ) s optimize for 1 row (4) "SELECT COUNT (DISTINCT f1.g2.pk) FROM F1 f1", SELECT COUNT FROM (SELECT DISTINCT t1.G1PK, t1.G2PK FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK ) s optimize for 1 row (5)"select count (f1) from F1 f1", SELECT COUNT FROM F1 t0 optimize for 1 row (6)"select count (f1.pk) from F1 f1", SELECT COUNT FROM F1 t0 optimize for 1 row (7) "SELECT COUNT (f1.g2) FROM F1 f1", SELECT COUNT FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK optimize for 1 row (8)"SELECT COUNT (f1.g2.pk) FROM F1 f1", SELECT COUNT FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK optimize for 1 row
        Hide
        Milosz Tylenda added a comment -

        Hi Fay, does your patch make useWildCardForCount property obsolete? I understand your description that with the patch we would always use wild card in SELECT COUNT. If so, we could remove the useWildCardForCount property in 2.1.0.

        Show
        Milosz Tylenda added a comment - Hi Fay, does your patch make useWildCardForCount property obsolete? I understand your description that with the patch we would always use wild card in SELECT COUNT. If so, we could remove the useWildCardForCount property in 2.1.0.
        Hide
        Fay Wang added a comment -

        Since not all databases support common table expression, a DBDictioinary attribute "supportsCommonTableExpression" is introduced. A test case is also included to run on DB2. For those databases that do not support common table expression, openjpa will throw UnsupportedException instead of giving a misleading result for Count (Distinct compound PK).

        Show
        Fay Wang added a comment - Since not all databases support common table expression, a DBDictioinary attribute "supportsCommonTableExpression" is introduced. A test case is also included to run on DB2. For those databases that do not support common table expression, openjpa will throw UnsupportedException instead of giving a misleading result for Count (Distinct compound PK).
        Hide
        Fay Wang added a comment -

        Hi Milosz, no, we will not always use Count for select count in all scenarios. The Count will be used only when SELECT COUNT(compound PK). Could you please review OPENJPA-1483-2.patch. Thanks!

        Show
        Fay Wang added a comment - Hi Milosz, no, we will not always use Count for select count in all scenarios. The Count will be used only when SELECT COUNT(compound PK). Could you please review OPENJPA-1483 -2.patch. Thanks!
        Hide
        Milosz Tylenda added a comment -

        Hi Fay, of course, my question was wrong. I forgot you stated it clearly even in issue description that the problem is only with compound PKs. Sorry for the confusion.

        As for the reviewing the new patch - yes, I reviewed it but since this is you who is the expert in that matter, I am learning from the patch rather than judging I would only think these small issues:

        1. I think that by "common table expression" people usually mean "WITH cte_name ... SELECT ... FROM cte_name ..." syntax where this cte can also be recursive. What we need for counting compund PKs, is the database ability to execute subqueries in FROM clause. Thus, I would name the new property "supportsSubselectInFrom". What do you think?

        2. We should document the new property in the user manual.

        3. When the patch is applied, it would be good to open a new issue to investigate the value of the new property for other databases. I believe quite many of them support subqueries in FROM clause.

        4. In the test case I would change

        + if (!(dict instanceof DB2Dictionary))
        + return;

        to testing the value of the new property. That way, the test case will automatically be executed also on other databases which have the support.

        Show
        Milosz Tylenda added a comment - Hi Fay, of course, my question was wrong. I forgot you stated it clearly even in issue description that the problem is only with compound PKs. Sorry for the confusion. As for the reviewing the new patch - yes, I reviewed it but since this is you who is the expert in that matter, I am learning from the patch rather than judging I would only think these small issues: 1. I think that by "common table expression" people usually mean "WITH cte_name ... SELECT ... FROM cte_name ..." syntax where this cte can also be recursive. What we need for counting compund PKs, is the database ability to execute subqueries in FROM clause. Thus, I would name the new property "supportsSubselectInFrom". What do you think? 2. We should document the new property in the user manual. 3. When the patch is applied, it would be good to open a new issue to investigate the value of the new property for other databases. I believe quite many of them support subqueries in FROM clause. 4. In the test case I would change + if (!(dict instanceof DB2Dictionary)) + return; to testing the value of the new property. That way, the test case will automatically be executed also on other databases which have the support.
        Hide
        Fay Wang added a comment -

        Hi Milosz, thank you for pointing out this. Common table expression and table expression semantically are the same, but syntactically, they are quite different. For cte, you can define once and use it multiple times. it is like a function definition of a program. In this JIRA, it is table expression, but not common table expression that is required:

        SELECT COUNT
        FROM (SELECT DISTINCT G1.G1PK, G1.G2PK
        FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK)) TX;

        Show
        Fay Wang added a comment - Hi Milosz, thank you for pointing out this. Common table expression and table expression semantically are the same, but syntactically, they are quite different. For cte, you can define once and use it multiple times. it is like a function definition of a program. In this JIRA, it is table expression, but not common table expression that is required: SELECT COUNT FROM (SELECT DISTINCT G1.G1PK, G1.G2PK FROM F1 t0 INNER JOIN G2 t1 ON t0.G1PK = t1.G1PK AND t0.G2PK = t1.G2PK)) TX;
        Hide
        Catalina Wei added a comment -

        Hi Fay,
        1. for newly added unsupported messages, could you add message id to localizer.properties ?
        2. could you move the block of the code in JDBCStoreQuery that handles the isCountDistinctMultiCols to SelectConstructor.evaluate() so that kernel.exp.Count visibility remains private at package level ?

        Show
        Catalina Wei added a comment - Hi Fay, 1. for newly added unsupported messages, could you add message id to localizer.properties ? 2. could you move the block of the code in JDBCStoreQuery that handles the isCountDistinctMultiCols to SelectConstructor.evaluate() so that kernel.exp.Count visibility remains private at package level ?
        Hide
        Heath Thomann added a comment -

        Added OPENJPA-1483-1.2.x.patch.txt for 1.2.x code and OPENJPA-1483-1.3.x.patch.txt for 1.3.x code, as their names imply.

        Show
        Heath Thomann added a comment - Added OPENJPA-1483 -1.2.x.patch.txt for 1.2.x code and OPENJPA-1483 -1.3.x.patch.txt for 1.3.x code, as their names imply.

          People

          • Assignee:
            Unassigned
            Reporter:
            Fay Wang
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development