Issue Details (XML | Word | Printable)

Key: OPENJPA-134
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Fay Wang
Reporter: Catalina Wei
Votes: 1
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
OpenJPA

Extra unneeded SQL joins for OneToMany relationship with fetch type EAGER

Created: 08/Feb/07 01:47 AM   Updated: 31/Jul/08 10:10 PM
Return to search
Component/s: sql
Affects Version/s: None
Fix Version/s: 0.9.7

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works openjpa134_3.patch 2008-07-21 02:40 AM Fay Wang 8 kB

Resolution Date: 11/Apr/07 04:26 PM


 Description  « Hide
Running JPAConfiguration default setting for EagerFetchMode (FetchModeValue.EAGER_PARALLEL),
the SQL generated is sub-optimal.
Consider the following entities:

                                     lineitems( OneToMany )
            Order <===========================> OrderItem
                                    order ( ManyToOne )


Case 1: why not combining 2 SQL to 1 SQL ?
=================================================================================================
Order.lineitmes(EAGER):
OrderItem.order(LAZY):

Executing query: select o from Order o
2173 demo TRACE [main] openjpa.jdbc.SQL - <t 1094730048, conn 1318342292> executing prepstmnt 1299336562
SELECT t0.oid, t0.version, t0.amount, t0.customer_countryCode, t0.customer_id, t0.delivered, t0.shipaddr FROM Order t0
2213 demo TRACE [main] openjpa.jdbc.SQL - <t 1094730048, conn 1318342292> [40 ms] spent
2223 demo TRACE [main] openjpa.jdbc.SQL - <t 1094730048, conn 1318342292> executing prepstmnt 1406424020
SELECT t0.oid, t1.lid, t1.version, t1.cost, t1.order_oid, t1.part_partno, t1.quantity FROM Order t0 INNER JOIN OrderItem t1 ON t0.oid = t1.order_oid ORDER BY t0.oid ASC


Case 2: extra unneeded LEFT OUTER JOIN, if eliminated, the selection aliase t2 should change to t1:
=============================================================================================
Order.lineitmes(EAGER):
OrderItem.order(LAZY):

Executing query: select o from Order o left join fetch o.lineitems
2403 demo TRACE [main] openjpa.jdbc.SQL - <t 1094730048, conn 1314410072> executing prepstmnt 1500797300
SELECT t0.oid, t0.version, t0.amount, t0.customer_countryCode, t0.customer_id, t0.delivered, t0.shipaddr, t2.order_oid, t2.lid, t2.version, t2.cost, t2.part_partno, t2.quantity FROM Order t0 LEFT OUTER JOIN OrderItem t1 ON t0.oid = t1.order_oid LEFT OUTER JOIN OrderItem t2 ON t0.oid = t2.order_oid ORDER BY t2.order_oid ASC


Case 3: why not generating 1 SQL ?
==================================================================================================
Order.lineitmes(EAGER):
OrderItem.order(EAGER):

Executing query: select o from Order o
2343 demo TRACE [main] openjpa.jdbc.SQL - <t 1094730048, conn 1318342292> executing prepstmnt 384833264 SELECT t0.oid, t0.version, t0.amount, t0.customer_countryCode, t0.customer_id, t0.delivered, t0.shipaddr FROM Order t0
2383 demo TRACE [main] openjpa.jdbc.SQL - <t 1094730048, conn 1318342292> [40 ms] spent
2393 demo TRACE [main] openjpa.jdbc.SQL - <t 1094730048, conn 1318342292> executing prepstmnt 1722705582
SELECT t0.oid, t1.lid, t1.version, t1.cost, t2.oid, t2.version, t2.amount, t2.customer_countryCode, t2.customer_id, t2.delivered, t2.shipaddr, t1.part_partno, t1.quantity FROM Order t0 INNER JOIN OrderItem t1 ON t0.oid = t1.order_oid LEFT OUTER JOIN Order t2 ON t1.order_oid = t2.oid ORDER BY t0.oid ASC
2393 demo TRACE [main] openjpa.jdbc.SQL - <t 1094730048, conn 1318342292> [0 ms] spent
3134 demo TRACE [main] openjpa.jdbc.SQL - <t 1094730048, conn 1318342292> executing prepstmnt 950548648
SELECT t0.lid, t0.version, t0.cost, t1.oid, t1.version, t1.amount, t1.customer_countryCode, t1.customer_id, t1.delivered, t1.shipaddr, t0.part_partno, t0.quantity FROM OrderItem t0 LEFT OUTER JOIN Order t1 ON t0.order_oid = t1.oid WHERE t0.order_oid = ? [params=(int) 88]
3134 demo TRACE [main] openjpa.jdbc.SQL - <t 1094730048, conn 1318342292> [0 ms] spent


Case 4: duplicate selections and redundant joins
==================================================================================================
Order.lineitmes(EAGER):
OrderItem.order(EAGER):

Executing query: select o from Order o left join fetch o.lineitems
2273 demo TRACE [main] openjpa.jdbc.SQL - <t 1094730048, conn 1307463150> executing prepstmnt 1565154634
SELECT t0.oid, t0.version, t0.amount, t0.customer_countryCode, t0.customer_id, t0.delivered, t0.shipaddr, t2.order_oid, t2.lid, t2.version, t2.cost, t3.oid, t3.version, t3.amount, t3.customer_countryCode, t3.customer_id, t3.delivered, t3.shipaddr, t2.part_partno, t2.quantity FROM Order t0 LEFT OUTER JOIN OrderItem t1 ON t0.oid = t1.order_oid LEFT OUTER JOIN OrderItem t2 ON t0.oid = t2.order_oid LEFT OUTER JOIN Order t3 ON t2.order_oid = t3.oid ORDER BY t2.order_oid ASC

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
John Stecher added a comment - 26/Feb/07 10:43 PM
Just curious if there has been any movement on this JIRA. This problem actually presents itself in the test bucket that IBM has been running trying to work through the performance issues of OpenJPA and puts the OpenJPA runtime behind the competitive implementations in the scenarios above. I am interested in getting some feedback from those that understand the code a little more about this bug and getting a possible fix generated for it.

In my mind there is no reason that the the Orderlines->Orders Eager,Eager relationship should ever issues a 2 join query. It should be a single left outer join with the eager back pointer filled by the initial orderline query.

Thoughts?

Patrick Linskey added a comment - 26/Feb/07 10:54 PM
Could you attach a test case that demonstrates this problem?

Kevin Sutter added a comment - 27/Feb/07 01:48 AM
The following conversation has also been recorded in the dev mailing list concerning this OPENJPA-134 Issue:

http://www.nabble.com/extraneous-joins-OPENJPA-134-tf3230933.html#a8977476

John Stecher added a comment - 27/Feb/07 05:17 PM
I think the links to the mailing list post that Kevin has shows a good example of the code causing the problem. In reality any one to many relationship with a back pointer generates two joins when only one is necessary to pull in all the information needed for the query.

Dave's example in the mailing list should be something you can run pretty quick and see the problem. Catalina's original JIRA entry contains a pretty good example of the SQL.

Dan Galvin added a comment - 07/Mar/07 09:45 PM
We find that this case also occurs on the getReference() call as well, I assume because associated fetch is eager?

John Stecher added a comment - 29/Mar/07 04:10 PM
Any update on where we stand with getting this issue addressed?

Abe White added a comment - 11/Apr/07 04:26 PM
Fixed the most egregious issue, which was the cyclic fetching of eager bidirectional relations. Changed to cut off SELECTs when we're traversing the back-ptr to the owning side of a relation we've already fetched. I'm not convinced all the other issues mentioned are bugs given the eager fetch settings used. Please open new JIRAs for any individual issues that you believe remain. Fixed in revision 527565.

Rob Wisniewski added a comment - 12/Apr/07 10:15 PM
I did some performance testing and things look good here. My 'one to many eager' scenario, which would be affected the most by this looping query behavior, improved in performance by 6%, and verified by sql dumps.

I did notice that my lazy scenarios are almost 50% slower now, but looking at the sql dumps it appears that we were fetching eagerly even in those scenarios and this (or another JIRA?) seems to have fixed that functional error. Does that seem like something your changes would resolve? Did you run into anything that would cause an eager fetch even when lazy was specified?

Rob Wisniewski added a comment - 12/Apr/07 10:19 PM
To illustrate the 'eager despite lazy' behavior... here's the before and after sql:

svn 04/03/2007:

executed once:

SELECT t0.HOLDINGID, t1.ACCOUNTID, t1.PROFILE_USERID, t1.BALANCE, t1.CREATIONDATE, t1.LASTLOGIN, t1.LOGINCOUNT, t1.LOGOUTCOUNT, t1.OPENBALANCE, t0.PURCHASEDATE, t0.PURCHASEPRICE, t0.QUANTI
TY, t2.SYMBOL, t2.CHANGE1, t2.COMPANYNAME, t2.HIGH, t2.LOW, t2.OPEN1, t2.PRICE, t2.VOLUME FROM HOLDINGEJB t0 LEFT OUTER JOIN ACCOUNTEJB t1 ON t0.ACCOUNT_ACCOUNTID = t1.ACCOUNTID LEFT OUTER JOIN QUOTEEJB t2 ON t0.QUOTE_SYMBOL =
 t2.SYMBOL WHERE t0.ACCOUNT_ACCOUNTID = ?

executed once:

SELECT t0.PROFILE_USERID, t0.BALANCE, t0.CREATIONDATE, t0.LASTLOGIN, t0.LOGINCOUNT, t0.LOGOUTCOUNT, t0.OPENBALANCE FROM ACCOUNTEJB t0 WHERE t0.ACCOUNTID = ? optimize for 1 row

CORRECT svn 04/12/2007:

executed once:

SELECT t0.PROFILE_USERID, t0.BALANCE, t0.CREATIONDATE, t0.LASTLOGIN, t0.LOGINCOUNT, t0.LOGOUTCOUNT, t0.OPENBALANCE FROM ACCOUNTEJB t0 WHERE t0.ACCOUNTID = ? FOR READ ONLY optimize for 1
row

executed 4 times:

SELECT t1.ACCOUNTID, t1.PROFILE_USERID, t1.BALANCE, t1.CREATIONDATE, t1.LASTLOGIN, t1.LOGINCOUNT, t1.LOGOUTCOUNT, t1.OPENBALANCE FROM HOLDINGEJB t0 INNER JOIN ACCOUNTEJB t1 ON t0.ACCOUNT_A
CCOUNTID = t1.ACCOUNTID WHERE t0.HOLDINGID = ? FOR READ ONLY optimize for 1 row

Fay Wang added a comment - 21/Jul/08 02:40 AM
I have tested the following scenario:

@Entity
public class Customer {
..
   @OneToMany(fetch=FetchType.EAGER, mappedBy="customer")
    private Collection<Order> orders = new ArrayList<Order>();
...
}

@Entity
public class Order {
@Id
@GeneratedValue(strategy=GenerationType.IDENTITY)
int oid;

double amount;
boolean delivered;

@ManyToOne (fetch=FetchType.EAGER)
Customer customer;

...

}

A JPQL :
   Select c from Customer c

generates the following SQL:

(1) SELECT t0.countryCode, t0.id, t0.version, t0.city, t0.state, t0.street, t0.zip
      FROM PdqCustomer t0

(2) SELECT t0.countryCode, t0.id, t1.oid, t1.version, t1.amount, t1.delivered
      FROM PdqCustomer t0
      INNER JOIN PdqOrder t1 ON t0.countryCode = t1.CUSTOMER_COUNTRYCODE AND
                             t0.id = t1.CUSTOMER_ID ORDER BY t0.countryCode ASC, t0.id ASC

(3) SELECT t1.countryCode, t1.id, t1.version, t1.city, t1.state, t1.street, t1.zip
      FROM PdqOrder t0
      INNER JOIN PdqCustomer t1 ON t0.CUSTOMER_COUNTRYCODE = t1.countryCode
                             AND t0.CUSTOMER_ID = t1.id WHERE t0.oid = ? optimize for 1 row

(4) SELECT t1.countryCode, t1.id, t1.version, t1.city, t1.state, t1.street, t1.zip
      FROM PdqOrder t0
      INNER JOIN PdqCustomer t1 ON t0.CUSTOMER_COUNTRYCODE = t1.countryCode
                              AND t0.CUSTOMER_ID = t1.id WHERE t0.oid = ? optimize for 1 row

(5) SELECT t1.countryCode, t1.id, t1.version, t1.city, t1.state, t1.street, t1.zip
      FROM PdqOrder t0
      INNER JOIN PdqCustomer t1 ON t0.CUSTOMER_COUNTRYCODE = t1.countryCode
                             AND t0.CUSTOMER_ID = t1.id WHERE t0.oid = ? optimize for 1 row

(6) SELECT t1.countryCode, t1.id, t1.version, t1.city, t1.state, t1.street, t1.zip
      FROM PdqOrder t0
      INNER JOIN PdqCustomer t1 ON t0.CUSTOMER_COUNTRYCODE = t1.countryCode
                              AND t0.CUSTOMER_ID = t1.id WHERE t0.oid = ? optimize for 1 row

Note that the last four sql are the same ((3) -(6)). This is because there are four orders for the customer and each loading of order will load its inverse relationship with customer.

This case is most similar to the case (3) described by Catalina. The attached patch detects the inverse relationship and get rid of sql (3) to (6).





Fay Wang added a comment - 31/Jul/08 10:10 PM
Scenario (2) could no longer be reproduced with r680566:

@Entity
public class Order {
@Id
@GeneratedValue(strategy=GenerationType.IDENTITY)
int oid;

double amount;
boolean delivered;

@ManyToOne (fetch=FetchType.LAZY)
Customer customer;

@OneToMany (fetch=FetchType.EAGER , mappedBy="order")
Collection<OrderItem> lineitems = new ArrayList<OrderItem>();
...
}

@Entity

public class OrderItem {
@Id
@GeneratedValue(strategy=GenerationType.IDENTITY)
int lid;

int quantity;
double cost;

@ManyToOne (fetch=FetchType.LAZY)
Order order;
...
}

JPQL:
--------
select o from Order o left join fetch o.lineitems

Generated SQL:
----------------------
SELECT t0.oid, t0.version, t0.amount, t0.CUSTOMER_COUNTRYCODE, t0.CUSTOMER_ID,
t0.delivered, t1.ORDER_OID, t1.lid, t1.version, t1.cost, t1.PART_PARTNO, t1.quantity
FROM Order t0 LEFT OUTER JOIN OrderItem t1 ON t0.oid = t1.ORDER_OID ORDER BY t1.ORDER_OID ASC

Note that the unneeded LEFT OUTER JOIN is gone .