Uploaded image for project: 'Samza'
  1. Samza
  2. SAMZA-2554

SQL Join Filter Handling is Broken when Table is remote and the condition has more than one expression

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • sql
    • None

    Description

      The current implementation of remote table join can not handle more than one condition.

      Currently this will return wrong results silently by ignoring the filter expression.

      As a first step we need to fail such query to start with and then add the support of 2 or more filter expression as part of the join condition.

      Here is the test case org.apache.samza.test.samzasql.TestSamzaSqlRemoteTable#testSourceEndToEndWithKeyWithNullForeignKeysRightOuterJoin

      String sql =
              "Insert into testavro.enrichedPageViewTopic "
                  + "select pv.pageKey as __key__, pv.pageKey as pageKey, coalesce(null, 'N/A') as companyName,"
                  + "       p.name as profileName, p.address as profileAddress "
                  + " from  "
                  + " testavro.PAGEVIEW as pv left outer join testRemoteStore.Profile.`$table` as p"
                  + " on p.__key__ = pv.profileId and p.__key__ = pv.profileId + 1 and  p.__key__ = pv.profileId - 1 ";
      
       String sql =
              "Insert into testavro.enrichedPageViewTopic "
                  + "select pv.pageKey as __key__, pv.pageKey as pageKey, coalesce(null, 'N/A') as companyName,"
                  + "       p.name as profileName, p.address as profileAddress "
                  + " from  "
                  + " testavro.PAGEVIEW as pv left outer join testRemoteStore.Profile.`$table` as p"
                  + " on p.__key__ = pv.profileId  ";
      
      

      Both queries will return the same result and that is wrong

      Attachments

        Activity

          People

            Unassigned Unassigned
            bslim Slim Bouguerra
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 40m
                40m