Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-37811

Broadcast Join throws HintErrorLogger for joins with multiple tables

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.1.2
    • None
    • SQL

    Description

      Following query throws HintErrorLogger Warnings in v3.1.2. 

       

      // code placeholder
      val Query = " SELECT /*+ BROADCASTJOIN(L1, L2, L3) */ " +
                    " L1.v1 AS L1V1 " +
                    " L4.* " +
                    " FROM L1 " +
                    " INNER JOIN L2 ON L2.id = L1.id " +
                    " INNER JOIN L3 ON L3.id = L1.id " +
                    " LEFT JOIN L4 ON L4.id = L1.id AND L4.idx = L2.idx AND L4.time BETWEEN L3.time1 AND L3.time2 "
                  

       

      Following is the warning it thorws during runtime:

      WARN HintErrorLogger: Count not find relation 'L1' specified in hint 'BROADCASTJOIN(L1,L2,L3)'
      WARN HintErrorLogger: Count not find relation 'L2' specified in hint 'BROADCASTJOIN(L1,L2,L3)'
      WARN HintErrorLogger: Count not find relation 'L3' specified in hint 'BROADCASTJOIN(L1,L2,L3)'

       

      The same query didn't have any warnings in v2.4.7. I am not entirely sure if this is inherently not broadcasting the three small tables (L1, L2, L3) when doing a Left Join with a bigger table (L4).

      I have set {{autoBroadcastJoinThreshold = 4G }} which is way bigger than L1+L2+L3.

       

      Let me know if you need more info.

      Attachments

        Activity

          People

            Unassigned Unassigned
            tsarangi tsarangi
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: