Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24016

Share bloom filter construction branch in multi column semijoin reducers

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      In HIVE-21196, we add a transformation capable of merging single column semijoin reducers to multi column semijoin reducer.

      Currently it transforms the subplan SB0 to subplan SB1.

      SB0

                                            / RS -> TS_1[Editor] 
               / SEL[fname] - GB - RS - GB -  RS -> TS_0[Author] 
       SOURCE 
               \ SEL[lname] - GB - RS - GB -  RS -> TS_0[Author]
      	                              \ RS -> TS_1[Editor]
      
      TS_0[Author] - FIL[in_bloom(fname) ^ in_bloom(lname)]
      TS_1[Editor] - FIL[in_bloom(fname) ^ in_bloom(lname)]  
      

      SB1

               / SEL[fname,lname] - GB - RS - GB - RS -> TS[Author] - FIL[in_bloom(hash(fname,lname))]
       SOURCE  
               \ SEL[fname,lname] - GB - RS - GB - RS -> TS[Editor] - FIL[in_bloom(hash(fname,lname))]
      

      Observe that in SB1 we could share the common path that creates the bloom filter (SEL - GB - RS -GB) to obtain a plan like SB2.

      SB2

      					   / RS -> TS[Author] - FIL[in_bloom(hash(fname,lname))]
       SOURCE - SEL[fname,lname] - GB - RS - GB -
      					   \ RS -> TS[Editor] - FIL[in_bloom(hash(fname,lname))]
      

      Attachments

        Issue Links

          Activity

            People

              zabetak Stamatis Zampetakis
              zabetak Stamatis Zampetakis
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: