Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-15676

Remove Bloom Filters from semi join reduction if it is too big.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      Bloom filters themselves could become really big if the row count is high. Aggregating such bloom filters in reducers could be even more expensive. For e.g., a bloom filter for 100M rows can be as big as 170MB. Aggregating 100 such filters in reducer could end up taking 17GB of memory.

      Attachments

        Activity

          People

            djaiswal Deepak Jaiswal
            djaiswal Deepak Jaiswal
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: