Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1721

use bloom filters to improve the performance of joins

    Details

      Description

      In case of map-joins, it is likely that the big table will not find many matching rows from the small table.
      Currently, we perform a hash-map lookup for every row in the big table, which can be pretty expensive.

      It might be useful to try out a bloom-filter containing all the elements in the small table.
      Each element from the big table is first searched in the bloom filter, and only in case of a positive match,
      the small table hash table is explored.

        Attachments

        1. hive-1721.patch.txt
          13 kB
          Siddhartha Gunda

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                namit Namit Jain
              • Votes:
                5 Vote for this issue
                Watchers:
                23 Start watching this issue

                Dates

                • Created:
                  Updated: