Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-18362

Introduce a parameter to control the max row number for map join convertion

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • Query Processor
    • None

    Description

      The compression ratio of the Orc compressed file will be very high in some cases.

      The test table has three Int columns, with twelve million records, but the compressed file size is only 4M. Hive will automatically converts the Join to Map join, but this will cause memory overflow. So I think it is better to have a parameter to limit to the total number of table records in the Map Join convertion, and if the total number of records is larger than that, it can not be converted to Map join.

      hive.auto.convert.join.max.number = 2500000L

      The default value for this parameter is 2500000, because so many records occupy about 700M memory in clint JVM, and 2500000 records for Map Join are also large tables.

      Attachments

        Activity

          People

            gopalv Gopal Vijayaraghavan
            wankunde wan kun
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: