Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-3593

Output files of SMB join grow indefinitely

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 0.10.0
    • None
    • Query Processor
    • None

    Description

      The output files of a SMB join are prefixed by the big table's partition spec that was used to create them. The length of the bucket number portion of the file name is updated to be the same length as the length of the task ID. Since the task ID is the name of the file, this means that if the output of a SMB join is used as the big table of another SMB join, the output files will increase by the size of the original partition spec. Compound this and the file size can grow indefinitely.

      Attachments

        Issue Links

          Activity

            People

              kevinwilfong Kevin Wilfong
              kevinwilfong Kevin Wilfong
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: