Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1635

ability to check partition size for dynamic partiions

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Query Processor
    • None

    Description

      With dynamic partitions, it becomes very easy to create partitions.

      We have seen some scenarios, where a lot of partitions/files get created due to some corrupt data (1 corrupt row
      can end up creating a partition and a lot of files (number of mappers, if merge is false)).

      This puts a lot of load on the cluster, and is a debugging nightmare.

      It would be good to have a configuration parameter, for the minimum number of rows for a partition.
      If the number of rows is less than the threshold, the partition need not be created. The default value
      of this parameter can be zero for backward compatibility

      Attachments

        Activity

          People

            nzhang Ning Zhang
            namit Namit Jain
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: