Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7685

Parquet memory manager

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Similar to HIVE-4248, Parquet tries to write large very large "row groups". This causes Hive to run out of memory during dynamic partitions when a reducer may have many Parquet files open at a given time.

      As such, we should implement a memory manager which ensures that we don't run out of memory due to writing too many row groups within a single JVM.

      Attachments

        1. HIVE-7685.1.patch
          1 kB
          Dong Chen
        2. HIVE-7685.patch
          2 kB
          Dong Chen
        3. HIVE-7685.1.patch.ready
          3 kB
          Dong Chen
        4. HIVE-7685.patch.ready
          3 kB
          Dong Chen

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            dongc Dong Chen Assign to me
            brocknoland Brock Noland
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment