Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-1950

Block merge for RCFile

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.8.0
    • None
    • None
    • Reviewed

    Description

      In our env, there are a lot of small files inside one partition/table. In order to reduce the namenode load, we have one dedicated housekeeping job running to merge these file. Right now the merge is an 'insert overwrite' in hive, and requires decompress the data and compress it. This jira is to add a command in Hive to do the merge without decompress and recompress the data.

      Something like "alter table tbl_name [partition ()] concatenate". In this jira the new command will only support RCFile, since there need some new APIs to the fileformat.

      Attachments

        1. HIVE-1950.6.patch
          165 kB
          He Yongqiang
        2. HIVE-1950.5.patch
          165 kB
          He Yongqiang
        3. HIVE-1950.4.patch
          162 kB
          He Yongqiang
        4. HIVE-1950.3.patch
          159 kB
          He Yongqiang
        5. HIVE-1950.2.patch
          126 kB
          He Yongqiang
        6. HIVE-1950.1.patch
          113 kB
          He Yongqiang

        Issue Links

          Activity

            People

              he yongqiang He Yongqiang
              he yongqiang He Yongqiang
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: