Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-25521

Data corruption when concatenating files with different compressions in same table/partition

    XMLWordPrintableJSON

Details

    Description

      Currently if files of different compressions are in same directory then concatenate can fail and cause data corruption. This happens because file can be moved by one task as incompatible file and the other tasks will fail after this.

       

      This issue is addressed in this Jira by only processing a file in one task where offset 0 is process and ignoring the the file in all other tasks.

      Attachments

        Issue Links

          Activity

            People

              harishjp Harish JP
              harishjp Harish JP
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m