Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-12877

Hive use index for queries will lose some data if the Query file is compressed.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Major
    • Resolution: Unresolved
    • 1.2.1
    • None
    • Indexing
    • None
    • This problem exists in all Hive versions.no matter what platform

    Description

      Hive created the index using the extracted file length when the file is the compressed,
      but when to divide the data into pieces in MapReduce,Hive use the file length to compare with the extracted file length,if
      If it found that these two lengths are not matched, It filters out the file.So the query will lose some data.
      I modified the source code and make hive index can be used when the files is compressed,please test it.

      Attachments

        1. 19-index_compressed_file.gz
          14.41 MB
          yangfang
        2. HIVE-12877.1.patch
          2 kB
          yangfang
        3. HIVE-12877.patch
          1 kB
          yangfang
        4. index_query_compressed_file_failure.q
          2 kB
          yangfang

        Activity

          People

            Unassigned Unassigned
            yangfang yangfang
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: