Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6979

Hadoop-2 test failures related to quick stats not being populated correctly

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.14.0
    • Fix Version/s: 0.14.0
    • Component/s: None
    • Labels:
      None

      Description

      The test failures that are currently reported by Hive QA running on hadoop-2 (https://issues.apache.org/jira/browse/HIVE-6968?focusedCommentId=13980570&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13980570) are related to difference in the way hadoop FileSystem.globStatus() api behaves. For a directory structure like below

      dir1/file1
      dir1/file2
      

      Two level of path pattern like dir1// will return both files in hadoop 1.x but will return empty result in hadoop 2.x (in fact it will say no such file or directory and return empty file status array). Hadoop 2.x seems to be compliant to linux behaviour (ls dir1//) but hadoop 1.x is not.

      As a result of this, the fast statistics (NUM_FILES and TOTAL_SIZE) are populated wrongly causing diffs in qfile tests for hadoop-1 and hadoop-2.

        Attachments

        1. HIVE-6979.1.patch
          64 kB
          Prasanth Jayachandran
        2. HIVE-6979.2.patch
          197 kB
          Prasanth Jayachandran
        3. HIVE-6979.3.patch
          199 kB
          Prasanth Jayachandran

          Issue Links

            Activity

              People

              • Assignee:
                prasanth_j Prasanth Jayachandran
                Reporter:
                prasanth_j Prasanth Jayachandran
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: