Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6537

NullPointerException when loading hashtable for MapJoin directly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.13.0
    • None
    • None

    Description

      We see the following error:

      2014-02-20 23:33:15,743 FATAL [main] org.apache.hadoop.hive.ql.exec.mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
              at org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:103)
              at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:149)
              at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:164)
              at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1026)
              at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
              at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1030)
              at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:489)
              at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
              at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
              at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
              at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
              at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:396)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
              at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
      Caused by: java.lang.NullPointerException
              at java.util.Arrays.fill(Arrays.java:2685)
              at org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.loadDirectly(HashTableLoader.java:155)
              at org.apache.hadoop.hive.ql.exec.mr.HashTableLoader.load(HashTableLoader.java:81)
              ... 15 more
      

      It appears that the tables in Arrays.fill call is nulls. I don't really have full understanding of this path, but what I gleaned so far is this...
      From what I see, tables would be set unconditionally in initializeOp of the sink, and in no other place, so I assume for this code to ever work that startForward calls it at least some time.
      Here, it doesn't call it, so it's null.
      Previous loop also uses tables, and should have NPE-d before fill was ever called; it didn't, so I'd assume it never executed.
      There's a little bit of inconsistency in the above code where directWorks are added to parents unconditionally but sink is only added as child conditionally. I think it may be that some of the direct works are not table scans; in fact given that loop never executes they may be null (which is rather strange).
      Regardless, it seems that the logic should be fixed, it may be the root cause

      Attachments

        1. HIVE-6537.01.patch
          3 kB
          Sergey Shelukhin
        2. HIVE-6537.2.patch.txt
          16 kB
          Navis Ryu
        3. HIVE-6537.patch
          4 kB
          Sergey Shelukhin

        Issue Links

          Activity

            People

              sershe Sergey Shelukhin
              sershe Sergey Shelukhin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: