Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20419

Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key

    XMLWordPrintableJSON

Details

    Description

      This is going into the loop because the VectorPartitionDesc is modified after it is used in the HashMap key - resulting in a hashcode & equals modification after it has been placed in the hashmap.

      HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: 621ms
      java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869  <7 recursive calls>
      java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, Object) HashMap.java:1989
      java.util.HashMap.putVal(int, Object, Object, boolean, boolean) HashMap.java:637
      java.util.HashMap.put(Object, Object) HashMap.java:611
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc, VectorPartitionDesc, Map) Vectorizer.java:1272
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc, boolean, List, Set, Map, Set, ArrayList, Set) Vectorizer.java:1323
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateInputFormatAndSchemaEvolution(MapWork, String, TableScanOperator, Vectorizer$VectorTaskColumnInfo) Vectorizer.java:1654
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(MapWork, Vectorizer$VectorTaskColumnInfo, boolean) Vectorizer.java:1865
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(MapWork, boolean) Vectorizer.java:1109
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Node, Stack, Object[]) Vectorizer.java:961
      org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(Node, Stack, TaskGraphWalker$TaskGraphWalkerContext) TaskGraphWalker.java:111
      org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(Node) TaskGraphWalker.java:180
      org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(Collection, HashMap) TaskGraphWalker.java:125
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(PhysicalContext) Vectorizer.java:2442
      org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(List, ParseContext, Context) TezCompiler.java:717
      org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(ParseContext, List, HashSet, HashSet) TaskCompiler.java:258
      org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(ASTNode, SemanticAnalyzer$PlannerContextFactory) SemanticAnalyzer.java:12443
      org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(ASTNode) CalcitePlanner.java:358
      

      Attachments

        1. HIVE-20419.1.patch
          11 kB
          Teddy Choi
        2. HIVE-20419.2.patch
          16 kB
          Teddy Choi
        3. HIVE-20419.4.patch
          16 kB
          Teddy Choi

        Activity

          People

            teddy.choi Teddy Choi
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 0.5h
                0.5h