Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-20419

Vectorization: Prevent mutation of VectorPartitionDesc after being used in a hashmap key

Log workAgile BoardRank to TopRank to BottomBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      This is going into the loop because the VectorPartitionDesc is modified after it is used in the HashMap key - resulting in a hashcode & equals modification after it has been placed in the hashmap.

      HiveServer2-Background-Pool: Thread-6049 State: RUNNABLE CPU usage on sample: 621ms
      java.util.HashMap$TreeNode.find(int, Object, Class) HashMap.java:1869  <7 recursive calls>
      java.util.HashMap$TreeNode.putTreeVal(HashMap, HashMap$Node[], int, Object, Object) HashMap.java:1989
      java.util.HashMap.putVal(int, Object, Object, boolean, boolean) HashMap.java:637
      java.util.HashMap.put(Object, Object) HashMap.java:611
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.addVectorPartitionDesc(PartitionDesc, VectorPartitionDesc, Map) Vectorizer.java:1272
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.verifyAndSetVectorPartDesc(PartitionDesc, boolean, List, Set, Map, Set, ArrayList, Set) Vectorizer.java:1323
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateInputFormatAndSchemaEvolution(MapWork, String, TableScanOperator, Vectorizer$VectorTaskColumnInfo) Vectorizer.java:1654
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(MapWork, Vectorizer$VectorTaskColumnInfo, boolean) Vectorizer.java:1865
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(MapWork, boolean) Vectorizer.java:1109
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Node, Stack, Object[]) Vectorizer.java:961
      org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(Node, Stack, TaskGraphWalker$TaskGraphWalkerContext) TaskGraphWalker.java:111
      org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(Node) TaskGraphWalker.java:180
      org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(Collection, HashMap) TaskGraphWalker.java:125
      org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(PhysicalContext) Vectorizer.java:2442
      org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeTaskPlan(List, ParseContext, Context) TezCompiler.java:717
      org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(ParseContext, List, HashSet, HashSet) TaskCompiler.java:258
      org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(ASTNode, SemanticAnalyzer$PlannerContextFactory) SemanticAnalyzer.java:12443
      org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(ASTNode) CalcitePlanner.java:358
      

      Attachments

        1. HIVE-20419.1.patch
          11 kB
          Teddy Choi
        2. HIVE-20419.2.patch
          16 kB
          Teddy Choi
        3. HIVE-20419.4.patch
          16 kB
          Teddy Choi

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            teddy.choi Teddy Choi Assign to me
            gopalv Gopal Vijayaraghavan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - Not Specified
              Not Specified
              Remaining:
              Remaining Estimate - 0h
              0h
              Logged:
              Time Spent - 0.5h
              0.5h

              Slack

                Issue deployment