Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7292 Hive on Spark
  3. HIVE-9781

Utilize spark.kryo.classesToRegister [Spark Branch]

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.2.0
    • Component/s: Spark
    • Labels:
      None

      Description

      I noticed in several thread dumps that it appears kyro is serializing the class names associated with our keys and values.

      Kyro supports pre-registering classes so that you don't have to serialize the class name and spark supports this via the spark.kryo.registrator property. We should do this so we don't have to serialize class names.

      Thread 12154: (state = BLOCKED)
       - java.lang.Object.hashCode() @bci=0 (Compiled frame; information may be imprecise)
       - com.esotericsoftware.kryo.util.ObjectMap.get(java.lang.Object) @bci=1, line=265 (Compiled frame)
       - com.esotericsoftware.kryo.util.DefaultClassResolver.getRegistration(java.lang.Class) @bci=18, line=61 (Compiled frame)
       - com.esotericsoftware.kryo.Kryo.getRegistration(java.lang.Class) @bci=20, line=429 (Compiled frame)
       - com.esotericsoftware.kryo.util.DefaultClassResolver.readName(com.esotericsoftware.kryo.io.Input) @bci=242, line=148 (Compiled frame)
       - com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(com.esotericsoftware.kryo.io.Input) @bci=65, line=115 (Compiled frame)
       - com.esotericsoftware.kryo.Kryo.readClass(com.esotericsoftware.kryo.io.Input) @bci=20, line=610 (Compiled frame)
       - com.esotericsoftware.kryo.Kryo.readClassAndObject(com.esotericsoftware.kryo.io.Input) @bci=21, line=721 (Compiled frame)
       - com.twitter.chill.Tuple2Serializer.read(com.esotericsoftware.kryo.Kryo, com.esotericsoftware.kryo.io.Input, java.lang.Class) @bci=6, line=41 (Compiled frame)
       - com.twitter.chill.Tuple2Serializer.read(com.esotericsoftware.kryo.Kryo, com.esotericsoftware.kryo.io.Input, java.lang.Class) @bci=4, line=33 (Compiled frame)
       - com.esotericsoftware.kryo.Kryo.readClassAndObject(com.esotericsoftware.kryo.io.Input) @bci=126, line=729 (Compiled frame)
       - org.apache.spark.serializer.KryoDeserializationStream.readObject(scala.reflect.ClassTag) @bci=8, line=142 (Compiled frame)
       - org.apache.spark.serializer.DeserializationStream$$anon$1.getNext() @bci=10, line=133 (Compiled frame)
       - org.apache.spark.util.NextIterator.hasNext() @bci=16, line=71 (Compiled frame)
       - org.apache.spark.util.CompletionIterator.hasNext() @bci=4, line=32 (Compiled frame)
       - scala.collection.Iterator$$anon$13.hasNext() @bci=4, line=371 (Compiled frame)
       - org.apache.spark.util.CompletionIterator.hasNext() @bci=4, line=32 (Compiled frame)
       - org.apache.spark.InterruptibleIterator.hasNext() @bci=22, line=39 (Compiled frame)
       - scala.collection.Iterator$$anon$11.hasNext() @bci=4, line=327 (Compiled frame)
       - org.apache.spark.util.collection.ExternalSorter.insertAll(scala.collection.Iterator) @bci=191, line=217 (Compiled frame)
       - org.apache.spark.shuffle.hash.HashShuffleReader.read() @bci=278, line=61 (Interpreted frame)
       - org.apache.spark.rdd.ShuffledRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=46, line=92 (Interpreted frame)
       - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame)
       - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame)
       - org.apache.spark.rdd.MapPartitionsRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=24, line=35 (Interpreted frame)
       - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame)
       - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame)
       - org.apache.spark.rdd.MapPartitionsRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=24, line=35 (Interpreted frame)
       - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame)
       - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame)
       - org.apache.spark.rdd.UnionRDD.compute(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=22, line=87 (Interpreted frame)
       - org.apache.spark.rdd.RDD.computeOrReadCheckpoint(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=26, line=263 (Interpreted frame)
       - org.apache.spark.rdd.RDD.iterator(org.apache.spark.Partition, org.apache.spark.TaskContext) @bci=33, line=230 (Interpreted frame)
       - org.apache.spark.scheduler.ShuffleMapTask.runTask(org.apache.spark.TaskContext) @bci=166, line=68 (Interpreted frame)
       - org.apache.spark.scheduler.ShuffleMapTask.runTask(org.apache.spark.TaskContext) @bci=2, line=41 (Interpreted frame)
       - org.apache.spark.scheduler.Task.run(long) @bci=77, line=56 (Interpreted frame)
       - org.apache.spark.executor.Executor$TaskRunner.run() @bci=310, line=196 (Interpreted frame)
       - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Interpreted frame)
       - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)
       - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)
      

        Attachments

        1. HIVE-9781.4.patch
          3 kB
          Jimmy Xiang
        2. HIVE-9781.3.patch
          7 kB
          Jimmy Xiang
        3. HIVE-9781.2.patch
          7 kB
          Jimmy Xiang
        4. HIVE-9781.1.patch
          7 kB
          Jimmy Xiang

          Activity

            People

            • Assignee:
              jxiang Jimmy Xiang
              Reporter:
              brocknoland Brock Noland
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: