Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2610

When spark.serializer is set as org.apache.spark.serializer.KryoSerializer, importing a method causes multiple spark applications creations

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Incomplete
    • 1.0.1
    • None
    • Spark Shell

    Description

      To reproduce, set

      spark.serializer        org.apache.spark.serializer.KryoSerializer
      

      in conf/spark-defaults.conf and launch a spark shell.
      Then, execute

      class X() { println("What!"); def y = 3 }
      val x = new X
      import x.y
      
      case class Person(name: String, age: Int)
      
      val serializer = org.apache.spark.serializer.Serializer.getSerializer(null)
      val kryoSerializer = serializer.newInstance
      
      val value = kryoSerializer.serialize(Person("abc", 1))
      kryoSerializer.deserialize(value): Person
      // Once you execute this line, you will see ...
      // What!
      // What!
      // res1: Person = Person(abc,1)
      

      Basically, importing a method of a class causes the constructor of that class been called twice.

      It affects our branch 1.0 and master.
      For the master, you can use

      val serializer = org.apache.spark.serializer.Serializer.getSerializer(None)
      

      to get the serializer.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            yhuai Yin Huai
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment