Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-32486

Issue with deserialization and persist api in latest spark java versions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 2.4.4, 2.4.5, 2.4.6, 3.0.0
    • None
    • Java API
    • None
    • It's happening on all the os and java8

    Description

      Hey Team, We have class level object instantiations in one of our Classes. When we want to persist that data into the Dataset of this class Type it's not persisting the null values instead it's taking class level precedence. i.e. It's showing as new object.

      Eg: 

      Test.class has below class level attributes:

      private Test1 testNumber = new Test1();

      private Test2 testNumber2;

       

      String inputLocation = "src/test/resources/pipeline/test.parquet";

      Dataset<Row> ds = this.session.read().parquet(inputLocation);
      ds.printSchema();
      ds.foreach(input->

      { System.out.println(input); // When we verified it's showing testNumber, testNumber2 as null }

      );
      Dataset<Test> inputDataSet = ds.as(Encoders.bean(Test.class));

      inputDataSet.foreach(input->

      { System.out.println(input); // When we verified it's showing testNumber as new Test1(), testNumber2 as null }

      );

       

       

      This is the same issue with dataset.persist() call aswell. It is happening with all 2.4.4 and higher versions. Can you please fix it?

       

       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            Dineshy534 Dinesh Kumar
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: