Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21316

Dataset Union output is not consistent with the column sequence

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 2.1.0
    • None
    • Optimizer, SQL

    Description

      if i take union of 2 datasets with similar schema, the output should remain same even if i change the sequence of columns while creating the dataset.

      i am attaching the code snippet for details.

      public class Person{
        public String name;
        public String age;
      
        public Person(String name, String age) {
          this.name = name;
          this.age = age;
        }
      
        public String getName() {return name;}
        public void setName(String name) {this.name = name;}
        public String getAge() {return age;}
        public void setAge(String age) {this.age = age;}
      }
      
      public class Test {
        public static void main(String arg[]) throws Exception {
          SparkSession spark = SparkConnection.getSpark();
      
          List<Person> list1 = new ArrayList<>();
          list1.add(new Person("kaushal", "25"));
          list1.add(new Person("aman", "26"));
      
          List<Person> list2 = new ArrayList<>();
          list2.add(new Person("sapan", "25"));
          list2.add(new Person("yati", "26"));
      
          Dataset<Person> ds1 = spark.createDataset(list1, Encoders.bean(Person.class));
          Dataset<Person> ds2 = spark.createDataset(list2, Encoders.bean(Person.class));
          ds1.show();
          ds2.show();
          ds1.select("name","age").as(Encoders.bean(Person.class)).union(ds2).show();
        }
      }
      

      output :-

      +---+-------+
      |age|   name|
      +---+-------+
      | 25|kaushal|
      | 26|   aman|
      +---+-------+
      
      +---+-----+
      |age| name|
      +---+-----+
      | 25|sapan|
      | 26| yati|
      +---+-----+
      
      +-------+-----+
      |   name|  age|
      +-------+-----+
      |kaushal|   25|
      |   aman|   26|
      |     25|sapan|
      |     26| yati|
      +-------+-----+
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              skp33 Kaushal Prajapati
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 168h
                  168h
                  Remaining:
                  Remaining Estimate - 168h
                  168h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified