[SPARK-21316] Dataset Union output is not consistent with the column sequence - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Not A Problem
Affects Version/s: 2.1.0
Fix Version/s: None
Component/s: Optimizer, SQL
Labels:
- patch

Description

if i take union of 2 datasets with similar schema, the output should remain same even if i change the sequence of columns while creating the dataset.

i am attaching the code snippet for details.

public class Person{
  public String name;
  public String age;

  public Person(String name, String age) {
    this.name = name;
    this.age = age;
  }

  public String getName() {return name;}
  public void setName(String name) {this.name = name;}
  public String getAge() {return age;}
  public void setAge(String age) {this.age = age;}
}

public class Test {
  public static void main(String arg[]) throws Exception {
    SparkSession spark = SparkConnection.getSpark();

    List<Person> list1 = new ArrayList<>();
    list1.add(new Person("kaushal", "25"));
    list1.add(new Person("aman", "26"));

    List<Person> list2 = new ArrayList<>();
    list2.add(new Person("sapan", "25"));
    list2.add(new Person("yati", "26"));

    Dataset<Person> ds1 = spark.createDataset(list1, Encoders.bean(Person.class));
    Dataset<Person> ds2 = spark.createDataset(list2, Encoders.bean(Person.class));
    ds1.show();
    ds2.show();
    ds1.select("name","age").as(Encoders.bean(Person.class)).union(ds2).show();
  }
}

output :-

+---+-------+
|age|   name|
+---+-------+
| 25|kaushal|
| 26|   aman|
+---+-------+

+---+-----+
|age| name|
+---+-----+
| 25|sapan|
| 26| yati|
+---+-----+

+-------+-----+
|   name|  age|
+-------+-----+
|kaushal|   25|
|   aman|   26|
|     25|sapan|
|     26| yati|
+-------+-----+

Attachments

Issue Links

relates to

SPARK-21043 Add unionByName API to Dataset

Resolved

SPARK-19615 Provide Dataset union convenience for divergent schema

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Kaushal Prajapati

Votes:: 1 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 05/Jul/17 13:31

Updated:: 19/Jul/17 09:08

Resolved:: 19/Jul/17 09:08

Time Tracking

Estimated:

168h

Remaining:

168h

Logged:

Not Specified