[SPARK-10246] Join in PySpark using a list of column names - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Not A Problem
Affects Version/s: None
Fix Version/s: None
Component/s: PySpark, SQL
Labels:
None

Target Version/s:

1.6.0

Description

Currently, there are two supported methods to perform a join: join condition and one column name.

The documentation specifies that the join function can accept a list of conditions or a list of column names but neither are currently supported. This is discussed in issue ~~SPARK-7197~~ as well.

Functionality should match the documentation which currently contains an example in /spark/python/pyspark/sql/dataframe.py line 560:

>>> df.join(df4, ['name', 'age']).select(df.name, df.age).collect()
[Row(name=u'Bob', age=5)]
"""

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Michal Monselise

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 25/Aug/15 20:34

Updated:: 09/Nov/15 18:20

Resolved:: 09/Nov/15 18:20