Bulk Operation

  1. Choose Issues
  2. Choose Operation
  3. Operation Details
  4. Confirmation

Step 1 of 4: Choose Issues

Cancel

T Patch Info Key Summary Assignee Reporter P Status Resolution Created Updated Due Development
Sub-task SPARK-9659

SPARK-6116 Rename inSet to isin to match Pandas function

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-9166

SPARK-6116 Hide JVM stack trace for IllegalArgumentException in Python

L. C. Hsieh Reynold Xin Major Resolved Fixed  
Sub-task SPARK-8926

SPARK-6116 Good errors for invalid input to ExpectsInput expressions

Michael Armbrust Michael Armbrust Critical Resolved Fixed  
Sub-task SPARK-8846

SPARK-6116 Maintain binary compatibility for in function

Unassigned Reynold Xin Major Closed Won't Fix  
Sub-task SPARK-8818

SPARK-6116 In should not take Any not Column

Unassigned Michael Armbrust Major Resolved Duplicate  
Sub-task SPARK-8766

SPARK-6116 DataFrame Python API should work with column which has non-ascii character in it

Davies Liu Davies Liu Major Resolved Fixed  
Sub-task SPARK-8698

SPARK-6116 partitionBy in Python DataFrame reader/writer interface should not default to empty tuple

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-8685

SPARK-6116 dataframe left joins are not working as expected in pyspark

Davies Liu axel dahl Major Resolved Fixed  
Sub-task SPARK-8668

SPARK-6116 expr function to convert SQL expression into a Column

Joseph Batchik Reynold Xin Major Resolved Fixed  
Sub-task SPARK-8573

SPARK-6116 For PySpark's DataFrame API, we need to throw exceptions when users try to use and/or/not

Davies Liu Yin Huai Critical Resolved Duplicate  
Sub-task SPARK-8568

SPARK-6116 Prevent accidental use of "and" and "or" to build invalid expressions in Python

Davies Liu Reynold Xin Critical Resolved Fixed  
Sub-task SPARK-8434

SPARK-6116 Add a "pretty" parameter to show

Shixiong Zhu Shixiong Zhu Major Resolved Fixed  
Sub-task SPARK-8356

SPARK-6116 Reconcile callUDF and callUdf

Benjamin Fradet Michael Armbrust Critical Resolved Fixed  
Sub-task SPARK-8355

SPARK-6116 Python DataFrameReader/Writer should mirror scala

Cheolsoo Park Michael Armbrust Critical Resolved Fixed  
Sub-task SPARK-8300

SPARK-6116 DataFrame hint for broadcast join

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-8299

SPARK-6116 Improve error message reporting for DataFrame and SQL

Michael Armbrust Reynold Xin Major Resolved Fixed  
Sub-task SPARK-8146

SPARK-6116 DataFrame Python API: Alias replace in DataFrameNaFunctions

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-8072

SPARK-6116 Better AnalysisException for writing DataFrame with identically named columns

Animesh Baranawal Reynold Xin Blocker Resolved Fixed  
Sub-task SPARK-8060

SPARK-6116 Improve Python reader/writer interface doc and testing

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-8056

SPARK-6116 Design an easier way to construct schema for both Scala and Python

Ilya Ganelin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-8026

SPARK-6116 Add Column.alias to Scala/Java API

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-8021

SPARK-6116 DataFrameReader/Writer in Python does not match Scala

Davies Liu Michael Armbrust Blocker Resolved Fixed  
Sub-task SPARK-7998

SPARK-6116 Improve frequent items documentation

Burak Yavuz Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7993

SPARK-6116 Improve DataFrame.show() output

Akhil Thatipamula Reynold Xin Blocker Resolved Fixed  
Sub-task SPARK-7991

SPARK-6116 Python DataFrame: support passing a list into describe

Amey Chaugule Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7990

SPARK-6116 Add methods to facilitate equi-join on multiple join keys

L. C. Hsieh Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7982

SPARK-6116 crosstab should use 0 instead of null for pairs that don't appear

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7981

SPARK-6116 Improve DataFrame Python exception

Davies Liu Reynold Xin Major Closed Duplicate  
Sub-task SPARK-7980

SPARK-6116 Support SQLContext.range(end)

Animesh Baranawal Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7971

SPARK-6116 Add JavaDoc style deprecation for deprecated DataFrame methods

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7836

SPARK-6116 DataFrame.ntile() should only accept Int as parameter

Davies Liu Davies Liu Blocker Resolved Fixed  
Sub-task SPARK-7834

SPARK-6116 Better error for unresolved window functions.

Michael Armbrust Michael Armbrust Critical Resolved Fixed  
Sub-task SPARK-7822

SPARK-6116 Window function support in Python DataFrame DSL

Davies Liu Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7783

SPARK-6116 Add rollup and cube support to DataFrame Python DSL

Davies Liu Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7778

SPARK-6116 Add standard deviation aggregate expression

Unassigned Rakesh Chalasani Major Closed Duplicate  
Sub-task SPARK-7742

SPARK-6116 Figure out what to do with insertInto w.r.t. DataFrameWriter API

Yin Huai Reynold Xin Critical Closed Fixed  
Sub-task SPARK-7738

SPARK-6116 DataFrame reader/writer API in Python

Davies Liu Davies Liu Critical Resolved Fixed  
Sub-task SPARK-7734

SPARK-6116 make explode support struct type

Unassigned Wenchen Fan Major Closed Not A Problem  
Sub-task SPARK-7654

SPARK-6116 DataFrameReader and DataFrameWriter for input/output API

Reynold Xin Reynold Xin Blocker Resolved Fixed  
Sub-task SPARK-7606

SPARK-6116 Document all PySpark SQL/DataFrame public methods with @since tag

Davies Liu Nicholas Chammas Major Resolved Fixed  
Sub-task SPARK-7588

SPARK-6116 Document all SQL/DataFrame public methods with @since tag

Reynold Xin Reynold Xin Blocker Resolved Fixed  
Sub-task SPARK-7551

SPARK-6116 Don't split by dot if within backticks for DataFrame attribute resolution

Wenchen Fan Reynold Xin Critical Resolved Fixed  
Sub-task SPARK-7548

SPARK-6116 Add explode expression

Michael Armbrust Reynold Xin Blocker Resolved Fixed  
Sub-task SPARK-7543

SPARK-6116 Break dataframe.py into multiple files

Davies Liu Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7509

SPARK-6116 Add drop column to Python DataFrame API

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7507

SPARK-6116 pyspark.sql.types.StructType and Row should implement __iter__()

Unassigned Nicholas Chammas Minor Closed Won't Fix  
Sub-task SPARK-7506

SPARK-6116 pyspark.sql.types.StructType.fromJson() is incorrectly named

Unassigned Nicholas Chammas Minor Closed Won't Fix  
Sub-task SPARK-7462

SPARK-6116 By default retain group by columns in aggregate

Reynold Xin Reynold Xin Blocker Resolved Fixed  
Sub-task SPARK-7460

SPARK-6116 Provide DataFrame.zip (analog of RDD.zip) to merge two data frames

Ram Sriharsha Ram Sriharsha Minor Closed Won't Fix  
Sub-task SPARK-7358

SPARK-6116 Move mathfunctions into functions

Burak Yavuz Reynold Xin Blocker Resolved Fixed  
Sub-task SPARK-7324

SPARK-6116 Add DataFrame.dropDuplicates

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7322

SPARK-6116 Window function support in Scala/Java DataFrame DSL

Cheng Hao Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7321

SPARK-6116 Add Column expression for conditional statements (if, case)

Chen Song Reynold Xin Critical Resolved Fixed  
Sub-task SPARK-7320

SPARK-6116 Add rollup and cube support to DataFrame Java/Scala DSL

Cheng Hao Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7319

SPARK-6116 Improve the output from DataFrame.show()

Chen Song Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7295

SPARK-6116 Add bitwise operations to DataFrame DSL

Shiti Saxena Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7294

SPARK-6116 Add a between function in Column

Chen Song Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7285

SPARK-6116 Audit missing Hive functions

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7280

SPARK-6116 Add a method for dropping a column in Java/Scala

Rakesh Chalasani Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7276

SPARK-6116 withColumn is very slow on dataframe with large number of columns

Wenchen Fan Alexandre CLEMENT Major Resolved Fixed  
Sub-task SPARK-7274

SPARK-6116 Create Column expression for array/struct creation

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7248

SPARK-6116 Random number generators for DataFrames

Burak Yavuz Xiangrui Meng Major Resolved Fixed  
Sub-task SPARK-7247

SPARK-6116 Add Pandas' shift method to the Dataframe API

Unassigned Olivier Girardot Minor Closed Won't Fix  
Sub-task SPARK-7227

SPARK-6116 Support fillna / dropna in R DataFrame

Sun Rui Reynold Xin Critical Resolved Fixed  
Sub-task SPARK-7226

SPARK-6116 Support math functions in R DataFrame

Qian Huang Reynold Xin Critical Resolved Fixed  
Sub-task SPARK-7215

SPARK-6116 Make repartition and coalesce a part of the query plan

Burak Yavuz Burak Yavuz Critical Resolved Fixed  
Sub-task SPARK-7188

SPARK-6116 Support math functions in DataFrames in Python

Burak Yavuz Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7158

SPARK-6116 collect and take return different results

Cheng Hao Reynold Xin Blocker Resolved Fixed  
Sub-task SPARK-7157

SPARK-6116 Add approximate stratified sampling to DataFrame

Xiangrui Meng Joseph K. Bradley Minor Resolved Fixed  
Sub-task SPARK-7156

SPARK-6116 Add randomSplit method to DataFrame

Burak Yavuz Joseph K. Bradley Minor Resolved Fixed  
Sub-task SPARK-7152

SPARK-6116 Add a Column expression for partition ID

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7151

SPARK-6116 Correlation methods for DataFrame

Burak Yavuz Joseph K. Bradley Minor Closed Duplicate  
Sub-task SPARK-7150

SPARK-6116 SQLContext.range()

Adrian Wang Joseph K. Bradley Minor Resolved Fixed  
Sub-task SPARK-7135

SPARK-6116 Expression for monotonically increasing IDs

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7133

SPARK-6116 Implement struct, array, and map field accessor using apply in Scala and __getitem__ in Python

Wenchen Fan Reynold Xin Blocker Resolved Fixed  
Sub-task SPARK-7118

SPARK-6116 Add coalesce Spark SQL function to PySpark API

Olivier Girardot Olivier Girardot Minor Resolved Fixed  
Sub-task SPARK-7073

SPARK-6116 Clean up Python data type hierarchy

Davies Liu Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7069

SPARK-6116 Rename NativeType -> AtomicType

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7068

SPARK-6116 Remove PrimitiveType

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7060

SPARK-6116 Missing alias function on Python DataFrame

Yin Huai Yin Huai Major Resolved Fixed  
Sub-task SPARK-7059

SPARK-6116 Create a DataFrame join API to facilitate equijoin and self join

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-7035

SPARK-6116 Drop __getattr__ on pyspark.sql.DataFrame

Unassigned Karl-Johan Wettin Major Closed Won't Fix  
Sub-task SPARK-6970

SPARK-6116 Document what the options: Map[String, String] does on DataFrame.save and DataFrame.saveAsTable

Unassigned John Muller Trivial Closed Won't Fix  
Sub-task SPARK-6876

SPARK-6116 DataFrame.na.replace value support for Python

Adrian Wang Reynold Xin Major Resolved Fixed  
Sub-task SPARK-6865

SPARK-6116 Decide on semantics for string identifiers in DataFrame API

Reynold Xin Michael Armbrust Blocker Resolved Fixed  
Sub-task SPARK-6829

SPARK-6116 Support math functions in DataFrames

Burak Yavuz Xiangrui Meng Blocker Resolved Fixed  
Sub-task SPARK-6623

SPARK-6116 Alias DataFrame.na.fill/drop in Python

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-6608

SPARK-6116 Make DataFrame.rdd a lazy val

Cheng Lian Cheng Lian Minor Resolved Fixed  
Sub-task SPARK-6603

SPARK-6116 SQLContext.registerFunction -> SQLContext.udf.register

Davies Liu Reynold Xin Major Resolved Fixed  
Sub-task SPARK-6564

SPARK-6116 SQLContext.emptyDataFrame should contain 0 rows, not 1 row

Reynold Xin Reynold Xin Blocker Resolved Fixed  
Sub-task SPARK-6563

SPARK-6116 DataFrame.fillna

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-6562

SPARK-6116 DataFrame.na.replace value support in Scala/Java

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-6293

SPARK-6116 SQLContext.implicits should provide automatic conversion for RDD[Row]

Unassigned Joseph K. Bradley Major Closed Won't Fix  
Sub-task SPARK-6292

SPARK-6116 Add RDD methods to DataFrame to preserve schema

Joseph K. Bradley Joseph K. Bradley Major Resolved Duplicate  
Sub-task SPARK-6231

SPARK-6116 Join on two tables (generated from same one) is broken

Reynold Xin Davies Liu Critical Resolved Fixed  
Sub-task SPARK-6119

SPARK-6116 DataFrame.dropna support

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-6117

SPARK-6116 describe function for summary statistics

Andrey Zagrebin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-5632

SPARK-6116 not able to resolve dot('.') in field name

Wenchen Fan Lishu Liu Blocker Resolved Fixed  
Sub-task SPARK-5295

SPARK-6116 Stabilize data types

Reynold Xin Reynold Xin Major Resolved Fixed  
Sub-task SPARK-5288

SPARK-6116 Stabilize Spark SQL data type API followup

Reynold Xin Yin Huai Major Resolved Fixed  
Sub-task SPARK-4867

SPARK-6116 UDF clean up

Reynold Xin Michael Armbrust Blocker Resolved Fixed  

Cancel