Step 1 of 4: Choose Issues

Cancel

T Patch Info Key Summary Assignee Reporter P Status Resolution Created Updated Due Development
Sub-task SPARK-45509

SPARK-41279 Investigate the behavior difference in self-join

Unassigned Allison Wang Major Open Unresolved  
Sub-task SPARK-45001

SPARK-41279 Implement DataFrame.foreachPartition

Hyukjin Kwon Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-45000

SPARK-41279 Implement DataFrame.foreach

Hyukjin Kwon Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-43879

SPARK-41279 Decouple handle command and send response on server side

Unassigned Jiaan Geng Major Open Unresolved  
Sub-task SPARK-43146

SPARK-41279 Implement eager evaluation.

Takuya Ueshin Takuya Ueshin Major Resolved Fixed  
Sub-task SPARK-42998

SPARK-41279 Fix DataFrame.collect with null struct.

Takuya Ueshin Takuya Ueshin Major Resolved Fixed  
Sub-task SPARK-42985

SPARK-41279 Fix createDataFrame from pandas to respect session timezone.

Takuya Ueshin Takuya Ueshin Major Resolved Fixed  
Sub-task SPARK-42984

SPARK-41279 Fix test_createDataFrame_with_single_data_type.

Takuya Ueshin Takuya Ueshin Major Resolved Fixed  
Sub-task SPARK-42983

SPARK-41279 Fix the error message of createDataFrame from np.array(0)

Takuya Ueshin Takuya Ueshin Major Resolved Fixed  
Sub-task SPARK-42982

SPARK-41279 Fix createDataFrame from pandas with map type

Takuya Ueshin Takuya Ueshin Major Resolved Fixed  
Sub-task SPARK-42969

SPARK-41279 Fix the comparison the result with Arrow optimization enabled/disabled.

Takuya Ueshin Takuya Ueshin Major Resolved Fixed  
Sub-task SPARK-42929

SPARK-41279 make mapInPandas / mapInArrow support "is_barrier"

Weichen Xu Weichen Xu Major Resolved Fixed  
Sub-task SPARK-42889

SPARK-41279 Implement cache, persist, unpersist, and storageLevel

Takuya Ueshin Takuya Ueshin Major Resolved Fixed  
Sub-task SPARK-42875

SPARK-41279 Fix toPandas to handle timezone and map types properly.

Takuya Ueshin Takuya Ueshin Major Resolved Fixed  
Sub-task SPARK-42848

SPARK-41279 Implement DataFrame.registerTempTable

Takuya Ueshin Takuya Ueshin Major Resolved Fixed  
Sub-task SPARK-42615

SPARK-41279 Refactor the AnalyzePlan RPC and add `session.version`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-42431

SPARK-41279 Union avoid calling `output` before analysis

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-42378

SPARK-41279 Make `DataFrame.select` support `a.*`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-42367

SPARK-41279 DataFrame.drop should handle duplicated columns properly

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-42338

SPARK-41279 Different exception in DataFrame.sample

Takuya Ueshin Takuya Ueshin Major Resolved Fixed  
Sub-task SPARK-42267

SPARK-41279 Support left_outer join

Ruifeng Zheng Xinrong Meng Major Resolved Fixed  
Sub-task SPARK-42265

SPARK-41279 DataFrame.createTempView - SparkConnectGrpcException: requirement failed

Takuya Ueshin Xinrong Meng Major Resolved Fixed  
Sub-task SPARK-42213

SPARK-41279 Failed to test ClientE2ETestSuite with maven

Yang Jie Yang Jie Major Resolved Fixed  
Sub-task SPARK-42089

SPARK-41279 Different result in nested lambda function

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-42085

SPARK-41279 Make `from_arrow_schema` support nested types

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-42076

SPARK-41279 Factor data conversion `arrow -> rows` out to `conversion.py`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41988

SPARK-41279 Fix map_filter and map_zip_with output order

Jiaan Geng Jiaan Geng Major Resolved Fixed  
Sub-task SPARK-41987

SPARK-41279 createDataFrame supports column with map type.

Unassigned Jiaan Geng Major Resolved Resolved  
Sub-task SPARK-41963

SPARK-41279 Different exception message in DataFrame.unpivot

Takuya Ueshin Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-41957

SPARK-41279 Enable the doctest for `DataFrame.hint`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41945

SPARK-41279 Python: connect client lost column data with pyarrow.Table.to_pylist

Jiaan Geng Jiaan Geng Major Resolved Fixed  
Sub-task SPARK-41936

SPARK-41279 Make `withMetadata` reuse the `withColumns` proto

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41927

SPARK-41279 Add the unsupported list for `GroupedData`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41924

SPARK-41279 Make StructType support metadata and Implement `DataFrame.withMetadata`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41923

SPARK-41279 Add `DataFrame.writeTo` to the unsupported list

Ruifeng Zheng Ruifeng Zheng Minor Resolved Fixed  
Sub-task SPARK-41922

SPARK-41279 Implement DataFrame `semanticHash`

Unassigned Sandeep Singh Major Resolved Duplicate  
Sub-task SPARK-41907

SPARK-41279 Function `sampleby` return parity

Jiaan Geng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41906

SPARK-41279 Handle Function `rand() `

Hyukjin Kwon Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41905

SPARK-41279 Function `slice` should handle string in params

Hyukjin Kwon Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41904

SPARK-41279 Fix Function `nth_value` functions output

Unassigned Sandeep Singh Major Resolved Won't Fix  
Sub-task SPARK-41902

SPARK-41279 Parity in String representation of higher_order_function's output

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41901

SPARK-41279 Parity in String representation of Column

Hyukjin Kwon Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41899

SPARK-41279 DataFrame.createDataFrame converting int to bigint

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41898

SPARK-41279 Window.rowsBetween should handle `float("-inf")` and `float("+inf")` as argument

Sandeep Singh Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41897

SPARK-41279 Parity in Error types between pyspark and connect functions

Sandeep Singh Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41888

SPARK-41279 Support StreamingQueryListener for DataFrame.observe

Jiaan Geng Jiaan Geng Major Resolved Fixed  
Sub-task SPARK-41887

SPARK-41279 Support DataFrame hint parameter to be list

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41886

SPARK-41279 `DataFrame.intersect` doctest output has different order

Jiaan Geng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41884

SPARK-41279 DataFrame `toPandas` parity in return types

Hyukjin Kwon Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41881

SPARK-41279 `DataFrame.collect` should handle None/NaN properly

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41879

SPARK-41279 `DataFrame.collect` should support nested types

Apache Spark Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41876

SPARK-41279 Implement DataFrame `toLocalIterator`

Takuya Ueshin Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41875

SPARK-41279 Throw proper errors in Dataset.to()

Jiaan Geng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41874

SPARK-41279 Implement DataFrame `sameSemantics`

Unassigned Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41873

SPARK-41279 Implement DataFrame `pandas_api`

Sandeep Singh Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41871

SPARK-41279 DataFrame hint parameter can be str, float or int

Sandeep Singh Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41869

SPARK-41279 DataFrame dropDuplicates should throw error on non list argument

Hyukjin Kwon Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41850

SPARK-41279 Fix `isnan` function

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41847

SPARK-41279 DataFrame mapfield,structlist invalid type

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41846

SPARK-41279 DataFrame windowspec functions : unresolved columns

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41840

SPARK-41279 DataFrame.show(): 'Column' object is not callable

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41838

SPARK-41279 DataFrame.show() fix map printing

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41837

SPARK-41279 DataFrame.createDataFrame datatype conversion error

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41833

SPARK-41279 DataFrame.collect() output parity with pyspark

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41832

SPARK-41279 DataFrame.unionByName output is wrong

Sandeep Singh Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41831

SPARK-41279 DataFrame.transform: Only Column or String can be used for projections

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41830

SPARK-41279 Fix DataFrame.sample parameters

Sandeep Singh Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41829

SPARK-41279 Implement Dataframe.sort,sortWithinPartitions Ordering

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41827

SPARK-41279 DataFrame.groupBy requires all cols be Column or str

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41825

SPARK-41279 DataFrame.show formatting int as double

Ruifeng Zheng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41824

SPARK-41279 Implement DataFrame.explain format to be similar to PySpark

Jiaan Geng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41823

SPARK-41279 DataFrame.join creating ambiguous column names

Ruifeng Zheng Sandeep Singh Major Resolved Duplicate  
Sub-task SPARK-41821

SPARK-41279 Fix DataFrame.describe

Jiaan Geng Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41820

SPARK-41279 DataFrame.createOrReplaceGlobalTempView - SparkConnectException: requirement failed

Takuya Ueshin Sandeep Singh Major Resolved Fixed  
Sub-task SPARK-41819

SPARK-41279 Implement Dataframe.rdd getNumPartitions

Unassigned Sandeep Singh Major Resolved Invalid  
Sub-task SPARK-41812

SPARK-41279 DataFrame.join: ambiguous column

Ruifeng Zheng Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-41785

SPARK-41279 Implement `GroupedData.mean`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41779

SPARK-41279 Make getitem support filter and select

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41749

SPARK-41279 Support multiple arguments in groupBy.sum(...)

Apache Spark Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-41748

SPARK-41279 Support multiple arguments in groupBy.min(...)

Hyukjin Kwon Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-41747

SPARK-41279 Support multiple arguments in groupBy.avg(...)

Hyukjin Kwon Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-41744

SPARK-41279 Support multiple arguments in groupBy.max(...)

Hyukjin Kwon Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-41743

SPARK-41279 groupBy(...).agg(...).sort does not actually sort the output

Martin Grund Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-41742

SPARK-41279 Support star in groupBy.agg()

Ruifeng Zheng Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-41737

SPARK-41279 Implement `GroupedData.{min, max, avg, sum}`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41736

SPARK-41279 pyspark_types_to_proto_types should supports ArrayType

Jiaan Geng Jiaan Geng Major Resolved Fixed  
Sub-task SPARK-41717

SPARK-41279 Implement the command logic for print and _repr_html_

Hyukjin Kwon Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-41706

SPARK-41279 pyspark_types_to_proto_types should supports MapType

Jiaan Geng Jiaan Geng Major Resolved Fixed  
Sub-task SPARK-41693

SPARK-41279 Implement `GroupedData.pivot`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41692

SPARK-41279 implement `DataFrame.rollup`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41681

SPARK-41279 Factor GroupedData out to group.py

Hyukjin Kwon Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-41546

SPARK-41279 pyspark_types_to_proto_types should supports StructType.

Jiaan Geng Jiaan Geng Major Resolved Fixed  
Sub-task SPARK-41527

SPARK-41279 Implement DataFrame.observe

Jiaan Geng Hyukjin Kwon Major Resolved Fixed  
Sub-task SPARK-41464

SPARK-41279 Implement DataFrame.to

Jiaan Geng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41453

SPARK-41279 Implement DataFrame.subtract

Jiaan Geng Tom van Bussel Major Resolved Fixed  
Sub-task SPARK-41440

SPARK-41279 Implement DataFrame.randomSplit

Jiaan Geng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41439

SPARK-41279 Implement `DataFrame.melt` and `DataFrame.unpivot`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41438

SPARK-41279 Implement DataFrame. colRegex

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41403

SPARK-41279 Implement DataFrame.describe

Jiaan Geng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41384

SPARK-41279 Should use SQLExpression for str arguments in Projection

Unassigned Rui Wang Major Resolved Duplicate  
Sub-task SPARK-41383

SPARK-41279 Implement `DataFrame.cube`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41372

SPARK-41279 Support DataFrame TempView

Rui Wang Rui Wang Major Resolved Resolved  
Sub-task SPARK-41366

SPARK-41279 DF.groupby.agg() API should be compatible

Martin Grund Martin Grund Major Resolved Fixed  
Sub-task SPARK-41362

SPARK-41279 Better type errors when passing wrong parameters

Unassigned Martin Grund Major In Progress Unresolved  
Sub-task SPARK-41354

SPARK-41279 Implement `DataFrame.repartitionByRange`

Deng Ziming Xinrong Meng Major Resolved Fixed  
Sub-task SPARK-41349

SPARK-41279 Implement `DataFrame.hint`

Deng Ziming Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41333

SPARK-41279 Make `Groupby.{min, max, sum, avg, mean}` compatible with PySpark

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41331

SPARK-41279 Add orderBy and drop_duplicates

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41330

SPARK-41279 Improve Documentation for Take,Tail, Limit and Offset

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41326

SPARK-41279 Bug in Deduplicate Python transformation

Martin Grund Martin Grund Major Resolved Fixed  
Sub-task SPARK-41325

SPARK-41279 Add missing avg() to DF group

Martin Grund Martin Grund Major Resolved Fixed  
Sub-task SPARK-41315

SPARK-41279 Implement `DataFrame.replace ` and `DataFrame.na.replace `

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41312

SPARK-41279 Implement DataFrame.withColumnRenamed

Rui Wang Rui Wang Critical Resolved Fixed  
Sub-task SPARK-41310

SPARK-41279 Implement DataFrame.toDF

Rui Wang Rui Wang Major Resolved Resolved  
Sub-task SPARK-41308

SPARK-41279 Improve `DataFrame.count()`

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41304

SPARK-41279 Add missing docs for DataFrame API

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41297

SPARK-41279 Support string sql expressions in DF.where()

Martin Grund Martin Grund Blocker Resolved Fixed  
Sub-task SPARK-41291

SPARK-41279 `DataFrame.explain` should print and return None

Ruifeng Zheng Ruifeng Zheng Minor Resolved Fixed  
Sub-task SPARK-41256

SPARK-41279 Implement DataFrame.withColumn(s)

Rui Wang Rui Wang Blocker Resolved Fixed  
Sub-task SPARK-41250

SPARK-41279 DataFrame.to_pandas should not return optional pandas dataframe

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41230

SPARK-41279 Remove `str` from Aggregate expression type

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41227

SPARK-41279 Implement DataFrame cross join

Xinrong Meng Xinrong Meng Major Resolved Fixed  
Sub-task SPARK-41216

SPARK-41279 Make AnalyzePlan support multiple analysis tasks

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41213

SPARK-41279 Implement `DataFrame.__repr__` and `DataFrame.dtypes`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41212

SPARK-41279 Implement `DataFrame.isEmpty`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41203

SPARK-41279 Dataframe.transform in Python client support

Martin Grund Martin Grund Major Resolved Fixed  
Sub-task SPARK-41201

SPARK-41279 Implement `DataFrame.SelectExpr` in Python client

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41169

SPARK-41279 Implement `DataFrame.drop`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41164

SPARK-41279 Update relations.proto to follow Connect Proto development guidance

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41157

SPARK-41279 Show detailed differences in dataframe comparison

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41148

SPARK-41279 Implement `DataFrame.dropna ` and `DataFrame.na.drop `

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41128

SPARK-41279 Implement `DataFrame.fillna ` and `DataFrame.na.fill `

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41127

SPARK-41279 Implement DataFrame.CreateGlobalView in Python client

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41122

SPARK-41279 Explain API can support different modes

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41116

SPARK-41279 Input relation can be optional for Project in Connect proto

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41115

SPARK-41279 Add ClientType to proto to indicate which client sends a request

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41111

SPARK-41279 Implement `DataFrame.show`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41110

SPARK-41279 Implement `DataFrame.sparkSession` in Python client

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41105

SPARK-41279 Adopt `optional` keyword from proto3 which offers `hasXXX` to differentiate if a field is set or unset

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41078

SPARK-41279 DataFrame `withColumnsRenamed` can be implemented through `RenameColumns` proto

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41069

SPARK-41279 Implement `DataFrame.approxQuantile` and `DataFrame.stat.approxQuantile`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41068

SPARK-41279 Implement `DataFrame.stat.corr`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41067

SPARK-41279 Implement `DataFrame.stat.cov`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41066

SPARK-41279 Implement `DataFrame.sampleBy ` and `DataFrame.stat.sampleBy `

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41065

SPARK-41279 Implement `DataFrame.freqItems ` and `DataFrame.stat.freqItems `

Unassigned Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41064

SPARK-41279 Implement `DataFrame.crosstab` and `DataFrame.stat.crosstab`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-41061

SPARK-41279 Support SelectExpr which apply Projection by expressions in Strings in Connect DSL

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41058

SPARK-41279 Removing unused code in connect

Deng Ziming Deng Ziming Minor Resolved Fixed  
Sub-task SPARK-41057

SPARK-41279 Support other data type conversion in the DataTypeProtoConverter

Unassigned Rui Wang Major Resolved Duplicate  
Sub-task SPARK-41046

SPARK-41279 Support CreateView in Connect DSL

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41036

SPARK-41279 `columns` API should use `schema` API to avoid data fetching

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41026

SPARK-41279 Support Repartition in Connect DSL

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41010

SPARK-41279 Complete Support for Except and Intersect in Python client

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-41002

SPARK-41279 Compatible `take`, `head` and `first` API in Python client

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40992

SPARK-41279 Support toDF(columnNames) in Connect DSL

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40977

SPARK-41279 Complete Support for Union in Python client

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40971

SPARK-41279 Imports more from connect proto package to avoid calling `proto.` for Connect DSL

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40970

SPARK-41279 Support List[Column] for Join's on argument.

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40953

SPARK-41279 Add missing `limit(n)` in DataFrame.head

Ruifeng Zheng Ruifeng Zheng Minor Resolved Fixed  
Sub-task SPARK-40949

SPARK-41279 Implement `DataFrame.sortWithinPartitions`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-40938

SPARK-41279 Support Alias for every Relation

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40933

SPARK-41279 Reimplement df.stat.{cov, corr} with built-in sql functions

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-40930

SPARK-41279 Support Collect() in Python client

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40926

SPARK-41279 Refactor server side tests to only use DataFrame API

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40917

SPARK-41279 Add a dedicated logical plan for `Summary`

Ruifeng Zheng Ruifeng Zheng Major Resolved Workaround  
Sub-task SPARK-40915

SPARK-41279 Improve `on` in Join in Python client

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40900

SPARK-41279 Reimplement `frequentItems` with dataframe operations

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-40880

SPARK-41279 Reimplement `summary` with dataframe operations

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-40879

SPARK-41279 Support Join UsingColumns in proto

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40877

SPARK-41279 Reimplement `crosstab` with dataframe operations

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-40875

SPARK-41279 Add .agg() to Connect DSL

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40854

SPARK-41279 Change default serialization from 'broken' CSV to Spark DF JSON

Martin Grund Martin Grund Major Resolved Fixed  
Sub-task SPARK-40852

SPARK-41279 Implement `DataFrame.summary`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-40839

SPARK-41279 [Python] Implement `DataFrame.sample`

Ruifeng Zheng Ruifeng Zheng Major Resolved Fixed  
Sub-task SPARK-40836

SPARK-41279 AnalyzeResult should use struct for schema

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40823

SPARK-41279 Connect Proto should carry unparsed identifiers

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40818

SPARK-41279 Add Intersect to Connect proto and DSL

Unassigned Hyukjin Kwon Major Resolved Duplicate  
Sub-task SPARK-40816

SPARK-41279 Python: rename LogicalPlan.collect to LogicalPlan.to_proto

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40813

SPARK-41279 Add limit and offset to Connect DSL

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40812

SPARK-41279 Add Deduplicate to Connect proto

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40809

SPARK-41279 Add as(alias: String) to connect DSL

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40780

SPARK-41279 Add WHERE to Connect proto and DSL

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40774

SPARK-41279 Add Sample to proto and DSL

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40743

SPARK-41279 StructType should contain a list of StructField and each field should have a name

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40713

SPARK-41279 Improve SET operation support in the proto and the server

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40707

SPARK-41279 Add groupby to connect DSL and test more than one grouping expressions

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40645

SPARK-41279 Throw exception for Collect() and recommend to use toPandas()

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40587

SPARK-41279 SELECT * shouldn't be empty project list in proto.

Rui Wang Rui Wang Major Resolved Fixed  
Sub-task SPARK-40586

SPARK-41279 Decouple plan transformation and validation on server side

Unassigned Rui Wang Major Open Unresolved  
Sub-task SPARK-40534

SPARK-41279 Extend support for Join Relation

Rui Wang Martin Grund Major Resolved Fixed  
Sub-task SPARK-40454

SPARK-41279 Initial DSL framework for protobuf testing

Rui Wang Martin Grund Major Resolved Fixed  

Cancel