Version 3.2.2

RELEASED

Start date not set

Released: 15/Jul/22

Release Notes

PTKeySummaryAssigneeStatus
BlockerBugSPARK-38204All state operators are at a risk of inconsistency between state partitioning and operator partitioningJungtaek LimResolved
BlockerBugSPARK-38652uploadFileUri should preserve file schemeDongjoon HyunResolved
BlockerBugSPARK-38677pyspark hangs in local mode running rdd map operationAnkur DaveResolved
BlockerBugSPARK-38684Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iteratorsJungtaek LimResolved
BlockerBugSPARK-38955Disable lineSep option in 'from_csv' and 'schema_of_csv'Hyukjin KwonResolved
BlockerBugSPARK-39293The accumulator of ArrayAggregate should copy the intermediate result if string, struct, array, or mapTakuya UeshinResolved
CriticalBugSPARK-37554Add PyArrow, pandas and plotly to release Docker image dependenciesHyukjin KwonResolved
CriticalBugSPARK-37793Invalid LocalMergedBlockData cause task hangChandni SinghResolved
CriticalBugSPARK-38542UnsafeHashedRelation should serialize numKeys outUnassignedResolved
CriticalBugSPARK-38563Upgrade to Py4J 0.10.9.5Hyukjin KwonResolved
CriticalBugSPARK-38631Arbitrary shell command injection via Utils.unpack()Hyukjin KwonResolved
CriticalBugSPARK-39283Spark tasks stuck forever due to deadlock between TaskMemoryManager and UnsafeExternalSorterSandeep PalResolved
MajorBugSPARK-30062Add IMMEDIATE statement to the DB2 dialect truncate implementationIvan KarolResolved
MajorBugSPARK-33206Spark Shuffle Index Cache calculates memory usage wrongAttila Zsolt PirosResolved
MajorBugSPARK-36553KMeans fails with NegativeArraySizeException for K = 50000 after issue #27758 was introducedRuifeng ZhengResolved
MajorImprovementSPARK-36808Upgrade Kafka to 2.8.1Kousuke SarutaResolved
MajorBugSPARK-37290Exponential planning time in case of non-deterministic functionKaya KupferschmidtResolved
MajorBugSPARK-37498 test_reuse_worker_of_parallelize_range is flakyYikun JiangResolved
MajorBugSPARK-37544sequence over dates with month interval is producing incorrect resultsBruce RobbinsResolved
MajorBugSPARK-37643when charVarcharAsString is true, char datatype partition table query incorrectYuanGuanhuResolved
MajorImprovementSPARK-37670Support predicate pushdown and column pruning for de-duped CTEsWei XueResolved
MajorSub-taskSPARK-37675Prevent overwriting of push shuffle merged files once the shuffle is finalizedChandni SinghResolved
MajorBugSPARK-37690Recursive view `df` detected (cycle: `df` -> `df`)UnassignedResolved
MajorBugSPARK-37730plot.hist throws AttributeError on pandas=1.3.5Michał SłapekResolved
MajorSub-taskSPARK-37735Add appId interface to KubernetesConfYikun JiangResolved
MajorBugSPARK-37865Spark should not dedup the groupingExpressions when the first child of Union has duplicate columnsKaren FengResolved
MajorSub-taskSPARK-37866Set file.encoding to UTF-8 for SBT testsDongjoon HyunResolved
MajorBugSPARK-37932Analyzer can fail when join left side and right side are the same viewZhixiong ChenResolved
MajorBugSPARK-37963Need to update Partition URI after renaming table in InMemoryCatalogGengliang WangResolved
MajorBugSPARK-37977Upgrade ORC to 1.6.13Dongjoon HyunResolved
MajorSub-taskSPARK-37995TPCDS 1TB q72 fails when spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly is falseXiDuo YouResolved
MajorImprovementSPARK-38007Update K8s doc to recommend K8s 1.20+Dongjoon HyunResolved
MajorSub-taskSPARK-38013AQE can change bhj to smj if no extra shuffle introduceXiDuo YouResolved
MajorBugSPARK-38018Fix ColumnVectorUtils.populate to handle CalendarIntervalType correctlyCheng SuResolved
MajorSub-taskSPARK-38019ExecutorMonitor.timedOutExecutors should be deterministicDongjoon HyunResolved
MajorSub-taskSPARK-38023ExecutorMonitor.onExecutorRemoved should handle ExecutorDecommission as finishedDongjoon HyunResolved
MajorSub-taskSPARK-38029Support docker-desktop K8S integration test in SBTWilliam HyunResolved
MajorSub-taskSPARK-38030Query with cast containing non-nullable columns fails with AQE on Spark 3.1.1Shardul MahadikResolved
MajorBugSPARK-38042Encoder cannot be found when a tuple component is a type alias for an ArrayJohan Nyström-PerssonResolved
MajorTestSPARK-38045More strict validation on plan check for stream-stream join unit testJungtaek LimResolved
MajorImprovementSPARK-38046Fix KafkaSource/KafkaMicroBatch flaky test due to non-deterministic timingBoyang Jerry PengResolved
MajorSub-taskSPARK-38048Add IntegrationTestBackend.describePods to support all K8s test backendsWilliam HyunResolved
MajorBugSPARK-38056Structured streaming not working in history server when using LevelDBwyResolved
MajorSub-taskSPARK-38071Support K8s namespace parameter in SBT K8s ITWilliam HyunResolved
MajorSub-taskSPARK-38072Support K8s imageTag parameter in SBT K8s ITWilliam HyunResolved
MajorBugSPARK-38073Update atexit function to avoid issues with late bindingMaciej SzymkiewiczResolved
MajorBugSPARK-38075Hive script transform with order by and limit will return fake rowsBruce RobbinsResolved
MajorTestSPARK-38080Flaky test: StreamingQueryManagerSuite: 'awaitAnyTermination with timeout and resetTerminated'Shixiong ZhuResolved
MajorSub-taskSPARK-38081Support cloud-backend in K8s IT with SBTDongjoon HyunResolved
MajorTestSPARK-38084Support `SKIP_PYTHON` and `SKIP_R` in `run-tests.py`Dongjoon HyunResolved
MajorTaskSPARK-38122Update App Key of DocSearchGengliang WangResolved
MajorTaskSPARK-38144Remove unused `spark.storage.safetyFraction` configDongjoon HyunResolved
MajorBugSPARK-38151Handle `Pacific/Kanton` in DateTimeUtilsSuiteDongjoon HyunResolved
MajorBugSPARK-38178Correct the logic to measure the memory usage of RocksDBYun TangResolved
MajorSub-taskSPARK-38180Allow safe up-cast expressions in correlated equality predicatesAllison WangResolved
MajorImprovementSPARK-38184Fix malformatted ExpressionDescription of `decode`Xinrong MengResolved
MajorBugSPARK-38185Fix data incorrect if aggregate function is emptyXiDuo YouResolved
MajorTaskSPARK-38189Add priority scheduling doc for Spark on K8SYikun JiangResolved
MajorBugSPARK-38221Group by a stream of complex expressions failsUnassignedResolved
MajorBugSPARK-38236Absolute file paths specified in create/alter table are treated as relativeBo ZhangResolved
MajorBugSPARK-38271PoissonSampler may output more rows than MaxRowsRuifeng ZhengResolved
MajorSub-taskSPARK-38272Use docker-desktop instead of docker-for-desktop for Docker K8S IT deployMode and context name Yikun JiangResolved
MajorBugSPARK-38273decodeUnsafeRows's iterators should close underlying input streamsKevin SewellResolved
MajorImprovementSPARK-38279Pin markupsafe to 2.0.1 fix linter failureHaejoon LeeResolved
MajorBugSPARK-38285ClassCastException: GenericArrayData cannot be cast to InternalRowL. C. HsiehResolved
MajorBugSPARK-38286Union's maxRows and maxRowsPerPartition may overflowRuifeng ZhengResolved
MajorTestSPARK-38297Fix mypy failure on DataFrame.to_numpy in pandas API on SparkHyukjin KwonResolved
MajorDependency upgradeSPARK-38303Upgrade ansi-regex from 5.0.0 to 5.0.1 in /devBjørn JørgensenResolved
MajorBugSPARK-38309SHS has incorrect percentiles for shuffle read bytes and shuffle total blocks metricsRob ReevesResolved
MajorTaskSPARK-38318regression when replacing a dataset viewLinhong LiuResolved
MajorBugSPARK-38320(flat)MapGroupsWithState can timeout groups which just received inputs in the same microbatchAlex BalikovResolved
MajorSub-taskSPARK-38325ANSI mode: avoid potential runtime error in HashJoin.extractKeyExprAt() Gengliang WangResolved
MajorBugSPARK-38333DPP cause DataSourceScanExec java.lang.NullPointerExceptionjiahong.liResolved
MajorBugSPARK-38347Nullability propagation in transformUpWithNewOutputYingyi BuResolved
MajorSub-taskSPARK-38363Avoid runtime error in Dataset.summary() when ANSI mode is onGengliang WangResolved
MajorBugSPARK-38379Fix Kubernetes Client mode when mounting persistent volume with storage classThomas GravesResolved
MajorSub-taskSPARK-38398Add `priorityClassName` integration test caseDongjoon HyunResolved
MajorSub-taskSPARK-38407ANSI Cast: loosen the limitation of casting non-null complex typesGengliang WangResolved
MajorBugSPARK-38411Use UTF-8 when doMergeApplicationListingInternal reads event logsCheng PanResolved
MajorBugSPARK-38412`from` and `to` is swapped in the StateSchemaCompatibilityCheckerJungtaek LimResolved
MajorSub-taskSPARK-38430Add SBT commands to K8s IT readmeWilliam HyunResolved
MajorBugSPARK-38446Deadlock between ExecutorClassLoader and FileDownloadCallback caused by Log4jKent Yao 2Resolved
MajorImprovementSPARK-38487Fix docstrings of nlargest/nsmallest of DataFrameXinrong MengResolved
MajorBugSPARK-38517Fix PySpark documentation generation (missing ipython_genutils)Hyukjin KwonResolved
MajorBugSPARK-38528NullPointerException when selecting a generator in a Stream of aggregate expressionsBruce RobbinsResolved
MajorBugSPARK-38579Requesting Restful API can cause NullPointerExceptionYimin YangResolved
MajorBugSPARK-38587Validating new location for rename command should use formatted namesKent Yao 2Resolved
MajorBugSPARK-38614Don't push down limit through window that's using percent_rankBruce RobbinsResolved
MajorBugSPARK-38655OffsetWindowFunctionFrameBase cannot find the offset row whose input is not nullJiaan GengResolved
MajorSub-taskSPARK-38787Possible correctness issue on stream-stream join when handling edge caseAnish ShrigondekarResolved
MajorBugSPARK-38807Error when starting spark shell on Windows systemMing LiResolved
MajorSub-taskSPARK-38809Implement option to skip null values in symmetric hash impl of stream-stream joinsAnish ShrigondekarResolved
MajorBugSPARK-38830Warn on corrupted block messagesDongjoon HyunResolved
MajorBugSPARK-38868`assert_true` fails unconditionnaly after `left_outer` joinsBruce RobbinsResolved
MajorImprovementSPARK-38892Fix the UT of schema equal assertYuanGuanhuResolved
MajorBugSPARK-38905Upgrade ORC to 1.6.14Dongjoon HyunResolved
MajorBugSPARK-38922TaskLocation.apply throw NullPointerExceptionKent Yao 2Resolved
MajorTestSPARK-38927Skip NumPy/Pandas tests in `test_rdd.py` if not availableWilliam HyunResolved
MajorBugSPARK-38931RocksDB File manager would not create initial dfs directory with unknown number of keys on 1st empty checkpointYun TangResolved
MajorBugSPARK-38977Fix schema pruning with correlated subqueriesAnton OkolnychyiResolved
MajorBugSPARK-38992Avoid using bash -c in ShellBasedGroupsMappingProviderHyukjin KwonResolved
MajorImprovementSPARK-39030Rename sum to avoid shading the builtin Python functionBjørn JørgensenResolved
MajorBugSPARK-39060Typo in error messages of decimal overflowVitalii LiResolved
MajorBugSPARK-39061Incorrect results or NPE when using Inline function against an array of dynamically created structsBruce RobbinsResolved
MajorBugSPARK-39083Fix FsHistoryProvider race condition between update and clean app dataVu TanResolved
MajorBugSPARK-39084df.rdd.isEmpty() results in unexpected executor failure and JVM crashIvan SadikovResolved
MajorDependency upgradeSPARK-39099Add dependencies to Dockerfile for building Spark releasesMax GekkResolved
MajorBugSPARK-39104Null Pointer Exeption on unpersist callCheng PanResolved
MajorBugSPARK-39107Silent change in regexp_replace's handling of empty stringsLorenzo MartiniResolved
MajorImprovementSPARK-39154Remove outdated statements on distributed-sequence default index Xinrong MengResolved
MajorImprovementSPARK-39174Catalogs loading swallows missing classname for ClassNotFoundExceptionKent Yao 2Resolved
MajorDocumentationSPARK-39219Promote Structured Streaming over Spark StreamingJungtaek LimResolved
MajorTestSPARK-39252Flaky Test: pyspark.sql.tests.test_dataframe.DataFrameTests test_df_is_emptyIvan SadikovResolved
MajorBugSPARK-39259Timestamps returned by now() and equivalent functions are not consistent in subqueriesJan-Ole SasseResolved
MajorTestSPARK-39273Make PandasOnSparkTestCase inherit ReusedSQLTestCaseHyukjin KwonResolved
MajorBugSPARK-39340DS v2 agg pushdown should allow dots in the name of top-level columnsWenchen FanResolved
MajorTaskSPARK-39367Review and fix issues in Scala/Java API docs of SQL moduleGengliang WangResolved
MajorTestSPARK-39373Recover branch-3.2 build broken by SPARK-39273 and SPARK-39252Hyukjin KwonResolved
MajorBugSPARK-39376Do not output duplicated columns in star expansion of subquery alias of NATURAL/USING JOINKaren FengResolved
MajorBugSPARK-39393Parquet data source only supports push-down predicate filters for non-repeated primitive typesAmin BorjianResolved
MajorBugSPARK-39419When the comparator of ArraySort returns null, it should fail.Takuya UeshinResolved
MajorBugSPARK-39421Sphinx build fails with "node class 'meta' is already registered, its visitors will be overridden"Hyukjin KwonResolved
MajorBugSPARK-39437normalize plan id separately in PlanStabilitySuiteWenchen FanResolved
MajorBugSPARK-39447Only non-broadcast query stage can propagate empty relationXiDuo YouResolved
MajorBugSPARK-39496Inline eval path cannot handle null structsBruce RobbinsResolved
MajorBugSPARK-39505Escape log content rendered in UISean R. OwenResolved
MajorBugSPARK-39548CreateView Command with a window clause query hit a wrong window definition not found issueRui WangResolved
MajorSub-taskSPARK-39553Failed to remove shuffle ${shuffleId} - null when using Scala 2.13Yang JieResolved
MajorBugSPARK-39570inline table should allow expressions with aliasWenchen FanResolved
MajorBugSPARK-39575ByteBuffer forget to rewind after get in AvroDeserializerFrank WongResolved
MajorSub-taskSPARK-39611PySpark support numpy 1.23.XYikun JiangResolved
MajorBugSPARK-39621Make run-tests.py robust by avoiding `rmtree` usageDongjoon HyunResolved
MajorBugSPARK-39650Streaming Deduplication should not check the schema of "value"Jungtaek LimResolved
MajorBugSPARK-39672NotExists subquery failed with conflicting attributesManu ZhangResolved
MajorDocumentationSPARK-39677Wrong args item formatting of the regexp functionsMax GekkResolved
MajorBugSPARK-39758NPE on invalid patterns from the regexp functionsMax GekkResolved
MajorBugSPARK-40804Missing handling a catalog name in destination tables in `RenameTableExec`UnassignedIn Progress
MajorBugSPARK-41336BroadcastExchange does not support the execute() code path. when AQE enabledUnassignedResolved
MinorImprovementSPARK-37891Add scalastyle check to disable scala.concurrent.ExecutionContext.Implicits.globalTianhan HuResolved
MinorImprovementSPARK-37934Upgrade Jetty version to 9.4.44Sajith AResolved
MinorSub-taskSPARK-37998Use `rbac.authorization.k8s.io/v1` instead of `v1beta1`Yikun JiangResolved
MinorBugSPARK-38017Fix the API doc for window to say it supports TimestampNTZType too as timeColumnKousuke SarutaResolved
MinorBugSPARK-38120HiveExternalCatalog.listPartitions is failing when partition column name is upper case and dot in partition valueKhalid MammadovResolved
MinorTestSPARK-38142Move ArrowColumnVectorSuite to org.apache.spark.sql.vectorizedKazuyuki TanimuraResolved
MinorBugSPARK-38198Fix `QueryExecution.debug#toFile` use the passed in `maxFields` when `explainMode` is `CodegenMode`Yang JieResolved
MinorImprovementSPARK-38211Add SQL migration guide on restoring loose upcast from stringManu ZhangResolved
MinorBugSPARK-38304Elt() should return null if index is null under ANSI modeGengliang WangResolved
MinorImprovementSPARK-38353Instrument __enter__ and __exit__ magic methods for pandas API on SparkYihong HeResolved
MinorSub-taskSPARK-38392Add `spark-` prefix to namespaces and `-driver` suffix to drivers during ITMartin Tzvetanov GrigorovResolved
MinorSub-taskSPARK-38538Fix driver environment verification in BasicDriverFeatureStepSuiteDongjoon HyunResolved
MinorImprovementSPARK-38570Incorrect DynamicPartitionPruning caused by Literalmcdull_zhangResolved
MinorTestSPARK-38786Test Bug in StatisticsSuite "change stats after add/drop partition command"Kazuyuki TanimuraResolved
MinorImprovementSPARK-38816Wrong comment in random matrix generator in spark-als algorithm Sean R. OwenResolved
MinorBugSPARK-38990date_trunc and trunc both fail with format from column in inline tableBruce RobbinsResolved
MinorDocumentationSPARK-39032Incorrectly formatted examples in pyspark.sql.functions.whenVadim PatsaloResolved
MinorDependency upgradeSPARK-39183Upgrade Apache Xerces Java to 2.12.2Bjørn JørgensenResolved
MinorBugSPARK-39258Fix `Hide credentials in show create table` after SPARK-35378Yang JieResolved
MinorBugSPARK-39422SHOW CREATE TABLE should suggest 'AS SERDE' for Hive tables with unsupported serde configurationsJosh RosenResolved
MinorBugSPARK-39476Disable Unwrap cast optimize when casting from Long to Float/ Double or from Integer to FloatEdisonWangResolved
MinorBugSPARK-39543The option of DataFrameWriterV2 should be passed to storage properties if fallback to v1yikaifeiResolved
MinorBugSPARK-39551Add AQE invalid plan checkWei XueResolved
TrivialImprovementSPARK-38100Remove unused method in `Decimal`Yang JieResolved
TrivialImprovementSPARK-38305Check existence of file before untarring/zippingSean R. OwenResolved
TrivialBugSPARK-38416Change day to month Bjørn JørgensenResolved
TrivialBugSPARK-38436Fix `test_ceil` to test `ceil`Bjørn JørgensenResolved
TrivialDocumentationSPARK-38606Update document to make a good guide of multiple versions of the Spark Shuffle Service tonydoenResolved
TrivialDocumentationSPARK-38629Two links beneath Spark SQL Guide/Data Sources do not work properlymorvenhuangResolved
TrivialImprovementSPARK-38936Script transform feed thread should have namedzcxzlResolved
TrivialImprovementSPARK-39240Source and binary releases using different tool to generates hashes for integrityKent Yao 2Resolved
TrivialBugSPARK-39355Single column uses quoted to construct UnresolvedAttributedzcxzlResolved
1170 of 170