[SPARK-24215] Implement eager evaluation for DataFrame APIs - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.3.0
Fix Version/s: 2.4.0
Component/s: PySpark, Spark Core, SQL
Labels:
None

Target Version/s:

2.4.0

Description

To help people that are new to Spark get feedback more easily, we should implement the repr methods for Jupyter python kernels. That way, when users run pyspark in jupyter console or notebooks, they get good feedback about the queries they've defined.

This should include an option for eager evaluation, (maybe spark.jupyter.eager-eval?). When set, the formatting methods would run dataframes and produce output like show. This is a good balance between not hiding Spark's action behavior and getting feedback to users that don't know to call actions.

Here's the dev list thread for context: http://apache-spark-developers-list.1001551.n3.nabble.com/eager-execution-and-debuggability-td23928.html

Attachments

Issue Links

links to

[Github] Pull Request #21370 (xuanyuanking)

[Github] Pull Request #21553 (xuanyuanking)

Activity

People

Assignee:: Yuanjian Li

Reporter:: Ryan Blue

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 08/May/18 23:08

Updated:: 12/Dec/22 18:10

Resolved:: 05/Jun/18 01:26