Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-43270

Implement __dir__() in pyspark.sql.dataframe.DataFrame to include columns

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.5.0
    • 3.5.0
    • SQL
    • None

    Description

      Currently, Given df.| , the databricks notebook will only suggest the method of dataframe(see attached Screenshot of databricks notebook),

      However, df.column_name is also legal and runnable 

      Hence we should override the parent _dir_() method on Python DataFrame class to include column names. And the benefit of this is engine that uses dir() to generate autocomplete suggestions (e.g. IPython kernel, Databricks Notebooks) will suggest column names on the completion df.| 

      Attachments

        Activity

          People

            jarviscao Beishao Cao
            jarviscao Beishao Cao
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified