Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-38988

Pandas API - "PerformanceWarning: DataFrame is highly fragmented." get printed many times.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.3.0, 3.4.0
    • 3.3.0
    • PySpark
    • None

    Description

      I add a file and a notebook with the info msg I get when I run df.info()

      Spark master build from 13.04.22.

      df.shape
      (763300, 224)

      Attachments

        1. info.txt
          44 kB
          Bjørn Jørgensen
        2. Untitled.html
          735 kB
          Bjørn Jørgensen
        3. warning printed.txt
          113 kB
          Bjørn Jørgensen

        Activity

          People

            XinrongM Xinrong Meng
            bjornjorgensen Bjørn Jørgensen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: