Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-49409

CONNECT_SESSION_PLAN_CACHE_SIZE is too small for certain programming patterns

    XMLWordPrintableJSON

Details

    Description

      Example:

       

      ```

      df_1 = df_a.filter(col('X').isNotNull())

      df_2 = df_b.filter(col('SAFE_SU_Conv').isNotNull())

      ....

      df_x = ...

      for _ in range(0, 5):

          df_x = df_x.select(...)

      ...

      df_3 = df_1.join(df_2, ...)

      ```

      => df_x completely invalidates all the cached entries.

      Attachments

        Issue Links

          Activity

            People

              changgyoopark-db Changgyoo Park
              changgyoopark-db Changgyoo Park
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: