Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29048

Query optimizer slow when using Column.isInCollection() with a large size collection

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 2.4.4, 2.4.5, 2.4.6
    • None
    • SQL
    • None

    Description

      Query optimizer slow when using Column.isInCollection() with a large size collection.

      The query optimizer takes a long time to do its thing and on the UI all I see is "Running commands". This can take from 10s of minutes to 11 hours depending on how many values there are.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              weichenxu123 Weichen Xu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: