Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-27761

Make UDF nondeterministic by default

    XMLWordPrintableJSON

Details

    • Brainstorming
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.0.0
    • None
    • SQL
    • None

    Description

      Opening this issue as a followup from a discussion/question on this PR for an optimization involving deterministic udf: https://github.com/apache/spark/pull/24593#pullrequestreview-237361795
      "We even should discuss whether all UDFs must be deterministic or non-deterministic by default."

      Basically today in Spark 2.4, Scala UDFs are marked deterministic by default and it is implicit. To mark a udf as non deterministic, they need to call this method asNondeterministic().

      The concern's expressed are that users are not aware of this property and its implications.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ksunitha Sunitha Kambhampati
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated: