Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-9836

[Rust] [DataFusion] Improve API for usage of UDFs

    XMLWordPrintableJSON

Details

    Description

      TL;DR; currently, users call UDFs through
       
      df.select(scalar_functions(“sqrt”, vec![col(“a”)], DataType::Float64))
       
      Proposal:
       
      let f = df.registry();

      df.select(f.udf(“sqrt”, vec![col(“a”)])?)
       
      so that they do not have to remember the UDFs return type when using it.
       
      This API will in the future allow to declare the UDF as part of the planning, like spark, instead of having to register it in the registry before using it (we just need to check if the UDF is registered or not before doing so).
      See complete proposal here: https://docs.google.com/document/d/1Kzz642ScizeKXmVE1bBlbLvR663BKQaGqVIyy9cAscY/edit?usp=sharing

       

      Attachments

        Issue Links

          Activity

            People

              jorgecarleitao Jorge Leitão
              jorgecarleitao Jorge Leitão
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 40m
                  4h 40m