Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-9815

[Rust] [DataFusion] Deadlock in creation of physical plan with two udfs

Details

    Description

      This one took me some time to understand, but I finally have a reproducible example: when two udfs are called, one after the other, we cause a deadlock when creating the physical plan.

      Example test

      #[test]
      fn csv_query_sqrt_sqrt() -> Result<()> {
          let mut ctx = create_ctx()?;
          register_aggregate_csv(&mut ctx)?;
          let sql = "SELECT sqrt(sqrt(c12)) FROM aggregate_test_100 LIMIT 1";
          let actual = execute(&mut ctx, sql);
          // sqrt(sqrt(c12=0.9294097332465232)) = 0.9818650561397431
          let expected = "0.9818650561397431".to_string();
          assert_eq!(actual.join("\n"), expected);
          Ok(())
      }
      

      I believe that this is due to the recursive nature of the physical planner, that locks scalar_functions within a match, which blocks the whole thing.

      Attachments

        Activity

          People

            jorgecarleitao Jorge Leitão
            jorgecarleitao Jorge Leitão
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 5h 50m
                5h 50m

                Slack

                  Issue deployment