Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10173

[Rust][DataFusion] Improve performance of equality to a constant predicate support

    XMLWordPrintableJSON

Details

    Description

      I noticed this behavior while working on support for DictionaryArrays and wanted to capture it in a ticket in case someone has time to work on it.

      In order to implement an equality predicate to a constant such as d1 = 'three', DataFusion effectively creates an array with the same value 'three' repeated over and over again and uses the equality compute kernel. This is ... suboptimal.

      Here is what the predicate looks like:

              predicate: BinaryExpr {
                  left: CastExpr {
                      expr: Column {
                          name: "d1",
                      },
                      cast_type: Utf8,
                  },
                  op: Eq,
                  right: Literal {
                      value: Utf8("three"),
                  },
              },
      

      Attachments

        Issue Links

          Activity

            People

              yordan-pavlov Yordan Pavlov
              alamb Andrew Lamb
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5.5h
                  5.5h