Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-7047

[C++][Dataset] Filter expressions should not require exact type match

    XMLWordPrintableJSON

    Details

      Description

      It's not trivial for users to be able to ensure that scalars are of identical type to the fields they relate to in Expressions. For one, FieldExpressions don't contain a type reference, so at the time when I construct field_ref("col1") > scalar(42), I don't know exactly what type col1 is to be able to ensure that scalar(42) matches. Even if it were available, I wouldn't be able to determine what type to make it if the expression were (field_ref("col1") + field_ref("col2")) > scalar(42).

      We should allow CompareExpressions to cast the inputs as necessary. This should be among integer types and floating point types, and across integers and floats too. Likewise among date/timestamp types, and probably if comparing a string scalar against a date/timestamp column, the string should be parsed as a datetime. We also need to think about DictionaryTypes (though in practice this is moot until we have a comparison kernels that work on strings).

      Francois Saint-JacquesBen Kietzman

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bkietz Ben Kietzman
                Reporter:
                npr Neal Richardson
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 7h 10m
                  7h 10m