Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-11562

[C++][Dataset] Provide more robust handling of comparison guarantees in the presence of implicit casts

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0
    • None
    • C++
    • None

    Description

      After ARROW-8919 it's possible that a field reference may be wrapped in an implicit cast, which complicates destructuring during expression simplification. In particular, some errors can arise as a result of assuming that casts will preserve numeric ordering:

      int two_28 = 1 << 28;
      
      auto partition_expr = less_equal(field_ref("i32"), literal(two_28 + 1));
      
      auto filter = greater(
        cast(field_ref("i32"), float32()),
        literal(float(two_28)));
      

      Currently the RHS of the filter and the partition expression will be considered equal since casting two_28+1 to float results in the same value as casting two_28 to float (due to limited FP precision). Since x <= y and x > y are disjoint, the partition will skipped entirely including any rows where i32 == two_28+1 (which should be selected by this filter).

      Attachments

        Activity

          People

            Unassigned Unassigned
            bkietz Ben Kietzman
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: