Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-9083

[R] collect int64, uint32, uint64 as R integer type if not out of bounds

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0
    • R

    Description

      bit64::integer64 can be awkward to work with in R (one example: https://github.com/apache/arrow/issues/7385). Often in Arrow we get int64 types from compute methods or other translation methods that auto-promote to the largest integer type, but they would fit fine in a 32-bit integer, which is R's native type.

      When calling Array__as_vector on an int64, we could first call the minmax function on the array, and if the extrema are within the range of a 32-bit int, return a regular R integer vector. This would add a little bit of ambiguity as to what R type you'll get from an Arrow type, but I wonder if the benefits are worth it since you can't do much with an integer64 in R. (We could also make this optional, similar to ARROW-7657, so you could specify a "strict" mode if you are in a use case where roundtrip fidelity is more important than R usability.)

      Likewise, uint32 and uint64 could be kept as integers and prevent the conversion to double that is currently implemented.

      Attachments

        Issue Links

          Activity

            People

              npr Neal Richardson
              npr Neal Richardson
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 40m
                  40m