Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-3263

[R] Use R sentinel values for missingness in addition to bitmask

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Format, R
    • None

    Description

      R uses sentinal values to indicate missingness within Atomic vectors (read arrays in Arrow parlance, AFAIK). 

      Currently according to wesmckinn, the current value in the array in memory is undefined if the bitmap indicating missingness is set to 1. 

      This will force R to copy and modify data whenever adopting Arrow data which has missingness present as a native vector.

      If the value were written to the relevant sentinal values (INT_MIN for 32 bit integers, and NaN with payload 1954 for double precision floats) in addition to the bit mask, then R would be able to use Arrow as intended while not breaking any other systems.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gmbecker Gabriel Becker
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated: