Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-3547

[R] Protect against Null crash when reading from RecordBatch

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 0.12.0
    • R
    • None

    Description

      Reprex:

       

        tbl <- tibble::tibble(
          int = 1:10, dbl = as.numeric(1:10),
          lgl = sample(c(TRUE, FALSE, NA), 10, replace = TRUE),
          chr = letters[1:10]
        )
      
        batch <- record_batch(tbl)
        bytes <- write_record_batch(batch, raw())
      
        stream_reader <- record_batch_stream_reader(bytes)
        batch1 <- read_record_batch(stream_reader)
      
        batch2 <- read_record_batch(stream_reader)
        
        # Crash
        as_tibble(batch2)

       

      While users should check for Null entries by running:

       

      if(!batch2$is_null()) as_tibble(batch2)
      

      It's harsh to trigger a crash, we should consider protecting all functions that use RecordBatch pointers to return NULL instead, for instance:

       

      List RecordBatch__to_dataframe(const std::shared_ptr<arrow::RecordBatch>& batch) {
         if (batch->get() == nullptr) Rcpp::stop("Can't read from NULL record batch.")
      }

       

       

      Attachments

        Activity

          People

            romainfrancois Romain Francois
            javierluraschi Javier Luraschi
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: