Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-9946

[R] ParquetFileWriter segfaults when `sink` is a string

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.0.1
    • 2.0.0
    • R
    • Ubuntu 20.04

    Description

      Hello again! I have another minor R arrow issue.

       

      The ParquetFileWriter docs say that the sink argument can be a "string which is interpreted as a file path". However, when I try to use a string, I get a segfault because the memory isn't mapped.

       

      Maybe this is a separate request, but it would also be helpful to have documentation for the methods of the writer created by ParquetFileWriter$create().

      Docs link: https://arrow.apache.org/docs/r/reference/ParquetFileWriter.html

       

      library(arrow)
      
      sch = schema(a = float32())
      writer = ParquetFileWriter$create(schema = sch, sink = "test.parquet")
      
      #> *** caught segfault ***
      #> address 0x14100007d, cause 'memory not mapped'
      #> 
      #> Traceback:
      #> 1: parquet___arrow___ParquetFileWriter__Open(schema, sink, properties,     arrow_properties)
      #> 2: shared_ptr_is_null(xp)
      #> 3: shared_ptr(ParquetFileWriter, parquet___arrow___ParquetFileWriter__Open(schema,     sink, properties, arrow_properties))
      #> 4: ParquetFileWriter$create(schema = sch, sink = "test.parquet")
      
      
      # This works as expected:
      sink = FileOutputStream$create("test.parquet")
      writer = ParquetFileWriter$create(schema = sch, sink = sink)
      

      Attachments

        Issue Links

          Activity

            People

              karldw Karl Dunkle Werner
              karldw Karl Dunkle Werner
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 20m
                  1h 20m