Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-10172

[Python] pyarrow.concat_arrays segfaults if a resulting StringArray's capacity overflows

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.0.1, 2.0.0
    • None
    • Python
    • None

    Description

      I'm sorry if this was already reported, but there's an overflow issue in concatenation of large arrays

      In [1]: import pyarrow as pa
      
      In [2]: str_array = pa.array(['a' * 128] * 10**8)
      
      In [3]: large_array = pa.concat_arrays([str_array] * 50)
      Segmentation fault (core dumped)
      

      I suppose that this should be handled by upcast to large_string.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ArtemK Artem KOZHEVNIKOV
              Votes:
              2 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: