Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-14519

[C++] joins segfault when data contains list column

    XMLWordPrintableJSON

Details

    Description

      When I run the R code below, it results in a segfault if one of the tables contains a list column.

      library(arrow)
      library(dplyr)
      
      basic_tbl <- arrow_table(
        tibble::tibble(
          x = 1:3,
          y = c("a", "b", "c")
        )
      )
      
      basic_tbl2 <- arrow_table(
        tibble::tibble(
          x = 1:3,
          z = c(T, F, T)
        )
      )
      
      list_tbl <- arrow_table(
        tibble::tibble(
          z = list(c("first", "list", "col", "row"), c("second row ", "here")),
          x = 1:2
        )
      )
      
      # works
      left_join(basic_tbl, basic_tbl2) %>%
        collect()
      
      # segfaults
      left_join(basic_tbl, list_tbl) %>%
        collect()
      
      

      Attachments

        Issue Links

          Activity

            People

              lidavidm David Li
              thisisnic Nicola Crane
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2.5h
                  2.5h