[ARROW-17459] [C++] Support nested data conversions for chunked array - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Open
Priority: Blocker
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: C++
Labels:
None

External issue URL:
https://github.com/apache/arrow/issues/32723

Description

`FileReaderImpl::ReadRowGroup` fails with "Nested data conversions not implemented for chunked array outputs". It fails on [ChunksToSingle](https://github.com/apache/arrow/blob/7f6b074b84b1ca519b7c5fc7da318e8d47d44278/cpp/src/parquet/arrow/reader.cc#L95)

Data schema is:

  optional group fields_map (MAP) = 217 {
    repeated group key_value {
      required binary key (STRING) = 218;
      optional binary value (STRING) = 219;
    }
  }
fields_map.key_value.value-> Size In Bytes: 13243589 Size In Ratio: 0.20541047
fields_map.key_value.key-> Size In Bytes: 3008860 Size In Ratio: 0.046667963

Is there a way to work around this issue in the cpp lib?

In any case, I am willing to implement this, but I need some guidance. I am very new to parquet (as in started reading about it yesterday).

Probably related to: https://issues.apache.org/jira/browse/ARROW-10958

Attachments

Issue Links

duplicates

ARROW-5030 [Python] read_row_group fails with Nested data conversions not implemented for chunked array outputs

Open

Activity

People

Assignee:: Arthur Passos

Reporter:: Arthur Passos

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 18/Aug/22 12:24

Updated:: 11/Jan/23 11:50