Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.0.0
Description
Parquet file reader crashes while reading boolean columns in TypedColumnReaderImpl<DType>::Skip.
The calculation of the buffer size in the code below is not correct as value_byte_size is 1 for booleans, and the same buffer is used for definition and repetition levels data which requires 2 bytes per value.
// This will be enough scratch space to accommodate 16-bit levels or any // value type std::shared_ptr<ResizableBuffer> scratch = AllocateBuffer( this->pool_, batch_size * type_traits<DType::type_num>::value_byte_size); do { batch_size = std::min(batch_size, rows_to_skip); values_read = ReadBatch(static_cast<int>(batch_size), reinterpret_cast<int16_t*>(scratch->mutable_data()), reinterpret_cast<int16_t*>(scratch->mutable_data()), reinterpret_cast<T*>(scratch->mutable_data()), &values_read); rows_to_skip -= values_read; } while (values_read > 0 && rows_to_skip > 0);
Attachments
Issue Links
- links to