Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
Impala 2.7.0
Description
The parquet scanner does not validate values read in many cases, so it has a number of failure modes.
We don't validate that the string length is non-negative and fits within the buffer:
template<> inline int ParquetPlainEncoder::Decode( uint8_t* buffer, int fixed_len_size, StringValue* v) { memcpy(&v->len, buffer, sizeof(int32_t)); v->ptr = reinterpret_cast<char*>(buffer) + sizeof(int32_t); int bytesize = ByteSize(*v); if (fixed_len_size > 0) v->len = std::min(v->len, fixed_len_size); // we still read bytesize bytes, even if we truncate return bytesize; }
In general, we don't validate when calling ParquetPlainEncoder::Decode() that we're staying within the data page.
Attachments
Issue Links
- is related to
-
IMPALA-5407 Crash SIGSEGV in DeepCopyVarlenData while inserting into sequencefile
- Resolved