Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.17.0
-
None
-
Windows 10 single local node.
Description
Parquet logical type UUID fails on read. Only workaround is to store as text, a 125% penalty.
Here is the schema dump for the attached test parquet file. I can read the file okay from R and natively through C++.
3961 $ parquet-dump-schema uuid.parquet required group field_id=0 schema { required fixed_len_byte_array(16) field_id=1 uuid_req1 (UUID); optional fixed_len_byte_array(16) field_id=2 uuid_opt1 (UUID); required fixed_len_byte_array(16) field_id=3 uuid_req2 (UUID); }
UPDATE: I tested with a simple fixed binary column, and received the following error.
See second attached uuid-simple-fixed-length-array.parquet.
org.apache.drill.common.exceptions.UserRemoteException: INTERNAL_ERROR ERROR: Error in drill parquet reader (complex). Message: Failure in setting up reader Parquet Metadata: null Fragment: 0:0 Please, refer to logs for more information. [Error Id: f6fdd477-c208-4a3d-8476-e366921e5787 on PWXAA:31010] at org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:125) at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273) at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
I'm new.. I put this as MAJOR from reading the severity definitions, but gladly defer to those who know better how to classify.
Attachments
Attachments
Issue Links
- contains
-
PARQUET-1006 ColumnChunkPageWriter uses only heap memory.
- Open
-
DRILL-7906 Replace ParquetColumnChunkPageWriter with original Parquet class
- Open
-
DRILL-7907 Replace ParquetFileWriter in Drill with parquet version
- Open
-
DRILL-7904 Update to 30-jre Guava version
- Resolved
-
PARQUET-2026 Allow empty row in parquet file
- Open
- Dependent
-
DRILL-7896 Drill Parquet UUID function
- Open
- is blocked by
-
PARQUET-1898 Release parquet-mr 1.12.0
- Resolved
- is cloned by
-
DRILL-7829 Drill Parquet UUID logical type
- Resolved
- is related to
-
PARQUET-1266 LogicalTypes union in parquet-format doesn't include UUID
- Resolved
-
PARQUET-1827 UUID type currently not supported by parquet-mr
- Resolved
-
PARQUET-1125 Add UUID logical type
- Resolved
- relates to
-
DRILL-7904 Update to 30-jre Guava version
- Resolved
- requires
-
PARQUET-2026 Allow empty row in parquet file
- Open