Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.11.0
-
None
-
None
Description
'dictionary' command of parquet-cli throw NPE when specified column isn't dictionary encoding.
$ java -cp 'target/classes:target/dependency/*' org.apache.parquet.cli.Main dictionary /work/parquet-mr/data/test.parquet -c binary_field
Unknown error
java.lang.NullPointerException
at org.apache.parquet.cli.commands.ShowDictionaryCommand.run(ShowDictionaryCommand.java:78)
at org.apache.parquet.cli.Main.run(Main.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.parquet.cli.Main.main(Main.java:177)
The schema of 'test.parquet' is following:
$ java -cp 'target/classes:target/dependency/*' org.apache.parquet.cli.Main meta /work/parquet-mr/data/test.parquet File path: /work/parquet-mr/data/test.parquet Created by: parquet-mr version 1.12.0-SNAPSHOT (build 1e62e2e2ca903d4109480bc87ceec1dc954b6c92) Properties: writer.model.name: example Schema: message test { required int32 int32_field; required int64 int64_field; required float float_field; required double double_field; required binary binary_field; required int64 timestamp_field (TIMESTAMP(MILLIS,true)); } Row group 0: count: 395 15.87 B records start: 4 total: 6.120 kB -------------------------------------------------------------------------------- type encodings count avg size nulls min / max int32_field INT32 _ D 395 0.20 B 0 "32" / "426" int64_field INT64 _ D 395 0.20 B 0 "64" / "458" float_field FLOAT _ _ 395 4.13 B 0 "1.0" / "395.0" double_field DOUBLE _ _ 395 8.13 B 0 "2.0" / "396.0" binary_field BINARY _ D 395 2.98 B 0 "0x6162636465666768696A6B6..." / "0x6162636465666768696A6B6..." timestamp_field INT64 _ D 395 0.23 B 0 "2018-11-04T12:41:15.123+0000" / "2018-11-04T12:47:49.123+0000" Row group 1: count: 395 15.92 B records start: 6271 total: 6.142 kB -------------------------------------------------------------------------------- type encodings count avg size nulls min / max int32_field INT32 _ D 395 0.20 B 0 "427" / "821" int64_field INT64 _ D 395 0.20 B 0 "459" / "853" float_field FLOAT _ _ 395 4.13 B 0 "396.0" / "790.0" double_field DOUBLE _ _ 395 8.13 B 0 "397.0" / "791.0" binary_field BINARY _ D 395 3.03 B 0 "0x6162636465666768696A6B6..." / "0x6162636465666768696A6B6..." timestamp_field INT64 _ D 395 0.23 B 0 "2018-11-04T12:47:50.123+0000" / "2018-11-04T12:54:24.123+0000" Row group 2: count: 234 16.53 B records start: 12560 total: 3.777 kB -------------------------------------------------------------------------------- type encodings count avg size nulls min / max int32_field INT32 _ D 234 0.17 B 0 "822" / "1055" int64_field INT64 _ D 234 0.31 B 0 "854" / "1087" float_field FLOAT _ _ 234 4.11 B 0 "791.0" / "1024.0" double_field DOUBLE _ _ 234 8.21 B 0 "792.0" / "1025.0" binary_field BINARY _ D 234 3.38 B 0 "0x6162636465666768696A6B6..." / "0x6162636465666768696A6B6..." timestamp_field INT64 _ D 234 0.35 B 0 "2018-11-04T12:54:25.123+0000" / "2018-11-04T12:58:18.123+0000"