Details
-
Question
-
Status: Resolved
-
Major
-
Resolution: Not A Bug
-
None
-
None
-
None
-
None
-
ghx-label-4
Description
hi,
i am doing the self learning now the impala and trying to enable the compression for the table but could not see the hdfs file getting the extension?
referring to
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_txtfile.html
but not sure how the final compressed file are creating.
When I try sqoop, i can get the compress file. please guide.
create table csv_compressed (a string, b string, c string)
row format delimited fields terminated by ",";
insert into csv_compressed values
('one - uncompressed', 'two - uncompressed', 'three - uncompressed'),
('abc - uncompressed', 'xyz - uncompressed', '123 - uncompressed');
...make equivalent .gz, .bz2, and .snappy files and load them into same table directory...
select * from csv_compressed;
----------------------------------------------------------
a | b | c |
----------------------------------------------------------
one - snappy | two - snappy | three - snappy |
one - uncompressed | two - uncompressed | three - uncompressed |
abc - uncompressed | xyz - uncompressed | 123 - uncompressed |
one - bz2 | two - bz2 | three - bz2 |
abc - bz2 | xyz - bz2 | 123 - bz2 |
one - gzip | two - gzip | three - gzip |
abc - gzip | xyz - gzip | 123 - gzip |
----------------------------------------------------------
$ hdfs dfs -ls 'hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/';
...truncated for readability...
75 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/csv_compressed.snappy
79 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/csv_compressed_bz2.csv.bz2
80 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/csv_compressed_gzip.csv.gz
116 hdfs://127.0.0.1:8020/user/hive/warehouse/file_formats.db/csv_compressed/dd414df64d67d49b_data.0.