Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4753

Table created like parquet file shows wrong row count

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Bug
    • Impala 2.7.0
    • None
    • Backend

    Description

      Creating a table with CREATE TABLE LIKE PARQUET results in a wrong number of rows.

      [de:21000] > create table test stored as parquet as select * from functional.alltypessmall;
      Query: create table test stored as parquet as select * from functional.alltypessmall
      Query submitted at: 2017-01-10 20:12:07 (Coordinator: http://de:25000)
      Query progress can be monitored at: http://de:25000/query_plan?query_id=f644c35e488ef9b0:9d4ad3ba00000000
      +---------------------+
      | summary             |
      +---------------------+
      | Inserted 100 row(s) |
      +---------------------+
      Fetched 1 row(s) in 4.04s
      [de:21000] > show files in test;
      Query: show files in test
      +-----------------------------------------------------------------------------------------------------+--------+-----------+
      | Path                                                                                                | Size   | Partition |
      +-----------------------------------------------------------------------------------------------------+--------+-----------+
      | hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq | 3.70KB |           |
      +-----------------------------------------------------------------------------------------------------+--------+-----------+
      Fetched 1 row(s) in 0.02s
      [de:21000] > create external table t2 like parquet 'hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq' location 'hdfs://localhost:20500/test-warehouse/test/';
      Query: create external table t2 like parquet 'hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq' location 'hdfs://localhost:20500/test-warehouse/test/'
      
      Fetched 0 row(s) in 0.21s
      [de:21000] > select count(*) from test;
      Query: select count(*) from test
      Query submitted at: 2017-01-10 20:13:08 (Coordinator: http://de:25000)
      Query progress can be monitored at: http://de:25000/query_plan?query_id=4640a0b136cae9c1:afb7a8f000000000
      +----------+
      | count(*) |
      +----------+
      | 100      |
      +----------+
      Fetched 1 row(s) in 0.15s
      [de:21000] > select count(*) from t2;
      Query: select count(*) from t2
      Query submitted at: 2017-01-10 20:13:38 (Coordinator: http://de:25000)
      Query progress can be monitored at: http://de:25000/query_plan?query_id=1c40c8df788a5982:7a83261d00000000
      +----------+
      | count(*) |
      +----------+
      | 75       |
      +----------+
      Fetched 1 row(s) in 4.70s
      [de:21000] >
      

      Querying the actual data results in read errors and wrong results:

      [de:21000] > select * from t2 limit 10;
      Query: select * from t2 limit 10
      Query submitted at: 2017-01-10 20:19:14 (Coordinator: http://de:25000)
      Query progress can be monitored at: http://de:25000/query_plan?query_id=5b4ffee71956d366:4745f26a00000000
      +------+----------+-------------+--------------+---------+------------+-----------+------------+-----------------+--------------+---------------+------+-------+
      | id   | bool_col | tinyint_col | smallint_col | int_col | bigint_col | float_col | double_col | date_string_col | string_col   | timestamp_col | year | month |
      +------+----------+-------------+--------------+---------+------------+-----------+------------+-----------------+--------------+---------------+------+-------+
      | NULL | NULL     | NULL        | NULL         | NULL    | NULL       | NULL      | NULL       |                 |              | NULL          | NULL | NULL  |
      | NULL | NULL     | NULL        | NULL         | NULL    | NULL       | NULL      | NULL       |                 |              | NULL          | NULL | NULL  |
      | NULL | NULL     | NULL        | NULL         | NULL    | NULL       | NULL      | NULL       |                 | UUUUUU������ | NULL          | NULL | NULL  |
      | NULL | NULL     | NULL        | NULL         | NULL    | NULL       | NULL      | NULL       |                 |              | NULL          | NULL | NULL  |
      | NULL | NULL     | NULL        | NULL         | NULL    | NULL       | NULL      | NULL       |                 |              | NULL          | NULL | NULL  |
      | NULL | NULL     | NULL        | NULL         | NULL    | NULL       | NULL      | NULL       |                 |              | NULL          | NULL | NULL  |
      | NULL | NULL     | NULL        | NULL         | NULL    | NULL       | NULL      | NULL       |                 |              | NULL          | NULL | NULL  |
      | NULL | NULL     | NULL        | NULL         | NULL    | NULL       | NULL      | NULL       |                 |              | NULL          | NULL | NULL  |
      | NULL | NULL     | NULL        | NULL         | NULL    | NULL       | NULL      | NULL       |                 |              | NULL          | NULL | NULL  |
      | NULL | NULL     | NULL        | NULL         | NULL    | NULL       | NULL      | NULL       |                 |              | NULL          | NULL | NULL  |
      +------+----------+-------------+--------------+---------+------------+-----------+------------+-----------------+--------------+---------------+------+-------+
      WARNINGS: Error converting column: 0 to INT
      Error converting column: 1 to BOOLEAN
      Error converting column: 2 to INT
      Error converting column: 3 to INT
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq, before offset: 3784
      Error converting column: 0 to INT
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq, before offset: 3784
      Error converting column: 0 to INT
      Error converting column: 1 to BOOLEAN
      Error converting column: 2 to INT
      Error converting column: 3 to INT
      Error converting column: 5 to BIGINT
      Error converting column: 6 to FLOAT
      Error converting column: 7 to DOUBLE
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq, before offset: 3784
      Error converting column: 0 to INT
      Error converting column: 1 to BOOLEAN
      Error converting column: 2 to INT
      Error converting column: 3 to INT
      Error converting column: 5 to BIGINT
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq, before offset: 3784
      Error converting column: 0 to INT
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq, before offset: 3784
      Error converting column: 0 to INT
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq, before offset: 3784
      Error converting column: 0 to INT
      Error converting column: 1 to BOOLEAN
      Error converting column: 2 to INT
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq, before offset: 3784
      Error converting column: 0 to INT
      Error converting column: 1 to BOOLEAN
      Error converting column: 2 to INT
      Error converting column: 4 to INT
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq, before offset: 3784
      Error converting column: 0 to INT
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq, before offset: 3784
      Error converting column: 0 to INT
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq, before offset: 3784
      Error converting column: 0 to INT
      Error converting column: 1 to BOOLEAN
      Error converting column: 2 to INT
      Error converting column: 3 to INT
      Error converting column: 4 to INT
      Error parsing row: file: hdfs://localhost:20500/test-warehouse/test/f644c35e488ef9b0-9d4ad3ba00000000_1474207981_data.0.parq, before offset: 3784
      
      Fetched 10 row(s) in 0.01s
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              lv Lars Volker
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: