Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9071

When metastore.warehouse.dir != metastore.warehouse.external.dir, Impala writes to the wrong location for external tables

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • Impala 3.4.0
    • Impala 3.4.0
    • Frontend
    • ghx-label-12

    Description

      Hive introduced a translation layer that can convert a normal table to an external table. When doing so without a specified location, the translated external table uses metastore.warehouse.external.dir as the location rather than metastore.warehouse.dir. Impala does not know about this distinction, so it writes to the location it thinks the table should be (under metastore.warehouse.dir). This means I can do the following:

      [localhost:21000] joetest> select count(*) from functional.alltypes;
      Query: select count(*) from functional.alltypes
      Query submitted at: 2019-10-19 13:08:24 (Coordinator: http://joemcdonnell:25000)
      Query progress can be monitored at: http://joemcdonnell:25000/query_plan?query_id=68434b05e2badd50:a18a2e3000000000
      +----------+
      | count(*) |
      +----------+
      | 7300     |
      +----------+
      Fetched 1 row(s) in 0.14s
      [localhost:21000] joetest> create table testtable as select * from functional.alltypes;
      Query: create table testtable as select * from functional.alltypes
      Query submitted at: 2019-10-19 13:08:36 (Coordinator: http://joemcdonnell:25000)
      Query progress can be monitored at: http://joemcdonnell:25000/query_plan?query_id=794b92fb68f36ab0:910d036400000000
      +----------------------+
      | summary              |
      +----------------------+
      | Inserted 7300 row(s) |
      +----------------------+
      Fetched 1 row(s) in 0.50s
      [localhost:21000] joetest> select count(*) from testtable;
      Query: select count(*) from testtable
      Query submitted at: 2019-10-19 13:08:43 (Coordinator: http://joemcdonnell:25000)
      Query progress can be monitored at: http://joemcdonnell:25000/query_plan?query_id=66423abf016e65af:8362460900000000
      +----------+
      | count(*) |
      +----------+
      | 0        |
      +----------+
      Fetched 1 row(s) in 0.13s
      

      We inserted 7300 rows, but we can't select them back because they were written to the wrong location.

      Attachments

        Activity

          People

            stigahuang Quanlong Huang
            joemcdonnell Joe McDonnell
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: