Hive
  1. Hive
  2. HIVE-2980

Show a warning or an error when the data directory is empty or not existing

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Diagnosability, Metastore
    • Labels:
      None

      Description

      It looks like a good idea to show a warning or an error when the data directory is missing or empty.

      This will help in cut down the debugging time as well a good information to have on the deleted data

        Activity

        Hide
        Ashutosh Chauhan added a comment -

        This certainly is not an error condition. Table with no data (dir) is a valid use case. I am not even sure about warning. When yo do select on empty table, you get 0 records back, what else do you expect?

        Show
        Ashutosh Chauhan added a comment - This certainly is not an error condition. Table with no data (dir) is a valid use case. I am not even sure about warning. When yo do select on empty table, you get 0 records back, what else do you expect?
        Hide
        alex gemini added a comment -

        I think this maybe relevant to the table or partition status,when create a new table or partition ,the table will store "NEW" status in metastore,if is delete or drop,status will change to 'DROPED',if table is create using load DML ,status will be 'LOADED',when using desc command,it will print the table or partition status to help user understand the last status of this table.

        Show
        alex gemini added a comment - I think this maybe relevant to the table or partition status,when create a new table or partition ,the table will store "NEW" status in metastore,if is delete or drop,status will change to 'DROPED',if table is create using load DML ,status will be 'LOADED',when using desc command,it will print the table or partition status to help user understand the last status of this table.
        Hide
        alex gemini added a comment -

        and maybe this will help stats gather,if table or partition status is "NEW" or "DROPED", we know that there is 0 row in this table or partition, if table or partition status is "LOADED" or "INSERT" ,we can continue to gather table stats.

        Show
        alex gemini added a comment - and maybe this will help stats gather,if table or partition status is "NEW" or "DROPED", we know that there is 0 row in this table or partition, if table or partition status is "LOADED" or "INSERT" ,we can continue to gather table stats.
        Hide
        alex gemini added a comment -

        currently,the transient_lastDdlTime is a little useless since it didn't not indicate which DDL user use.I suggest change this to two new status:"update_time" and "status", update_time means last time user "insert","insert overwrite","drop" ,status will include "NEW","LOADED","INSERT","DROPPED" status.

        Show
        alex gemini added a comment - currently,the transient_lastDdlTime is a little useless since it didn't not indicate which DDL user use.I suggest change this to two new status:"update_time" and "status", update_time means last time user "insert","insert overwrite","drop" ,status will include "NEW","LOADED","INSERT","DROPPED" status.
        Hide
        sukhendu chakraborty added a comment -

        I tend to disagree with Ashutosh. When a table is created a corresponding directory is created in the in /<hive path>/warehouse. So, technically, this directory is part of the table metadata. Therefore, if this directory is erroneously deleted, the table metadata is in a inconsistent state and the user should be notified about it. If you remove all the files inside the directory, then its the same as an empty table and a select should pass with 0 rows returned.

        Show
        sukhendu chakraborty added a comment - I tend to disagree with Ashutosh. When a table is created a corresponding directory is created in the in /<hive path>/warehouse. So, technically, this directory is part of the table metadata. Therefore, if this directory is erroneously deleted, the table metadata is in a inconsistent state and the user should be notified about it. If you remove all the files inside the directory, then its the same as an empty table and a select should pass with 0 rows returned.

          People

          • Assignee:
            Unassigned
            Reporter:
            Nitin Pawar
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development