Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.6.0
    • Fix Version/s: None
    • Component/s: Query Processor
    • Labels:
      None

      Description

      Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?

      This feature is enabled when

      set hive.session.test = true

      is done in the session.

      1. HIVE-805-1.patch
        30 kB
        Ashish Thusoo
      2. HIVE-805.patch
        28 kB
        Ashish Thusoo

        Issue Links

          Activity

          Hide
          Zheng Shao added a comment -

          As with my last comment, I think we should interpret the option "hive.metastore.dryrun" at compile time instead of at execution time.
          It should be compiled into the plan.

          The execution code can look at the plan and then decide which metastore to "create" the table in.

          Now with "view" in, we should also make it work for views.

          Show
          Zheng Shao added a comment - As with my last comment, I think we should interpret the option "hive.metastore.dryrun" at compile time instead of at execution time. It should be compiled into the plan. The execution code can look at the plan and then decide which metastore to "create" the table in. Now with "view" in, we should also make it work for views.
          Hide
          Ashish Thusoo added a comment -

          Incorporated Prasad's review comments. I have not yet disabled this for partition tables though.

          Show
          Ashish Thusoo added a comment - Incorporated Prasad's review comments. I have not yet disabled this for partition tables though.
          Hide
          Prasad Chakka added a comment -
          1. can you rename 'test' mode to 'temporary' mode or something like that? test here should mean either dry-run or temporary.
          2. this patch tries to allow creation of a partition of a regular table in temporary store. i am sure that it fails. i don't think there is a good solution at all since the metastore requires the table to exist before creating a partition. should we allow this at all? if we need it then we may have to redesign this.
          3. once a session table is created, a table parameter should identify that as such. this can be done by adding that parameter before creating the table in session metastore. alter_table etc that take in a table object should depend on this table instead of trying to alter both metastores.
          4. if ignoreUnknownTab=true then a NoSuchObjectException will not be thrown. so the below code will be incorrect.
            boolean tableDropped = false;
            if (this.conf.getBoolVar(HiveConf.ConfVars.HIVESESSIONTEST)) {
              try {
                getSessionMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab);
                tableDropped = true;
              }
              catch (NoSuchObjectException e) {
                // Ignore if the table is not found
              }
            }
            
            if (!tableDropped)
              getMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab);
            

          this pattern can be rewritten as

          if (this.conf.getBoolVar(HiveConf.ConfVars.HIVESESSIONTEST)) {
            try {
              getSessionMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab);
            }
            catch (NoSuchObjectException e) {
            	getMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab);
            }
          }
          
          Show
          Prasad Chakka added a comment - can you rename 'test' mode to 'temporary' mode or something like that? test here should mean either dry-run or temporary. this patch tries to allow creation of a partition of a regular table in temporary store. i am sure that it fails. i don't think there is a good solution at all since the metastore requires the table to exist before creating a partition. should we allow this at all? if we need it then we may have to redesign this. once a session table is created, a table parameter should identify that as such. this can be done by adding that parameter before creating the table in session metastore. alter_table etc that take in a table object should depend on this table instead of trying to alter both metastores. if ignoreUnknownTab=true then a NoSuchObjectException will not be thrown. so the below code will be incorrect. boolean tableDropped = false ; if ( this .conf.getBoolVar(HiveConf.ConfVars.HIVESESSIONTEST)) { try { getSessionMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab); tableDropped = true ; } catch (NoSuchObjectException e) { // Ignore if the table is not found } } if (!tableDropped) getMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab); this pattern can be rewritten as if ( this .conf.getBoolVar(HiveConf.ConfVars.HIVESESSIONTEST)) { try { getSessionMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab); } catch (NoSuchObjectException e) { getMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab); } }
          Hide
          Zheng Shao added a comment -

          In the mode of "hive.session.test = true", we should translate "create" to "create temporary", and "select" to "explain select".

          The Metastore/Hive.java code should only look at whether it's "create" or "create temporary" to decide which metastore to use.

          Show
          Zheng Shao added a comment - In the mode of "hive.session.test = true", we should translate "create" to "create temporary", and "select" to "explain select". The Metastore/Hive.java code should only look at whether it's "create" or "create temporary" to decide which metastore to use.
          Hide
          Ashish Thusoo added a comment -

          Fair enough. Let me fix that. I presume we can create a separate hive conf variable that keeps track of temp or test name space.

          Show
          Ashish Thusoo added a comment - Fair enough. Let me fix that. I presume we can create a separate hive conf variable that keeps track of temp or test name space.
          Hide
          Prasad Chakka added a comment -

          we need to do this now since Metastore.createTable() will create the directory for you so when the session level metastore closes, these directories will be unnecessarily hanging.
          didn't look into the code yet.

          Show
          Prasad Chakka added a comment - we need to do this now since Metastore.createTable() will create the directory for you so when the session level metastore closes, these directories will be unnecessarily hanging. didn't look into the code yet.
          Hide
          Ashish Thusoo added a comment -

          It is the same location. However in the session.test mode the queries and dmls do not run - only an explain output is generated for those. We can extend this later to maintain a different namespace for this data - or we can do that now as well and it will not be used in the test mode. Thoughts?

          Show
          Ashish Thusoo added a comment - It is the same location. However in the session.test mode the queries and dmls do not run - only an explain output is generated for those. We can extend this later to maintain a different namespace for this data - or we can do that now as well and it will not be used in the test mode. Thoughts?
          Hide
          Ashish Thusoo added a comment -

          submitting patch.

          Show
          Ashish Thusoo added a comment - submitting patch.
          Hide
          Ashish Thusoo added a comment -

          Patch that implements this. Please send in your comments.

          Show
          Ashish Thusoo added a comment - Patch that implements this. Please send in your comments.
          Hide
          Prasad Chakka added a comment -

          what is the HDFS location for tables in session level metastore? is it the same location as regular table?

          Show
          Prasad Chakka added a comment - what is the HDFS location for tables in session level metastore? is it the same location as regular table?

            People

            • Assignee:
              Ashish Thusoo
              Reporter:
              Ashish Thusoo
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:

                Development