Hive
  1. Hive
  2. HIVE-1428

ALTER TABLE ADD PARTITION fails with a remote Thrift metastore

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.6.0
    • Component/s: Metastore
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      If the hive cli is configured to use a remote metastore, ALTER TABLE ... ADD PARTITION commands will fail with an error similar to the following:

      [pradeepk@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e "ALTER TABLE mytable add partition(datestamp = '20091101', srcid = '10',action) location '/user/pradeepk/mytable/20091101/10';"
      10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
      Hive history file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt
      FAILED: Error in metadata: org.apache.thrift.TApplicationException: get_partition failed: unknown result
      FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
      [pradeepk@chargesize:~/dev/howl]

      This is due to a check that tries to retrieve the partition to see if it exists. If it does not, an attempt is made to pass a null value from the metastore. Since thrift does not support null return values, an exception is thrown.

      1. HIVE-1428.patch
        23 kB
        Pradeep Kamath
      2. HIVE-1428-0.6.0-patch.txt
        32 kB
        Carl Steinbach
      3. HIVE-1428-2.patch
        37 kB
        Pradeep Kamath
      4. HIVE-1428-3.patch
        29 kB
        Pradeep Kamath
      5. TestHiveMetaStoreRemote.java
        13 kB
        Pradeep Kamath

        Activity

        Paul Yang created issue -
        Paul Yang made changes -
        Field Original Value New Value
        Description If the hive cli is configured to use a remote metastore, ALTER TABLE ... ADD PARTITION commands will fail with an error similar to the following:

        {code}
        [pradeepk@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e "ALTER TABLE mytable add partition(datestamp = '20091101', srcid = '10',action) location '/user/pradeepk/mytable/20091101/10';"
        10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
        Hive history file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt
        FAILED: Error in metadata: org.apache.thrift.TApplicationException: get_partition failed: unknown result
        FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
        [pradeepk@chargesize:~/dev/howl]
        {code}

        This is due to a check that tries to retrieve the partition to check if it exists. If it does not, an attempt is made to pass a null partition value from the metastore. Since thrift does not support null return values, an exception is thrown when the CLI is configured to use a remote metastore.
        If the hive cli is configured to use a remote metastore, ALTER TABLE ... ADD PARTITION commands will fail with an error similar to the following:

        {code}
        [pradeepk@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e "ALTER TABLE mytable add partition(datestamp = '20091101', srcid = '10',action) location '/user/pradeepk/mytable/20091101/10';"
        10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
        Hive history file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt
        FAILED: Error in metadata: org.apache.thrift.TApplicationException: get_partition failed: unknown result
        FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
        [pradeepk@chargesize:~/dev/howl]
        {code}

        This is due to a check that tries to retrieve the partition to see if it exists. If it does not, an attempt is made to pass a null value from the metastore. Since thrift does not support null return values, an exception is thrown.
        Hide
        Pradeep Kamath added a comment -

        Attached patch addresses the issue by throwing a NoSuchObjectException in ObjectStore for get_partition() method when there are no partitions matching the arguments. This requires a change in the thrift idl for hive_metastore. Hence the patch also has the compiler generated files. I have tested that the "ALTER TABLE ADD pARTITION.." works when hive client uses thrift and connects to a thrift server and also in the case it connects directly to the db - not sure how to test this in a unit test framework - would need some guidance in that area.

        Show
        Pradeep Kamath added a comment - Attached patch addresses the issue by throwing a NoSuchObjectException in ObjectStore for get_partition() method when there are no partitions matching the arguments. This requires a change in the thrift idl for hive_metastore. Hence the patch also has the compiler generated files. I have tested that the "ALTER TABLE ADD pARTITION.." works when hive client uses thrift and connects to a thrift server and also in the case it connects directly to the db - not sure how to test this in a unit test framework - would need some guidance in that area.
        Pradeep Kamath made changes -
        Attachment HIVE-1428.patch [ 12448018 ]
        Hide
        Paul Yang added a comment -

        The unit tests that we have for the metastore run locally, so I don't think there's a way to use the existing framework for the thrift command - take a look at TestHiveMetastore.java. To test this, you would have to create a new JUnit test (or modify the existing one) to sets up a metastore server and issues commands (create table, drop table, etc) to it. Does that seem doable?

        Show
        Paul Yang added a comment - The unit tests that we have for the metastore run locally, so I don't think there's a way to use the existing framework for the thrift command - take a look at TestHiveMetastore.java. To test this, you would have to create a new JUnit test (or modify the existing one) to sets up a metastore server and issues commands (create table, drop table, etc) to it. Does that seem doable?
        Paul Yang made changes -
        Description If the hive cli is configured to use a remote metastore, ALTER TABLE ... ADD PARTITION commands will fail with an error similar to the following:

        {code}
        [pradeepk@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e "ALTER TABLE mytable add partition(datestamp = '20091101', srcid = '10',action) location '/user/pradeepk/mytable/20091101/10';"
        10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
        Hive history file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt
        FAILED: Error in metadata: org.apache.thrift.TApplicationException: get_partition failed: unknown result
        FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
        [pradeepk@chargesize:~/dev/howl]
        {code}

        This is due to a check that tries to retrieve the partition to see if it exists. If it does not, an attempt is made to pass a null value from the metastore. Since thrift does not support null return values, an exception is thrown.
        If the hive cli is configured to use a remote metastore, ALTER TABLE ... ADD PARTITION commands will fail with an error similar to the following:

        [pradeepk@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e "ALTER TABLE mytable add partition(datestamp = '20091101', srcid = '10',action) location '/user/pradeepk/mytable/20091101/10';"
        10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
        Hive history file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt
        FAILED: Error in metadata: org.apache.thrift.TApplicationException: get_partition failed: unknown result
        FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
        [pradeepk@chargesize:~/dev/howl]

        This is due to a check that tries to retrieve the partition to see if it exists. If it does not, an attempt is made to pass a null value from the metastore. Since thrift does not support null return values, an exception is thrown.
        Hide
        Pradeep Kamath added a comment -

        I have tried writing a unit test based on the existing one in TestHiveMetastore which starts off a thrift server. Unfortunately I have hit issues since some if the ant properties like build.dir do not get expanded in the conf seen by the server leading to error. I am attaching the Test file - can someone suggest how I can resolve this and if there is any other test which has attempted something similar.

        Here is the exception I see:

        2Codec, fs.checkpoint.size=67108864}
            [junit] MetaException(message:Got exception: java.io.FileNotFoundException File file:/test/data/warehouse/compdb.db does not exist.)
            [junit]     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result.read(ThriftHiveMetastore.java:2751)
            [junit]     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_database(ThriftHiveMetastore.java:127)
            [junit]     at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_database(ThriftHiveMetastore.java:104)
            [junit]     at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:270)
        

        Here is the value in the conf as seen by the server:
        hive.metastore.warehouse.dir=file://$

        {build.dir}

        /test/data/warehouse/

        Show
        Pradeep Kamath added a comment - I have tried writing a unit test based on the existing one in TestHiveMetastore which starts off a thrift server. Unfortunately I have hit issues since some if the ant properties like build.dir do not get expanded in the conf seen by the server leading to error. I am attaching the Test file - can someone suggest how I can resolve this and if there is any other test which has attempted something similar. Here is the exception I see: 2Codec, fs.checkpoint.size=67108864} [junit] MetaException(message:Got exception: java.io.FileNotFoundException File file:/test/data/warehouse/compdb.db does not exist.) [junit] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$create_database_result.read(ThriftHiveMetastore.java:2751) [junit] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_database(ThriftHiveMetastore.java:127) [junit] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_database(ThriftHiveMetastore.java:104) [junit] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createDatabase(HiveMetaStoreClient.java:270) Here is the value in the conf as seen by the server: hive.metastore.warehouse.dir= file://$ {build.dir} /test/data/warehouse/
        Pradeep Kamath made changes -
        Attachment TestHiveMetaStoreRemote.java [ 12448325 ]
        Hide
        Paul Yang added a comment -

        Can you try using a thread to create a HiveMetaStore instance?

        e.g.

        
          private static class RunMS implements Runnable {
        
            @Override
            public void run() {
              System.out.println("Running metastore!");
              String [] args = new String [0];
              HiveMetaStore.main(args);
            }
        
          }
        
          @Override
          protected void setUp() throws Exception {
            super.setUp();
        
            Thread t = new Thread(new RunMS());
            t.start();
        
            // Wait a little bit for the metastore to start. Should probably have
            // a better way of detecting if the metastore has started?
            Thread.sleep(5000);
        
            // Set conf to connect to the local metastore.
            HiveConf hiveConf = new HiveConf(this.getClass());
            // hive.metastore.local should be defined in HiveConf
            hiveConf.set("hive.metastore.local", "false");
            hiveConf.setVar(HiveConf.ConfVars.METASTOREURIS, "thrift://localhost:9083");
            // There's a bug in HiveConf that needs to be fixed before this next line 
            // will work
            // Change METATORETHRIFTRETRIES("hive.metastore.connect.retries", ""),
            // To METATORETHRIFTRETRIES("hive.metastore.connect.retries", 3),
            hiveConf.setIntVar(HiveConf.ConfVars.METATORETHRIFTRETRIES, 3);
        
            client = new HiveMetaStoreClient(hiveConf);
            // Now you have the client - run necessary tests.
          }
        
        Show
        Paul Yang added a comment - Can you try using a thread to create a HiveMetaStore instance? e.g. private static class RunMS implements Runnable { @Override public void run() { System .out.println( "Running metastore!" ); String [] args = new String [0]; HiveMetaStore.main(args); } } @Override protected void setUp() throws Exception { super .setUp(); Thread t = new Thread ( new RunMS()); t.start(); // Wait a little bit for the metastore to start. Should probably have // a better way of detecting if the metastore has started? Thread .sleep(5000); // Set conf to connect to the local metastore. HiveConf hiveConf = new HiveConf( this .getClass()); // hive.metastore.local should be defined in HiveConf hiveConf.set( "hive.metastore.local" , " false " ); hiveConf.setVar(HiveConf.ConfVars.METASTOREURIS, "thrift: //localhost:9083" ); // There's a bug in HiveConf that needs to be fixed before this next line // will work // Change METATORETHRIFTRETRIES( "hive.metastore.connect.retries" , ""), // To METATORETHRIFTRETRIES( "hive.metastore.connect.retries" , 3), hiveConf.setIntVar(HiveConf.ConfVars.METATORETHRIFTRETRIES, 3); client = new HiveMetaStoreClient(hiveConf); // Now you have the client - run necessary tests. }
        Hide
        Ashish Thusoo added a comment -

        Is your question the fact that build.dir is an empty string? build.dir gets defined in build-common.xml which inturn picks properties from build.properties. The build.xml in the metastore directory includes build-common.xml so it should be getting build.dir. How are you running this test?

        Show
        Ashish Thusoo added a comment - Is your question the fact that build.dir is an empty string? build.dir gets defined in build-common.xml which inturn picks properties from build.properties. The build.xml in the metastore directory includes build-common.xml so it should be getting build.dir. How are you running this test?
        Hide
        Paul Yang added a comment -

        Meant to add - By using a thread, you can avoid having to set up an environment, classpath, etc. An important line is "HiveConf hiveConf = new HiveConf(this.getClass());", which allows you to load the configuration files that are in the classpath defined for testing.

        And the command to run the test should be 'ant test -Dtestcase=TestHiveMetastoreRemote'

        Show
        Paul Yang added a comment - Meant to add - By using a thread, you can avoid having to set up an environment, classpath, etc. An important line is "HiveConf hiveConf = new HiveConf(this.getClass());", which allows you to load the configuration files that are in the classpath defined for testing. And the command to run the test should be 'ant test -Dtestcase=TestHiveMetastoreRemote'
        Hide
        Pradeep Kamath added a comment -

        New patch with unit tests included. Thanks for the suggestion of using threads Paul, have used that in the unit test. I have made a change in HiveConf to enable the unit test. The remaining changes in the patch are as in the first version - to throw NoSuchObjectException in getPartition() when no partition exists. This mainly changes the generated thrift code (to add throws in the method signature) and in other code which interacts with it to catch the exception and set the partition object to null.

        Show
        Pradeep Kamath added a comment - New patch with unit tests included. Thanks for the suggestion of using threads Paul, have used that in the unit test. I have made a change in HiveConf to enable the unit test. The remaining changes in the patch are as in the first version - to throw NoSuchObjectException in getPartition() when no partition exists. This mainly changes the generated thrift code (to add throws in the method signature) and in other code which interacts with it to catch the exception and set the partition object to null.
        Pradeep Kamath made changes -
        Attachment HIVE-1428-2.patch [ 12448984 ]
        Hide
        Pradeep Kamath added a comment -

        HIVE-1428-2.patch is ready for review.

        Show
        Pradeep Kamath added a comment - HIVE-1428 -2.patch is ready for review.
        Pradeep Kamath made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        John Sichi added a comment -

        Setting assignee to Pradeep.

        (Pradeep, I just added you as a contributor on Hive, so you should be able to assign issues to yourself going forward.)

        Show
        John Sichi added a comment - Setting assignee to Pradeep. (Pradeep, I just added you as a contributor on Hive, so you should be able to assign issues to yourself going forward.)
        John Sichi made changes -
        Assignee Pradeep Kamath [ pradeepkth ]
        Hide
        Paul Yang added a comment -

        Fix looks pretty good - two things though:

        1. The field identifiers for NoSuchObjectException and MetaException in hive_metastore.thrift should be swapped - this is because thrift uses those identifiers for versioning and we want to be consistent.

        2. TestHiveMetaStoreRemote and TestHiveMetaStore share quite a bit of code. Can you extract this out to a separate class? Or maybe roll the remote metastore client into TestHiveMetastore.

        Show
        Paul Yang added a comment - Fix looks pretty good - two things though: 1. The field identifiers for NoSuchObjectException and MetaException in hive_metastore.thrift should be swapped - this is because thrift uses those identifiers for versioning and we want to be consistent. 2. TestHiveMetaStoreRemote and TestHiveMetaStore share quite a bit of code. Can you extract this out to a separate class? Or maybe roll the remote metastore client into TestHiveMetastore.
        Paul Yang made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Pradeep Kamath added a comment -

        New patch incorporating Paul's review comments attached.

        Show
        Pradeep Kamath added a comment - New patch incorporating Paul's review comments attached.
        Pradeep Kamath made changes -
        Attachment HIVE-1428-3.patch [ 12449276 ]
        Hide
        Pradeep Kamath added a comment -

        HIVE-1428-3.patch is ready to be reviewed.

        Show
        Pradeep Kamath added a comment - HIVE-1428 -3.patch is ready to be reviewed.
        Pradeep Kamath made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Paul Yang added a comment -

        "hive.metastore.local" should be made into a HiveConf, but that can be done in a separate JIRA since it's probably used elsewhere as well. Looks good +1.

        Show
        Paul Yang added a comment - "hive.metastore.local" should be made into a HiveConf, but that can be done in a separate JIRA since it's probably used elsewhere as well. Looks good +1.
        Hide
        Pradeep Kamath added a comment -

        Paul - thanks for the review - I presume you mean it should be part of HiveConf.ConfVars enum - I was wondering why it wasn't too but did not go to make that change since it is not pertinent to the issue. I think we should address that in a separate jira. Can this patch be committed now? Any other steps needed from me?

        Show
        Pradeep Kamath added a comment - Paul - thanks for the review - I presume you mean it should be part of HiveConf.ConfVars enum - I was wondering why it wasn't too but did not go to make that change since it is not pertinent to the issue. I think we should address that in a separate jira. Can this patch be committed now? Any other steps needed from me?
        Hide
        John Sichi added a comment -

        +1. Will commit when tests pass.

        Show
        John Sichi added a comment - +1. Will commit when tests pass.
        John Sichi made changes -
        Summary ALTER TABLE ADD PARTITION fails with a remote Thirft metastore ALTER TABLE ADD PARTITION fails with a remote Thrift metastore
        Fix Version/s 0.7.0 [ 12315150 ]
        Hide
        John Sichi added a comment -

        Committed. Thanks Pradeep!

        Show
        John Sichi added a comment - Committed. Thanks Pradeep!
        John Sichi made changes -
        Hadoop Flags [Reviewed]
        Resolution Fixed [ 1 ]
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Hide
        Carl Steinbach added a comment -

        We need to backport this fix to 0.6.0

        Show
        Carl Steinbach added a comment - We need to backport this fix to 0.6.0
        Carl Steinbach made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Carl Steinbach made changes -
        Attachment HIVE-1428-0.6.0-patch.txt [ 12451981 ]
        Carl Steinbach made changes -
        Fix Version/s 0.6.0 [ 12314524 ]
        Fix Version/s 0.7.0 [ 12315150 ]
        Affects Version/s 0.5.0 [ 12314156 ]
        Affects Version/s 0.6.0 [ 12314524 ]
        Affects Version/s 0.7.0 [ 12315150 ]
        Carl Steinbach made changes -
        Status Reopened [ 4 ] Patch Available [ 10002 ]
        Hide
        John Sichi added a comment -

        I'll work on the 0.6 commit.

        Show
        John Sichi added a comment - I'll work on the 0.6 commit.
        Hide
        John Sichi added a comment -

        Committed to 0.6.

        Show
        John Sichi added a comment - Committed to 0.6.
        John Sichi made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Carl Steinbach made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Transition Time In Source Status Execution Times Last Executer Last Execution Date
        Patch Available Patch Available Open Open
        7h 50m 1 Paul Yang 08/Jul/10 23:51
        Open Open Patch Available Patch Available
        19d 10h 40m 2 Pradeep Kamath 12/Jul/10 19:33
        Resolved Resolved Reopened Reopened
        29d 1h 40m 1 Carl Steinbach 13/Aug/10 02:26
        Reopened Reopened Patch Available Patch Available
        3m 10s 1 Carl Steinbach 13/Aug/10 02:29
        Patch Available Patch Available Resolved Resolved
        3d 3h 49m 2 John Sichi 14/Aug/10 01:06
        Resolved Resolved Closed Closed
        489d 22h 56m 1 Carl Steinbach 17/Dec/11 00:03

          People

          • Assignee:
            Pradeep Kamath
            Reporter:
            Paul Yang
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development