Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.0
    • Fix Version/s: 0.7.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      After HIVE-1609, we are seeing some table not found errors intermittently.
      We have a test case where 5 processes are concurrently issueing the same query -
      explain extended insert .. select from <T>

      and once in a while, we get a error <T> not found -
      When we revert back the JDO version, the error is gone.

      We can investigate later to find the JDO bug, but for now this is a show-stopper for facebook, and needs
      to be reverted back immediately.

      This also means, that the filters will not be pushed to mysql.

      1. HIVE-1853.1.patch
        10 kB
        Paul Yang
      2. HIVE-1853.2.patch
        10 kB
        Paul Yang

        Issue Links

          Activity

          Hide
          Andy Jefferson added a comment -

          > Since the bug is within the newer version JDO/datanucleus

          Why not reference what this bug was in DataNucleus JIRA ? that way someone at that side can understand what this refers to. Or maybe it was never reported to DataNucleus?

          Show
          Andy Jefferson added a comment - > Since the bug is within the newer version JDO/datanucleus Why not reference what this bug was in DataNucleus JIRA ? that way someone at that side can understand what this refers to. Or maybe it was never reported to DataNucleus?
          Hide
          Namit Jain added a comment -

          Unfortunately, the query that I was running used some production tables.
          I will try to reproduce the query with some non-production tables.

          Show
          Namit Jain added a comment - Unfortunately, the query that I was running used some production tables. I will try to reproduce the query with some non-production tables.
          Hide
          Ashutosh Chauhan added a comment -

          We have a test case where 5 processes are concurrently issueing the same query - explain extended insert .. select from <T>and once in a while, we get a error <T> not found - When we revert back the JDO version, the error is gone.

          Namit,

          Will it be possible to post the testcase which shows this problem?

          Show
          Ashutosh Chauhan added a comment - We have a test case where 5 processes are concurrently issueing the same query - explain extended insert .. select from <T>and once in a while, we get a error <T> not found - When we revert back the JDO version, the error is gone. Namit, Will it be possible to post the testcase which shows this problem?
          Show
          John Sichi added a comment - Yeah, Hudson agrees with Ashutosh: https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/445/testReport/junit/org.apache.hadoop.hive.metastore/TestEmbeddedHiveMetaStore/testPartitionFilter/
          Hide
          Ashutosh Chauhan added a comment -

          Was doing some testing for HIVE-1696 test cases now fail with

           <error message="listMPartitionsByFilter is not supported due to a JDO library downgrade" type="java.lang.RuntimeException">java.lang.RuntimeException: listMPartitionsByFilter is not supported due to a JDO library downgrade
                  at org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1025)
                  at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:972)
                  at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$36.run(HiveMetaStore.java:2126)
                  at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$36.run(HiveMetaStore.java:2123)
                  at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:240)
                  at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2123)
                  at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:589)
                  at org.apache.hadoop.hive.metastore.TestHiveMetaStore.checkFilter(TestHiveMetaStore.java:1147)
                  at org.apache.hadoop.hive.metastore.TestHiveMetaStore.testPartitionFilter(TestHiveMetaStore.java:1065)
          

          May be for time being (until underlying issue is fixed) we shall also disable the failing tests so that trunk is buildable with passing all tests.

          Show
          Ashutosh Chauhan added a comment - Was doing some testing for HIVE-1696 test cases now fail with <error message= "listMPartitionsByFilter is not supported due to a JDO library downgrade" type= "java.lang.RuntimeException" >java.lang.RuntimeException: listMPartitionsByFilter is not supported due to a JDO library downgrade at org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1025) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:972) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$36.run(HiveMetaStore.java:2126) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$36.run(HiveMetaStore.java:2123) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:240) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2123) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:589) at org.apache.hadoop.hive.metastore.TestHiveMetaStore.checkFilter(TestHiveMetaStore.java:1147) at org.apache.hadoop.hive.metastore.TestHiveMetaStore.testPartitionFilter(TestHiveMetaStore.java:1065) May be for time being (until underlying issue is fixed) we shall also disable the failing tests so that trunk is buildable with passing all tests.
          Hide
          Ashutosh Chauhan added a comment -

          We are still evaluating its impact on us and looking if there is any straight fwd fix which exists for us. I will have more details later.

          Show
          Ashutosh Chauhan added a comment - We are still evaluating its impact on us and looking if there is any straight fwd fix which exists for us. I will have more details later.
          Hide
          Namit Jain added a comment -

          Ashutosh, what is your timeline ?

          Right now, we dont have the infra-structure in place to pick some patches and ignore others.
          We pick all the patches from the open source to our internal tree.

          For the time it will take us to develop this, can you live with the current trunk
          (lower JDO) ?

          Show
          Namit Jain added a comment - Ashutosh, what is your timeline ? Right now, we dont have the infra-structure in place to pick some patches and ignore others. We pick all the patches from the open source to our internal tree. For the time it will take us to develop this, can you live with the current trunk (lower JDO) ?
          Hide
          Namit Jain added a comment -

          Also, is there some other more stable version of JDO which does not have this problem ?

          Show
          Namit Jain added a comment - Also, is there some other more stable version of JDO which does not have this problem ?
          Hide
          Paul Yang added a comment -

          This bug should affect you guys as well - you haven't been seeing occasionally missing tables when you issue metastore queries?

          Show
          Paul Yang added a comment - This bug should affect you guys as well - you haven't been seeing occasionally missing tables when you issue metastore queries?
          Hide
          Ashutosh Chauhan added a comment -

          This was creating a big problem for us - so, we needed to fix it asap.

          Unfortunately, this will cause a big problem for us : ( We were pushing filters from Pig to Metastore (in Howl) all those pig scripts will now break.

          Show
          Ashutosh Chauhan added a comment - This was creating a big problem for us - so, we needed to fix it asap. Unfortunately, this will cause a big problem for us : ( We were pushing filters from Pig to Metastore (in Howl) all those pig scripts will now break.
          Hide
          Paul Yang added a comment -

          The bug is that get_table (and likely get_partition) returns null even though the object clearly exists. This is without any pruning or filtering.

          Since the bug is within the newer version JDO/datanucleus, and HIVE-1609 (listPartitionsByFilter) depends on the newer version, it wouldn't have worked with the JDO downgrade.

          This patch undoes HIVE-1660, reverts JDO back to the older version, and disables listPartitionsByFilter().

          Show
          Paul Yang added a comment - The bug is that get_table (and likely get_partition) returns null even though the object clearly exists. This is without any pruning or filtering. Since the bug is within the newer version JDO/datanucleus, and HIVE-1609 (listPartitionsByFilter) depends on the newer version, it wouldn't have worked with the JDO downgrade. This patch undoes HIVE-1660 , reverts JDO back to the older version, and disables listPartitionsByFilter().
          Hide
          Namit Jain added a comment -

          This was creating a big problem for us - so, we needed to fix it asap.

          Now, we can think of more options.

          Show
          Namit Jain added a comment - This was creating a big problem for us - so, we needed to fix it asap. Now, we can think of more options.
          Hide
          Ashutosh Chauhan added a comment -

          That will disable the feature on metastore itself. I was hoping just undoing the changes of HIVE-1660 will revert back Hive to old behavior where QL was doing the pruning and thus will solve the problem you guys are seeing now. That way we could have left partition pruning in metastore to play with and try to fix the real bug.

          Show
          Ashutosh Chauhan added a comment - That will disable the feature on metastore itself. I was hoping just undoing the changes of HIVE-1660 will revert back Hive to old behavior where QL was doing the pruning and thus will solve the problem you guys are seeing now. That way we could have left partition pruning in metastore to play with and try to fix the real bug.
          Hide
          Paul Yang added a comment -

          Actually, I made the listPartitionsByFilter() throw a runtime exception until this issue is resolved because the feature is supposed to require a newer version of JDOQL to function properly, as stated in HIVE-1609. I'll confirm this again with Ajay.

          Show
          Paul Yang added a comment - Actually, I made the listPartitionsByFilter() throw a runtime exception until this issue is resolved because the feature is supposed to require a newer version of JDOQL to function properly, as stated in HIVE-1609 . I'll confirm this again with Ajay.
          Hide
          Namit Jain added a comment -

          Committed. Thanks Paul

          Show
          Namit Jain added a comment - Committed. Thanks Paul
          Hide
          Ashutosh Chauhan added a comment -

          Paul,

          Skimmed though the patch. It undo the changes you introduced in get_partition_ps() in HIVE-1660 but most of HIVE-1609 still lives on ? Is that correct? I think that will work fine as then Hive will not be pushing partition pruning to metastore and will avoid the problems as currently being experienced, but api will be there in metastore for others who want to play with it.

          Show
          Ashutosh Chauhan added a comment - Paul, Skimmed though the patch. It undo the changes you introduced in get_partition_ps() in HIVE-1660 but most of HIVE-1609 still lives on ? Is that correct? I think that will work fine as then Hive will not be pushing partition pruning to metastore and will avoid the problems as currently being experienced, but api will be there in metastore for others who want to play with it.
          Hide
          Namit Jain added a comment -

          +1

          Running tests

          Show
          Namit Jain added a comment - +1 Running tests
          Hide
          Paul Yang added a comment -

          Regenerated

          Show
          Paul Yang added a comment - Regenerated
          Hide
          Namit Jain added a comment -

          Can you regenerate the patch ?
          It is not applying cleanly.

          Show
          Namit Jain added a comment - Can you regenerate the patch ? It is not applying cleanly.
          Show
          John Sichi added a comment - This user report has similar symptoms: http://mail-archives.apache.org/mod_mbox/hive-user/201011.mbox/%3CAANLkTik6M1juxvx0Lerre-cNN5QnFwi6_4AY0pQ6zB42@mail.gmail.com%3E

            People

            • Assignee:
              Paul Yang
              Reporter:
              Namit Jain
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development