Hive
  1. Hive
  2. HIVE-1862

Revive partition filtering in the Hive MetaStore

    Details

      Description

      HIVE-1853 downgraded the JDO version. This makes the feature of partition filtering in the metastore unusable. This jira is to keep track of the lost feature and discussing approaches to bring it back.

      1. HIVE-1862.1.patch.txt
        17 kB
        Mac Yang
      2. invoke_runqry.sh
        0.1 kB
        Namit Jain
      3. qry
        0.6 kB
        Namit Jain
      4. runqry
        0.7 kB
        Namit Jain
      5. qry-sch.Z
        2.20 MB
        Namit Jain
      6. HIVE-1862.2.patch.txt
        16 kB
        Mac Yang
      7. HIVE-1862.3.patch.txt
        17 kB
        Mac Yang

        Issue Links

          Activity

          Hide
          Paul Yang added a comment -

          Committed. Thanks Mac! Cool use of string manipulation. Hopefully, we'll find a workaround for those escaped partition names soon..

          Show
          Paul Yang added a comment - Committed. Thanks Mac! Cool use of string manipulation. Hopefully, we'll find a workaround for those escaped partition names soon..
          Hide
          Paul Yang added a comment -

          +1 looks good. Will commit if tests pass.

          Show
          Paul Yang added a comment - +1 looks good. Will commit if tests pass.
          Hide
          Mac Yang added a comment -

          Incorporated Paul's feedback

          Show
          Mac Yang added a comment - Incorporated Paul's feedback
          Hide
          Mac Yang added a comment -

          Paul, thanks for the comment.

          • I will add a check to catch the case where it's the last element in the path
          • Escaping the value does not work for the "like" operator, for example, the pattern "p.*3" would match values like "p13". However, it would also match value "p1#" since it got truned into "p1%23"
          • I will leave get_partition_names_ps()/get_partitions_ps() as the way they are in the trunk in the next patch
          Show
          Mac Yang added a comment - Paul, thanks for the comment. I will add a check to catch the case where it's the last element in the path Escaping the value does not work for the "like" operator, for example, the pattern "p.*3" would match values like "p13". However, it would also match value "p1#" since it got truned into "p1%23" I will leave get_partition_names_ps()/get_partitions_ps() as the way they are in the trunk in the next patch
          Hide
          Paul Yang added a comment -
          • There may be a potential bug with the string index logic:
          172	      String keyEqual = FileUtils.escapePathName(keyName) + "=";
          173	      int keyEqualLength = keyEqual.length();
          174	      String valString;
          175	      // partitionname ==>  (key=value/)*(key=value)
          176	      if (partitionColumnCount == 1) {
          177	        valString = "partitionName.substring(partitionName.indexOf(\"" + keyEqual + "\")+" + keyEqualLength + ")";
          178	      }
          179	      else {
          180	        valString = "partitionName.substring(partitionName.indexOf(\"" + keyEqual + "\")+" + keyEqualLength + ").substring(0, partitionName.substring(partitionName.indexOf(\"" + keyEqual + "\")+" + keyEqualLength + ").indexOf(\"/\"))";
          181	      }
          

          Say the partition name was ds=1/hr=2 and we want to extract the value of hr. Since there is no tailing '/', what will .indexOf(\"/\"))" return?

          • Can you elaborate further on why it can't handle escaped partition names? It seems that if we escape the value that we are searching for, instead of unescaping the partition name, we should be able to handle that case.

          If escaped partition names can't be handled, can we revert the code inside get_partition_names_ps() / get_partitions_ps() to return the correct, but slow results?

          Show
          Paul Yang added a comment - There may be a potential bug with the string index logic: 172 String keyEqual = FileUtils.escapePathName(keyName) + "=" ; 173 int keyEqualLength = keyEqual.length(); 174 String valString; 175 // partitionname ==> (key=value/)*(key=value) 176 if (partitionColumnCount == 1) { 177 valString = "partitionName.substring(partitionName.indexOf(\" " + keyEqual + " \ ")+" + keyEqualLength + ")" ; 178 } 179 else { 180 valString = "partitionName.substring(partitionName.indexOf(\" " + keyEqual + " \ ")+" + keyEqualLength + ").substring(0, partitionName.substring(partitionName.indexOf(\" " + keyEqual + " \ ")+" + keyEqualLength + ").indexOf(\" /\ "))" ; 181 } Say the partition name was ds=1/hr=2 and we want to extract the value of hr. Since there is no tailing '/', what will .indexOf(\"/\"))" return? Can you elaborate further on why it can't handle escaped partition names? It seems that if we escape the value that we are searching for, instead of unescaping the partition name, we should be able to handle that case. If escaped partition names can't be handled, can we revert the code inside get_partition_names_ps() / get_partitions_ps() to return the correct, but slow results?
          Hide
          Paul Yang added a comment -

          I'll take a look..

          Show
          Paul Yang added a comment - I'll take a look..
          Hide
          Devaraj Das added a comment -

          Folks, any update on this one? Does it serve our purpose with the limitation that partition values can't have special characters. If so, could it be reviewed. Thanks!

          Show
          Devaraj Das added a comment - Folks, any update on this one? Does it serve our purpose with the limitation that partition values can't have special characters. If so, could it be reviewed. Thanks!
          Hide
          Mac Yang added a comment -

          "A quick note about the patch. The work around is implemented in ExpressionTree.java. The rest of the patch just added back old code that was removed as part of HIVE-1853."

          This was a comment I made on 4/Jan/11. It's still mostly true. However, I did make some additional minor changes due to other commits in the trunk since then.

          Show
          Mac Yang added a comment - "A quick note about the patch. The work around is implemented in ExpressionTree.java. The rest of the patch just added back old code that was removed as part of HIVE-1853 ." This was a comment I made on 4/Jan/11. It's still mostly true. However, I did make some additional minor changes due to other commits in the trunk since then.
          Hide
          Mac Yang added a comment -

          Regenerate patch based on current trunk

          Show
          Mac Yang added a comment - Regenerate patch based on current trunk
          Hide
          Namit Jain added a comment -

          Can you generate the patch - I am getting some conflicts while applying the patch

          Show
          Namit Jain added a comment - Can you generate the patch - I am getting some conflicts while applying the patch
          Hide
          Mac Yang added a comment -

          @Namit, certainly, but I don't have a set up ready to go for that scale, so it will take some time.

          In the mean time, it would be great to get some feedback on the patch to see if the approach looks reasonable.

          Show
          Mac Yang added a comment - @Namit, certainly, but I don't have a set up ready to go for that scale, so it will take some time. In the mean time, it would be great to get some feedback on the patch to see if the approach looks reasonable.
          Hide
          Namit Jain added a comment -

          I think the problem exists in the thrift server mode also, it is just more difficult to reproduce.

          Can you run 200 concurrent clients, and see if the problem appears ?

          Show
          Namit Jain added a comment - I think the problem exists in the thrift server mode also, it is just more difficult to reproduce. Can you run 200 concurrent clients, and see if the problem appears ?
          Hide
          Mac Yang added a comment -

          Given that the issue does not seem to impact thrift server mode metastore server, I would like to propose the following approach,

          • Make datanucleus version configurable at build time (similar to hadoop.version). Use 2.0.3 as the default
          • Put back the code in listMPartitionsByFilter

          With those changes, the behavior of a default build will be what it is today, but people also has the option to use the partition filtering function.

          Show
          Mac Yang added a comment - Given that the issue does not seem to impact thrift server mode metastore server, I would like to propose the following approach, Make datanucleus version configurable at build time (similar to hadoop.version). Use 2.0.3 as the default Put back the code in listMPartitionsByFilter With those changes, the behavior of a default build will be what it is today, but people also has the option to use the partition filtering function.
          Hide
          Namit Jain added a comment -

          cool

          Show
          Namit Jain added a comment - cool
          Hide
          Mac Yang added a comment -

          Bingo!

          Ran the test with metastore in local mode and was able to reproduce the table not found error. It appears that the bug gets tickled when metastore is running in local mode, but not (at least not easily) in thrift server mode.

          Show
          Mac Yang added a comment - Bingo! Ran the test with metastore in local mode and was able to reproduce the table not found error. It appears that the bug gets tickled when metastore is running in local mode, but not (at least not easily) in thrift server mode.
          Hide
          John Sichi added a comment -

          Many of our clients connect to the DB directly (not through the metastore Thrift server). As far as I know, there are no issues with DBCP.

          Show
          John Sichi added a comment - Many of our clients connect to the DB directly (not through the metastore Thrift server). As far as I know, there are no issues with DBCP.
          Hide
          Mac Yang added a comment -

          I did not set those parameters and just picked up from hive-default.xml,

          datanucleus.transactionIsolation=read-committed
          datanucleus.connectionPoolingType=DBCP

          Will try the test with repeatable-read and no connection pooling and see if that makes a difference.

          I'm curious why you run without connection pooling. Was it because the metadata server was running in local mode or was there some issue with DBCP?

          Show
          Mac Yang added a comment - I did not set those parameters and just picked up from hive-default.xml, datanucleus.transactionIsolation=read-committed datanucleus.connectionPoolingType=DBCP Will try the test with repeatable-read and no connection pooling and see if that makes a difference. I'm curious why you run without connection pooling. Was it because the metadata server was running in local mode or was there some issue with DBCP?
          Hide
          John Sichi added a comment -

          I checked our server; it's MySQL 5.0.84. Also, we set the following hive-site.xml properties:

          datanucleus.transactionIsolation=repeatable-read
          datanucleus.valuegeneration.transactionIsolation=repeatable-read
          datanucleus.connectionPoolingType=none

          Show
          John Sichi added a comment - I checked our server; it's MySQL 5.0.84. Also, we set the following hive-site.xml properties: datanucleus.transactionIsolation=repeatable-read datanucleus.valuegeneration.transactionIsolation=repeatable-read datanucleus.connectionPoolingType=none
          Hide
          Mac Yang added a comment -

          Ran the test again with the full dataset (~300k partitions in jdo2 table) and 10 concurrent processes. It ran for 30 hours without error.

          The test was run against a remote metastore server, with datanucleus 2.1.1 and the partition filtering code in place (pre HIVE-1853). MySQL version is 5.1.46.

          Show
          Mac Yang added a comment - Ran the test again with the full dataset (~300k partitions in jdo2 table) and 10 concurrent processes. It ran for 30 hours without error. The test was run against a remote metastore server, with datanucleus 2.1.1 and the partition filtering code in place (pre HIVE-1853 ). MySQL version is 5.1.46.
          Hide
          Mac Yang added a comment -

          Namit, thanks for posting the test.

          I have run the test with partial data (36k out of 302k partitions for the jdo2 table). With six concurrent processes, the test ran for more than a day without error. Will try again after finish loading the full data set.

          Show
          Mac Yang added a comment - Namit, thanks for posting the test. I have run the test with partial data (36k out of 302k partitions for the jdo2 table). With six concurrent processes, the test ran for more than a day without error. Will try again after finish loading the full data set.
          Hide
          Namit Jain added a comment -

          I have attached the test case which I was able to run to reproduce the problem with the higher jdo version.
          Note that this is not deterministic, and may take some time to error out

          Show
          Namit Jain added a comment - I have attached the test case which I was able to run to reproduce the problem with the higher jdo version. Note that this is not deterministic, and may take some time to error out
          Hide
          Mac Yang added a comment -

          A quick note about the patch. The work around is implemented in ExpressionTree.java. The rest of the patch just added back old code that was removed as part of HIVE-1853.

          Show
          Mac Yang added a comment - A quick note about the patch. The work around is implemented in ExpressionTree.java. The rest of the patch just added back old code that was removed as part of HIVE-1853 .
          Hide
          Mac Yang added a comment -

          Datanucleus 2.0.3 does not support the get() method on Collection, which the
          partition filtering code depends on in order to retrieve the value for a
          particular partition and use it for filtering.

          The submitted patch is a quick work around. It uses the substring() function
          to extract the partition value out of the partitionName field, and thus
          eliminates the need of the get() method.

          However, this approach does not work if the partition value contains special
          characters. This is because the partitionName has the special characters
          escaped. Hence the partition value generated using the substring() approach is
          also in the escaped form. Here is the list of special characters for reference
          purpose, '"', '#', '%', '\'', '*', '/', ':', '=', '?', '
          ', '\u007F', '{',
          ']'

          While this solution is incomplete, I am hoping this submission will trigger more
          suggestions and ideas.

          Show
          Mac Yang added a comment - Datanucleus 2.0.3 does not support the get() method on Collection, which the partition filtering code depends on in order to retrieve the value for a particular partition and use it for filtering. The submitted patch is a quick work around. It uses the substring() function to extract the partition value out of the partitionName field, and thus eliminates the need of the get() method. However, this approach does not work if the partition value contains special characters. This is because the partitionName has the special characters escaped. Hence the partition value generated using the substring() approach is also in the escaped form. Here is the list of special characters for reference purpose, '"', '#', '%', '\'', '*', '/', ':', '=', '?', ' ', '\u007F', '{', ']' While this solution is incomplete, I am hoping this submission will trigger more suggestions and ideas.
          Hide
          Ashutosh Chauhan added a comment -

          Namit,

          Since HIVE-1853 is marked fixed and resolved can you attach the test scripts here and instructions on how to reproduce it.

          Show
          Ashutosh Chauhan added a comment - Namit, Since HIVE-1853 is marked fixed and resolved can you attach the test scripts here and instructions on how to reproduce it.
          Hide
          Ashutosh Chauhan added a comment -

          I can see four possible routes going forward:

          1) Fish out exact issue by coming up with a test case and submit it to jdo community and ask for a fix in the library.
          Assuming 1) may take longer and is unbounded, remaining on older stable jdo version presents following possibilities.

          2) Rewrite the jdo query to not make use of Collection.get() which only exists in newer jdo version and thereby required a jdo lib upgrade.

          3) Refactor the partition pruning code from ql to metastore so that metastore returns a filtered list of partition

          4) If 2) is not possible then, then push equality predicate to MySQL and do filtering for other predicates using 3) in Metastore.

          Show
          Ashutosh Chauhan added a comment - I can see four possible routes going forward: 1) Fish out exact issue by coming up with a test case and submit it to jdo community and ask for a fix in the library. Assuming 1) may take longer and is unbounded, remaining on older stable jdo version presents following possibilities. 2) Rewrite the jdo query to not make use of Collection.get() which only exists in newer jdo version and thereby required a jdo lib upgrade. 3) Refactor the partition pruning code from ql to metastore so that metastore returns a filtered list of partition 4) If 2) is not possible then, then push equality predicate to MySQL and do filtering for other predicates using 3) in Metastore.

            People

            • Assignee:
              Mac Yang
              Reporter:
              Devaraj Das
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development