Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-6992

Implement PreparedStatement.getMetaData() in HIVE JDBC

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • JDBC
    • None

    Description

      It would be very helpful to have methods PreparedStatement.getMetaData() and also PreparedStatement.getParameterMetaData() implemented.

      I especially would like PreparedStatmeent.getMetaData() implemented, as I could prepare a SQL statement, and then get information about the result set, as well as information that the query is valid.

      I am pretty sure this information is available in some form. When you do an "EXPLAIN query", the explain operation shows information about the result set including the column name/aliases and the column types.

      thanks you

      -bill

      Attachments

        1. HIVE-6992.1.patch
          85 kB
          Prafulla T
        2. HIVE-6992.2.patch
          85 kB
          Prafulla T

        Issue Links

          Activity

            rcswebb Richard Webb added a comment -

            I'd say this was a bug rather than a new feature. Getting the metadata of a prepared statement without executing it is rather important for us.

            rcswebb Richard Webb added a comment - I'd say this was a bug rather than a new feature. Getting the metadata of a prepared statement without executing it is rather important for us.
            prafulla Prafulla T added a comment -

            I have some thoughts about how PraparedStatement.getMetadata can be implemented.
            Let me know your comments and I can go ahead and implement this functionality.

            Here is my proposal.

            On server-side:

            Add couple of parameters TExecuteStatementReq thrift

            • A bool variable prepareOnly indicate this is “prepare only” request.
            • An operation handle to indicate existing operation on server .

            Add couple of parameters to HiveSession.executeStatement and executeStatementAsy - operationHandle and prepareOnly.

            In HiveSessionImpl.executeStatementInternal method,
            f prepareOnly flag is set and Create operation and invoke “prepare” method instead of run.
            If ExecuteStatementReq has valid operationHandle ( non-null ), then use that operation instead of creating new operation.
            If prepareOnly flag is not set, Invoke run method on corresponding Operation.
            Return error if Operation(if given) is not in valid state.

            On Client side:

            In PreparedStatement.getMetadata, Check if cached schema is available, if yes return that.
            Otherwise, send ExecuteStatementReq with prepareOnly flag and read schema using getSchema method.
            When this returns, Save operationHandle locally and use it for sending when user actually invokes PrepareMethod.execute* methods.

            prafulla Prafulla T added a comment - I have some thoughts about how PraparedStatement.getMetadata can be implemented. Let me know your comments and I can go ahead and implement this functionality. Here is my proposal. On server-side: Add couple of parameters TExecuteStatementReq thrift A bool variable prepareOnly indicate this is “prepare only” request. An operation handle to indicate existing operation on server . Add couple of parameters to HiveSession.executeStatement and executeStatementAsy - operationHandle and prepareOnly. In HiveSessionImpl.executeStatementInternal method, f prepareOnly flag is set and Create operation and invoke “prepare” method instead of run. If ExecuteStatementReq has valid operationHandle ( non-null ), then use that operation instead of creating new operation. If prepareOnly flag is not set, Invoke run method on corresponding Operation. Return error if Operation(if given) is not in valid state. On Client side: In PreparedStatement.getMetadata, Check if cached schema is available, if yes return that. Otherwise, send ExecuteStatementReq with prepareOnly flag and read schema using getSchema method. When this returns, Save operationHandle locally and use it for sending when user actually invokes PrepareMethod.execute* methods.
            navis Navis Ryu added a comment -

            I think this can be implemented with HIVE-4923, which is not yet committed and seemed not to be.

            navis Navis Ryu added a comment - I think this can be implemented with HIVE-4923 , which is not yet committed and seemed not to be.
            prafulla Prafulla T added a comment -

            Description of changes in attached patch HIVE-6992.1.patch
            HIVE-6992 - Support for PreparedStatement.getMetadata in hive-jdbc and server

            This patch
            1. Changes HiveSession/ICliService.executeStatement prototype to include two paramters.
            a. Boolean variable - prepareOnly to indicate if this execution request is only for preparing.
            b. existingOpHandle - to execute prepared operation.
            2. Changes TExecuteStatementReq to support above two parameters.
            3. Changes SQLOperation.java to support separate preparation step.
            4. Adds new OperationState called PREPARED to indicate prepared operation.
            5. Refactors HiveStatement.java/HiveQueryResult to support retrieving of schema
            from HiveStatement.java.
            6. Changes HiveJdbc class to support PreparedStatement.getMetadata

            Also includes new (basic) unit-test for PreapredStatement.getMetadata
            Test for executeStatement will be added later after initial code-review
            I would like to know if my approach looks ok OR reviewers would like seperate
            PreapreStatement API like ExecuteStatement

            prafulla Prafulla T added a comment - Description of changes in attached patch HIVE-6992 .1.patch HIVE-6992 - Support for PreparedStatement.getMetadata in hive-jdbc and server This patch 1. Changes HiveSession/ICliService.executeStatement prototype to include two paramters. a. Boolean variable - prepareOnly to indicate if this execution request is only for preparing. b. existingOpHandle - to execute prepared operation. 2. Changes TExecuteStatementReq to support above two parameters. 3. Changes SQLOperation.java to support separate preparation step. 4. Adds new OperationState called PREPARED to indicate prepared operation. 5. Refactors HiveStatement.java/HiveQueryResult to support retrieving of schema from HiveStatement.java. 6. Changes HiveJdbc class to support PreparedStatement.getMetadata Also includes new (basic) unit-test for PreapredStatement.getMetadata Test for executeStatement will be added later after initial code-review I would like to know if my approach looks ok OR reviewers would like seperate PreapreStatement API like ExecuteStatement
            prafulla Prafulla T added a comment -

            Initial patch

            prafulla Prafulla T added a comment - Initial patch
            prafulla Prafulla T added a comment -

            This patch is also available for review on reviewboard at following link.
            https://reviews.apache.org/r/29448/

            prafulla Prafulla T added a comment - This patch is also available for review on reviewboard at following link. https://reviews.apache.org/r/29448/
            hiveqa Hive QA added a comment -

            Overall: -1 at least one tests failed

            Here are the results of testing the latest attachment:
            https://issues.apache.org/jira/secure/attachment/12689283/HIVE-6992.1.patch

            ERROR: -1 due to 20 failed/errored test(s), 6722 tests executed
            Failed tests:

            org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_list_bucket_dml_10
            org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
            org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
            org.apache.hive.jdbc.miniHS2.TestHiveServer2.testGetVariableValue
            org.apache.hive.jdbc.miniHS2.TestHiveServer2SessionTimeout.testConnection
            org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testConfOverlay
            org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatement
            org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync
            org.apache.hive.service.cli.operation.TestOperationLoggingAPI.testFetchResultsOfLog
            org.apache.hive.service.cli.operation.TestOperationLoggingAPI.testFetchResultsOfLogAsync
            org.apache.hive.service.cli.operation.TestOperationLoggingAPI.testFetchResultsOfLogCleanup
            org.apache.hive.service.cli.operation.TestOperationLoggingAPI.testFetchResultsOfLogWithOrientation
            org.apache.hive.service.cli.session.TestSessionGlobalInitFile.testSessionGlobalInitDir
            org.apache.hive.service.cli.session.TestSessionGlobalInitFile.testSessionGlobalInitFile
            org.apache.hive.service.cli.session.TestSessionGlobalInitFile.testSessionGlobalInitFileAndConfOverlay
            org.apache.hive.service.cli.session.TestSessionGlobalInitFile.testSessionGlobalInitFileWithUser
            org.apache.hive.service.cli.thrift.TestThriftBinaryCLIService.testExecuteStatement
            org.apache.hive.service.cli.thrift.TestThriftBinaryCLIService.testExecuteStatementAsync
            org.apache.hive.service.cli.thrift.TestThriftHttpCLIService.testExecuteStatement
            org.apache.hive.service.cli.thrift.TestThriftHttpCLIService.testExecuteStatementAsync
            

            Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2204/testReport
            Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2204/console
            Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2204/

            Messages:

            Executing org.apache.hive.ptest.execution.PrepPhase
            Executing org.apache.hive.ptest.execution.ExecutionPhase
            Executing org.apache.hive.ptest.execution.ReportingPhase
            Tests exited with: TestsFailedException: 20 tests failed
            

            This message is automatically generated.

            ATTACHMENT ID: 12689283 - PreCommit-HIVE-TRUNK-Build

            hiveqa Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12689283/HIVE-6992.1.patch ERROR: -1 due to 20 failed/errored test(s), 6722 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection org.apache.hive.jdbc.miniHS2.TestHiveServer2.testGetVariableValue org.apache.hive.jdbc.miniHS2.TestHiveServer2SessionTimeout.testConnection org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testConfOverlay org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatement org.apache.hive.service.cli.TestEmbeddedThriftBinaryCLIService.testExecuteStatementAsync org.apache.hive.service.cli.operation.TestOperationLoggingAPI.testFetchResultsOfLog org.apache.hive.service.cli.operation.TestOperationLoggingAPI.testFetchResultsOfLogAsync org.apache.hive.service.cli.operation.TestOperationLoggingAPI.testFetchResultsOfLogCleanup org.apache.hive.service.cli.operation.TestOperationLoggingAPI.testFetchResultsOfLogWithOrientation org.apache.hive.service.cli.session.TestSessionGlobalInitFile.testSessionGlobalInitDir org.apache.hive.service.cli.session.TestSessionGlobalInitFile.testSessionGlobalInitFile org.apache.hive.service.cli.session.TestSessionGlobalInitFile.testSessionGlobalInitFileAndConfOverlay org.apache.hive.service.cli.session.TestSessionGlobalInitFile.testSessionGlobalInitFileWithUser org.apache.hive.service.cli.thrift.TestThriftBinaryCLIService.testExecuteStatement org.apache.hive.service.cli.thrift.TestThriftBinaryCLIService.testExecuteStatementAsync org.apache.hive.service.cli.thrift.TestThriftHttpCLIService.testExecuteStatement org.apache.hive.service.cli.thrift.TestThriftHttpCLIService.testExecuteStatementAsync Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2204/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2204/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2204/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 20 tests failed This message is automatically generated. ATTACHMENT ID: 12689283 - PreCommit-HIVE-TRUNK-Build
            prafulla Prafulla T added a comment -

            This patch includes fixes for NullpointerException found while testing.

            prafulla Prafulla T added a comment - This patch includes fixes for NullpointerException found while testing.
            hiveqa Hive QA added a comment -

            Overall: -1 at least one tests failed

            Here are the results of testing the latest attachment:
            https://issues.apache.org/jira/secure/attachment/12689747/HIVE-6992.2.patch

            ERROR: -1 due to 1 failed/errored test(s), 6723 tests executed
            Failed tests:

            org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
            

            Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2233/testReport
            Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2233/console
            Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2233/

            Messages:

            Executing org.apache.hive.ptest.execution.PrepPhase
            Executing org.apache.hive.ptest.execution.ExecutionPhase
            Executing org.apache.hive.ptest.execution.ReportingPhase
            Tests exited with: TestsFailedException: 1 tests failed
            

            This message is automatically generated.

            ATTACHMENT ID: 12689747 - PreCommit-HIVE-TRUNK-Build

            hiveqa Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12689747/HIVE-6992.2.patch ERROR: -1 due to 1 failed/errored test(s), 6723 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2233/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2233/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2233/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed This message is automatically generated. ATTACHMENT ID: 12689747 - PreCommit-HIVE-TRUNK-Build
            gates Alan Gates added a comment -

            First of all, apologies for the huge delay before this got reviewed. The patch no longer applies.

            But I have one big question, which is, I don't see how this works when there are parameters in the prepared statement. This patch passes the SQL query as provided to the prepared statement and then has Driver parse that statement. This will fail when the statement has parameters (e.g. select a from foo where b = ?). The only provided test case uses a query with no parameters.

            gates Alan Gates added a comment - First of all, apologies for the huge delay before this got reviewed. The patch no longer applies. But I have one big question, which is, I don't see how this works when there are parameters in the prepared statement. This patch passes the SQL query as provided to the prepared statement and then has Driver parse that statement. This will fail when the statement has parameters (e.g. select a from foo where b = ?). The only provided test case uses a query with no parameters.

            People

              Unassigned Unassigned
              bioliv Bill Oliver
              Votes:
              6 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated: