Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-712

Avatica statement execute return all resultset instead of MaxRows from setMaxRows

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.0-incubating
    • Fix Version/s: 1.4.0-incubating
    • Component/s: None
    • Labels:

      Description

      setMaxRows is supported, but after moving to JdbcMeta, max row was ignore and full result set is return instead.

        Issue Links

          Activity

          Hide
          xhoong Xavier FH Leong added a comment - - edited
          Show
          xhoong Xavier FH Leong added a comment - - edited Pull request https://github.com/apache/incubator-calcite/pull/84
          Hide
          julianhyde Julian Hyde added a comment -

          You didn't test (a) calling setMaxRows on a prepared statement, (b) re-executing a prepared statement. I added those tests in my branch https://github.com/julianhyde/incubator-calcite/tree/712-maxRows. Can you fix please?

          Show
          julianhyde Julian Hyde added a comment - You didn't test (a) calling setMaxRows on a prepared statement, (b) re-executing a prepared statement. I added those tests in my branch https://github.com/julianhyde/incubator-calcite/tree/712-maxRows . Can you fix please?
          Hide
          xhoong Xavier FH Leong added a comment -

          The prepareStatement setMaxRows needs a bit of overhaul, as the current logic flow is defunct, one can't setMaxRows on a prepareStatement without the statement first created, eventhou the meta prototype have maxRowCount in it, one can only set the maxRowCounts after the fact the prepare is called:

          public StatementHandle prepare(ConnectionHandle ch, String sql,
          int maxRowCount)

          The only way to have prepareStatement supporting the setMaxRows, is to have it done via RPC, and extend CalciteSignature to have setMaxRows method to overwrite the private member during the RPC call.

          Open for ideas.

          Include Nick Dimiduk in here for Avatica discussion.

          Show
          xhoong Xavier FH Leong added a comment - The prepareStatement setMaxRows needs a bit of overhaul, as the current logic flow is defunct, one can't setMaxRows on a prepareStatement without the statement first created, eventhou the meta prototype have maxRowCount in it, one can only set the maxRowCounts after the fact the prepare is called: public StatementHandle prepare(ConnectionHandle ch, String sql, int maxRowCount) The only way to have prepareStatement supporting the setMaxRows, is to have it done via RPC, and extend CalciteSignature to have setMaxRows method to overwrite the private member during the RPC call. Open for ideas. Include Nick Dimiduk in here for Avatica discussion.
          Hide
          xhoong Xavier FH Leong added a comment - - edited

          Using re-executing prepare method, new commit at:

          https://github.com/xhoong/incubator-calcite/commit/48324c8da810963d9852887dd7e7e2a3f3e8faa0
          As you can see, can be quite flaky calling meta.prepare 2x at remote and at local when there's a setMaxRows

          Show
          xhoong Xavier FH Leong added a comment - - edited Using re-executing prepare method, new commit at: https://github.com/xhoong/incubator-calcite/commit/48324c8da810963d9852887dd7e7e2a3f3e8faa0 As you can see, can be quite flaky calling meta.prepare 2x at remote and at local when there's a setMaxRows
          Hide
          julianhyde Julian Hyde added a comment -

          I've reviewed this. I agree that it's quite messy, and not your fault. I think the root cause is the fact that a prepare call can specify the maxRowCount and if that prepared statement is re-executed with a higher maxRowCount (or no limit) the results will be invalid. Do you agree with the root cause?

          I think re-preparing a prepared statement is unacceptable.

          I believe that the only time a prepare request occurs with a row limit is a prepare-and-execute call (i.e. caused by Statement.executeQuery). Do you believe that to be true? If that is true, then it will be impossible to re-execute the prepare with a different row limit. If we can put in place an assert that will make sure that this is true, I think we can avoid the re-prepare.

          I found at least one place where you were checking 'parameter > 0' where the javadoc comment on the parameter said 'negative means no limit'. 0 is not negative. Please be careful.

          Show
          julianhyde Julian Hyde added a comment - I've reviewed this. I agree that it's quite messy, and not your fault. I think the root cause is the fact that a prepare call can specify the maxRowCount and if that prepared statement is re-executed with a higher maxRowCount (or no limit) the results will be invalid. Do you agree with the root cause? I think re-preparing a prepared statement is unacceptable. I believe that the only time a prepare request occurs with a row limit is a prepare-and-execute call (i.e. caused by Statement.executeQuery). Do you believe that to be true? If that is true, then it will be impossible to re-execute the prepare with a different row limit. If we can put in place an assert that will make sure that this is true, I think we can avoid the re-prepare. I found at least one place where you were checking 'parameter > 0' where the javadoc comment on the parameter said 'negative means no limit'. 0 is not negative. Please be careful.
          Hide
          julianhyde Julian Hyde added a comment -

          By the way, it is not hard to enforce row limit in the client. Enforcing the row limit on the server is a nice-to-have but it shouldn't dictate our architecture.

          Show
          julianhyde Julian Hyde added a comment - By the way, it is not hard to enforce row limit in the client. Enforcing the row limit on the server is a nice-to-have but it shouldn't dictate our architecture.
          Hide
          xhoong Xavier FH Leong added a comment -

          We have user on squirrel executing a query on a large table without SQL limit, by default squirrel apply row limit of 100. The problem lies in when user fire that query, the server side will assemble the whole result set before sending it over (which is millions of rows), and it choke.

          So there's 2 issue, the limit set by squirrel is not fulfilled. And the fetch is not working for Statement.execute() (it's working for PreparedStatement for now). Need another JIRA to capture the fetch deficiency in Statement.execute()

          Show
          xhoong Xavier FH Leong added a comment - We have user on squirrel executing a query on a large table without SQL limit, by default squirrel apply row limit of 100. The problem lies in when user fire that query, the server side will assemble the whole result set before sending it over (which is millions of rows), and it choke. So there's 2 issue, the limit set by squirrel is not fulfilled. And the fetch is not working for Statement.execute() (it's working for PreparedStatement for now). Need another JIRA to capture the fetch deficiency in Statement.execute()
          Hide
          xhoong Xavier FH Leong added a comment - - edited

          , not finding finger to point, but just illustrate the logic flow.

          I think the root cause is the fact that a prepare call can specify the maxRowCount and if that prepared statement is re-executed with a higher maxRowCount (or no limit) the results will be invalid. Do you agree with the root cause?

          No, if setting the maxRowCount will do a re-prepare, the execute will always get the latest settings (plus a new statement at the server end), so the above scenario would not happen.

          I believe that the only time a prepare request occurs with a row limit is a prepare-and-execute call (i.e. caused by Statement.executeQuery). Do you believe that to be true? If that is true, then it will be impossible to re-execute the prepare with a different row limit.

          I think what missing here is that we are trying to set the maxRowCount during prepare, instead of during execute. To fulfill JDBC contract for setting maxRowCount, even for PreparedStatment, I think is good to support RPC call that can

          • prepare
          • execute
          • prepareAndExecute

          prepareAndExecute can end up calling prepare and then execute, we can also choose to combine the call and indicating which task under the method argument, but I prefer having separate RPC which is much cleaner.

          I found at least one place where you were checking 'parameter > 0'

          I think that would be in the JdbcMeta.prepare, the original check is in prepareAndExecute, where maxRowCount has been change by following code:

          // In JDBC, maxRowCount = 0 means no limit; in prepare it means LIMIT 0
          final int maxRowCount1 = maxRowCount <= 0 ? -1 : maxRowCount;
          

          But I agree, prepare should takes 0, but for JdbcMeta.prepare case it is checking condition whether to call setMaxRows, not to pass in for meta.prepare. Should the above code be in setMaxRows instead of having to convert in every execute call?

          Show
          xhoong Xavier FH Leong added a comment - - edited , not finding finger to point, but just illustrate the logic flow. I think the root cause is the fact that a prepare call can specify the maxRowCount and if that prepared statement is re-executed with a higher maxRowCount (or no limit) the results will be invalid. Do you agree with the root cause? No, if setting the maxRowCount will do a re-prepare, the execute will always get the latest settings (plus a new statement at the server end), so the above scenario would not happen. I believe that the only time a prepare request occurs with a row limit is a prepare-and-execute call (i.e. caused by Statement.executeQuery). Do you believe that to be true? If that is true, then it will be impossible to re-execute the prepare with a different row limit. I think what missing here is that we are trying to set the maxRowCount during prepare, instead of during execute. To fulfill JDBC contract for setting maxRowCount, even for PreparedStatment, I think is good to support RPC call that can prepare execute prepareAndExecute prepareAndExecute can end up calling prepare and then execute, we can also choose to combine the call and indicating which task under the method argument, but I prefer having separate RPC which is much cleaner. I found at least one place where you were checking 'parameter > 0' I think that would be in the JdbcMeta.prepare, the original check is in prepareAndExecute, where maxRowCount has been change by following code: // In JDBC, maxRowCount = 0 means no limit; in prepare it means LIMIT 0 final int maxRowCount1 = maxRowCount <= 0 ? -1 : maxRowCount; But I agree, prepare should takes 0, but for JdbcMeta.prepare case it is checking condition whether to call setMaxRows, not to pass in for meta.prepare. Should the above code be in setMaxRows instead of having to convert in every execute call?
          Hide
          xhoong Xavier FH Leong added a comment -

          To add on, I agree row limit on server is nice to have, *only* if the fetch is working.

          This JIRA was originally open against a working method but breaks by JdbcMeta refactoring.

          Show
          xhoong Xavier FH Leong added a comment - To add on, I agree row limit on server is nice to have, * only * if the fetch is working. This JIRA was originally open against a working method but breaks by JdbcMeta refactoring.
          Hide
          xhoong Xavier FH Leong added a comment -

          Thinking of it, is there a circuit breaker?

          Show
          xhoong Xavier FH Leong added a comment - Thinking of it, is there a circuit breaker?
          Hide
          julianhyde Julian Hyde added a comment -

          Agreed, please log those as separate issues, and let's fix them one by one.

          Show
          julianhyde Julian Hyde added a comment - Agreed, please log those as separate issues, and let's fix them one by one.
          Hide
          julianhyde Julian Hyde added a comment -

          Regarding maxRowCount. In code that is near JDBC, we use JDBC's semantics: 0 means no limit. Those semantics should extend to AvaticaStatement.maxRowCount.

          Actually, the mistake was in JdbcMeta.prepareAndExecute.

          Show
          julianhyde Julian Hyde added a comment - Regarding maxRowCount. In code that is near JDBC, we use JDBC's semantics: 0 means no limit. Those semantics should extend to AvaticaStatement.maxRowCount. Actually, the mistake was in JdbcMeta.prepareAndExecute.
          Hide
          xhoong Xavier FH Leong added a comment -

          Hi, I'd remove the support for setMaxRows for PreparedStatement, and open 2 issues for fetch and prepare (see related link)

          Finalize commit:
          https://github.com/xhoong/incubator-calcite/commit/2b7edc6406a620c0a25c1c01cd3ec0b6edc5a201

          I think all AvaticaStatement should be consistent, and the -1 should only be internal to Calcite, so that in JdbcMeta, the check for setMaxRows do not need to apply. And in reality, the limit 0 row will never be set for prepare as in JDBC due to the fact that 0 represented as unlimited. I think if there's any need for 0 limit for parseSQL, it should be internal to Calcite. Hence the code is refactor to behave in such.

          Show
          xhoong Xavier FH Leong added a comment - Hi, I'd remove the support for setMaxRows for PreparedStatement, and open 2 issues for fetch and prepare (see related link) Finalize commit: https://github.com/xhoong/incubator-calcite/commit/2b7edc6406a620c0a25c1c01cd3ec0b6edc5a201 I think all AvaticaStatement should be consistent, and the -1 should only be internal to Calcite, so that in JdbcMeta, the check for setMaxRows do not need to apply. And in reality, the limit 0 row will never be set for prepare as in JDBC due to the fact that 0 represented as unlimited. I think if there's any need for 0 limit for parseSQL, it should be internal to Calcite. Hence the code is refactor to behave in such.
          Hide
          julianhyde Julian Hyde added a comment -

          I find JDBC's 1-based maxRows irksome just as I find its 1-based column ordinals (in ResultSet.getXxx and PreparedStatement.setXxx) irksome. Most implementations of Meta do not go to JDBC and so don't need to abide by JDBC's silly interpretation of limit 0.

          Show
          julianhyde Julian Hyde added a comment - I find JDBC's 1-based maxRows irksome just as I find its 1-based column ordinals (in ResultSet.getXxx and PreparedStatement.setXxx) irksome. Most implementations of Meta do not go to JDBC and so don't need to abide by JDBC's silly interpretation of limit 0.
          Hide
          xhoong Xavier FH Leong added a comment -

          I still think that the consistency within Avatica will be beneficial, if not, care needs to be in for special handling (as you pointed out earlier)

          Finalize commit:
          https://github.com/xhoong/incubator-calcite/commit/e644a3ad8344f68e114da0adf60f41a227b2aff9

          Added a test in CalciteRemoteDriverTest.

          My 2 cent is to have the interface supporting JDBC consistent with JDBC, extending the interface to support other use case if needed (where there's a use case calling for 0 row ResultSet)

          Show
          xhoong Xavier FH Leong added a comment - I still think that the consistency within Avatica will be beneficial, if not, care needs to be in for special handling (as you pointed out earlier) Finalize commit: https://github.com/xhoong/incubator-calcite/commit/e644a3ad8344f68e114da0adf60f41a227b2aff9 Added a test in CalciteRemoteDriverTest. My 2 cent is to have the interface supporting JDBC consistent with JDBC, extending the interface to support other use case if needed (where there's a use case calling for 0 row ResultSet)
          Hide
          xhoong Xavier FH Leong added a comment -

          Think of it, there's no mistake in JdbcMeta.prepareAndExecute

                if (maxRowCount > 0) {
                  statement.setMaxRows(maxRowCount);
                }
          

          As meta.prepareAndExecute expect maxRowCount==0 to return 0 row ResultSet, but for the scope in JdbcMeta (which is a façade for underlying JDBC driver) cannot fulfill that expectation, as setting the statement.setMaxRows(0) is giving unlimited rows instead. There will be no underlying JDBC driver that can fulfill 0 rows ResultSet, and would require additional handling to fulfill that expectation.

          Sorry for being pain in the ass, but I hope you can see that the miss-alignment can be implementation trap.

          Show
          xhoong Xavier FH Leong added a comment - Think of it, there's no mistake in JdbcMeta.prepareAndExecute if (maxRowCount > 0) { statement.setMaxRows(maxRowCount); } As meta.prepareAndExecute expect maxRowCount==0 to return 0 row ResultSet, but for the scope in JdbcMeta (which is a façade for underlying JDBC driver) cannot fulfill that expectation, as setting the statement.setMaxRows(0) is giving unlimited rows instead. There will be no underlying JDBC driver that can fulfill 0 rows ResultSet, and would require additional handling to fulfill that expectation. Sorry for being pain in the ass, but I hope you can see that the miss-alignment can be implementation trap.
          Hide
          julianhyde Julian Hyde added a comment -

          I hope you can see that the miss-alignment can be implementation trap.

          Yes, that's why I documented the SPI explicitly:

             * @param maxRowCount Negative for no limit (different meaning than JDBC)

          As an implementer, that's as good as it gets. You just need to implement the SPI as documented.

          Show
          julianhyde Julian Hyde added a comment - I hope you can see that the miss-alignment can be implementation trap. Yes, that's why I documented the SPI explicitly: * @param maxRowCount Negative for no limit (different meaning than JDBC) As an implementer, that's as good as it gets. You just need to implement the SPI as documented.
          Hide
          xhoong Xavier FH Leong added a comment -

          Seem like I'm not able to convince you to be consistent with JDBC "irkiness" maxRowCount

          https://github.com/xhoong/incubator-calcite/commit/05d1a6bd51e6a4a5e939b5a899122008da7bd189

          Please find the 0 row count implementation with JdbcMeta.prepareAndExecute, complete with test of course.

          Show
          xhoong Xavier FH Leong added a comment - Seem like I'm not able to convince you to be consistent with JDBC "irkiness" maxRowCount https://github.com/xhoong/incubator-calcite/commit/05d1a6bd51e6a4a5e939b5a899122008da7bd189 Please find the 0 row count implementation with JdbcMeta.prepareAndExecute, complete with test of course.
          Hide
          xhoong Xavier FH Leong added a comment -

          Julian Hyde Any chance to review the new implementation? thanks.

          Show
          xhoong Xavier FH Leong added a comment - Julian Hyde Any chance to review the new implementation? thanks.
          Show
          julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/e8c3cccd . Thanks, Xavier!
          Hide
          jnadeau Jacques Nadeau added a comment -

          Resolved in release 1.4.0-incubating (2015-08-23)

          Show
          jnadeau Jacques Nadeau added a comment - Resolved in release 1.4.0-incubating (2015-08-23)

            People

            • Assignee:
              julianhyde Julian Hyde
              Reporter:
              xhoong Xavier FH Leong
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development