Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1031

In prepared statement, CsvScannableTable.scan is called twice

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.5.0
    • Fix Version/s: 1.7.0, avatica-1.7.0
    • Component/s: avatica
    • Labels:
      None
    • Environment:

      jdk 1.8.0_20 linux mint 17.3

      Description

      Hi,

      I just noticed something when i am playing with examples.

        Class.forName("org.apache.calcite.jdbc.Driver")
        val properties: Properties = new Properties()
        properties.setProperty("caseSensitive","true")
        val connection = DriverManager.getConnection("jdbc:calcite:", properties)
        val calciteConnection=connection.unwrap(classOf[CalciteConnection])
      
        val schema= (new CsvSchemaFactory()).create(calciteConnection.getRootSchema,null,Map[String,AnyRef]("directory" -> "src/main/resources/csv","flavor" -> "scannable").asJava)
      
        calciteConnection.getRootSchema.add("TEST",schema)
      
      
        val statement2=calciteConnection.prepareStatement("select \"tarih\" from \"TEST\".\"timeseries\" where \"sensor\" = ?")
      
        statement2.setString(1,"sensor38")
        val resultSet1=statement2.executeQuery()
        while(resultSet1.next())
          println(resultSet1.getString("tarih"))
      

      My example like above. But when i call executeQuery i noticed that CsvScannableTable.scan called twice and enumerator iterating on values. When i inspected stacktraces i found that one of the call come from AvaticaConnection.java:463 and the other one is AvaticaConnection:489 .

      1. threads_report.txt
        3 kB
        Anıl Chalil
      2. threads_report2.txt
        3 kB
        Anıl Chalil

        Activity

        Hide
        elserj Josh Elser added a comment -

        Resolved in release Avatica 1.7.0

        Show
        elserj Josh Elser added a comment - Resolved in release Avatica 1.7.0
        Hide
        zhenw zhen wang added a comment -

        thanks Julian Hyde mine is too hacky to form a PR, I am good with the PR.

        Show
        zhenw zhen wang added a comment - thanks Julian Hyde mine is too hacky to form a PR, I am good with the PR.
        Hide
        julianhyde Julian Hyde added a comment -

        Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/8bbef2d6.

        zhen wang, I wasn't able to use your fix, but thank you for your help finding the cause of this issue.

        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/8bbef2d6 . zhen wang , I wasn't able to use your fix, but thank you for your help finding the cause of this issue.
        Hide
        julianhyde Julian Hyde added a comment -

        zhen wang, I would like to commit soon. If you don't provide a pull request in the next 2 days I will write a fix that does not use your changes. Please create a pull request.

        Show
        julianhyde Julian Hyde added a comment - zhen wang , I would like to commit soon. If you don't provide a pull request in the next 2 days I will write a fix that does not use your changes. Please create a pull request.
        Hide
        julianhyde Julian Hyde added a comment -

        zhen wang, I have a fix ready to commit that builds upon your patch. But I need a pull request before I can proceed. Please let me know when you have created one.

        Show
        julianhyde Julian Hyde added a comment - zhen wang , I have a fix ready to commit that builds upon your patch. But I need a pull request before I can proceed. Please let me know when you have created one.
        Hide
        julianhyde Julian Hyde added a comment -

        It looks good. Can you please open a pull request with your changes? (It's necessary red tape: Apache does not consider a link to a github branch adequate consent for a code submission.)

        Show
        julianhyde Julian Hyde added a comment - It looks good. Can you please open a pull request with your changes? (It's necessary red tape: Apache does not consider a link to a github branch adequate consent for a code submission.)
        Hide
        zhenw zhen wang added a comment -

        This is a hack which works ( all case pass )
        https://github.com/apache/calcite/compare/master...zinking:1031-scan-twice2?expand=1

        Revisiting: from the face of it, the sql plan is executed twice causing table scanned twice. firstly when result set is needed and secondly after the result set is generated and to actually retrieve the result set content.

        In my understanding, The intention of this lies in `RemoteMeta` design. where for the `RemoteMeta` the first execution is delegated to server side where result set is wrapped but not retrieved. and the second execute is to retrieve the content from the wrapped result set. Certainly, the `RemoteMeta` doesn’t deal with `TableScan` anyways. so this is good for RemoteMeta.

        but for the ‘Local’ Meta CalciteMetaImpl, both the two executions falls back to the same call, that is wrapping the result set for client use. thus making the Table scanned twice. CalciteMetaImpl certainly does not require the execution to be delegated to remote, so in this case the first execution can be safely skipped.

        The fix is a bit hacky though, currently no way provided to distinct these metas.

        Julian Hyde let me know what do you think. thanks.

        Show
        zhenw zhen wang added a comment - This is a hack which works ( all case pass ) https://github.com/apache/calcite/compare/master...zinking:1031-scan-twice2?expand=1 Revisiting: from the face of it, the sql plan is executed twice causing table scanned twice. firstly when result set is needed and secondly after the result set is generated and to actually retrieve the result set content. In my understanding, The intention of this lies in `RemoteMeta` design. where for the `RemoteMeta` the first execution is delegated to server side where result set is wrapped but not retrieved. and the second execute is to retrieve the content from the wrapped result set. Certainly, the `RemoteMeta` doesn’t deal with `TableScan` anyways. so this is good for RemoteMeta. but for the ‘Local’ Meta CalciteMetaImpl, both the two executions falls back to the same call, that is wrapping the result set for client use. thus making the Table scanned twice. CalciteMetaImpl certainly does not require the execution to be delegated to remote, so in this case the first execution can be safely skipped. The fix is a bit hacky though, currently no way provided to distinct these metas. Julian Hyde let me know what do you think. thanks.
        Hide
        zhenw zhen wang added a comment -

        best I can go today https://github.com/apache/calcite/compare/master...zinking:1031-scan-twice?expand=1
        still failing two cases in `CalciteRemoteDriverTest`

        my understanding currently is that in `CalciteMetaImpl` it always overrides the base class `createIterable` method to re-execute the statement regardless of whether it has already been executed.

        but once the Avatica driver wrapped result set is exposed to Calcite driver, there seem to be certain incompatibles. the result type List<List<object>> doesn't seem adapt perfectly everywhere in calcite driver which generate 12 errors Julian mentioned above.

        CalciteRemoteDriverTest left to be addressed.

        Show
        zhenw zhen wang added a comment - best I can go today https://github.com/apache/calcite/compare/master...zinking:1031-scan-twice?expand=1 still failing two cases in `CalciteRemoteDriverTest` my understanding currently is that in `CalciteMetaImpl` it always overrides the base class `createIterable` method to re-execute the statement regardless of whether it has already been executed. but once the Avatica driver wrapped result set is exposed to Calcite driver, there seem to be certain incompatibles. the result type List<List<object>> doesn't seem adapt perfectly everywhere in calcite driver which generate 12 errors Julian mentioned above. CalciteRemoteDriverTest left to be addressed.
        Hide
        julianhyde Julian Hyde added a comment -

        I think the problem is that we call execute twice for prepared statements. We should instead call bind then execute.

        Show
        julianhyde Julian Hyde added a comment - I think the problem is that we call execute twice for prepared statements. We should instead call bind then execute.
        Hide
        julianhyde Julian Hyde added a comment -

        The result string is easily fixed - it should have been "i=[0]\ni=[10]\n" all along.

        Bigger problem is the 12 errors in RelBuilderTest, JdbcAdapterTest, JdbcTest. I don't think executeEmpty is the right approach.

        Show
        julianhyde Julian Hyde added a comment - The result string is easily fixed - it should have been "i= [0] \ni= [10] \n" all along. Bigger problem is the 12 errors in RelBuilderTest, JdbcAdapterTest, JdbcTest. I don't think executeEmpty is the right approach.
        Hide
        zhenw zhen wang added a comment -

        since the statement is already executed, I think the second execution should wrap the first result.
        some thoughts here: https://github.com/apache/calcite/compare/master...zinking:1031-scan-twice?expand=1

        but I am getting
        java.lang.AssertionError:
        Expected: "i=0\ni=10\n"
        but: was "i=[0]\ni=[10]\n"

        seems I am having some dataType issue.

        is this the correct direction ?

        Show
        zhenw zhen wang added a comment - since the statement is already executed, I think the second execution should wrap the first result. some thoughts here: https://github.com/apache/calcite/compare/master...zinking:1031-scan-twice?expand=1 but I am getting java.lang.AssertionError: Expected: "i=0\ni=10\n" but: was "i= [0] \ni= [10] \n" seems I am having some dataType issue. is this the correct direction ?
        Hide
        julianhyde Julian Hyde added a comment -

        I can reproduce on master. See https://github.com/julianhyde/calcite/tree/1031-scan-twice, which has

        • a test case against CsvScannableTable that succeeds but demonstrates the issue if you set a break point, and
        • another test case in ScannableTableTest that counts the number of times scan is called, and fails.

        The heart of the problem is in AvaticaConnection.executeQueryInternal, which calls execute from line 463 then again from line 489. By the way, I checked and the issue already existed when Josh Elser changed that method to fix CALCITE-903.

        Show
        julianhyde Julian Hyde added a comment - I can reproduce on master. See https://github.com/julianhyde/calcite/tree/1031-scan-twice , which has a test case against CsvScannableTable that succeeds but demonstrates the issue if you set a break point, and another test case in ScannableTableTest that counts the number of times scan is called, and fails. The heart of the problem is in AvaticaConnection.executeQueryInternal, which calls execute from line 463 then again from line 489. By the way, I checked and the issue already existed when Josh Elser changed that method to fix CALCITE-903 .
        Hide
        zhenw zhen wang added a comment - - edited

        not reproduced on `master`

        not reproduced using new statement instead of `prepared statement`

        Show
        zhenw zhen wang added a comment - - edited not reproduced on `master` – not reproduced using new statement instead of `prepared statement`
        Hide
        capacman Anıl Chalil added a comment -

        I attached two files which include stack traces at the time scan is called

        Show
        capacman Anıl Chalil added a comment - I attached two files which include stack traces at the time scan is called

          People

          • Assignee:
            zhenw zhen wang
            Reporter:
            capacman Anıl Chalil
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development