Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-767

Commit functionality not exposed by the RPC server

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: avatica
    • Labels:
      None

      Description

      It seems that the commit/rollback functionality is not exposed by the RPC server, which means that it's only usable in autocommit mode. Avatica itself doesn't have a concept of commit in the RPC and the remote JDBC connection raises an exception when calling commit() on it, but Phoenix's native JDBC connection does implement commit(), so the RPC needs to be extended to allow calling that remotely.

      The easiest way to test this, "!autocommit off" and then "!commit" fails in "sqline-thin.py", but works in "sqline.py".

        Issue Links

          Activity

          Hide
          ndimiduk Nick Dimiduk added a comment -

          Thanks for kicking the tires here Lukas. This would be a feature gap on the Calcite side.

          Show
          ndimiduk Nick Dimiduk added a comment - Thanks for kicking the tires here Lukas. This would be a feature gap on the Calcite side.
          Hide
          lukaslalinsky Lukas Lalinsky added a comment -

          Ah, I was not sure if that wasn't intentionally not implemented in Calcite with the idea that projects that need that should extend the interface. But adding full JDBC interface there makes more sense.

          Show
          lukaslalinsky Lukas Lalinsky added a comment - Ah, I was not sure if that wasn't intentionally not implemented in Calcite with the idea that projects that need that should extend the interface. But adding full JDBC interface there makes more sense.
          Hide
          ndimiduk Nick Dimiduk added a comment -

          Bulk edit assigning avatica component to obvious issues.

          Show
          ndimiduk Nick Dimiduk added a comment - Bulk edit assigning avatica component to obvious issues.
          Hide
          jamestaylor James Taylor added a comment -

          Nick Dimiduk - it'd be great if Avatica could add support for commit and rollback in it's RPC. Is there a Calcite JIRA for that that I can follow?

          Show
          jamestaylor James Taylor added a comment - Nick Dimiduk - it'd be great if Avatica could add support for commit and rollback in it's RPC. Is there a Calcite JIRA for that that I can follow?
          Hide
          jamestaylor James Taylor added a comment -

          One complication of implementing this in Avatica brought up by Josh Elser is how to implement commit while preserving our "stateless client" approach. Specifically, say you're access a query server behind some "dumb" load balancer. You do some stuff, but before you call commit(), the server dies. How does the client handle this? When autocommit=off, do we buffer this in memory on the client and automatically replay the calls since the last commit()?

          A similar issue might be if the same client connection ends up getting a different query server node due to the load balancer. How is the uncommitted state maintained across different query server nodes?

          For transactional tables, one potential solution is that when a statement causes a transaction to start, we return back the transaction metadata as part of the response. The metadata would vary based on the implementation, but for Phoenix the state that would need to be captured is based on Tephra's Transaction object[1]. The query server would need to ensure that the data for each statement was flushed to the cluster prior to returning (see PHOENIX-2411 for more detail). With transactions, the data can essentially be written to HBase, but not yet considered committed. Ideally, a transaction implementation could give back a simple transaction ID which would capture this metadata

          For non transactional tables, other than carrying around the uncommitted data on the client (which probably isn't such a scalable solution for a true thin driver), I can't think of another better solution.

          [1] https://github.com/caskdata/tephra/blob/develop/tephra-api/src/main/java/co/cask/tephra/Transaction.java.

          Show
          jamestaylor James Taylor added a comment - One complication of implementing this in Avatica brought up by Josh Elser is how to implement commit while preserving our "stateless client" approach. Specifically, say you're access a query server behind some "dumb" load balancer. You do some stuff, but before you call commit(), the server dies. How does the client handle this? When autocommit=off, do we buffer this in memory on the client and automatically replay the calls since the last commit()? A similar issue might be if the same client connection ends up getting a different query server node due to the load balancer. How is the uncommitted state maintained across different query server nodes? For transactional tables, one potential solution is that when a statement causes a transaction to start, we return back the transaction metadata as part of the response. The metadata would vary based on the implementation, but for Phoenix the state that would need to be captured is based on Tephra's Transaction object [1] . The query server would need to ensure that the data for each statement was flushed to the cluster prior to returning (see PHOENIX-2411 for more detail). With transactions, the data can essentially be written to HBase, but not yet considered committed. Ideally, a transaction implementation could give back a simple transaction ID which would capture this metadata For non transactional tables, other than carrying around the uncommitted data on the client (which probably isn't such a scalable solution for a true thin driver), I can't think of another better solution. [1] https://github.com/caskdata/tephra/blob/develop/tephra-api/src/main/java/co/cask/tephra/Transaction.java .
          Hide
          elserj Josh Elser added a comment -

          I've been racking my brains trying to come up with a better way around this for non-transactional tables (as James outlined above), and I'm just not coming up with anything better. This might be an area where operating stateless-ly just isn't feasible for a first-pass implementation.

          I think my approach might be to make the assumption that communications are going to a single instance for now, and come up with more robust strategies later.

          Show
          elserj Josh Elser added a comment - I've been racking my brains trying to come up with a better way around this for non-transactional tables (as James outlined above), and I'm just not coming up with anything better. This might be an area where operating stateless-ly just isn't feasible for a first-pass implementation. I think my approach might be to make the assumption that communications are going to a single instance for now, and come up with more robust strategies later.
          Hide
          elserj Josh Elser added a comment -

          Just threw up some simple changes that enable commit/rollback. Didn't try to do anything fancy previously talked about – dirt simple implementation.

          Julian Hyde, if you have time, a glance by you is always appreciated.

          Show
          elserj Josh Elser added a comment - Just threw up some simple changes that enable commit/rollback. Didn't try to do anything fancy previously talked about – dirt simple implementation. Julian Hyde , if you have time, a glance by you is always appreciated.
          Hide
          elserj Josh Elser added a comment -

          Did some prelim testing of this using Phoenix and protobuf. Using Phoenix's pherf tool, I was able to run the same test against the normal Phoenix driver and the Avatica-based Phoenix thin driver. For those curious, performance degredation without any thought of optimization in Avatica: 5x. Pretty good for a first pass, IMO.

          I'm going to rebase/merge this in tonight.

          Show
          elserj Josh Elser added a comment - Did some prelim testing of this using Phoenix and protobuf. Using Phoenix's pherf tool, I was able to run the same test against the normal Phoenix driver and the Avatica-based Phoenix thin driver. For those curious, performance degredation without any thought of optimization in Avatica: 5x. Pretty good for a first pass, IMO. I'm going to rebase/merge this in tonight.
          Hide
          jamestaylor James Taylor added a comment -

          I'd be interested in seeing your Pherf scenario, Josh Elser. 5x is more degradation that I'd have expected, but I'm sure it depends on the workload.

          FWIW, I'm about to implement PHOENIX-2411 which would support the "dumb" load balancer scenario you mentioned.

          Show
          jamestaylor James Taylor added a comment - I'd be interested in seeing your Pherf scenario, Josh Elser . 5x is more degradation that I'd have expected, but I'm sure it depends on the workload. FWIW, I'm about to implement PHOENIX-2411 which would support the "dumb" load balancer scenario you mentioned.
          Hide
          elserj Josh Elser added a comment -

          I'd be interested in seeing your Pherf scenario, Josh Elser. 5x is more degradation that I'd have expected, but I'm sure it depends on the workload.

          The frustrating part at the moment is that I have no good means to understand where time is being spent. I'm still working through some, what I think is, nonsense logging in pherf, but I'll get back to you on that over in Phoenixlandia

          FWIW, I'm about to implement PHOENIX-2411 which would support the "dumb" load balancer scenario you mentioned.

          Sick!

          Show
          elserj Josh Elser added a comment - I'd be interested in seeing your Pherf scenario, Josh Elser. 5x is more degradation that I'd have expected, but I'm sure it depends on the workload. The frustrating part at the moment is that I have no good means to understand where time is being spent. I'm still working through some, what I think is, nonsense logging in pherf, but I'll get back to you on that over in Phoenixlandia FWIW, I'm about to implement PHOENIX-2411 which would support the "dumb" load balancer scenario you mentioned. Sick!
          Hide
          jamestaylor James Taylor added a comment -

          FWIW, I'm about to implement PHOENIX-2411 which would support the "dumb" load balancer scenario you mentioned.

          Some work would be required in Avatica to support this scenario. Phoenix would give you the transactional context as a result of executing a statement and you'd need to pass this through the RPC. I think you'd need to do the same for the connection URL and properties (though these wouldn't be modified by Phoenix, just read).

          If you want a lesson in "how to interpret Pherf numbers", Mujtaba Chohan is your man.

          Show
          jamestaylor James Taylor added a comment - FWIW, I'm about to implement PHOENIX-2411 which would support the "dumb" load balancer scenario you mentioned. Some work would be required in Avatica to support this scenario. Phoenix would give you the transactional context as a result of executing a statement and you'd need to pass this through the RPC. I think you'd need to do the same for the connection URL and properties (though these wouldn't be modified by Phoenix, just read). If you want a lesson in "how to interpret Pherf numbers", Mujtaba Chohan is your man.
          Hide
          elserj Josh Elser added a comment -

          If you want a lesson in "how to interpret Pherf numbers", Mujtaba Chohan is your man.

          Ok. I'll probably hit him up later this week. Still messing around with things

          Some work would be required in Avatica to support this scenario. Phoenix would give you the transactional context as a result of executing a statement and you'd need to pass this through the RPC. I think you'd need to do the same for the connection URL and properties (though these wouldn't be modified by Phoenix, just read).

          Ok, that makes sense. While I'm thinking about it, let me open up another issue and assign it to you for proper attribution on this work as well as tracking it.

          Show
          elserj Josh Elser added a comment - If you want a lesson in "how to interpret Pherf numbers", Mujtaba Chohan is your man. Ok. I'll probably hit him up later this week. Still messing around with things Some work would be required in Avatica to support this scenario. Phoenix would give you the transactional context as a result of executing a statement and you'd need to pass this through the RPC. I think you'd need to do the same for the connection URL and properties (though these wouldn't be modified by Phoenix, just read). Ok, that makes sense. While I'm thinking about it, let me open up another issue and assign it to you for proper attribution on this work as well as tracking it.
          Show
          elserj Josh Elser added a comment - This was fixed in https://git1-us-west.apache.org/repos/asf?p=calcite.git;a=commit;h=322b97300d460cf7c98b6002e4f0d5dab455f188
          Hide
          julianhyde Julian Hyde added a comment -

          Resolved in release 1.6.0 (2016-01-22).

          Show
          julianhyde Julian Hyde added a comment - Resolved in release 1.6.0 (2016-01-22).

            People

            • Assignee:
              elserj Josh Elser
              Reporter:
              lukaslalinsky Lukas Lalinsky
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development