Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.6.0
    • Component/s: stream
    • Labels:
      None

      Description

      Stream joins are used to relate information from different streams or stream and relation combinations. Calcite lacks (proper) support for stream-to-relation joins and stream-to-stream joins.

      stream-to-relation join like below fails at the SQL validation stage.

      select stream orders.orderId, orders.productId, products.name from orders join products on orders.productId = products.id

      But if 'products' is a stream, the query is valid according to Calcite, even though the stream-to-stream join in above query is not valid due to unbounded nature of streams.

      1. CALCITE-968-0.patch
        13 kB
        Milinda Lakmal Pathirage

        Issue Links

          Activity

          Hide
          milinda Milinda Lakmal Pathirage added a comment - - edited

          Below is Julian Hyde's response from the developer mailing list on this issue.

          The design falls into 3 parts:

          • Validation. We should allow any combination: table-table, stream-table and stream-stream joins, as long as the query can make progress. That often means that where a stream is involved, the join condition should involve a monotonic expression. If it is a stream-table join you can make progress without the monotonic expression, but if there are 2 streams you will need it.
          • Translation to relational algebra. Inspired by differential calculus’ product rule[1], "stream(x join y)" becomes "x join stream(y) union all stream(x) join y". Suppose that products is a table (i.e. we do not receive notifications of new products); then "stream(products)" is empty. Suppose that orders is a both a stream and a table; i.e. a stream with history. Because stream(products) is empty, "stream(products join orders)" is simply “products join stream(orders)”. These rewrites would happen in a DeltaJoinTransposeRule.
          • Updates to relations. Suppose that the products table is updated two or three times during each day. How quickly does the end user expect those updated records to appear in the output of the stream-table join? If the table is updated at 10am, should the new data be loaded only when processing transactions from 10am (which might not hit the join until say 10:07am). There is no ‘right answer’ here; we should offer the end user a choice of policies. A good basic policy would be “cache for no more than T seconds” or “cache as long as you like” but give a manual way to flush the cache.

          [1] https://en.wikipedia.org/wiki/Product_rule

          Show
          milinda Milinda Lakmal Pathirage added a comment - - edited Below is Julian Hyde 's response from the developer mailing list on this issue. The design falls into 3 parts: Validation. We should allow any combination: table-table, stream-table and stream-stream joins, as long as the query can make progress. That often means that where a stream is involved, the join condition should involve a monotonic expression. If it is a stream-table join you can make progress without the monotonic expression, but if there are 2 streams you will need it. Translation to relational algebra. Inspired by differential calculus’ product rule [1] , "stream(x join y)" becomes "x join stream(y) union all stream(x) join y". Suppose that products is a table (i.e. we do not receive notifications of new products); then "stream(products)" is empty. Suppose that orders is a both a stream and a table; i.e. a stream with history. Because stream(products) is empty, "stream(products join orders)" is simply “products join stream(orders)”. These rewrites would happen in a DeltaJoinTransposeRule. Updates to relations. Suppose that the products table is updated two or three times during each day. How quickly does the end user expect those updated records to appear in the output of the stream-table join? If the table is updated at 10am, should the new data be loaded only when processing transactions from 10am (which might not hit the join until say 10:07am). There is no ‘right answer’ here; we should offer the end user a choice of policies. A good basic policy would be “cache for no more than T seconds” or “cache as long as you like” but give a manual way to flush the cache. [1] https://en.wikipedia.org/wiki/Product_rule
          Hide
          milinda Milinda Lakmal Pathirage added a comment -

          Below are some sample queries I came up with. But I am not sure what exactly Julian Hyde mentioned by

          the join condition should involve a monotonic expression

          above.

          In below queries I assumed that we are not going to extend FROM and JOIN clauses with OVER to express the window specification and put <monotonic expression> to represent a monotonic expression which can express the window spec. All the windows expressed below are sliding windows.

          Sample Schemas

          Order Processing

          • Orders (rowtime, productId, orderId, units) - a stream
          • Products (productId, name, supplierId) - a table
          • Shipments (rowtime, shipmentId, orderId) - a stream
          • Suppliers (supplierId, name, loction) - a table

          Packet Monitoring

          • PacketsR1 (rowtime, sourcetime, packetId) - a stream
          • PacketsR2 (rowtime, sourcetime, packetId) - a stream

          Stream-to-Stream Joins

          Network Packet Monitoring

          SELECT STREAM
          	PacketsR1.sourcetime, PacketsR1.packetId, PacketsR2.rowtime - PacketsR1.rowtime AS timeToTravel
          FROM PacketsR1
          JOIN PacketsR2 ON <monotonic expression> AND  PacketsR1.packetId = PacketsR2.packetId	
          

          With OVER clause:

          SELECT STREAM
          	PacketsR1.sourcetime, PacketsR1.packetId, PacketsR2.rowtime - PacketsR1.rowtime AS timeToTravel
          FROM PacketsR1 (ORDER BY rowtime RANGE INTERVAL '2' SECOND PRECEDING)
          JOIN PacketsR2 (ORDER BY rowtime RANGE INTERVAL '2' SECOND PRECEDING) ON  PacketsR1.packetId = PacketsR2.packetId	
          

          Online Auctions

          From http://www.sqlstream.com/docs/index.html?qs_stream_window_joins.html

          But joining the bids from last minute with Asks stream may be a valid query. But we don't have support for OVER in FROM and JOIN clauses. As Julian mentioned in the mail, it may be possible to express window spec as a monotonic expression in the ON clause.

          SELECT STREAM 
          	Asks.askId as askId, Bids.bidId as bidId, Asks.rowtime as askRowtime, Bids.rowtime as bidRowtime, Asks.ticker, Asks.shares as askShares, Asks.prices as askPrice, Bids.shares as bidShares, Bids.price as bidPrice
          FROM Bids
          JOIN Asks ON <monotonic expression> AND Asks.ticker = Bids.ticker
          

          Above query with OVER clause.

          SELECT STREAM 
          	Asks.askId as askId, Bids.bidId as bidId, Asks.rowtime as askRowtime, Bids.rowtime as bidRowtime, Asks.ticker, Asks.shares as askShares, Asks.prices as askPrice, Bids.shares as bidShares, Bids.price as bidPrice
          FROM Bids OVER (ORDER BY rowtime RANGE INTERVAL '1' MINUTE PRECEDING)
          JOIN Asks OVER (ROWS CURRENT ROW) ON  Asks.ticker = Bids.ticker
          

          Stream-to-Relaiton Joins

          Add Supplier Information to Orders Stream

          SELECT STREAM
          	Orders.rowtime, Orders.orderId, Orders.productId, Orders.units, Products.supplierId
          FROM Orders
          JOIN Products 	ON Orders.productId = Products.productId
          
          CREATE VIEW OrdersWithSupplierId (rowtime, orderId, productId, units, supplierId) AS
          	SELECT STREAM
          		Orders.rowtime, Orders.orderId, Orders.productId, Orders.units, Products.supplierId
          	FROM Orders
          	JOIN Products 	ON Orders.productId = Products.productId
          
          SELECT STREAM
          	OrdersWithSupplierId.rowtime, OrdersWithSupplierId.orderId, OrdersWithSupplierId.supplierId, Suppliers.locaiton
          FROM OrdersWithSupplierId
          JOIN Suppliers ON OrdersWithSupplierId.supplierId = Suppliers.supplierId
          
          Show
          milinda Milinda Lakmal Pathirage added a comment - Below are some sample queries I came up with. But I am not sure what exactly Julian Hyde mentioned by the join condition should involve a monotonic expression above. In below queries I assumed that we are not going to extend FROM and JOIN clauses with OVER to express the window specification and put <monotonic expression> to represent a monotonic expression which can express the window spec. All the windows expressed below are sliding windows. Sample Schemas Order Processing Orders (rowtime, productId, orderId, units) - a stream Products (productId, name, supplierId) - a table Shipments (rowtime, shipmentId, orderId) - a stream Suppliers (supplierId, name, loction) - a table Packet Monitoring PacketsR1 (rowtime, sourcetime, packetId) - a stream PacketsR2 (rowtime, sourcetime, packetId) - a stream Stream-to-Stream Joins Network Packet Monitoring SELECT STREAM PacketsR1.sourcetime, PacketsR1.packetId, PacketsR2.rowtime - PacketsR1.rowtime AS timeToTravel FROM PacketsR1 JOIN PacketsR2 ON <monotonic expression> AND PacketsR1.packetId = PacketsR2.packetId With OVER clause: SELECT STREAM PacketsR1.sourcetime, PacketsR1.packetId, PacketsR2.rowtime - PacketsR1.rowtime AS timeToTravel FROM PacketsR1 (ORDER BY rowtime RANGE INTERVAL '2' SECOND PRECEDING) JOIN PacketsR2 (ORDER BY rowtime RANGE INTERVAL '2' SECOND PRECEDING) ON PacketsR1.packetId = PacketsR2.packetId Online Auctions From http://www.sqlstream.com/docs/index.html?qs_stream_window_joins.html But joining the bids from last minute with Asks stream may be a valid query. But we don't have support for OVER in FROM and JOIN clauses. As Julian mentioned in the mail, it may be possible to express window spec as a monotonic expression in the ON clause. SELECT STREAM Asks.askId as askId, Bids.bidId as bidId, Asks.rowtime as askRowtime, Bids.rowtime as bidRowtime, Asks.ticker, Asks.shares as askShares, Asks.prices as askPrice, Bids.shares as bidShares, Bids.price as bidPrice FROM Bids JOIN Asks ON <monotonic expression> AND Asks.ticker = Bids.ticker Above query with OVER clause. SELECT STREAM Asks.askId as askId, Bids.bidId as bidId, Asks.rowtime as askRowtime, Bids.rowtime as bidRowtime, Asks.ticker, Asks.shares as askShares, Asks.prices as askPrice, Bids.shares as bidShares, Bids.price as bidPrice FROM Bids OVER (ORDER BY rowtime RANGE INTERVAL '1' MINUTE PRECEDING) JOIN Asks OVER (ROWS CURRENT ROW) ON Asks.ticker = Bids.ticker Stream-to-Relaiton Joins Add Supplier Information to Orders Stream SELECT STREAM Orders.rowtime, Orders.orderId, Orders.productId, Orders.units, Products.supplierId FROM Orders JOIN Products ON Orders.productId = Products.productId CREATE VIEW OrdersWithSupplierId (rowtime, orderId, productId, units, supplierId) AS SELECT STREAM Orders.rowtime, Orders.orderId, Orders.productId, Orders.units, Products.supplierId FROM Orders JOIN Products ON Orders.productId = Products.productId SELECT STREAM OrdersWithSupplierId.rowtime, OrdersWithSupplierId.orderId, OrdersWithSupplierId.supplierId, Suppliers.locaiton FROM OrdersWithSupplierId JOIN Suppliers ON OrdersWithSupplierId.supplierId = Suppliers.supplierId
          Hide
          julianhyde Julian Hyde added a comment -

          Regarding

          the join condition should involve a monotonic expression

          I was mistaken. Stream-relation joins do not need any monotonic expression in the join condition to make progress. (I was thinking of some temporal database thing where an order would be joined to the version of the product table that was current at the moment the order was placed. But that can wait until later!)

          Your stream-to-relation examples look good. Let's create a new example of stream-to-relation-to-relation example, implied by expanding the view in the 2nd example, and switch to USING just because we can:

          // stream-to-relation example #3
          SELECT STREAM o.rowtime, o.orderId, p.supplierId, s.location
          FROM Orders AS o
          JOIN Products AS p USING (productId)
          JOIN Suppliers AS s USING (supplierId)

          Regarding stream-to-stream join. I don't think it is ever necessary for a FROM item to have an OVER clause. (Please give a counter example if you have one! In all the real cases I have seen the left rowtime is joined to a range from the right, or vice versa.) I don't think that users will enjoy learning a new and unnecessary SQL construct, so let's see if we can do without it. Here are the queries with "monotonic expression" replaced:

          // stream-to-stream example #1
          SELECT STREAM GREATEST(PacketsR1.rowtime, PacketsR2.rowtime) AS rowtime,
          	PacketsR1.sourcetime,
                  PacketsR1.packetId,
                  PacketsR2.rowtime - PacketsR1.rowtime AS timeToTravel
          FROM PacketsR1
          JOIN PacketsR2 ON PacketsR1.rowtime
              BETWEEN PacketsR2.rowtime - INTERVAL '2' SECOND
                  AND PacketsR2.rowtime + INTERVAL '2' SECOND
          AND  PacketsR1.packetId = PacketsR2.packetId
          // stream-to-stream example #2
          SELECT STREAM  Asks.rowtime,
          	Asks.askId as askId, Bids.bidId as bidId,
                  Asks.rowtime as askRowtime, Bids.rowtime as bidRowtime,
                  Asks.ticker, Asks.shares as askShares, Asks.prices as askPrice,
                  Bids.shares as bidShares, Bids.price as bidPrice
          FROM Bids
          JOIN Asks ON Asks.rowtime BETWEEN Bids.rowtime AND Bids.rowtime + INTERVAL '1' MINUTE
          AND Asks.ticker = Bids.ticker

          Note that I added a 'rowtime' expressions to each SELECT clause. Whereas SQLstream's join operator generates an implicit rowtime column (rowtime is a system column in SQLstream, but not in Calcite streams), we have to generate it explicitly. But I think it's clearer that way.

          Show
          julianhyde Julian Hyde added a comment - Regarding the join condition should involve a monotonic expression I was mistaken. Stream-relation joins do not need any monotonic expression in the join condition to make progress. (I was thinking of some temporal database thing where an order would be joined to the version of the product table that was current at the moment the order was placed. But that can wait until later!) Your stream-to-relation examples look good. Let's create a new example of stream-to-relation-to-relation example, implied by expanding the view in the 2nd example, and switch to USING just because we can: // stream-to-relation example #3 SELECT STREAM o.rowtime, o.orderId, p.supplierId, s.location FROM Orders AS o JOIN Products AS p USING (productId) JOIN Suppliers AS s USING (supplierId) Regarding stream-to-stream join. I don't think it is ever necessary for a FROM item to have an OVER clause. (Please give a counter example if you have one! In all the real cases I have seen the left rowtime is joined to a range from the right, or vice versa.) I don't think that users will enjoy learning a new and unnecessary SQL construct, so let's see if we can do without it. Here are the queries with "monotonic expression" replaced: // stream-to-stream example #1 SELECT STREAM GREATEST(PacketsR1.rowtime, PacketsR2.rowtime) AS rowtime, PacketsR1.sourcetime, PacketsR1.packetId, PacketsR2.rowtime - PacketsR1.rowtime AS timeToTravel FROM PacketsR1 JOIN PacketsR2 ON PacketsR1.rowtime BETWEEN PacketsR2.rowtime - INTERVAL '2' SECOND AND PacketsR2.rowtime + INTERVAL '2' SECOND AND PacketsR1.packetId = PacketsR2.packetId // stream-to-stream example #2 SELECT STREAM Asks.rowtime, Asks.askId as askId, Bids.bidId as bidId, Asks.rowtime as askRowtime, Bids.rowtime as bidRowtime, Asks.ticker, Asks.shares as askShares, Asks.prices as askPrice, Bids.shares as bidShares, Bids.price as bidPrice FROM Bids JOIN Asks ON Asks.rowtime BETWEEN Bids.rowtime AND Bids.rowtime + INTERVAL '1' MINUTE AND Asks.ticker = Bids.ticker Note that I added a 'rowtime' expressions to each SELECT clause. Whereas SQLstream's join operator generates an implicit rowtime column (rowtime is a system column in SQLstream, but not in Calcite streams), we have to generate it explicitly. But I think it's clearer that way.
          Hide
          julianhyde Julian Hyde added a comment - - edited

          GREATEST is an Oracle function but it is pretty obvious: "GREATEST(a, b)" is defined as "CASE WHEN a > b THEN a ELSE b END".

          Show
          julianhyde Julian Hyde added a comment - - edited GREATEST is an Oracle function but it is pretty obvious: "GREATEST(a, b)" is defined as "CASE WHEN a > b THEN a ELSE b END".
          Hide
          milinda Milinda Lakmal Pathirage added a comment -

          Julian Hyde, I started to change the SqlValidatorImpl to support stream-to-relation joins. I have several questions on this?

          1. Is it a requirement for the stream to be at the left most position (just after FROM clause) in a stream-to-relation join? As I understand its possible for a stream to be in any place if we think in terms of instantaneous relations as in CQL.
          2. If a stream can be in any place how can we throw a proper error in SqlValidatorImpl's validateModality? For an example, let's say we trying the following logic if required modality is STREAM:

          boolean atLeastOneSupportsModality = false;
          for (Pair<String, SqlValidatorNamespace> namespace : scope.children) {
             if(namespace.right.supportsModality(modality)) {
               atLeastOneSupportsModality = true;
             }
          }
          
          if (!atLeastOneSupportsModality) {
            if (fail) {
              throw newValidationError(namespace.right.getNode(),
                 Static.RESOURCE.cannotConvertToStream(namespace.left));
            } else {
              return false;
            }
          }
          

          In above which child node we should use (to get the position) when throwing the error? Let's say if we move the error throwing logic inside the for loop, is it okay if we throw the error for the last child? May be we should throw the error with the position of select node?

          Show
          milinda Milinda Lakmal Pathirage added a comment - Julian Hyde , I started to change the SqlValidatorImpl to support stream-to-relation joins. I have several questions on this? 1. Is it a requirement for the stream to be at the left most position (just after FROM clause) in a stream-to-relation join? As I understand its possible for a stream to be in any place if we think in terms of instantaneous relations as in CQL. 2. If a stream can be in any place how can we throw a proper error in SqlValidatorImpl's validateModality ? For an example, let's say we trying the following logic if required modality is STREAM : boolean atLeastOneSupportsModality = false ; for (Pair< String , SqlValidatorNamespace> namespace : scope.children) { if (namespace.right.supportsModality(modality)) { atLeastOneSupportsModality = true ; } } if (!atLeastOneSupportsModality) { if (fail) { throw newValidationError(namespace.right.getNode(), Static.RESOURCE.cannotConvertToStream(namespace.left)); } else { return false ; } } In above which child node we should use (to get the position) when throwing the error? Let's say if we move the error throwing logic inside the for loop, is it okay if we throw the error for the last child? May be we should throw the error with the position of select node?
          Hide
          milinda Milinda Lakmal Pathirage added a comment - - edited

          Hi Julian Hyde,

          I tried a temporary fix to SqlValidatorImpl, implemented DeltaJoinTransposeRule based on Product Rule (I don't think this is 100% correct) and wrote a simple stream-to-relaiton join test to verify the planner. But I am getting an error during optimization with VolcanoPlanner. According to my understanding this can happen due to two reasons:

          • A missing rule. But I think Logical to Enumerable conversion works by default.
          • I am doing something incorrect when implementing ProductsTable

          [1] is the link to the output from VolcanoPlanner. Highly appreciate any tips.

          [1] https://gist.github.com/milinda/97484c23a8dc57fc2682

          Show
          milinda Milinda Lakmal Pathirage added a comment - - edited Hi Julian Hyde , I tried a temporary fix to SqlValidatorImpl, implemented DeltaJoinTransposeRule based on Product Rule (I don't think this is 100% correct) and wrote a simple stream-to-relaiton join test to verify the planner. But I am getting an error during optimization with VolcanoPlanner. According to my understanding this can happen due to two reasons: A missing rule. But I think Logical to Enumerable conversion works by default. I am doing something incorrect when implementing ProductsTable [1] is the link to the output from VolcanoPlanner. Highly appreciate any tips. [1] https://gist.github.com/milinda/97484c23a8dc57fc2682
          Hide
          julianhyde Julian Hyde added a comment -

          The stream does not need to be left-most. Stream-join-table and table-join-stream are both valid.

          Not sure of the details, but validateModality is just a means to an end. If I were you I would write tests for valid and invalid queries and make sure that the invalid ones produce the most useful possible error messages. If you need to throw away validateModality and start again, that's fine.

          Show
          julianhyde Julian Hyde added a comment - The stream does not need to be left-most. Stream-join-table and table-join-stream are both valid. Not sure of the details, but validateModality is just a means to an end. If I were you I would write tests for valid and invalid queries and make sure that the invalid ones produce the most useful possible error messages. If you need to throw away validateModality and start again, that's fine.
          Hide
          julianhyde Julian Hyde added a comment -

          I'll take a look shortly. Can you put your work on a github branch? Then we can more easily go back and forth.

          I see that you've introduced some '*' imports. That will break checkstyle.

          Show
          julianhyde Julian Hyde added a comment - I'll take a look shortly. Can you put your work on a github branch? Then we can more easily go back and forth. I see that you've introduced some '*' imports. That will break checkstyle.
          Hide
          milinda Milinda Lakmal Pathirage added a comment -

          Fixed code style issues and pushed the current changes to https://github.com/milinda/calcite/tree/CALCITE-968

          Show
          milinda Milinda Lakmal Pathirage added a comment - Fixed code style issues and pushed the current changes to https://github.com/milinda/calcite/tree/CALCITE-968
          Hide
          julianhyde Julian Hyde added a comment -

          I have your code working but I haven't had chance to fully analyze why Calcite cannot plan the query. But I wonder: do you have a rule

          Delta(Scan(constant-table)) -> Empty

          . Without such a rule I don't think the planner can plan the query.

          Show
          julianhyde Julian Hyde added a comment - I have your code working but I haven't had chance to fully analyze why Calcite cannot plan the query. But I wonder: do you have a rule Delta(Scan(constant-table)) -> Empty . Without such a rule I don't think the planner can plan the query.
          Hide
          milinda Milinda Lakmal Pathirage added a comment -

          I don't have that rule. I'll implement it and see whether the planner works.

          Show
          milinda Milinda Lakmal Pathirage added a comment - I don't have that rule. I'll implement it and see whether the planner works.
          Hide
          milinda Milinda Lakmal Pathirage added a comment - - edited

          Hi Julian Hyde,

          I simply modified the DeltaTableScanRule rule to transform call to empty LogicalValues node if the table scan is not for a streamable table. when tried with this modification, PruneEmptyRules' UNION_INSTANCE failed with a NullPointerException due to call.getChildRels(union) returning null. Then I replace that call with union.getInputs(). But now the rule fails saying 'planner promised us at least one Empty child'. I did some debugging and found out that this rule is matched even when union inputs are instance of RelSubset with one of the rels in the subset is an empty LogicalValues instance. But isEmpty method invoked on the inputs doesn't handle this situation. It just checks whether the input node is an instance of Values and if not returns false. It looks like this is a bug. But I am not 100% sure. Changes related to this issue can be found in https://github.com/milinda/calcite/tree/CALCITE-968-pruneempty-bug branch.

          Show
          milinda Milinda Lakmal Pathirage added a comment - - edited Hi Julian Hyde , I simply modified the DeltaTableScanRule rule to transform call to empty LogicalValues node if the table scan is not for a streamable table. when tried with this modification, PruneEmptyRules' UNION_INSTANCE failed with a NullPointerException due to call.getChildRels(union) returning null. Then I replace that call with union.getInputs() . But now the rule fails saying 'planner promised us at least one Empty child'. I did some debugging and found out that this rule is matched even when union inputs are instance of RelSubset with one of the rels in the subset is an empty LogicalValues instance. But isEmpty method invoked on the inputs doesn't handle this situation. It just checks whether the input node is an instance of Values and if not returns false . It looks like this is a bug. But I am not 100% sure. Changes related to this issue can be found in https://github.com/milinda/calcite/tree/CALCITE-968-pruneempty-bug branch.
          Hide
          julianhyde Julian Hyde added a comment -

          It looks as if there is a general bug with UNION_INSTANCE. I'll take a look and get back to you.

          Show
          julianhyde Julian Hyde added a comment - It looks as if there is a general bug with UNION_INSTANCE. I'll take a look and get back to you.
          Hide
          julianhyde Julian Hyde added a comment -

          Milinda Lakmal Pathirage, Your change from union.getInputs() to call.getChildRels(union) was not right (and isEmpty returns false when applied to a RelSubset). The real issue is that Volcano is not populating the list of siblings. I have logged CALCITE-990 for this, and will have a fix soon.

          If you're wondering why this was not discovered first: I previously tested PruneEmptyRules.UNION_INSTANCE with HepPlanner, which does not have this problem.

          Show
          julianhyde Julian Hyde added a comment - Milinda Lakmal Pathirage , Your change from union.getInputs() to call.getChildRels(union) was not right (and isEmpty returns false when applied to a RelSubset). The real issue is that Volcano is not populating the list of siblings. I have logged CALCITE-990 for this, and will have a fix soon. If you're wondering why this was not discovered first: I previously tested PruneEmptyRules.UNION_INSTANCE with HepPlanner, which does not have this problem.
          Hide
          milinda Milinda Lakmal Pathirage added a comment -

          Julian Hyde, thanks for looking into this. I knew that something should be wrong, but I didn't know much idea about call.getChildRels. Thought it was returning the inputs from the RelNode.

          Show
          milinda Milinda Lakmal Pathirage added a comment - Julian Hyde , thanks for looking into this. I knew that something should be wrong, but I didn't know much idea about call.getChildRels . Thought it was returning the inputs from the RelNode.
          Hide
          julianhyde Julian Hyde added a comment -

          Milinda Lakmal Pathirage, I have checked in a fix to CALCITE-990 in my branch https://github.com/julianhyde/calcite/tree/990-any-siblings. It should allow you to make progress. Please let me know either way.

          Show
          julianhyde Julian Hyde added a comment - Milinda Lakmal Pathirage , I have checked in a fix to CALCITE-990 in my branch https://github.com/julianhyde/calcite/tree/990-any-siblings . It should allow you to make progress. Please let me know either way.
          Hide
          milinda Milinda Lakmal Pathirage added a comment -

          Julian Hyde, I tried your fix with stream-to-relaiton join test. It works now. Thanks for the fix.

          Show
          milinda Milinda Lakmal Pathirage added a comment - Julian Hyde , I tried your fix with stream-to-relaiton join test. It works now. Thanks for the fix.
          Hide
          milinda Milinda Lakmal Pathirage added a comment -

          Pull request with necessary changes and tests can be found at https://github.com/apache/calcite/pull/172.

          Show
          milinda Milinda Lakmal Pathirage added a comment - Pull request with necessary changes and tests can be found at https://github.com/apache/calcite/pull/172 .
          Hide
          julianhyde Julian Hyde added a comment -

          Milinda Lakmal Pathirage, Looks good. I have rebased (but not squashed) and added a fix-up commit. See https://github.com/julianhyde/calcite/tree/968-stream.

          Can you check that I haven't screwed anything up? Then I'll squash and commit to master.

          Show
          julianhyde Julian Hyde added a comment - Milinda Lakmal Pathirage , Looks good. I have rebased (but not squashed) and added a fix-up commit. See https://github.com/julianhyde/calcite/tree/968-stream . Can you check that I haven't screwed anything up? Then I'll squash and commit to master.
          Hide
          milinda Milinda Lakmal Pathirage added a comment - - edited

          Julian Hyde, fixes looks okay to me. But I am getting following exception when I ran StreamTest.

          Running org.apache.calcite.test.StreamTest
          Tests run: 7, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 8.753 sec <<< FAILURE! - in org.apache.calcite.test.StreamTest
          testStreamToRelationJoin(org.apache.calcite.test.StreamTest)  Time elapsed: 0.104 sec  <<< ERROR!
          java.lang.RuntimeException: exception while preparing [select stream orders.rowtime as rowtime, orders.id as orderId, products.supplier as supplierId from orders join products on orders.product = products.id]
          	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
          	at org.junit.Assert.assertThat(Assert.java:865)
          	at org.junit.Assert.assertThat(Assert.java:832)
          	at org.apache.calcite.test.CalciteAssert$2.apply(CalciteAssert.java:225)
          	at org.apache.calcite.test.CalciteAssert$2.apply(CalciteAssert.java:219)
          	at org.apache.calcite.test.CalciteAssert$13.apply(CalciteAssert.java:523)
          	at org.apache.calcite.test.CalciteAssert$13.apply(CalciteAssert.java:521)
          	at org.apache.calcite.runtime.Hook.run(Hook.java:128)
          	at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:266)
          	at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:190)
          	at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:727)
          	at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:586)
          	at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:556)
          	at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:214)
          	at org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:194)
          	at org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:184)
          	at org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:85)
          	at org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:153)
          	at org.apache.calcite.test.CalciteAssert.assertPrepare(CalciteAssert.java:544)
          	at org.apache.calcite.test.CalciteAssert$AssertQuery.convertMatches(CalciteAssert.java:1257)
          	at org.apache.calcite.test.CalciteAssert$AssertQuery.convertContains(CalciteAssert.java:1252)
          	at org.apache.calcite.test.StreamTest.testStreamToRelationJoin(StreamTest.java:238)
          

          According to IntelliJ IDEA MatcherAssert#20 looks like following:

          if(!matcher.matches(actual)) {
          

          Can the above exception be a NullPointerException? But if convertContains works for other tests, this should work for testStreamToRelationJoin as well. May be I am missing something obvious in the code.

          Show
          milinda Milinda Lakmal Pathirage added a comment - - edited Julian Hyde , fixes looks okay to me. But I am getting following exception when I ran StreamTest. Running org.apache.calcite.test.StreamTest Tests run: 7, Failures: 0, Errors: 1, Skipped: 1, Time elapsed: 8.753 sec <<< FAILURE! - in org.apache.calcite.test.StreamTest testStreamToRelationJoin(org.apache.calcite.test.StreamTest) Time elapsed: 0.104 sec <<< ERROR! java.lang.RuntimeException: exception while preparing [select stream orders.rowtime as rowtime, orders.id as orderId, products.supplier as supplierId from orders join products on orders.product = products.id] at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.junit.Assert.assertThat(Assert.java:865) at org.junit.Assert.assertThat(Assert.java:832) at org.apache.calcite.test.CalciteAssert$2.apply(CalciteAssert.java:225) at org.apache.calcite.test.CalciteAssert$2.apply(CalciteAssert.java:219) at org.apache.calcite.test.CalciteAssert$13.apply(CalciteAssert.java:523) at org.apache.calcite.test.CalciteAssert$13.apply(CalciteAssert.java:521) at org.apache.calcite.runtime.Hook.run(Hook.java:128) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:266) at org.apache.calcite.prepare.Prepare.prepareSql(Prepare.java:190) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare2_(CalcitePrepareImpl.java:727) at org.apache.calcite.prepare.CalcitePrepareImpl.prepare_(CalcitePrepareImpl.java:586) at org.apache.calcite.prepare.CalcitePrepareImpl.prepareSql(CalcitePrepareImpl.java:556) at org.apache.calcite.jdbc.CalciteConnectionImpl.parseQuery(CalciteConnectionImpl.java:214) at org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement_(CalciteConnectionImpl.java:194) at org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:184) at org.apache.calcite.jdbc.CalciteConnectionImpl.prepareStatement(CalciteConnectionImpl.java:85) at org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:153) at org.apache.calcite.test.CalciteAssert.assertPrepare(CalciteAssert.java:544) at org.apache.calcite.test.CalciteAssert$AssertQuery.convertMatches(CalciteAssert.java:1257) at org.apache.calcite.test.CalciteAssert$AssertQuery.convertContains(CalciteAssert.java:1252) at org.apache.calcite.test.StreamTest.testStreamToRelationJoin(StreamTest.java:238) According to IntelliJ IDEA MatcherAssert#20 looks like following: if (!matcher.matches(actual)) { Can the above exception be a NullPointerException? But if convertContains works for other tests, this should work for testStreamToRelationJoin as well. May be I am missing something obvious in the code.
          Hide
          julianhyde Julian Hyde added a comment - - edited

          I got the same exception. I just fixed in d35e061. I think the join got flipped because of some cost change.

          Show
          julianhyde Julian Hyde added a comment - - edited I got the same exception. I just fixed in d35e061. I think the join got flipped because of some cost change.
          Hide
          milinda Milinda Lakmal Pathirage added a comment -

          The fix works. Thanks for the fixes to my patch.

          Show
          milinda Milinda Lakmal Pathirage added a comment - The fix works. Thanks for the fixes to my patch.
          Hide
          julianhyde Julian Hyde added a comment -

          In d35e061 I'm seeing non-deterministic behavior. I just saw PlannerTest.checkBushy fail due to a join being reversed. I have seen sporadic failures of that test over the years, and we can live with that, but I need to do more testing to see whether this is new non-determinism.

          Show
          julianhyde Julian Hyde added a comment - In d35e061 I'm seeing non-deterministic behavior. I just saw PlannerTest.checkBushy fail due to a join being reversed. I have seen sporadic failures of that test over the years, and we can live with that, but I need to do more testing to see whether this is new non-determinism.
          Show
          julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/calcite/commit/e9d50602 , with some minor fix-up in http://git-wip-us.apache.org/repos/asf/calcite/commit/937fc461 . Thanks for the PR, Milinda Lakmal Pathirage !
          Hide
          julianhyde Julian Hyde added a comment -

          Resolved in release 1.6.0 (2016-01-22).

          Show
          julianhyde Julian Hyde added a comment - Resolved in release 1.6.0 (2016-01-22).

            People

            • Assignee:
              julianhyde Julian Hyde
              Reporter:
              milinda Milinda Lakmal Pathirage
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development