Details
-
Task
-
Status: Triage Needed
-
P2
-
Resolution: Fixed
-
None
-
None
Description
Reported on user@
We are trying to setup a pipeline with using BeamSql and the trigger used is default (AfterWatermark crosses the window).
Below is the pipeline:KafkaSource (KafkaIO)
---> Windowing (FixedWindow 1min)
---> BeamSql
---> KafkaSink (KafkaIO)We are using Spark Runner for this.
The BeamSql query is:select Col3, count(*) as count_col1 from PCOLLECTION GROUP BY Col3We are grouping by Col3 which is a string. It can hold values string[0-9].
The records are getting emitted out at 1 min to kafka sink, but the output record in kafka is not as expected.
Below is the output observed: (WST and WET are indicators for window start time and window end time){"count_col1":1,"Col3":"string5","WST":"2018-10-09 09-55-00 0000 +0000","WET":"2018-10-09 09-56-00 0000 +0000"} {"count_col1":3,"Col3":"string7","WST":"2018-10-09 09-55-00 0000 +0000","WET":"2018-10-09 09-56-00 0000 +0000"} {"count_col1":2,"Col3":"string8","WST":"2018-10-09 09-55-00 0000 +0000","WET":"2018-10-09 09-56-00 0000 +0000"} {"count_col1":1,"Col3":"string2","WST":"2018-10-09 09-55-00 0000 +0000","WET":"2018-10-09 09-56-00 0000 +0000"} {"count_col1":1,"Col3":"string6","WST":"2018-10-09 09-55-00 0000 +0000","WET":"2018-10-09 09-56-00 0000 +0000"} {"count_col1":0,"Col3":"string6","WST":"2018-10-09 09-55-00 0000 +0000","WET":"2018-10-09 09-56-00 0000 +0000"} {"count_col1":0,"Col3":"string6","WST":"2018-10-09 09-55-00 0000 +0000","WET":"2018-10-09 09-56-00 0000 +0000"} {"count_col1":0,"Col3":"string6","WST":"2018-10-09 09-55-00 0000 +0000","WET":"2018-10-09 09-56-00 0000 +0000"} {"count_col1":0,"Col3":"string6","WST":"2018-10-09 09-55-00 0000 +0000","WET":"2018-10-09 09-56-00 0000 +0000"} {"count_col1":0,"Col3":"string6","WST":"2018-10-09 09-55-00 0000 +0000","WET":"2018-10-09 09-56-00 0000 +0000"} {"count_col1":0,"Col3":"string6","WST":"2018-10-09 09-55-00 0000 +0000","WET":"2018-10-09 09-56-00 0}
Attachments
Issue Links
- links to