Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21590

Structured Streaming window start time should support negative values to adjust time zone

Rank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      I want to calculate (unique) daily access count using structured streaming (2.2.0).
      Now strut streaming' s window with 1 day duration starts at
      00:00:00 UTC and ends at 23:59:59 UTC each day, but my local timezone is CST (UTC + 8 hours) and I
      want date boundaries to be 00:00:00 CST (that is 00:00:00 UTC - 8).

      In Flink I can set the window offset to -8 hours to make it, but here in struct streaming if I set the start time (same as the offset in Flink) to -8 or any other negative values, I will get the following error:

      Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve 'timewindow(timestamp, 86400000000, 86400000000, -28800000000)' due to data type mismatch: The start time (-28800000000) must be greater than or equal to 0.;;
      

      because the time window checks the input parameters to guarantee each value is greater than or equal to 0.

      So I'm thinking about whether we can remove the limit that the start time cannot be negative?

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            KevinZwx Kevin Zhang
            KevinZwx Kevin Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment