Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-3683

Support BigQuery column-based time partitioning

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • None
    • 2.4.0
    • io-java-gcp
    • None

    Description

      BigQuery now supports tables partitioned by a DATE or TIMESTAMP column. This is very useful for backfilling, because now it doesn't require 1 load job per partition (1 load job for the whole table is fine now), and in case of BigQueryIO.write(), doesn't require using DynamicDestinations - one only needs to specify which field to partition on.

      It is specified via TimePartitioning.field: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.load (configuration.load.timePartitioning.field).

      Seems that the only thing that's needed is to update the BigQuery client - then users can use BigQueryIO.write().withTimePartitioning() in some cases where they previously needed to use write().to(DynamicDestinations).

      Plus publicity (e.g. a StackOverflow answer)

      CC: reuvenlax chamikara

      Attachments

        Issue Links

          Activity

            People

              jkff Eugene Kirpichov
              jkff Eugene Kirpichov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 10m
                  1h 10m