Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-2870

BQ Partitioned Table Write Fails When Destination has Partition Decorator

Details

    Description

      Dataflow Job ID: https://console.cloud.google.com/dataflow/job/2017-09-08_23_03_14-14637186041605198816

      Tagging reuvenlax as I believe he built the time partitioning integration that was merged into master.

      Background
      Our production pipeline ingests millions of events per day and routes events into our clients' numerous tables. To keep costs down, all of our tables are partitioned. However, this requires that we create the tables before we allow events to process as creating partitioned tables isn't supported in 2.1.0. We've been looking forward to reuvenlax's partition table write feature (#3663) to get merged into master for some time now as it'll allow us to launch our client platforms much, much faster. Today we got around to testing the 2.2.0 nightly and discovered this bug.

      Issue
      Our pipeline writes to a table with a decorator. When attempting to write to an existing partitioned table with a decorator, the write succeeds. When using a partitioned table destination that doesn't exist without a decorator, the write succeeds. However, when writing to a partitioned table that doesn't exist with a decorator, the write fails.

      Example Implementation

      BigQueryIO.writeTableRows()
        .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
        .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
        .withFailedInsertRetryPolicy(InsertRetryPolicy.alwaysRetry())
        .to(new DynamicDestinations<TableRow, String>() {
      
          @Override
          public String getDestination(ValueInSingleWindow<TableRow> element) {
            return "PROJECT_ID:DATASET_ID.TABLE_ID$20170902";
          }
      
          @Override
          public TableDestination getTable(String destination) {
            TimePartitioning DAY_PARTITION = new TimePartitioning().setType("DAY");
            return new TableDestination(destination, null, DAY_PARTITION);
          }
      
          @Override
          public TableSchema getSchema(String destination) {
            return TABLE_SCHEMA;
          }
        })
      

      Relevant Logs & Errors in StackDriver

      23:06:26.790 
      Trying to create BigQuery table: PROJECT_ID:DATASET_ID.TABLE_ID$20170902
      
      23:06:26.873 
      Invalid table ID \"TABLE_ID$20170902\". Table IDs must be alphanumeric (plus underscores) and must be at most 1024 characters long. Also, Table decorators cannot be used.
      

      Attachments

        Activity

          People

            jkff Eugene Kirpichov
            SJAnderson Steven Jon Anderson
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: