Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-7755

BigQuery Repeated Records do not seem to work

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: P2
    • Resolution: Fixed
    • Affects Version/s: 2.12.0, 2.13.0
    • Fix Version/s: 2.15.0
    • Component/s: io-java-avro, io-java-gcp
    • Labels:
      None

      Description

      When translating BigQuery rows to beam rows, specifically using theĀ  BigQueryUtils.toBeamRow(record, beamSchema) method, REPEATEDĀ RECORDS causes an error. This seems to be caused that avro arrays are thought to only have primitive types but these are arrays with a ROW type:

      Caused by: java.lang.RuntimeException: ROW is not primitive type.
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroPrimitiveTypes(BigQueryUtils.java:467)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroArray(BigQueryUtils.java:427)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.convertAvroFormat(BigQueryUtils.java:373)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.lambda$toBeamRow$2(BigQueryUtils.java:222)
      	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
      	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376)
      	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
      	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
      	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
      	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
      	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamRow(BigQueryUtils.java:223)
      	at com.spotify.data.sql.RowSource.lambda$bigquery$120a5f9f$1(RowSource.java:55)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:242)
      	at org.apache.beam.sdk.io.gcp.bigquery.BigQuerySourceBase$1.apply(BigQuerySourceBase.java:235)
      	at org.apache.beam.sdk.io.AvroSource$AvroBlock.readNextRecord(AvroSource.java:597)
      	at org.apache.beam.sdk.io.BlockBasedSource$BlockBasedReader.readNextRecord(BlockBasedSource.java:209)
      	at org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.advanceImpl(FileBasedSource.java:484)
      	at org.apache.beam.sdk.io.FileBasedSource$FileBasedReader.startImpl(FileBasedSource.java:479)
      	at org.apache.beam.sdk.io.OffsetBasedSource$OffsetBasedReader.start(OffsetBasedSource.java:249)
      	at org.apache.beam.runners.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.start(WorkerCustomSources.java:601)
      

        Attachments

          Activity

            People

            • Assignee:
              snallapa Sahith Nallapareddy
              Reporter:
              snallapa Sahith Nallapareddy

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 2.5h
                2.5h

                  Issue deployment