Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-29854

Make Record Size Flush Strategy Optional for Async Sink

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Connectors / Common
    • None

    Description

      Background

      Currently AsyncSinkWriter supports three mechanisms that trigger a flush to the destination:

      • TIme based 
      • Batch size in bytes
      • Number of records in the batch

      For "batch size in bytes" one must implement getSizeInBytes in order for the base to calculate the total batch size. In some cases computing the batch size within the AsyncSinkWriter is an expensive operation, or not possible. For example, the DynamoDB connector needs to determine the serialized size of DynamoDbWriteRequest. (https://github.com/apache/flink-connector-dynamodb/pull/1/files#r1012223894)

      Scope

      Add a feature to make "size in bytes" support optional, this includes:

      • Connectors will not be required to implement getSizeInBytes
      • Batches will not be validated for max size
      • Records will not be validated for size
      • Batches are not flushed when max size is exceeded

      The sink implementer can decide if it is appropriate to enable this feature.

      Attachments

        Activity

          People

            chalixar Ahmed Hamdy
            dannycranmer Danny Cranmer
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: