Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-10422

Follow AWS specs in Kinesis Consumer

    XMLWordPrintableJSON

    Details

      Description

      Related conversation in mailing list:

      https://lists.apache.org/thread.html/96de3bac9761564767cf283b58d664f5ae1b076e0c4431620552af5b@%3Cdev.flink.apache.org%3E

      Summary:

      Flink Kinesis consumer checks shards id for a particular pattern:

      "^shardId-\\d{12}"
      

      https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/model/StreamShardHandle.java#L132

      While this inlines with current Kinesis streams server implementation (all streams follows this pattern), it confronts with AWS docs:

       

      ShardId
       The unique identifier of the shard within the stream.
       Type: String
       Length Constraints: Minimum length of 1. Maximum length of 128.
      Pattern: [a-zA-Z0-9_.-]+
       Required: Yes
      

       

      https://docs.aws.amazon.com/kinesis/latest/APIReference/API_Shard.html

      Intention:
      We have no guarantees and can't rely on patterns other than provided in AWS manifest.
      Any custom implementation of Kinesis mock should rely on AWS manifest which claims ShardID to be alfanums. This prevents anyone to use Flink with such kind of mocks.

      The reason behind the scene to use particular pattern "^shardId-d12" is to create Flink's custom Shard comparator, filter already seen shards, and pass latest shard for client.listShards only to limit the scope for RPC call to AWS.

      In the meantime, I think we can get rid of this logic at all. The current usage in project is:

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                eyushin eugen yushin
                Reporter:
                eyushin eugen yushin
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m