Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-3118

S3 urls do not find the correct region.

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Sinks+Sources
    • None

    Description

      So I am trying to use a S3 sink using hdfs but I am running into hurdles at every corner. My situation is that I need to be able to push to s3 without using access/secret amazon keys and using the underlying instance profile to authenticate with s3. I also need to add the aws encryption header for AES256. I am trying to use the base path of `s3://something.us-east-2.something/else`, but when I try it I get a `<Error><Code>AuthorizationHeaderMalformed</Code><Message>The authorization header is malformed; the region 'us-east-1' is wrong; expecting 'us-east-2'</Message><Region>us-east-2</Region><RequestId>N/A</RequestId><HostId>N/A</HostId></Error>`

      Here is my flume config:
      ```
      tier1.sources = source1
      tier1.channels = channel1
      tier1.sinks = sink1

      tier1.sources.source1.type = org.apache.flume.source.kafka.KafkaSource
      tier1.sources.source1.zookeeperConnect = localhost:2181
      tier1.sources.source1.topic = lynch

      1. tier1.sources.source1.groupId = flume
        tier1.sources.source1.channels = channel1
        tier1.sources.source1.interceptors = i1
        tier1.sources.source1.interceptors.i1.type = timestamp
        tier1.sources.source1.kafka.consumer.timeout.ms = 100

      tier1.channels.channel1.type = memory
      #tier1.channels.channel1.capacity = 10000
      #tier1.channels.channel1.transactionCapacity = 1000

      tier1.sinks.sink1.type = hdfs
      tier1.sinks.sink1.hdfs.path = s3://something.us-east-2.something/else
      tier1.sinks.sink1.hdfs.rollInterval = 5
      tier1.sinks.sink1.hdfs.rollSize = 0
      tier1.sinks.sink1.hdfs.rollCount = 0
      tier1.sinks.sink1.hdfs.fileType = DataStream
      tier1.sinks.sink1.channel = channel1
      ```

      Here is the command to run it:
      ```
      bin/flume-ng agent -c . -f kafka-source.conf -n tier1
      ```

      It should not be this difficult to push to S3 and adding support for s3:// addresses and instance profiles needs to happen. I have tried many permutations to get this to work, and I really want to see flume become a more friendly tool in these situations.

      Attachments

        Activity

          People

            Unassigned Unassigned
            chronotrono9878888 alex balzer
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: