Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2796

File Channel which queued more than 1TB data files got OOME when doing replay

    XMLWordPrintableJSON

Details

    • Question
    • Status: Open
    • Blocker
    • Resolution: Unresolved
    • 1.5.2
    • None
    • File Channel
    • None
    • CDH 5.3
      Cent OS

    Description

      Due to some error, my flume agent has queued 185204 event messages (more than 1 TB, about 7.7 MB /per event in average) in its file channel.

      I tried to restart the flume agent with more JVM Java heap space and let the file channel replay, and I got the following error message:

      java.lang.OutOfMemoryError: Java heap space
              at com.google.protobuf.ByteString.copyFrom(ByteString.java:90)
              at com.google.protobuf.ByteString.copyFrom(ByteString.java:99)
              at com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:294)
              at org.apache.flume.channel.file.proto.ProtosFactory$FlumeEvent$Builder.mergeFrom(ProtosFactory.java:5136)
              at org.apache.flume.channel.file.proto.ProtosFactory$FlumeEvent$Builder.mergeFrom(ProtosFactory.java:4950)
              at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:275)
              at org.apache.flume.channel.file.proto.ProtosFactory$Put$Builder.mergeFrom(ProtosFactory.java:3312)
              at org.apache.flume.channel.file.proto.ProtosFactory$Put$Builder.mergeFrom(ProtosFactory.java:3164)
              at com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:212)
              at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
              at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
              at com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
              at com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
              at com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
              at com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
              at org.apache.flume.channel.file.proto.ProtosFactory$Put.parseDelimitedFrom(ProtosFactory.java:3121)
              at org.apache.flume.channel.file.Put.readProtos(Put.java:86)
              at org.apache.flume.channel.file.TransactionEventRecord.fromByteArray(TransactionEventRecord.java:201)
              at org.apache.flume.channel.file.LogFileV3$SequentialReader.doNext(LogFileV3.java:344)
              at org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.java:498)
              at org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:245)
              at org.apache.flume.channel.file.Log.doReplay(Log.java:435)
              at org.apache.flume.channel.file.Log.replay(Log.java:382)
      
      

      Setting in flume-env.sh

      JAVA_OPTS="-Xms40000m -Xmx40000m -Xss500k -XX:MaxDirectMemorySize=2000m
      -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:PermSize=256m -XX:MaxPermSize=512m -XX:-UseGCOverheadLimit"
      

      Configuration for filechannel

      a1.channels.fc1.type = file
      a1.channels.fc1.dataDirs = ../../data
      a1.channels.fc1.checkpointDir = ../../check
      a1.channels.fc1.maxFileSize = 104857600
      a1.channels.fc1.capacity = 1000000
      a1.channels.fc1.transactionCapacity = 10000
      

      Is it possible to tune the flume config or environment setting to replay such a large amount data files?

      Attachments

        Activity

          People

            Unassigned Unassigned
            maxabcr2000 Max Lin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: