Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2796

File Channel which queued more than 1TB data files got OOME when doing replay

    XMLWordPrintableJSON

    Details

    • Type: Question
    • Status: Open
    • Priority: Blocker
    • Resolution: Unresolved
    • Affects Version/s: 1.5.2
    • Fix Version/s: None
    • Component/s: File Channel
    • Labels:
      None
    • Environment:

      CDH 5.3
      Cent OS

      Description

      Due to some error, my flume agent has queued 185204 event messages (more than 1 TB, about 7.7 MB /per event in average) in its file channel.

      I tried to restart the flume agent with more JVM Java heap space and let the file channel replay, and I got the following error message:

      java.lang.OutOfMemoryError: Java heap space
              at com.google.protobuf.ByteString.copyFrom(ByteString.java:90)
              at com.google.protobuf.ByteString.copyFrom(ByteString.java:99)
              at com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:294)
              at org.apache.flume.channel.file.proto.ProtosFactory$FlumeEvent$Builder.mergeFrom(ProtosFactory.java:5136)
              at org.apache.flume.channel.file.proto.ProtosFactory$FlumeEvent$Builder.mergeFrom(ProtosFactory.java:4950)
              at com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:275)
              at org.apache.flume.channel.file.proto.ProtosFactory$Put$Builder.mergeFrom(ProtosFactory.java:3312)
              at org.apache.flume.channel.file.proto.ProtosFactory$Put$Builder.mergeFrom(ProtosFactory.java:3164)
              at com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:212)
              at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:746)
              at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
              at com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:282)
              at com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:760)
              at com.google.protobuf.AbstractMessageLite$Builder.mergeDelimitedFrom(AbstractMessageLite.java:288)
              at com.google.protobuf.AbstractMessage$Builder.mergeDelimitedFrom(AbstractMessage.java:752)
              at org.apache.flume.channel.file.proto.ProtosFactory$Put.parseDelimitedFrom(ProtosFactory.java:3121)
              at org.apache.flume.channel.file.Put.readProtos(Put.java:86)
              at org.apache.flume.channel.file.TransactionEventRecord.fromByteArray(TransactionEventRecord.java:201)
              at org.apache.flume.channel.file.LogFileV3$SequentialReader.doNext(LogFileV3.java:344)
              at org.apache.flume.channel.file.LogFile$SequentialReader.next(LogFile.java:498)
              at org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:245)
              at org.apache.flume.channel.file.Log.doReplay(Log.java:435)
              at org.apache.flume.channel.file.Log.replay(Log.java:382)
      
      

      Setting in flume-env.sh

      JAVA_OPTS="-Xms40000m -Xmx40000m -Xss500k -XX:MaxDirectMemorySize=2000m
      -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:PermSize=256m -XX:MaxPermSize=512m -XX:-UseGCOverheadLimit"
      

      Configuration for filechannel

      a1.channels.fc1.type = file
      a1.channels.fc1.dataDirs = ../../data
      a1.channels.fc1.checkpointDir = ../../check
      a1.channels.fc1.maxFileSize = 104857600
      a1.channels.fc1.capacity = 1000000
      a1.channels.fc1.transactionCapacity = 10000
      

      Is it possible to tune the flume config or environment setting to replay such a large amount data files?

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              maxabcr2000 Max Lin
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: