Uploaded image for project: 'Flume'
  1. Flume
  2. FLUME-2176

SpoolDir Source, get 'File has changed' exception but actually there is no change on the file

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.0
    • Fix Version/s: 1.5.0
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      I am using a script to generate files and then copy them one by one to the spooling directory. I got 'File has changed size' exception, but I am pretty sure the file wasn't changed.

      23 Aug 2013 10:37:02,704 ERROR [pool-5-thread-1] (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:173)  - Uncaught exception in Runnable
      java.lang.IllegalStateException: File has changed size since being read: /log/flume-ng/agent1/spooldir/spd1/log.00000029.20130822-121450171+0900.8729556570223344.seq
              at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.retireCurrentFile(ReliableSpoolingFileEventReader.java:286)
              at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:226)
              at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:160)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
              at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
              at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
              at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
              at java.lang.Thread.run(Thread.java:662)
      

      My configuration:

      agent1.sources = spd1
      agent1.sources.spd1.type = spooldir
      agent1.sources.spd1.spoolDir = /log/flume-ng/agent1/spooldir/spd1
      agent1.sources.spd1.deserializer.maxLineLength = 8192
      agent1.sources.spd1.channels = file1
      
      agent1.channels = file1
      agent1.channels.file1.type = file
      agent1.channels.file1.checkpointDir = /log/flume-ng/agent1/checkpoint
      agent1.channels.file1.dataDirs = /log/flume-ng/agent1/data
      agent1.channels.file1.capacity = 2000000
      agent1.channels.file1.transactionCapacity = 100
      
      agent1.sinks = avro1
      agent1.sinks.avro1.type = avro
      agent1.sinks.avro1.channel = file1
      agent1.sinks.avro1.hostname = remote_host
      agent1.sinks.avro1.port = 33333
      
      

        Activity

        Hide
        yongkun Yongkun Wang added a comment -

        Just realize that 'cp' is not 'atomic'. I changed to 'mv' file to spooling directory, or 'cp' file as .tmp file then 'mv' it to spooling directory, so far it works well.

        Show
        yongkun Yongkun Wang added a comment - Just realize that 'cp' is not 'atomic'. I changed to 'mv' file to spooling directory, or 'cp' file as .tmp file then 'mv' it to spooling directory, so far it works well.

          People

          • Assignee:
            yongkun Yongkun Wang
            Reporter:
            yongkun Yongkun Wang
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development