Appearing below is answers to the questions posted to the mailing lists.
> Btw., because the position in the file is checkpointed periodically, does
> that mean that it is possible that, after a restart, some number of lines
> that have already been tailed, will be read again?
Yes. They will not be read again.
On restart this source will start reading from the last read position in position file.
> - How does it know when to stop tailing the current file and switch to or start tailing another file
> - When there is a backlog of many files being built up... how does it order the files for consumption
This source does not have the order because it is basically supposed to tail appended lines of files in nearly real-time.
If there is a backlog of many files on start-up, one file will be selected in random order and be read to EOF, then the next file will be selected in the same way.
Using 'skipToEnd' property, it can also start tailing from EOF of the current files.
> - Sounds like there is some C/C++ native code + JNI to work with inodes ? what api are you using.
This source uses java.nio.file.Files.getAttribute() of Java 7 API to identify inode of a file.
> - does it auto delete the consumed files ?
No, the consumed files need not be deleted in this source. Files and positions of each file that should be tailed are recorded in the position file.
For example, a log file of a application such as /var/log/app/access.log can be directly specified in flume.conf