This a proposal of performance improvement for new tailing source
Taildir source reads a file by 1byte, so the performance is very low compared to tailing on exec source.
I tested lot's of ways to improve performance and implemented the best one.
- Reading a file by a 8k block instead of 1 byte.
- Use byte for handling data instead of ByteArrayDataOutput/ByteBuffer(direct)/.. for better performance.
- Don't convert byte to string and vice verse.
Simple file reading test results.
File size: 100 MB,
Line size: 500 byte
Estimated time to read the file:
Reading 1byte(Using the code in Taildir) 32544 ms Reading 8K Block 431 ms
Testing on flume, it catches up the performance of tailing on exec source. (30x performance boost)