Uploaded image for project: 'Commons IO'
  1. Commons IO
  2. IO-523

Do not reload the entire file when a tailed file's length and position are the same but the file is newer

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 2.5
    • None
    • Streams/Writers
    • None
    • Windows 10

    • Patch

    Description

      In the Tailer class, when the file length is equal to the position and the file is newer, the following branch is executed:

      org.apache.commons.io.input.Tailer.java
      // ----------- Lines 461 - 472 --------------
      // ...
      else if (newer) {
        /*
         * This can happen if the file is truncated or overwritten with the exact same length of
         * information. In cases like this, the file position needs to be reset
         */
        position = 0;
        reader.seek(position); // cannot be null here
      
        // Now we can read new lines
        position = readLines(reader);
        last = file.lastModified();
      }
      // ...
      

      The comments in the branch specifically mention wanting to reset the position and reload the entire file. However, I believe this can lead to undesirable effects in certain cases.

      One example is when you are tailing one file into another file. If this branch is hit, the entire input file is recopied into the output file. This is especially troublesome if you have a rouge file who's timestamp changes regularly without any content changes.

      My improvement would be to simply remove this branch if it works for the general case as well. Or, at least for special cases, allow a parameter to be checked to prevent this behavior.

      Attachments

        1. IO-523.patch
          1 kB
          Tyler Murry

        Activity

          People

            Unassigned Unassigned
            tylermurry Tyler Murry
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified