Commons IO
  1. Commons IO
  2. IO-335

Tailer#readLines - incorrect CR handling

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.4
    • Component/s: None
    • Labels:
      None

      Description

      The readLines method checks for CR. If found, it is not stored immediately, but a flag is set.

      If the next char is an LF, the buffer is passed to the listener without the CR.
      As soon as the next non-LF (and non-CR) character is received, the saved CR is written to the buffer.

      The net result is that CR before LF migrates to the start of the next non-empty line, and repeated CRs are collapsed. This is clearly wrong.

      The original code (before IO-274) used RandomAccessFile#readLine() which returns on CR, LF or CRLF.

      It looks as though the intention was to retain this behaviour whilst not blocking.

        Activity

        Hide
        Lantao Jin added a comment -

        I found tailer still erroneously considers character CR(\r) as line terminator in version 2.4.
        The issue which I will describe is under Linux. When Tailer#readLines receive a character sequence like this "aa\rbb\n", it will be divided into 2 lines: (It is not what I expect)
        aa
        bb
        However, Linux system use the ASCII character \n(LF) as the newline character, not CR. Wiki about newline(http://en.wikipedia.org/wiki/Newline) also gives some correspondences between OS and line terminator.
        We can see that CR is just used as newline character in Mac OS etc.

        One not good solution for the issue is considering it with OS environment. We can keep OS condition in Tailer initial (http://www.ziben.com.br/java/java-os-name-property-values). But I know it is not a good way: The logs which record a Windows application data are coped by Tailer in Linux.

        Anyway, current code causes a problom by CR in Linux.

        Show
        Lantao Jin added a comment - I found tailer still erroneously considers character CR(\r) as line terminator in version 2.4. The issue which I will describe is under Linux. When Tailer#readLines receive a character sequence like this "aa\rbb\n", it will be divided into 2 lines: (It is not what I expect) aa bb However, Linux system use the ASCII character \n(LF) as the newline character, not CR. Wiki about newline( http://en.wikipedia.org/wiki/Newline ) also gives some correspondences between OS and line terminator. We can see that CR is just used as newline character in Mac OS etc. One not good solution for the issue is considering it with OS environment. We can keep OS condition in Tailer initial ( http://www.ziben.com.br/java/java-os-name-property-values ). But I know it is not a good way: The logs which record a Windows application data are coped by Tailer in Linux. Anyway, current code causes a problom by CR in Linux.
        Hide
        Sebb added a comment -

        The original bug is fixed (r1347829); the code now treats CR, LF and CRLF as line terminators.

        Please open a new bug to request a change in this behaviour.

        Show
        Sebb added a comment - The original bug is fixed (r1347829); the code now treats CR, LF and CRLF as line terminators. Please open a new bug to request a change in this behaviour.
        Hide
        Lantao Jin added a comment -

        Thank you. But it reminded me that the same approach to terminated a line in java.io.BufferedReader#readline()
        /**

        • Reads a line of text. A line is considered to be terminated by any one
        • of a line feed ('\n'), a carriage return ('\r'), or a carriage return
        • followed immediately by a linefeed.
        • … */
          String readLine(boolean ignoreLF) throws IOException {

          Perhaps, Java always regard them as the same. I'd better change my own code to adapt to it.
        Show
        Lantao Jin added a comment - Thank you. But it reminded me that the same approach to terminated a line in java.io.BufferedReader#readline() /** Reads a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed. … */ String readLine(boolean ignoreLF) throws IOException { … Perhaps, Java always regard them as the same. I'd better change my own code to adapt to it.

          People

          • Assignee:
            Sebb
            Reporter:
            Sebb
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development