Flume
  1. Flume
  2. FLUME-252

Update Tail to get rid of races and truncation problems.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: v0.9.0, v0.9.1
    • Fix Version/s: v0.9.2
    • Component/s: Sinks+Sources
    • Labels:
      None

      Description

      The first tail implementation used buffered readers and file readers. This caused problems because the read call was blocking and couldn't shutdown properly.

      The second tail implementation was mroe closely based on gnu tail's C implementation but relies on RandomAccessFile. This version had problems with races (restarting from beginning FLUME-218) and truncation (new test in FLUME-218 patch by Eric Sammer).

      The new approach will likely use NIO and nonblocking IO to act cleanly, or possibly use a JNI based approach to get to unix system calls to get and follow file descriptors or inode numbers.

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Blocked Blocked
          21d 13h 48m 1 Jonathan Hsieh 20/Oct/10 19:26
          Blocked Blocked Resolved Resolved
          15d 2h 36m 1 Jonathan Hsieh 04/Nov/10 21:02
          Resolved Resolved Closed Closed
          38d 17h 35m 1 Jonathan Hsieh 13/Dec/10 14:37
          hailinzeng made changes -
          Link This issue relates to FLUME-2354 [ FLUME-2354 ]
          Mark Thomas made changes -
          Project Import Tue Aug 02 16:57:12 UTC 2011 [ 1312304232406 ]
          Jonathan Hsieh made changes -
          Link This issue is duplicated by FLUME-518 [ FLUME-518 ]
          Jonathan Hsieh made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Hide
          Jonathan Hsieh added a comment -

          Closing released issues.

          Show
          Jonathan Hsieh added a comment - Closing released issues.
          Jonathan Hsieh made changes -
          Link This issue blocks FLUME-323 [ FLUME-323 ]
          Jonathan Hsieh made changes -
          Link This issue blocks FLUME-323 [ FLUME-323 ]
          Jonathan Hsieh made changes -
          Status Patch Available [ 10000 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Jonathan Hsieh added a comment -

          committed

          Show
          Jonathan Hsieh added a comment - committed
          Hide
          Jonathan Hsieh added a comment -

          This version significantly reduces the chances of data duplication
          encountered due to the FLUME-252 (races in tail) bug. However, it
          does not completely fix the problem. A workaround is to use
          'exec("tail -F <file>")' instead of the tail source Linux/Unix
          systems. A new issue has been filed as FLUME-320 to continue tracking
          this problem.

          Show
          Jonathan Hsieh added a comment - This version significantly reduces the chances of data duplication encountered due to the FLUME-252 (races in tail) bug. However, it does not completely fix the problem. A workaround is to use 'exec("tail -F <file>")' instead of the tail source Linux/Unix systems. A new issue has been filed as FLUME-320 to continue tracking this problem.
          Jonathan Hsieh made changes -
          Link This issue relates to FLUME-320 [ FLUME-320 ]
          Jonathan Hsieh made changes -
          Fix Version/s v0.9.2 [ 10022 ]
          Jonathan Hsieh made changes -
          Link This issue blocks FLUME-258 [ FLUME-258 ]
          Jonathan Hsieh made changes -
          Attachment 0001-FLUME-252-Update-tail-to-get-rid-of-races-and-trunca.patch [ 10261 ]
          Attachment License Granted license to ASF [ licensed ]
          Hide
          Jonathan Hsieh added a comment -

          review here https://review.cloudera.org/r/981/
          having problems uploading patch right now.

          Show
          Jonathan Hsieh added a comment - review here https://review.cloudera.org/r/981/ having problems uploading patch right now.
          Jonathan Hsieh made changes -
          Status Open [ 1 ] Patch Available [ 10000 ]
          Jonathan Hsieh made changes -
          Link This issue relates to FLUME-232 [ FLUME-232 ]
          Hide
          Jonathan Hsieh added a comment -

          FLUME-205 will be fixed by this patch – Here's a description of the problem I encountered : FLUME-261

          Show
          Jonathan Hsieh added a comment - FLUME-205 will be fixed by this patch – Here's a description of the problem I encountered : FLUME-261
          Hide
          Jonathan Hsieh added a comment -

          Scratch FLUME-205. I tested with some chinese characters (我是中國人) and data did not get through properly. My guess is that reads in this tail are correct on output there is an endian or output encoding issue. Will leave FLUME-205 open.

          Show
          Jonathan Hsieh added a comment - Scratch FLUME-205 . I tested with some chinese characters (我是中國人) and data did not get through properly. My guess is that reads in this tail are correct on output there is an endian or output encoding issue. Will leave FLUME-205 open.
          Hide
          Jonathan Hsieh added a comment - - edited

          The solution I have address these problems with the following mechanisms:

          • FLUME-205: by using NIO api and only using byte[] (never doing charater encoding translations)
          • FLUME-248: added a method to cursor that is called on close and read file rotate, as well as a test that fails if not done properly
          • FLUME-218: Passes the python script found in that test.

          This does not address FLUME-148.

          Show
          Jonathan Hsieh added a comment - - edited The solution I have address these problems with the following mechanisms: FLUME-205 : by using NIO api and only using byte[] (never doing charater encoding translations) FLUME-248 : added a method to cursor that is called on close and read file rotate, as well as a test that fails if not done properly FLUME-218 : Passes the python script found in that test. This does not address FLUME-148 .
          Jonathan Hsieh made changes -
          Component/s Sinks+Sources [ 10041 ]
          Jonathan Hsieh made changes -
          Affects Version/s v0.9.0 [ 10014 ]
          Affects Version/s v0.9.1 [ 10013 ]
          Description The first tail implementation used buffered readers and file readers. This caused problems because the read call was blocking and couldn't shutdown properly.

          The second tail implementation was mroe closely based on gnu tail's C implementation but relies on RandomAccessFile. This version had problems with races (restarting from beginning FLUME-218) and truncation (new test in FLUME-218 patch by Eric Sammer).

          The new approach will likely use NIO and nonblocking IO to cleanly, or possible a JNI based approach to get to unix system calls to get and follow file descriptors or inode numbers.
          The first tail implementation used buffered readers and file readers. This caused problems because the read call was blocking and couldn't shutdown properly.

          The second tail implementation was mroe closely based on gnu tail's C implementation but relies on RandomAccessFile. This version had problems with races (restarting from beginning FLUME-218) and truncation (new test in FLUME-218 patch by Eric Sammer).

          The new approach will likely use NIO and nonblocking IO to act cleanly, or possibly use a JNI based approach to get to unix system calls to get and follow file descriptors or inode numbers.
          Jonathan Hsieh made changes -
          Assignee Jonathan Hsieh [ jmhsieh ]
          Jonathan Hsieh made changes -
          Link This issue relates to FLUME-148 [ FLUME-148 ]
          Jonathan Hsieh made changes -
          Link This issue relates to FLUME-248 [ FLUME-248 ]
          Jonathan Hsieh made changes -
          Link This issue relates to FLUME-205 [ FLUME-205 ]
          Hide
          Jonathan Hsieh added a comment -

          readline / char interpretation is a related problem.

          Show
          Jonathan Hsieh added a comment - readline / char interpretation is a related problem.
          Jonathan Hsieh made changes -
          Field Original Value New Value
          Link This issue relates to FLUME-218 [ FLUME-218 ]
          Jonathan Hsieh created issue -

            People

            • Assignee:
              Jonathan Hsieh
              Reporter:
              Jonathan Hsieh
            • Votes:
              2 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development