Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.1.2, 0.2.0
    • Fix Version/s: 0.3.0
    • Component/s: Data Collection
    • Labels:
      None

      Description

      Right now, FileTailingAdaptors watch particular files. It'd be great to be able to watch a whole path: to say something like /var/logs/*, where new logs created in that directory get picked up.

      1. CHUKWA-185.patch
        10 kB
        Ari Rabkin
      2. CHUKWA-185.patch
        7 kB
        Ari Rabkin

        Issue Links

          Activity

          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Chukwa-trunk #61 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/61/ )
          Hide
          Ari Rabkin added a comment -

          I just committed this to trunk.

          Show
          Ari Rabkin added a comment - I just committed this to trunk.
          Hide
          Ari Rabkin added a comment -

          I'm open to suggestions on picking the interval between scans. It's configurable, of course. But perhaps a thing to do is to have it scale with the duration of a scan. So that small dirs are scanned more frequently. If that sounds good, I'll open a separate JIRA for it.

          Show
          Ari Rabkin added a comment - I'm open to suggestions on picking the interval between scans. It's configurable, of course. But perhaps a thing to do is to have it scale with the duration of a scan. So that small dirs are scanned more frequently. If that sounds good, I'll open a separate JIRA for it.
          Hide
          Eric Yang added a comment -

          My only concern is the wait time between each scan is 10 seconds. This is a bit short for 2 level deep directory structure. If it's small directory structure, this works fine. +1

          Show
          Eric Yang added a comment - My only concern is the wait time between each scan is 10 seconds. This is a bit short for 2 level deep directory structure. If it's small directory structure, this works fine. +1
          Hide
          Ari Rabkin added a comment -

          Revised patch, demonstrating correct handling of old files.

          Show
          Ari Rabkin added a comment - Revised patch, demonstrating correct handling of old files.
          Hide
          Ari Rabkin added a comment -

          I had misunderstood intent of CHUKWA-204. Shutoff for file tailers is now CHUKWA-295.

          Show
          Ari Rabkin added a comment - I had misunderstood intent of CHUKWA-204 . Shutoff for file tailers is now CHUKWA-295 .
          Hide
          Ari Rabkin added a comment -

          What I was planning to do was this. DTA takes a time cut-off, and will not stream files last modified before the cutoff. So if you specify the epoch, you get everything. Whenever DTA does a scan of the directory, it updates that cutoff to the time when the scan started. So for a file that isn't being modified, DTA will start tailing it at most once.

          Time windowing for shutdown should be addressed by CHUKWA-204.

          It might be reasonable to build a command line tool or script that stops all FTAs in a given subdirectory. There's no need for that to be coupled to this patch in any way. But I think we should have real use cases before we hack on it.

          Show
          Ari Rabkin added a comment - What I was planning to do was this. DTA takes a time cut-off, and will not stream files last modified before the cutoff. So if you specify the epoch, you get everything. Whenever DTA does a scan of the directory, it updates that cutoff to the time when the scan started. So for a file that isn't being modified, DTA will start tailing it at most once. Time windowing for shutdown should be addressed by CHUKWA-204 . It might be reasonable to build a command line tool or script that stops all FTAs in a given subdirectory. There's no need for that to be coupled to this patch in any way. But I think we should have real use cases before we hack on it.
          Hide
          Jerome Boulon added a comment -

          >> What if the user want to stream over files that were previously archived and no longer receiving updates
          That one could be addressed by the backfilling tool or DirTailingAdaptor could be started with a "--force" flag

          Show
          Jerome Boulon added a comment - >> What if the user want to stream over files that were previously archived and no longer receiving updates That one could be addressed by the backfilling tool or DirTailingAdaptor could be started with a "--force" flag
          Hide
          Eric Yang added a comment -

          If we go with time window approach, DTA will only work on files that have active updates. What if the user want to stream over files that were previously archived and no longer receiving updates? This is not in the previous identified use case, but it may make sense to include this use case.

          If we specify start time (as processed time flag), and time window size, the system could process data in a queue and try to closing the gap between past and present. The processed time flag could be used as an indicator for resuming agent crash as well.

          Show
          Eric Yang added a comment - If we go with time window approach, DTA will only work on files that have active updates. What if the user want to stream over files that were previously archived and no longer receiving updates? This is not in the previous identified use case, but it may make sense to include this use case. If we specify start time (as processed time flag), and time window size, the system could process data in a queue and try to closing the gap between past and present. The processed time flag could be used as an indicator for resuming agent crash as well.
          Hide
          Ari Rabkin added a comment -

          Will revise as per Jerome's comments.

          Show
          Ari Rabkin added a comment - Will revise as per Jerome's comments.
          Hide
          Ari Rabkin added a comment -

          I had figured we'd solve the expiration problem in FTA, rather than here. We already have CHUKWA-204 open for this. However, my patch does need to do something with time windows to make sure it doesn't restart the adaptors after they stop themselves. I'll fix that in the next version.

          I'm happy to revisit the TerminatorThread mechanism, but again, that's an FTA problem, and there's no need to solve that problem and this one at the same time.

          As to duplicate data. Once DirTailer starts a FileTailer, that FileTailer gets checkpointed in the usual way. And exactly one tailer can get started for each file. This patch shouldn't create any duplicate-data issues we didn't already have.

          Show
          Ari Rabkin added a comment - I had figured we'd solve the expiration problem in FTA, rather than here. We already have CHUKWA-204 open for this. However, my patch does need to do something with time windows to make sure it doesn't restart the adaptors after they stop themselves. I'll fix that in the next version. I'm happy to revisit the TerminatorThread mechanism, but again, that's an FTA problem, and there's no need to solve that problem and this one at the same time. As to duplicate data. Once DirTailer starts a FileTailer, that FileTailer gets checkpointed in the usual way. And exactly one tailer can get started for each file. This patch shouldn't create any duplicate-data issues we didn't already have.
          Hide
          Jerome Boulon added a comment -

          Ari, it will be good to have a better control on TerminatorThread ... maybe a pool of TerminatorThread instead of creating a new one every time. A simpler solution will be to limit the number of "running" TerminatorThread's instances...

          Also I'm not sure if the solution could so simple.

          If the agent crash, it shouldn't resend something that has already been sent.
          Here what I was thinking of:

          • make the timeWindow mandatory, could default to XX minutes
          • keep track of all files that are in the processing window ( file.lastModifiedDate > now - timeWindow), using a simple text file, (tracking file)
          • when the last modified date for a file exceed the timeWindow then:
            ---> do a shutdown on the adaptor for this file's entry
            ---> delete the file's entry from the tracking file
          • keep the tracking file in a chukwa directory and reload it at agent re-start to avoid sending the same file twice

          How do you stop tailing a file? We cannot assume that we can delete a file so we need to have that built in. My proposal is to use the last modified date and the timeWindow to automatically remove adaptors.

          Show
          Jerome Boulon added a comment - Ari, it will be good to have a better control on TerminatorThread ... maybe a pool of TerminatorThread instead of creating a new one every time. A simpler solution will be to limit the number of "running" TerminatorThread's instances... Also I'm not sure if the solution could so simple. If the agent crash, it shouldn't resend something that has already been sent. Here what I was thinking of: make the timeWindow mandatory, could default to XX minutes keep track of all files that are in the processing window ( file.lastModifiedDate > now - timeWindow), using a simple text file, (tracking file) when the last modified date for a file exceed the timeWindow then: ---> do a shutdown on the adaptor for this file's entry ---> delete the file's entry from the tracking file keep the tracking file in a chukwa directory and reload it at agent re-start to avoid sending the same file twice How do you stop tailing a file? We cannot assume that we can delete a file so we need to have that built in. My proposal is to use the last modified date and the timeWindow to automatically remove adaptors.
          Hide
          Ari Rabkin added a comment -

          Ideally, we'd add some more test coverage to make sure that the created adaptors have the right classes and params.

          Also that the recursion works properly.

          Show
          Ari Rabkin added a comment - Ideally, we'd add some more test coverage to make sure that the created adaptors have the right classes and params. Also that the recursion works properly.
          Hide
          Ari Rabkin added a comment -

          I have some code, but a couple concerns.

          What should we do if a user tries to tail "/"? Creating millions of adaptors is probably the Wrong Thing. But I'm okay saying "this is the user's problem". Another approach is to only gradually create the FileTailingAdaptors that do the real tailing, so that the user can kill it if it goes out of control.

          When the DirTailer is stopped, should that stop tailing all the files in the directory, or just stop scanning for new ones?

          Is DirTailer responsible for shutting off the FileTailers after a set period, or is that the responsibility of the Tailers themselves?

          Show
          Ari Rabkin added a comment - I have some code, but a couple concerns. What should we do if a user tries to tail "/"? Creating millions of adaptors is probably the Wrong Thing. But I'm okay saying "this is the user's problem". Another approach is to only gradually create the FileTailingAdaptors that do the real tailing, so that the user can kill it if it goes out of control. When the DirTailer is stopped, should that stop tailing all the files in the directory, or just stop scanning for new ones? Is DirTailer responsible for shutting off the FileTailers after a set period, or is that the responsibility of the Tailers themselves?
          Hide
          Eric Yang added a comment -

          Trunk is ideal for testing this feature.

          Show
          Eric Yang added a comment - Trunk is ideal for testing this feature.
          Hide
          Ari Rabkin added a comment -

          Proposal:
          DirTailingAdaptor should take two parameters; a directory and an optional date.
          It will periodically scan the directory and all subdirs, and then start filetailing adaptors on any file modified since that date, if none is running. Date defaults to the epoch.

          Combined with CHUKWA-204, this will prevent adaptor count from rising without bound, while still making it easy to snarf a whole directory tree.

          I assume this is going into trunk, not 0.2?

          Show
          Ari Rabkin added a comment - Proposal: DirTailingAdaptor should take two parameters; a directory and an optional date. It will periodically scan the directory and all subdirs, and then start filetailing adaptors on any file modified since that date, if none is running. Date defaults to the epoch. Combined with CHUKWA-204 , this will prevent adaptor count from rising without bound, while still making it easy to snarf a whole directory tree. I assume this is going into trunk, not 0.2?
          Hide
          Mac Yang added a comment -

          +1
          this feature will make it easier to collect job history, job conf and task syslog

          Show
          Mac Yang added a comment - +1 this feature will make it easier to collect job history, job conf and task syslog
          Hide
          Ari Rabkin added a comment -

          The approach I had in mind was the following –
          Define a "DirTailingAdaptor", that takes as parameters a directory, and enough options to create a FileTailingAdaptor. (probably a class name and a data type)

          That adaptor should scan the directory; if it sees a new file, it should start a tailing adaptor on it.
          Keep a list of currently running adaptors in the directory.

          For now, we can punt on expiring the adaptors – CHUKWA-204 will solve that problem.

          Show
          Ari Rabkin added a comment - The approach I had in mind was the following – Define a "DirTailingAdaptor", that takes as parameters a directory, and enough options to create a FileTailingAdaptor. (probably a class name and a data type) That adaptor should scan the directory; if it sees a new file, it should start a tailing adaptor on it. Keep a list of currently running adaptors in the directory. For now, we can punt on expiring the adaptors – CHUKWA-204 will solve that problem.

            People

            • Assignee:
              Ari Rabkin
              Reporter:
              Ari Rabkin
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development