Chukwa
  1. Chukwa
  2. CHUKWA-410

Does the BackfillingLoader return only after HDFS blocks are committed?

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Not A Problem
    • Affects Version/s: 0.3.0
    • Fix Version/s: None
    • Component/s: Data Collection
    • Labels:
      None
    • Environment:

      Hadoop 0.20.0, Debian 4 (Etch), Chukwa rev 817532

      Description

      I see that the BackfillingLoader is set to AdaptorShutdownPolicy.WAIT_TILL_FINISHED, what are the semantics of this? Does this mean that the BackfillingLoader returns after the last HDFS write request is made, but the DFSClient could continue to be flushing blocks to the DataNodes in the background? Or does that mean that the entire file has been written/flushed to HDFS and closed and fully available?

      I'm running the Demux immediately after the BackfillingLoader is complete; the raw log files are complete, but the Demux picks up only half of the entries in those log files. Could this be because some blocks are not closed yet?

        Activity

        Hide
        Jiaqi Tan added a comment -

        Closing this issue since I have not seen it since. The problem seemed to have been due to a full disk.

        Show
        Jiaqi Tan added a comment - Closing this issue since I have not seen it since. The problem seemed to have been due to a full disk.
        Hide
        Jiaqi Tan added a comment -

        From backfill.log:

        2009-11-06 04:30:05,157 INFO main FileTailingAdaptor - started file tailer on file <file.log> with first byte at offset 0
        2009-11-06 04:30:05,158 INFO main FileTailingAdaptor - Enter Shutdown:WAIT_TILL_FINISHED - ObjectId:Lightweight Tailer on <file.log>
        2009-11-06 04:30:05,160 INFO main FileTailingAdaptor - WAIT_TILL_FINISHED Retry:0
        2009-11-06 04:30:05,160 INFO Thread-2 TerminatorThread - Terminator thread started.<file.log>
        2009-11-06 04:30:05,175 INFO Thread-2 FileTailingAdaptor - Adaptor||Opening the file for the first time|seek|0
        2009-11-06 04:30:05,484 INFO QueueToWriterConnectorThread SeqFileWriter - start Date [Fri Nov 06 04:30:05 EST 2009]
        2009-11-06 04:30:05,485 INFO QueueToWriterConnectorThread SeqFileWriter - Rotate from QueueToWriterConnectorThread
        2009-11-06 04:30:05,573 INFO QueueToWriterConnectorThread QueueToWriterConnector - processing data for QueueToWriterConnector
        2009-11-06 04:30:05,573 INFO Thread-2 WaitingQueue - MemLimitQueue is full [8388608]
        2009-11-06 04:30:05,578 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue
        2009-11-06 04:30:05,874 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue
        2009-11-06 04:30:05,875 INFO Thread-2 WaitingQueue - MemLimitQueue is full [8388608]
        2009-11-06 04:30:05,919 INFO Thread-2 WaitingQueue - MemLimitQueue is full [8388608]
        2009-11-06 04:30:05,926 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue
        ... (continues like this) ...
        2009-11-06 04:30:07,840 INFO Thread-2 WaitingQueue - MemLimitQueue is full [8388608]
        2009-11-06 04:30:07,846 INFO Thread-2 TerminatorThread - Terminator thread finished.<file.log>
        2009-11-06 04:30:07,864 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue
        2009-11-06 04:30:07,901 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue
        2009-11-06 04:30:07,927 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue
        2009-11-06 04:30:07,960 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue
        2009-11-06 04:30:07,987 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 12 chunks back from the queue
        2009-11-06 04:30:08,198 INFO main FileTailingAdaptor - Exist Shutdown:WAIT_TILL_FINISHED - ObjectId:Lightweight Tailer on <file.log>
        2009-11-06 04:30:08,198 INFO main QueueToWriterConnector - Shutdown in progress ...
        2009-11-06 04:30:09,234 INFO main QueueToWriterConnector - Shutdown done.

        Show
        Jiaqi Tan added a comment - From backfill.log: 2009-11-06 04:30:05,157 INFO main FileTailingAdaptor - started file tailer on file <file.log> with first byte at offset 0 2009-11-06 04:30:05,158 INFO main FileTailingAdaptor - Enter Shutdown:WAIT_TILL_FINISHED - ObjectId:Lightweight Tailer on <file.log> 2009-11-06 04:30:05,160 INFO main FileTailingAdaptor - WAIT_TILL_FINISHED Retry:0 2009-11-06 04:30:05,160 INFO Thread-2 TerminatorThread - Terminator thread started.<file.log> 2009-11-06 04:30:05,175 INFO Thread-2 FileTailingAdaptor - Adaptor||Opening the file for the first time|seek|0 2009-11-06 04:30:05,484 INFO QueueToWriterConnectorThread SeqFileWriter - start Date [Fri Nov 06 04:30:05 EST 2009] 2009-11-06 04:30:05,485 INFO QueueToWriterConnectorThread SeqFileWriter - Rotate from QueueToWriterConnectorThread 2009-11-06 04:30:05,573 INFO QueueToWriterConnectorThread QueueToWriterConnector - processing data for QueueToWriterConnector 2009-11-06 04:30:05,573 INFO Thread-2 WaitingQueue - MemLimitQueue is full [8388608] 2009-11-06 04:30:05,578 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue 2009-11-06 04:30:05,874 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue 2009-11-06 04:30:05,875 INFO Thread-2 WaitingQueue - MemLimitQueue is full [8388608] 2009-11-06 04:30:05,919 INFO Thread-2 WaitingQueue - MemLimitQueue is full [8388608] 2009-11-06 04:30:05,926 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue ... (continues like this) ... 2009-11-06 04:30:07,840 INFO Thread-2 WaitingQueue - MemLimitQueue is full [8388608] 2009-11-06 04:30:07,846 INFO Thread-2 TerminatorThread - Terminator thread finished.<file.log> 2009-11-06 04:30:07,864 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue 2009-11-06 04:30:07,901 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue 2009-11-06 04:30:07,927 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue 2009-11-06 04:30:07,960 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 16 chunks back from the queue 2009-11-06 04:30:07,987 INFO QueueToWriterConnectorThread QueueToWriterConnector - Got 12 chunks back from the queue 2009-11-06 04:30:08,198 INFO main FileTailingAdaptor - Exist Shutdown:WAIT_TILL_FINISHED - ObjectId:Lightweight Tailer on <file.log> 2009-11-06 04:30:08,198 INFO main QueueToWriterConnector - Shutdown in progress ... 2009-11-06 04:30:09,234 INFO main QueueToWriterConnector - Shutdown done.
        Hide
        Jiaqi Tan added a comment -

        > What do you mean by: "the raw log files are complete"?
        > --> the datasink file from the collector is complete?

        Actually, let me check on that. I was just wondering if the semantics of WAIT_TILL_FINISHED could result in any races, i.e. blocks closed without the file being fully written, and the Demux hitting an incomplete file and processing only the blocks that had been closed so far.

        Show
        Jiaqi Tan added a comment - > What do you mean by: "the raw log files are complete"? > --> the datasink file from the collector is complete? Actually, let me check on that. I was just wondering if the semantics of WAIT_TILL_FINISHED could result in any races, i.e. blocks closed without the file being fully written, and the Demux hitting an incomplete file and processing only the blocks that had been closed so far.
        Hide
        Jerome Boulon added a comment -

        What do you mean by: "the raw log files are complete"?
        --> the datasink file from the collector is complete?

        Show
        Jerome Boulon added a comment - What do you mean by: "the raw log files are complete"? --> the datasink file from the collector is complete?

          People

          • Assignee:
            Unassigned
            Reporter:
            Jiaqi Tan
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development