Flume
  1. Flume
  2. FLUME-746

Correct the behavior and logging messages about states transition of wal chunks on retry

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: v0.9.4
    • Fix Version/s: v0.9.5
    • Component/s: Node
    • Labels:

      Description

      Flume logs often have scary looking log messages that look like this:

      2011-08-14 00:00:56,177 INFO com.cloudera.flume.agent.WALAckManager: Retransmitting log.00000038.20110801-235819417-0400.6477874242784836.seq after being stale for 60802ms
      2011-08-14 00:00:56,177 WARN com.cloudera.flume.agent.durability.NaiveFileWALManager: There was a race that happend with SENT vs SENDING states
      2011-08-14 00:00:56,177 INFO com.cloudera.flume.agent.WALAckManager: Retransmitting log.00000038.20110805-005128611-0400.6740261462532911.seq after being stale for 60802ms
      2011-08-14 00:00:56,177 WARN com.cloudera.flume.agent.durability.NaiveFileWALManager: There was a race that happend with SENT vs SENDING states
      2011-08-14 00:00:56,177 INFO com.cloudera.flume.agent.WALAckManager: Retransmitting log.00000038.20110805-031622274-0400.6748955125414911.seq after being stale for 60802ms
      2011-08-14 00:00:56,177 WARN com.cloudera.flume.agent.durability.NaiveFileWALManager: There was a race that happend with SENT vs SENDING states

      This because previously we only expected deal with three states:

      LOGGED, SENDING, SENT.

      We actually need to deal with all possible states, and importantly, the SENDING state is a valid state to transition from.(not a race as reported). Here's the high-level idea:

      Current state, state to transition to.
      IMPORT -> IMPORT // new warn that this is an odd case.
      WRITING -> WRITING // new warn that this is an odd case.
      LOGGED -> LOGGED // This is a change, used to be considered race – This is legal – f it is log, it is slated for retry so stay put.
      SENDING -> SENDING // This is the change, used to be considered race – This is legal – if we are sending the chunk already, keep sending it, no need to retry
      SENT -> LOGGED // this was sent already but acks didn't work out. move to LOGGED state to retry.
      E2EACKED -> E2EACKED // new acked already means it is good. No need to retry.
      others -> others // other states are unexpected and remain in their state.

        Issue Links

          Activity

          No work has yet been logged on this issue.

            People

            • Assignee:
              Jonathan Hsieh
              Reporter:
              Jonathan Hsieh
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development