Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Data Collection
    • Labels:
      None

      Description

      ExecPlugin never properly disposes of the subprocess's input fd. This means we run out of file descriptors eventually.

      This only affects ExecAdaptor; if ExecPlugin is invoked by the inputtools framework, the process doens't stay running.

      1. CHUKWA-229c.patch
        0.8 kB
        Ari Rabkin
      2. crashOnOutOfFDs.patch
        0.8 kB
        Ari Rabkin
      3. fixExecFDLeak.patch
        1 kB
        Ari Rabkin

        Activity

        Hide
        asrabkin Ari Rabkin added a comment -

        Also adds a few sentences of javadoc.

        Show
        asrabkin Ari Rabkin added a comment - Also adds a few sentences of javadoc.
        Hide
        asrabkin Ari Rabkin added a comment -

        Can we get this in to 0.1.2? Shouldn't break anything at Yahoo.

        Show
        asrabkin Ari Rabkin added a comment - Can we get this in to 0.1.2? Shouldn't break anything at Yahoo.
        Hide
        eyang Eric Yang added a comment -

        +1, yes we can include this in 0.1.2.

        Show
        eyang Eric Yang added a comment - +1, yes we can include this in 0.1.2.
        Hide
        eyang Eric Yang added a comment -

        I just committed this, thanks Ari.

        Show
        eyang Eric Yang added a comment - I just committed this, thanks Ari.
        Hide
        hudson Hudson added a comment -
        Show
        hudson Hudson added a comment - Integrated in Chukwa-trunk #24 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/24/ )
        Hide
        asrabkin Ari Rabkin added a comment -

        That fix didn't do the trick. We're still leaking. What's more, it's sporadic; it doesn't happen every execution. So I suspect it's a race condition somewhere. Instead of tracking it down immediately, I propose the following medium-term workaround: If exec adaptor detects that we're out of file handles, we stop the agent process and wait for watchdog to respawn it.

        We went to a lot of trouble to implement robust checkpointing; I figure we might as well rely on it here.

        Show
        asrabkin Ari Rabkin added a comment - That fix didn't do the trick. We're still leaking. What's more, it's sporadic; it doesn't happen every execution. So I suspect it's a race condition somewhere. Instead of tracking it down immediately, I propose the following medium-term workaround: If exec adaptor detects that we're out of file handles, we stop the agent process and wait for watchdog to respawn it. We went to a lot of trouble to implement robust checkpointing; I figure we might as well rely on it here.
        Hide
        hudson Hudson added a comment -
        Show
        hudson Hudson added a comment - Integrated in Chukwa-trunk #45 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/45/ )
        Hide
        asrabkin Ari Rabkin added a comment -

        I can't track this down; seems okay when I unit test.

        Show
        asrabkin Ari Rabkin added a comment - I can't track this down; seems okay when I unit test.
        Hide
        asrabkin Ari Rabkin added a comment -

        Okay. Nailed this.

        Show
        asrabkin Ari Rabkin added a comment - Okay. Nailed this.
        Hide
        asrabkin Ari Rabkin added a comment -

        I think this is NOT a blocker for 0.3, because many users don't need execadaptor, and there's an easy workaround, of periodically killing the process.

        Show
        asrabkin Ari Rabkin added a comment - I think this is NOT a blocker for 0.3, because many users don't need execadaptor, and there's an easy workaround, of periodically killing the process.
        Hide
        asrabkin Ari Rabkin added a comment -

        I just committed this to Trunk.

        Show
        asrabkin Ari Rabkin added a comment - I just committed this to Trunk.
        Hide
        hudson Hudson added a comment -
        Show
        hudson Hudson added a comment - Integrated in Chukwa-trunk #213 (See http://hudson.zones.apache.org/hudson/job/Chukwa-trunk/213/ )

          People

          • Assignee:
            asrabkin Ari Rabkin
            Reporter:
            asrabkin Ari Rabkin
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development