Chukwa
  1. Chukwa
  2. CHUKWA-631

Agents and Collectors do not respond to HUP signals

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.4.0
    • Fix Version/s: None
    • Component/s: Data Collection
    • Labels:
      None
    • Environment:
    • Tags:
      redhat enterprise hup signal agent collector

      Description

      On RHEL 5.4, agents and collectors do not respond to HUP signals sent to the process PID.

      I start the agent like this, then attempt to kill it like so:

      $ pwd
      /usr/local/chukwa/bin
      
      $ ./start-agents.sh
      
      $ pgrep -lf Agent
      23148 /usr/java/latest/bin/java -Djava.library.path=/usr/lib/hadoop/lib/native/Linux-X86-64 [...] org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent
      
      $ cat /var/run/Agent.pid 
      23148
      
      $ kill -hup 23148
      
      $ ps -fp 23148
      UID        PID  PPID  C STIME TTY          TIME CMD
      root     23148     1  0 Jan23 pts/0    00:00:19 /usr/java/latest/bin/java -Djava.library.path=/usr/lib/hadoop/lib/native/}}
      
      $ pgrep -lf Agent
      23148 /usr/java/latest/bin/java -Djava.library.path=/usr/lib/hadoop/lib/native/Linux-X86-64  [...] : org.apache.hadoop.chukwa.datacollection.agent.ChukwaAgent}}
      

      The agent is still running, and /var/log/agent.log shows no indication that the signal was seen.

      The agent is running one adaptor:

      $ echo "list" | nc localhost 9093
      adaptor_b5382452cfdc2ae81bd45d8126804843)  org.apache.hadoop.chukwa.datacollection.adaptor.ExecAdaptor Iostat 60 /usr/bin/iostat -x -k 55 2 11215696
      

      We can see this child process running too:

      $ ps -f --ppid 23148
      UID        PID  PPID  C STIME TTY          TIME CMD
      root      7116 23148  0 12:40 pts/0    00:00:00 /usr/bin/iostat -x -k 55 2
      

      In the case of the collector, I start it like so:

      $ ./start-collectors.sh 
      localhost: starting collector, logging to /var/log/chukwa-chukwa-collector-<hostname.>out
      localhost: 2012-01-24 04:24:11.04::INFO:  Logging to STDERR via org.mortbay.log.StdErrLog
      localhost: 2012-01-24 04:24:11.104::INFO:  jetty-6.1.11
      
      $ pgrep -lf collector
      10220 /usr/java/latest/bin/java -Djava.library.path=/usr/lib/hadoop/lib/native/Linux-X86-64 -DCHUKWA_HOME=/usr/local/chukwa/bin/.. [...] : org.apache.hadoop.chukwa.datacollection.collector.CollectorStub
      
      $ kill -HUP 10220
      
      $ pgrep -lf collector
      10220 /usr/java/latest/bin/java -Djava.library.path=/usr/lib/hadoop/lib/native/Linux-X86-64 -DCHUKWA_HOME=/usr/local/chukwa/bin/ [...]: org.apache.hadoop.chukwa.datacollection.collector.CollectorStub
      

      The logs on the collector show no indication that any signal was received.

        Activity

        Noel Duffy created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            Noel Duffy
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development