Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-24638

Ambari-agent process memory leak

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.7.0
    • 2.7.3
    • None

    Description

      There was one process which started using memory rapidly at certain point and grew up to ~27GB of RSS used until eventually we restarted it. Which happened after a month of running of 10 ambari-agent nodes.

      [root@andrew2-1n01 ~]# ps aux | grep ambari_agent
      root 39955 0.0 0.0 47580 6024 ? S Aug17 0:00 /usr/bin/python /usr/lib/ambari-agent/lib/ambari_agent/AmbariAgent.py start
      root 39959 20.4 10.2 31623096 27154348 ? Sl Aug17 7645:55 /usr/bin/python /usr/lib/ambari-agent/lib/ambari_agent/main.py start

      Just before the growth in memory usage is seen. This exception pops out:

      ERROR 2018-09-11 10:56:59,716 websocket.py:552 - Websocket connection was closed with an exception
      Traceback (most recent call last):
      File "/usr/lib/ambari-agent/lib/ambari_ws4py/websocket.py", line 549, in run
      if not self.once():
      File "/usr/lib/ambari-agent/lib/ambari_ws4py/websocket.py", line 428, in once
      if not self.process(self.buf[:requested]):
      File "/usr/lib/ambari-agent/lib/ambari_ws4py/websocket.py", line 483, in process
      self.reading_buffer_size = s.parser.send(bytes) or DEFAULT_READING_SIZE
      ValueError: generator already executing

      This exception is not seen on all other nodes or on this one at any other period (during 1 month). So I suggest it can be the root cause.
      Basically this error means that generator is being used by multiple threads. So I will upload the fix to thread-lock this place.

      This is just a guess solution which might work and might not. No way to test really. But definitely we should try this.

      This is noticed in ambari-2.7.1.0-73 version as well.

      Attachments

        1. AMBARI-24638.patch
          0.7 kB
          Andrew Onischuk
        2. AMBARI-24638.patch
          0.7 kB
          Andrew Onischuk

        Activity

          People

            aonishuk Andrew Onischuk
            aonishuk Andrew Onischuk
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 20m
                1h 20m