Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Duplicate
    • Affects Version/s: 1.5.0
    • Fix Version/s: None
    • Component/s: master
    • Labels:
      None

      Description

      This has been seen a few times. Master with Xms1g and Xmx4g, which should be more than enough. Most recent case, with 44 nodes, 1.74k tablets, and 22 tables including !METADATA. There are NO conspicuous messages in the master (just DefaultLoadBalancer messages for each table). Possible exception for an error of "received invalid frame size of -..., are you using TTframeProtocol (can't remember exact message). But then the master out file has a message about OoM received, kill -9.

      I don't really know how to get more information out of it for when this does occur again.

        Issue Links

          Activity

          Hide
          Eric Newton added a comment -

          The master is regularly run over clusters with "hundreds and hundreds" of nodes, half a million tablets and 40+ tables, with as little as Xmx3g.

          Can you provide the logs? Especially the specific text of the OoM?

          Show
          Eric Newton added a comment - The master is regularly run over clusters with "hundreds and hundreds" of nodes, half a million tablets and 40+ tables, with as little as Xmx3g. Can you provide the logs? Especially the specific text of the OoM?
          Hide
          William Slacum added a comment - - edited
          Show
          William Slacum added a comment - - edited Set the process to do a heap dump when it OoM's. http://stackoverflow.com/questions/542979/using-heapdumponoutofmemoryerror-parameter-for-heap-dump-for-jboss
          Hide
          John Vines added a comment -

          Unfortunately the default config doesn't show the OOM stack trace when it OOMs, you just get the OnOutOfMemory message about kill -9 PID. I've since added configurations for it to heap dump on OOM, but who knows when I can get it to occur again.

          Show
          John Vines added a comment - Unfortunately the default config doesn't show the OOM stack trace when it OOMs, you just get the OnOutOfMemory message about kill -9 PID. I've since added configurations for it to heap dump on OOM, but who knows when I can get it to occur again.
          Hide
          John Vines added a comment -

          Description is mentioned in the ticket. 4 bytes sent to the master can easily bring it down. Setting the MAX_BUFFER_SIZE in thrift limits how big the allocation can be. I will be opening up a task to revisit these in a more thorough manner, but 2360 is a good interim fix.

          Show
          John Vines added a comment - Description is mentioned in the ticket. 4 bytes sent to the master can easily bring it down. Setting the MAX_BUFFER_SIZE in thrift limits how big the allocation can be. I will be opening up a task to revisit these in a more thorough manner, but 2360 is a good interim fix.

            People

            • Assignee:
              John Vines
              Reporter:
              John Vines
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development