Solr
  1. Solr
  2. SOLR-3658

SolrCmdDistributor can briefly create spikes of threads in the thousands.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0, 6.0
    • Component/s: SolrCloud
    • Labels:
      None

      Description

        Activity

        Hide
        Mark Miller added a comment -

        SolrCmdDIstributor is using a thread executor that has no bounds on the number of threads it can create. Commit or busy activity on a replica can cause a small backup that leads to spikes of threads on the leader. They tend to spike up and then come back down. Since each thread eats memory, this can be very costly or cause an OOME: cant create new native thread.

        One possible workaround is to lower the thread stack size - on many 64-bit platforms it defaults to 1mb...-Xss=128kb might be enough and might make the situation better. It's not a solution though.

        As a fix, after discussing with Yonik, I've been working on limiting the number of threads to 8 x the number of hosts in the cluster as a start. Perhaps it will be overridable with config.

        Show
        Mark Miller added a comment - SolrCmdDIstributor is using a thread executor that has no bounds on the number of threads it can create. Commit or busy activity on a replica can cause a small backup that leads to spikes of threads on the leader. They tend to spike up and then come back down. Since each thread eats memory, this can be very costly or cause an OOME: cant create new native thread. One possible workaround is to lower the thread stack size - on many 64-bit platforms it defaults to 1mb...-Xss=128kb might be enough and might make the situation better. It's not a solution though. As a fix, after discussing with Yonik, I've been working on limiting the number of threads to 8 x the number of hosts in the cluster as a start. Perhaps it will be overridable with config.
        Hide
        Mark Miller added a comment -

        Here is an initial patch. It limits the total number of threads to hosts*8 (8 is the max connections per host we allow currenty anyhow). Just an initial stab, but it seems to test out all right so far.

        Show
        Mark Miller added a comment - Here is an initial patch. It limits the total number of threads to hosts*8 (8 is the max connections per host we allow currenty anyhow). Just an initial stab, but it seems to test out all right so far.
        Hide
        Yonik Seeley added a comment - - edited

        Other thing to keep in mind memory-wise is that each thread is associated with a number of outgoing update requests and other associated state, which in a realistic system probably take up more mem than the thread stack.

        edit: Another thing to note is that if Java thread stacks are treated the same as C stacks (and I believe they are), then the max stack size has nothing to do with how real RAM is used by the stack (it's expanded as needed by mapping real RAM to the address space via the MMU). In 32 bit programs, it's still a resource you can run out of easily of course.

        Show
        Yonik Seeley added a comment - - edited Other thing to keep in mind memory-wise is that each thread is associated with a number of outgoing update requests and other associated state, which in a realistic system probably take up more mem than the thread stack. edit: Another thing to note is that if Java thread stacks are treated the same as C stacks (and I believe they are), then the max stack size has nothing to do with how real RAM is used by the stack (it's expanded as needed by mapping real RAM to the address space via the MMU). In 32 bit programs, it's still a resource you can run out of easily of course.
        Hide
        Mark Miller added a comment -

        Yeah, I'm not sure of the nitty gritty - but a couple sites I found were people where experimenting with just firing up threads to see how many they could create (that where not really doing anything), they could fire up way more before running out of RAM if they used a low Xss.

        Show
        Mark Miller added a comment - Yeah, I'm not sure of the nitty gritty - but a couple sites I found were people where experimenting with just firing up threads to see how many they could create (that where not really doing anything), they could fire up way more before running out of RAM if they used a low Xss.
        Hide
        Mark Miller added a comment -

        (although i don't know if they were just using 32-bit or what)

        Show
        Mark Miller added a comment - (although i don't know if they were just using 32-bit or what)
        Hide
        Mark Miller added a comment -

        I'm going to commit this as a start - it seems to test out fine so far - we can change or tweak as necessary after.

        Show
        Mark Miller added a comment - I'm going to commit this as a start - it seems to test out fine so far - we can change or tweak as necessary after.
        Hide
        Mark Miller added a comment -

        I committed what we have so far.

        Show
        Mark Miller added a comment - I committed what we have so far.
        Hide
        Mark Miller added a comment -

        There were some real problems with my previous solution - it somewhat worked accidentally - but I think really damaged performance probably.

        I just committed a new approach that has tested out nicely so far.

        Show
        Mark Miller added a comment - There were some real problems with my previous solution - it somewhat worked accidentally - but I think really damaged performance probably. I just committed a new approach that has tested out nicely so far.
        Hide
        Robert Muir added a comment -

        rmuir20120906-bulk-40-change

        Show
        Robert Muir added a comment - rmuir20120906-bulk-40-change
        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            Mark Miller
            Reporter:
            Mark Miller
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development