Lucene - Core
  1. Lucene - Core
  2. LUCENE-2143

Understand why NRT performance is affected by flush frequency

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Not a Problem
    • Affects Version/s: None
    • Fix Version/s: 4.1
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      In LUCENE-2061 (perf tests for NRT), I test NRT performance by first
      getting a baseline QPS with only searching, using enough threads to
      saturate.

      Then, I pick an indexing rate (I used 100 docs/sec), and index docs at
      that rate, and I also reopen a NRT reader at different frequencies
      (10/sec, 1/sec, every 5 seconds, etc.), and then again test QPS
      (saturated).

      I think this is a good approach for testing NRT – apps can see, as a
      function of "freshness" and at a fixed indexing rate, what the cost is
      to QPS. You'd expect as index rate goes up, and freshness goes up,
      QPS will go down.

      But I found something very strange: the low frequency reopen rates
      often caused a highish hit to QPS. When I forced IW to flush every
      100 docs (= once per second), the performance was generally much
      better.

      I actually would've expected the reverse – flushing in batch ought to
      use fewer resoruces.

      One theory is something odd about my test env (based on OpenSolaris),
      so I'd like to retest on a more mainstream OS.

      I'm opening this issue to get to the bottom of it...

      1. SearchTest.java
        2 kB
        Michael McCandless

        Activity

        Hide
        Jake Mannix added a comment -

        So why also, are you only picking one indexing rate? Should you not also be varying this from low, all the way up to saturating the system with maximum indexing rate, to properly stress the system?

        Show
        Jake Mannix added a comment - So why also, are you only picking one indexing rate? Should you not also be varying this from low, all the way up to saturating the system with maximum indexing rate, to properly stress the system?
        Hide
        Michael McCandless added a comment -

        So why also, are you only picking one indexing rate? Should you not also be varying this from low, all the way up to saturating the system with maximum indexing rate, to properly stress the system?

        Yes, you're right – we need to see the full curve in general, when
        testing NRT. I already did that in LUCENE-2061 (see the first 3
        tables), testing saturated QPS at 1/10/100/1000 DPS, and reopen rate
        of every 0.1/1.0/5.0/10.0/30.0 seconds, and update vs add case.

        It's a 4d space that NRT tests should cover – freshness, QPS, DPS,
        add vs update; you fix 3 of them and saturate the other, to get the
        curve. The big question is how much you have to pay (lower QPS/DPS)
        for different levels of freshness.

        What was good about the tests in LUCENE-2061 is, as long as you flush
        once per second, you don't pay much for freshness, even down to 10X
        reopens per second, at least at the 100 docs/sec indexing rate. I
        think for most NRT apps, 100 msec freshness is more than enough.

        I'm focused on the 100 docs/sec indexing rate, for this issue, because
        that's able to reproduce this odd problem (though I imagine other
        DPS would as well).

        Show
        Michael McCandless added a comment - So why also, are you only picking one indexing rate? Should you not also be varying this from low, all the way up to saturating the system with maximum indexing rate, to properly stress the system? Yes, you're right – we need to see the full curve in general, when testing NRT. I already did that in LUCENE-2061 (see the first 3 tables), testing saturated QPS at 1/10/100/1000 DPS, and reopen rate of every 0.1/1.0/5.0/10.0/30.0 seconds, and update vs add case. It's a 4d space that NRT tests should cover – freshness, QPS, DPS, add vs update; you fix 3 of them and saturate the other, to get the curve. The big question is how much you have to pay (lower QPS/DPS) for different levels of freshness. What was good about the tests in LUCENE-2061 is, as long as you flush once per second, you don't pay much for freshness, even down to 10X reopens per second, at least at the 100 docs/sec indexing rate. I think for most NRT apps, 100 msec freshness is more than enough. I'm focused on the 100 docs/sec indexing rate, for this issue, because that's able to reproduce this odd problem (though I imagine other DPS would as well).
        Hide
        Michael McCandless added a comment -

        So, good news / bad news...

        The good news is I got a more mainstream test env (CentOS 5.4) online.

        The bad news is the strange anomolies when testing NRT still occur,
        and, flushing every 100 docs does not work around them.

        But then the good news is, I managed to isolate the problem to the
        hotspot compiler: somehow, it consistently compiles Lucene's search
        code less efficiently (20-30% slower) depending on which test is being
        run, which basically makes it impossible to really test performance
        tradeoffs of NRT.

        I've attached a simple SearchTest.java that shows the hotspot issue.

        Run it like this:

        java SearchTest /path/to/index <warmMethod>
        

        I'm testing against a 5M doc Wikipedia index.

        The <warmMethod> can be:

        • "writer": open a writer, indexes docs, then rollback
        • "nrt": same as "writer", but periodically get an NRT reader
        • "reader": just open an IndexReader N times, then close it
        • "searcher": same as "reader", but do some searching against each
          opened reader
        • "none": do no warming

        After the warming, the test just runs a set of searches (TermQuery for
        terms 0, 1, 2 ... 9) 10 times, then prints the min time.

        I ran the tests on a 5M docs wikipedia index.

        On nearly all JREs version I've tested, on OpenSolaris 2009.06 &
        CentOS 5.4, warming with NRT causes a "permanent" loss of search
        performance of somewhere between 20-30%. EG here's my results on
        OpenSolaris:

        nrt...
          5718 msec
        searcher...
          4664 msec
        reader...
          4771 msec
        writer...
          4785 msec
        none...
          4839 msec
        

        On CentOS:

        nrt...
          4550 msec
        searcher...
          3760 msec
        reader...
          4730 msec
        writer...
          3780 msec
        none...
          3766 msec
        

        (In this case the "reader" warming also kicked hotspot into the slow
        mode... it seems to be intermittant because sometimes "reader" is
        fast)

        I run java as "java -server -Xms1g -Xmx1g"

        It's very odd... I suspect something buggy in hotspot, but I'm not
        sure how to isolate it. It seems to somehow kick itself into a state
        where it produces less optimal code for searching. And we're not
        talking about that many methods, on the hotspots for running
        TermQuery...

        I even printed out the assembly code (-XX:+PrintOptoAssembly) and it
        was very strange – eg even IndexInput.readVInt was compiled
        differently, if you warmed with "nrt" vs the others. I don't get it.

        I'm trying to find a workaround that makes hotspot more manageable so
        I can run real tests....

        Show
        Michael McCandless added a comment - So, good news / bad news... The good news is I got a more mainstream test env (CentOS 5.4) online. The bad news is the strange anomolies when testing NRT still occur, and, flushing every 100 docs does not work around them. But then the good news is, I managed to isolate the problem to the hotspot compiler: somehow, it consistently compiles Lucene's search code less efficiently (20-30% slower) depending on which test is being run, which basically makes it impossible to really test performance tradeoffs of NRT. I've attached a simple SearchTest.java that shows the hotspot issue. Run it like this: java SearchTest /path/to/index <warmMethod> I'm testing against a 5M doc Wikipedia index. The <warmMethod> can be: "writer": open a writer, indexes docs, then rollback "nrt": same as "writer", but periodically get an NRT reader "reader": just open an IndexReader N times, then close it "searcher": same as "reader", but do some searching against each opened reader "none": do no warming After the warming, the test just runs a set of searches (TermQuery for terms 0, 1, 2 ... 9) 10 times, then prints the min time. I ran the tests on a 5M docs wikipedia index. On nearly all JREs version I've tested, on OpenSolaris 2009.06 & CentOS 5.4, warming with NRT causes a "permanent" loss of search performance of somewhere between 20-30%. EG here's my results on OpenSolaris: nrt... 5718 msec searcher... 4664 msec reader... 4771 msec writer... 4785 msec none... 4839 msec On CentOS: nrt... 4550 msec searcher... 3760 msec reader... 4730 msec writer... 3780 msec none... 3766 msec (In this case the "reader" warming also kicked hotspot into the slow mode... it seems to be intermittant because sometimes "reader" is fast) I run java as "java -server -Xms1g -Xmx1g" It's very odd... I suspect something buggy in hotspot, but I'm not sure how to isolate it. It seems to somehow kick itself into a state where it produces less optimal code for searching. And we're not talking about that many methods, on the hotspots for running TermQuery... I even printed out the assembly code (-XX:+PrintOptoAssembly) and it was very strange – eg even IndexInput.readVInt was compiled differently, if you warmed with "nrt" vs the others. I don't get it. I'm trying to find a workaround that makes hotspot more manageable so I can run real tests....
        Hide
        Michael McCandless added a comment -

        This is a hotspot issue ... not much we can do about it.

        Show
        Michael McCandless added a comment - This is a hotspot issue ... not much we can do about it.
        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development