Details

    • Type: Bug Bug
    • Status: Reopened
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: 4.9, 5.0
    • Component/s: general/build
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The nightly builds are taking 4-7 hours.

      This is caused by a few bad apples (can be seen https://builds.apache.org/job/Lucene-trunk/1896/testReport/).

      The top 5 are (all in analysis):

      • TestSynonymMapFilter: 1 hr 54 min
      • TestRandomChains: 1 hr 22 min
      • TestRemoveDuplicatesTokenFilter: 32 min
      • TestMappingCharFilter: 28 min
      • TestWordDelimiterFilter: 22 min

      so thats 4.5 hours right there for that run....

        Activity

        Robert Muir created issue -
        Hide
        Robert Muir added a comment -

        Patch, removing n^2 growth in these tests, and some other tuning of atLeast.

        In general, when tests like this hog the cpu for so long, we lose coverage overall.

        I'll keep an eye on the nightlies for other cpu-hogs.

        Here are the new timings for analyzers/ tests after the patch.

        'ant test' with no multiplier:

        BUILD SUCCESSFUL
        Total time: 1 minute 28 seconds
        

        'ant test -Dtests.multiplier=3 -Dtests.nightly=true'

        BUILD SUCCESSFUL
        Total time: 3 minutes 15 seconds
        
        Show
        Robert Muir added a comment - Patch, removing n^2 growth in these tests, and some other tuning of atLeast. In general, when tests like this hog the cpu for so long, we lose coverage overall. I'll keep an eye on the nightlies for other cpu-hogs. Here are the new timings for analyzers/ tests after the patch. 'ant test' with no multiplier: BUILD SUCCESSFUL Total time: 1 minute 28 seconds 'ant test -Dtests.multiplier=3 -Dtests.nightly=true' BUILD SUCCESSFUL Total time: 3 minutes 15 seconds
        Robert Muir made changes -
        Field Original Value New Value
        Attachment LUCENE-3994.patch [ 12522972 ]
        Robert Muir made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Fix Version/s 4.0 [ 12314025 ]
        Resolution Fixed [ 1 ]
        Hide
        Dawid Weiss added a comment -

        You could also update statistics – remove the previous ones and run two three times, then update.

        Alternatively, we could have jenkins update stats and fetch these from time to time.

        Show
        Dawid Weiss added a comment - You could also update statistics – remove the previous ones and run two three times, then update. Alternatively, we could have jenkins update stats and fetch these from time to time.
        Hide
        Robert Muir added a comment -

        I think statistics are mostly useless for nightly builds: since we pass huge multipliers and such?

        If anything, this issue did more for the stats than any stats update could do, as these tests
        now grow linearly instead of quadratically with the multiplier...

        Show
        Robert Muir added a comment - I think statistics are mostly useless for nightly builds: since we pass huge multipliers and such? If anything, this issue did more for the stats than any stats update could do, as these tests now grow linearly instead of quadratically with the multiplier...
        Hide
        Dawid Weiss added a comment -

        Ok. I'll recalculate them from time to time. There is a large variance in tests anyway (this can also be computed from log stats because we can keep a history of N runs... it'd be interesting to see which tests have the largest variance).

        Show
        Dawid Weiss added a comment - Ok. I'll recalculate them from time to time. There is a large variance in tests anyway (this can also be computed from log stats because we can keep a history of N runs... it'd be interesting to see which tests have the largest variance).
        Hide
        Robert Muir added a comment -

        Another thing i toned down here was the multithreaded testing in basetokenstreamtestcase,
        there is something os-specific about freebsd's java that causes this to take a lot more time
        than locally... thats why analysis tests take so long in nightly builds (especially with the n^2!)

        Show
        Robert Muir added a comment - Another thing i toned down here was the multithreaded testing in basetokenstreamtestcase, there is something os-specific about freebsd's java that causes this to take a lot more time than locally... thats why analysis tests take so long in nightly builds (especially with the n^2!)
        Show
        Robert Muir added a comment - There is a large variance in tests anyway Like this? https://builds.apache.org/job/Lucene-trunk/1896/testReport/org.apache.lucene.index/TestIndexWriterReader/history/
        Hide
        Robert Muir added a comment -

        This still occurs.

        I profiled the slow tests and found its because of the huge 1GB linedocs file. The problem is opening this 1GB zipped file and seeking to a random place (which is what LineDocs does), is really costly.

        so all the time is spent in GZIPInputStream.inflateBytes!

        I will temporary disable the huge file for nightly builds.

        Show
        Robert Muir added a comment - This still occurs. I profiled the slow tests and found its because of the huge 1GB linedocs file. The problem is opening this 1GB zipped file and seeking to a random place (which is what LineDocs does), is really costly. so all the time is spent in GZIPInputStream.inflateBytes! I will temporary disable the huge file for nightly builds.
        Robert Muir made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        rmuir committed 1327499 (1 file)
        Reviews: none

        LUCENE-3994: disable huge linedocs file until something else is figured out, so nightly builds don't take hours

        Hide
        Robert Muir added a comment -

        I'll leave the issue open, until we get the next nightly done, but this was pretty difficult to debug:

        Jenkins test time is now a total lie! I think its the clover time?

        Have a look at last nights build: https://builds.apache.org/job/Lucene-Trunk/1898/
        The entire build took 5 hours, yet it says tests took only 47 minutes: https://builds.apache.org/job/Lucene-Trunk/1898/testReport/

        Looking at the console you can see this is not the case:

        Actual tests:

        BUILD SUCCESSFUL
        Total time: 225 minutes 56 seconds
        

        Clovered tests:

        BUILD SUCCESSFUL
        Total time: 54 minutes 31 seconds
        

        Its possible i screwed this up with the nightly build changes from LUCENE-3965. I'll investigate.

        Show
        Robert Muir added a comment - I'll leave the issue open, until we get the next nightly done, but this was pretty difficult to debug: Jenkins test time is now a total lie! I think its the clover time? Have a look at last nights build: https://builds.apache.org/job/Lucene-Trunk/1898/ The entire build took 5 hours, yet it says tests took only 47 minutes: https://builds.apache.org/job/Lucene-Trunk/1898/testReport/ Looking at the console you can see this is not the case: Actual tests: BUILD SUCCESSFUL Total time: 225 minutes 56 seconds Clovered tests: BUILD SUCCESSFUL Total time: 54 minutes 31 seconds Its possible i screwed this up with the nightly build changes from LUCENE-3965 . I'll investigate.
        Hide
        Dawid Weiss added a comment -

        I've fixed that per-suite constant suite randomization already in github but I'll need some time to push to maven central, etc.

        Show
        Dawid Weiss added a comment - I've fixed that per-suite constant suite randomization already in github but I'll need some time to push to maven central, etc.
        Hide
        Robert Muir added a comment -

        Thanks Dawid, I am looking forward to that!

        Show
        Robert Muir added a comment - Thanks Dawid, I am looking forward to that!
        Hide
        Michael McCandless added a comment -

        so all the time is spent in GZIPInputStream.inflateBytes!

        Ugh, nice find Robert!

        I think for nightly hudson we should just pre-gunzip the file?

        I was also curious if this is substantially slowing down tests for the checked-in lines file ... it's much smaller so much less seeking. I ran a few tests (ran all lucene tests, using the python runner, with compressed vs uncompressed) and it seems to be in the noise...

        Show
        Michael McCandless added a comment - so all the time is spent in GZIPInputStream.inflateBytes! Ugh, nice find Robert! I think for nightly hudson we should just pre-gunzip the file? I was also curious if this is substantially slowing down tests for the checked-in lines file ... it's much smaller so much less seeking. I ran a few tests (ran all lucene tests, using the python runner, with compressed vs uncompressed) and it seems to be in the noise...
        Robert Muir made changes -
        Fix Version/s 4.1 [ 12321140 ]
        Fix Version/s 4.0 [ 12314025 ]
        Michael McCandless committed 1330485 (2 files)
        Reviews: none

        LUCENE-3994: pre-gunzip the nightly line docs file so the random seeking for tests is low cost

        Mark Miller made changes -
        Fix Version/s 5.0 [ 12321663 ]
        Mark Miller made changes -
        Fix Version/s 4.2 [ 12323899 ]
        Fix Version/s 4.1 [ 12321140 ]
        Robert Muir made changes -
        Fix Version/s 4.3 [ 12324143 ]
        Fix Version/s 5.0 [ 12321663 ]
        Fix Version/s 4.2 [ 12323899 ]
        Uwe Schindler made changes -
        Fix Version/s 4.4 [ 12324323 ]
        Fix Version/s 4.3 [ 12324143 ]
        Hide
        Steve Rowe added a comment -

        Bulk move 4.4 issues to 4.5 and 5.0

        Show
        Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
        Steve Rowe made changes -
        Fix Version/s 5.0 [ 12321663 ]
        Fix Version/s 4.5 [ 12324742 ]
        Fix Version/s 4.4 [ 12324323 ]
        Adrien Grand made changes -
        Fix Version/s 4.6 [ 12324999 ]
        Fix Version/s 5.0 [ 12321663 ]
        Fix Version/s 4.5 [ 12324742 ]
        Simon Willnauer made changes -
        Fix Version/s 4.7 [ 12325572 ]
        Fix Version/s 4.6 [ 12324999 ]
        David Smiley made changes -
        Fix Version/s 4.8 [ 12326269 ]
        Fix Version/s 4.7 [ 12325572 ]
        Hide
        Uwe Schindler added a comment -

        Move issue to Lucene 4.9.

        Show
        Uwe Schindler added a comment - Move issue to Lucene 4.9.
        Uwe Schindler made changes -
        Fix Version/s 4.9 [ 12326730 ]
        Fix Version/s 5.0 [ 12321663 ]
        Fix Version/s 4.8 [ 12326269 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Development