Lucene - Core
  1. Lucene - Core
  2. LUCENE-3489

Refactor test classes that use assumeFalse(codec != SimpleText, Memory) to use new annotation and move the expensive methods to separate classes

    Details

    • Type: Test Test
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: 4.9, 5.0
    • Component/s: general/test
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Folloup for LUCENE-3463.

      TODO:

      • Move test-methods that need the new @UseNoMemoryExpensiveCodec annotation to separate classes
      • Eliminate the assumeFalse-calls that check the current codec and disable the test if SimpleText or Memory is used
      1. LUCENE-3489.patch
        34 kB
        Robert Muir
      2. LUCENE-3489.patch
        33 kB
        Robert Muir
      3. LUCENE-3489.patch
        34 kB
        Robert Muir

        Issue Links

          Activity

          Hide
          Uwe Schindler added a comment -

          Move issue to Lucene 4.9.

          Show
          Uwe Schindler added a comment - Move issue to Lucene 4.9.
          Hide
          Steve Rowe added a comment -

          Bulk move 4.4 issues to 4.5 and 5.0

          Show
          Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
          Hide
          Robert Muir added a comment -

          ok one last change, renamed to SuppressCodecs (it actually is not just funny, but better since it works the same way etc)

          Show
          Robert Muir added a comment - ok one last change, renamed to SuppressCodecs (it actually is not just funny, but better since it works the same way etc)
          Hide
          Robert Muir added a comment -

          updated patch using value[], much less wordy.

          I will commit soon.

          Show
          Robert Muir added a comment - updated patch using value[], much less wordy. I will commit soon.
          Hide
          Robert Muir added a comment -

          I agree, this is the main problem with the current patch. We should fix this before committing.

          Show
          Robert Muir added a comment - I agree, this is the main problem with the current patch. We should fix this before committing.
          Hide
          Uwe Schindler added a comment - - edited

          It's easy, just rename codecs to "String[] value" and you are done. After that you can use @AvoidCodecs("SimpleText") or @AvoidCodecs(

          {"SimpleText","Lucene3x"}

          )

          See: http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/SuppressWarnings.html

          Show
          Uwe Schindler added a comment - - edited It's easy, just rename codecs to "String[] value" and you are done. After that you can use @AvoidCodecs("SimpleText") or @AvoidCodecs( {"SimpleText","Lucene3x"} ) See: http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/SuppressWarnings.html
          Hide
          Uwe Schindler added a comment -

          I like the annotation. Can we maybe change it to look like @SuppressWarnings? So it does not need codecs={} or if there is only one codec, no {} at all? Should be not too hard?

          Otherwise strong +1!

          Show
          Uwe Schindler added a comment - I like the annotation. Can we maybe change it to look like @SuppressWarnings? So it does not need codecs={} or if there is only one codec, no {} at all? Should be not too hard? Otherwise strong +1!
          Hide
          Robert Muir added a comment -

          attached is a patch generalizing the UseNoExpensiveMemory annotation to @AvoidCodecs that takes a list of codecs to avoid.

          This way, tests that cannot work with Lucene3x codec can just avoid it, using another codec, rather than assuming (in general its bad that many of the tests of actual new functionality often dont run at all because of the current assumes)

          Show
          Robert Muir added a comment - attached is a patch generalizing the UseNoExpensiveMemory annotation to @AvoidCodecs that takes a list of codecs to avoid. This way, tests that cannot work with Lucene3x codec can just avoid it, using another codec, rather than assuming (in general its bad that many of the tests of actual new functionality often dont run at all because of the current assumes)
          Hide
          Robert Muir added a comment -

          One last thing, thinking thru the simplifications Dawid is looking at doing,
          and knowing how horrible the code currently is, we could consider trying some things like:

          • upgrade/fix our tests to work with latest junit? maybe there are less frustrations
          • contribute some of the more general things like assume(String message, xxx) to junit to get them out of our codebase?
          Show
          Robert Muir added a comment - One last thing, thinking thru the simplifications Dawid is looking at doing, and knowing how horrible the code currently is, we could consider trying some things like: upgrade/fix our tests to work with latest junit? maybe there are less frustrations contribute some of the more general things like assume(String message, xxx) to junit to get them out of our codebase?
          Hide
          Robert Muir added a comment -

          And just so you know, its not possible i could have used 4.8 here, because all of our tests fail with 4.8

          Thats because of breaks in the lifecycle of TestWatchMan (Its initialized before the @Before's in 4.8, not in 4.7).
          A separate problem, but just something to mention. currently you cannot use junit 4.8 with lucene's tests for this reason.

          Show
          Robert Muir added a comment - And just so you know, its not possible i could have used 4.8 here, because all of our tests fail with 4.8 Thats because of breaks in the lifecycle of TestWatchMan (Its initialized before the @Before's in 4.8, not in 4.7). A separate problem, but just something to mention. currently you cannot use junit 4.8 with lucene's tests for this reason.
          Hide
          Robert Muir added a comment -

          This is what's causing the problem (superclass impl. changing over time - I think you just hit two different junit versions in that issue).

          I disagree. I used the same junit version (4.7) myself in both eclipse and via ant to deal with this problem. It has nothing to do with that.

          The junit test lifecycle is really undefined just as I described, its unfortunate.

          Show
          Robert Muir added a comment - This is what's causing the problem (superclass impl. changing over time - I think you just hit two different junit versions in that issue). I disagree. I used the same junit version (4.7) myself in both eclipse and via ant to deal with this problem. It has nothing to do with that. The junit test lifecycle is really undefined just as I described, its unfortunate.
          Hide
          Dawid Weiss added a comment -

          Re-reading the above algorithm I think I'll make it clearer: my point is that you can write repeatable runner by starting from a single initial seed and assigning initial seeds to all execution start points (tests) regardless of whether they are executed or not (and how many times). Hope I'm a bit clear(er) now.

          Show
          Dawid Weiss added a comment - Re-reading the above algorithm I think I'll make it clearer: my point is that you can write repeatable runner by starting from a single initial seed and assigning initial seeds to all execution start points (tests) regardless of whether they are executed or not (and how many times). Hope I'm a bit clear(er) now.
          Hide
          Dawid Weiss added a comment -

          Thanks. I did go through the code, so I know where the seeds are used and had a pretty much good understanding as to why.

          As for the different lifecycle - this is weird, but isn't it a direct consequence of subclassing BlockJUnit4ClassRunner and relying on what it internally does? This is what's causing the problem (superclass impl. changing over time - I think you just hit two different junit versions in that issue).

          Perhaps I was wrong and a custom runner is indeed needed, but if so then I still think a single seed (logically) would be fine. A custom runner then:

          • collects methods to be executed (per-class)
          • initializes the global init seed/ random. This random becomes the initial randomness source for everything that follows to make it repeatable.
          • shuffles methods,
          • execute any @BeforeClass rules (see note below),
          • for each selected method (-Dtestmethod limits the selection and acts as a filter), repeat test.iter-times (seed changes predictably): {initialize per-method starting seed based on the current random, create a new test instance, execute}

            .

          • execute any @AfterClass rules

          The question how to randomize class-level fixtures could be answered by a static utility method that would return the per-class seed using ThreadLocal or a thread map updated by the runner. Still predictable and repeatable.

          I'll chew a bit on the possibilities and report back tomorrow.

          Show
          Dawid Weiss added a comment - Thanks. I did go through the code, so I know where the seeds are used and had a pretty much good understanding as to why. As for the different lifecycle - this is weird, but isn't it a direct consequence of subclassing BlockJUnit4ClassRunner and relying on what it internally does? This is what's causing the problem (superclass impl. changing over time - I think you just hit two different junit versions in that issue). Perhaps I was wrong and a custom runner is indeed needed, but if so then I still think a single seed (logically) would be fine. A custom runner then: collects methods to be executed (per-class) initializes the global init seed/ random. This random becomes the initial randomness source for everything that follows to make it repeatable. shuffles methods, execute any @BeforeClass rules (see note below), for each selected method (-Dtestmethod limits the selection and acts as a filter), repeat test.iter-times (seed changes predictably): {initialize per-method starting seed based on the current random, create a new test instance, execute} . execute any @AfterClass rules The question how to randomize class-level fixtures could be answered by a static utility method that would return the per-class seed using ThreadLocal or a thread map updated by the runner. Still predictable and repeatable. I'll chew a bit on the possibilities and report back tomorrow.
          Hide
          Robert Muir added a comment -

          The need for the runner seed is explained in LUCENE-3362. One problem is the "Test lifecyle" of junit is ill-defined, it depends on how you are running tests!

          The problem is that via ant, tests work like this (e.g. for 3 test classes):
          computeTestMethods
          beforeClass
          afterClass
          computeTestMethods
          beforeClass
          AfterClass
          computeTestMethods
          beforeClass
          afterClass
          
          but via an IDE or maven, if you run it from a folder like you did, then it does this:
          computeTestMethods
          computeTestMethods
          computeTestMethods
          beforeClass
          afterClass
          beforeClass
          afterClass
          beforeClass
          afterClass 
          
          Show
          Robert Muir added a comment - The need for the runner seed is explained in LUCENE-3362 . One problem is the "Test lifecyle" of junit is ill-defined, it depends on how you are running tests! The problem is that via ant, tests work like this (e.g. for 3 test classes): computeTestMethods beforeClass afterClass computeTestMethods beforeClass AfterClass computeTestMethods beforeClass afterClass but via an IDE or maven, if you run it from a folder like you did, then it does this: computeTestMethods computeTestMethods computeTestMethods beforeClass afterClass beforeClass afterClass beforeClass afterClass
          Hide
          Uwe Schindler added a comment - - edited

          Yeah, I figured that you want to keep it compact. These may be compatible because there's nothing forbidding us to keep LuceneTestCase as a base class (descending from Assert and providing Lucene-related infrastructure).

          Yes, I just wanted to mention this.

          The other stuff in LuceneTestRunner is just to work around some limitations in JUnit's @BeforeClass: @BeforeClass does not pass the Class object to the annotated method, and you cannot find out which child class is initialized. So checking for annotations on the implementation class from the abstract LuceneTestCase base class does not work.

          Oh, by the way – is there any particular reason for so many things to be static (class level)? I get these are fixtures reused by tests but would people scream if they were object-level fixtures rather than class-level fixtures? It'd make things a bit easier... starting with the need for a single initial seed, for example.

          The reason is simple: We want those per test-class lifetime, but JUnit allocates a new class instance for each test method. And lot's of Lucene tests use @BeforeClass to produce indexes (random) static indexes, then used by all test methods in a read-only way. Currently we have 3 seeds, one for class-level stuff, one for instance stuff and a third one for the runner (according to Mister Robert Muir: LUCENE-3362).

          The randoms must therefore be static and initialized in @BeforeClass.

          Show
          Uwe Schindler added a comment - - edited Yeah, I figured that you want to keep it compact. These may be compatible because there's nothing forbidding us to keep LuceneTestCase as a base class (descending from Assert and providing Lucene-related infrastructure). Yes, I just wanted to mention this. The other stuff in LuceneTestRunner is just to work around some limitations in JUnit's @BeforeClass: @BeforeClass does not pass the Class object to the annotated method, and you cannot find out which child class is initialized. So checking for annotations on the implementation class from the abstract LuceneTestCase base class does not work. Oh, by the way – is there any particular reason for so many things to be static (class level)? I get these are fixtures reused by tests but would people scream if they were object-level fixtures rather than class-level fixtures? It'd make things a bit easier... starting with the need for a single initial seed, for example. The reason is simple: We want those per test-class lifetime, but JUnit allocates a new class instance for each test method. And lot's of Lucene tests use @BeforeClass to produce indexes (random) static indexes, then used by all test methods in a read-only way. Currently we have 3 seeds, one for class-level stuff, one for instance stuff and a third one for the runner (according to Mister Robert Muir : LUCENE-3362 ). The randoms must therefore be static and initialized in @BeforeClass.
          Hide
          Robert Muir added a comment -

          Oh, by the way – is there any particular reason for so many things to be static (class level)? I get these are fixtures reused by tests but would people scream if they were object-level fixtures rather than class-level fixtures? It'd make things a bit easier... starting with the need for a single initial seed, for example.

          why we have the different seeds:
          One thing we do is support running a test class (test1(), test2(), test3()). If test2() fails, we want to be able to just run that method and reproduce it.
          So we allow you to specify -Dtestmethod to only run a single method.

          At the same time, we want to support doing things like creating indexes in beforeClass() and afterClass() for efficient tests.
          We also support -Dtests.iter, where you run a single test method over and over... this is often convenient. If we only had 1 class-level seed, this would
          be useless as it would just do the same thing over and over!

          So the need for multiple seeds comes from the fact that some things are random at "class-level" and some things are at "method level".
          If you look at the 3 parts to the random seed, its really part1:part2:part3,

          • part1 = class seed
          • part2 = method seed
          • part3 = runner seed (this is needed for consistent randomization of test methods)
          Show
          Robert Muir added a comment - Oh, by the way – is there any particular reason for so many things to be static (class level)? I get these are fixtures reused by tests but would people scream if they were object-level fixtures rather than class-level fixtures? It'd make things a bit easier... starting with the need for a single initial seed, for example. why we have the different seeds: One thing we do is support running a test class (test1(), test2(), test3()). If test2() fails, we want to be able to just run that method and reproduce it. So we allow you to specify -Dtestmethod to only run a single method. At the same time, we want to support doing things like creating indexes in beforeClass() and afterClass() for efficient tests. We also support -Dtests.iter, where you run a single test method over and over... this is often convenient. If we only had 1 class-level seed, this would be useless as it would just do the same thing over and over! So the need for multiple seeds comes from the fact that some things are random at "class-level" and some things are at "method level". If you look at the 3 parts to the random seed, its really part1:part2:part3, part1 = class seed part2 = method seed part3 = runner seed (this is needed for consistent randomization of test methods)
          Hide
          Dawid Weiss added a comment -

          Yeah, I figured that you want to keep it compact. These may be compatible because there's nothing forbidding us to keep LuceneTestCase as a base class (descending from Assert and providing Lucene-related infrastructure). I'm just trying to push all the randomization (seed initialization, reproducibility, thread controls) out of LuceneTestCase and into something more generic. So far it looks good to my eyes, but I'll be looking forward to your strict German opinion, Uwe

          Oh, by the way – is there any particular reason for so many things to be static (class level)? I get these are fixtures reused by tests but would people scream if they were object-level fixtures rather than class-level fixtures? It'd make things a bit easier... starting with the need for a single initial seed, for example.

          Show
          Dawid Weiss added a comment - Yeah, I figured that you want to keep it compact. These may be compatible because there's nothing forbidding us to keep LuceneTestCase as a base class (descending from Assert and providing Lucene-related infrastructure). I'm just trying to push all the randomization (seed initialization, reproducibility, thread controls) out of LuceneTestCase and into something more generic. So far it looks good to my eyes, but I'll be looking forward to your strict German opinion, Uwe Oh, by the way – is there any particular reason for so many things to be static (class level)? I get these are fixtures reused by tests but would people scream if they were object-level fixtures rather than class-level fixtures? It'd make things a bit easier... starting with the need for a single initial seed, for example.
          Hide
          Uwe Schindler added a comment -

          (for example it's based on JUnit @Rules only, not on a custom runner/ base abstract class)

          This is a nice idea, although the reason for the LuceneTestCaseRunner and the abstract base class is more because we hate @Test annotations and dont want to add thousands of stupid Assert-includes (having this is abstract base class is more convenient). Just to mention UweSays on twitter: @UweSays

          Show
          Uwe Schindler added a comment - (for example it's based on JUnit @Rules only, not on a custom runner/ base abstract class) This is a nice idea, although the reason for the LuceneTestCaseRunner and the abstract base class is more because we hate @Test annotations and dont want to add thousands of stupid Assert-includes (having this is abstract base class is more convenient). Just to mention UweSays on twitter: @UweSays
          Hide
          Dawid Weiss added a comment -

          While working on my presentation I've been trying to generalize the concept of "randomized tests". There's definitely a lot of great concepts here, but they're closely coupled with Lucene and the rest of the Solr/Lucene infrastructure.

          I have a draft code of a RandomizedTesting framework that provides very much similar functionality, although in a slightly different technical way (for example it's based on JUnit @Rules only, not on a custom runner/ base abstract class). I would really like you to peek at this and, perhaps with some effort, generalize the concepts in the test framework instead of introducing more Lucene-specific annotations.

          I'll publish the code tomorrow (it still needs some, ehm, polishing) along with some thoughts that I had about the current code in LuceneTestCase/Runner.

          I'd really like this to evolve into something of a stand-alone project (even if bundled with Lucene) so that other project can benefit without necessarily rely on the rest of Lucene code. We're already using a somewhat decoupled code internally and making it really cross-project applicable is a great way of proving these concepts are generally useful.

          Until tomorrow?

          Show
          Dawid Weiss added a comment - While working on my presentation I've been trying to generalize the concept of "randomized tests". There's definitely a lot of great concepts here, but they're closely coupled with Lucene and the rest of the Solr/Lucene infrastructure. I have a draft code of a RandomizedTesting framework that provides very much similar functionality, although in a slightly different technical way (for example it's based on JUnit @Rules only, not on a custom runner/ base abstract class). I would really like you to peek at this and, perhaps with some effort, generalize the concepts in the test framework instead of introducing more Lucene-specific annotations. I'll publish the code tomorrow (it still needs some, ehm, polishing) along with some thoughts that I had about the current code in LuceneTestCase/Runner. I'd really like this to evolve into something of a stand-alone project (even if bundled with Lucene) so that other project can benefit without necessarily rely on the rest of Lucene code. We're already using a somewhat decoupled code internally and making it really cross-project applicable is a great way of proving these concepts are generally useful. Until tomorrow?
          Hide
          Uwe Schindler added a comment -

          Nice idea, this can be easily transformed to a annotation with param! Of course it would be per-class, too.

          Show
          Uwe Schindler added a comment - Nice idea, this can be easily transformed to a annotation with param! Of course it would be per-class, too.
          Hide
          Robert Muir added a comment -

          you know, another similar issue we have is tests that assumeFalse(codec != PreFlex), because of things like new index statistics or byte terms or other features that it doesn't support.

          Maybe there is some way we could generalize the annotation?

          something like @AvoidCodecs("SimpleText", "Memory"), @AvoidCodecs("PreFlex"), and this set would be handled like the boolean today?

          Show
          Robert Muir added a comment - you know, another similar issue we have is tests that assumeFalse(codec != PreFlex), because of things like new index statistics or byte terms or other features that it doesn't support. Maybe there is some way we could generalize the annotation? something like @AvoidCodecs("SimpleText", "Memory"), @AvoidCodecs("PreFlex"), and this set would be handled like the boolean today?

            People

            • Assignee:
              Unassigned
              Reporter:
              Uwe Schindler
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:

                Development