Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3.0
    • Component/s: Tests
    • Labels:
      None

      Description

      Now we have test which run the same code 4 times, every run 17+ seconds.

      Need do small refactoring and remove duplicated code.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/3089

          Thanks you for the effort to increase testing performance.

          I think we have to look very closely here, because it may easily decrease the test coverage. I think there were subtle differences between the tests concerning reusability of result holder objects.
          The hash table classes are at the core of many DataSet operations and had subtle bugs before that we caught and fixed by massively expanding test coverage. We must absolutely preserve that.

          That being said, removing exact duplicates makes total sense - we simply need to double check that these are in fact exact duplicates and not fuzzy duplicates.

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3089 Thanks you for the effort to increase testing performance. I think we have to look very closely here, because it may easily decrease the test coverage. I think there were subtle differences between the tests concerning reusability of result holder objects. The hash table classes are at the core of many DataSet operations and had subtle bugs before that we caught and fixed by massively expanding test coverage. We must absolutely preserve that. That being said, removing exact duplicates makes total sense - we simply need to double check that these are in fact exact duplicates and not fuzzy duplicates.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user xhumanoid commented on the issue:

          https://github.com/apache/flink/pull/3089

          @StephanEwen
          Hi, I did check more than double time

          This two part of code fully equals:

          ```java
          Tuple2<Integer, Integer> record;
          final Tuple2<Integer, Integer> recordReuse = new Tuple2<>();

          if (record = buildSide.next(recordReuse) != null )

          { .. }
          while (record = buildSide.next(record) != null ) { .. }


          ```

          ```java
          Tuple2<Integer, Integer> record;
          final Tuple2<Integer, Integer> recordReuse = new Tuple2<>();

          if (record = buildSide.next(recordReuse) != null )

          { .. }
          while (record = buildSide.next(recordReuse) != null ) { .. }

          ```

          after `if` record and recordReuse are pointing to one object.

          Other point it's comment:
          'This test is basically identical to the "testSpillingHashJoinWithMassiveCollisions" test, only that the number',
          but when @aljoscha did refactoring (2 years ago) he replaced record => recordReuse for one function (both place), but for second function replaced only in one place.

          p.s. i spent evening until unraveled this code from first commits =\

          Show
          githubbot ASF GitHub Bot added a comment - Github user xhumanoid commented on the issue: https://github.com/apache/flink/pull/3089 @StephanEwen Hi, I did check more than double time This two part of code fully equals: ```java Tuple2<Integer, Integer> record; final Tuple2<Integer, Integer> recordReuse = new Tuple2<>(); if (record = buildSide.next(recordReuse) != null ) { .. } while (record = buildSide.next(record) != null ) { .. } ``` ```java Tuple2<Integer, Integer> record; final Tuple2<Integer, Integer> recordReuse = new Tuple2<>(); if (record = buildSide.next(recordReuse) != null ) { .. } while (record = buildSide.next(recordReuse) != null ) { .. } ``` after `if` record and recordReuse are pointing to one object. Other point it's comment: 'This test is basically identical to the "testSpillingHashJoinWithMassiveCollisions" test, only that the number', but when @aljoscha did refactoring (2 years ago) he replaced record => recordReuse for one function (both place), but for second function replaced only in one place. p.s. i spent evening until unraveled this code from first commits =\
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/3089

          Thanks!
          I will try to get to this in the remainder of the week.
          Speeding up builds is a valuable thing

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3089 Thanks! I will try to get to this in the remainder of the week. Speeding up builds is a valuable thing
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/3089

          Okay, I finally found the time to double check this.

          The changes are good, merging this...

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3089 Okay, I finally found the time to double check this. The changes are good, merging this...
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/3089

          Thank you for fixing this!

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/3089 Thank you for fixing this!
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/flink/pull/3089

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/3089
          Hide
          StephanEwen Stephan Ewen added a comment -

          Fixed via 53134594644407d0a3cd691b0e93ae09ff6c8102

          Show
          StephanEwen Stephan Ewen added a comment - Fixed via 53134594644407d0a3cd691b0e93ae09ff6c8102

            People

            • Assignee:
              Unassigned
              Reporter:
              humanoid Alexey Diomin
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development