Uploaded image for project: 'Maven Resolver'
  1. Maven Resolver
  2. MRESOLVER-93

PathRecordingDependencyVisitor to handle 3 cycles

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.3.3, 1.4.0, 1.4.1
    • Fix Version/s: 1.4.2
    • Component/s: Resolver
    • Labels:
      None

      Description

      PathRecordingDependencyVisitor cannot handle dependency graphs that have 3 or more cycles such as below:
       

      gid:a:1 (1)
      +- gid:b:0
      |  \- ^1
      +- gid:b:1
      |  \- ^1
      \- gid:b:2
         \- ^1
      

      It fails with StackOverflowError or OutOfMemoryError. Test case.

       

      Solutions

      I came up with three solutions. I pick solution #1 for simplicity. 

      1. Use "parents" to check the cycle, rather than visited set

      This is the simplest. Checking array element member is usually discouraged especially for large data set. The implementation should confirm the overhead of this solution.

      2. Use AbstractMapBag/Multiset for visited set

      Creating a new class that extends AbstractMapBag and leverages IdentityHashMap. Although this solution would be theoretically more efficient than solution #1, I felt it's overkill to create a class just for this solution.

      AbstractMapBag(new IdentityHashMap<DependencyNode, AbstractMapBag.MutableInteger>())

       

      https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/bag/AbstractMapBag.html

       

      IdentityHashMap<DependencyNode, Integer>() would work as a multiset.

      3. Call visitLeave only when visitEnter is true

      The cause of this bug is DefaultDependencyNode calling visitLeave regardless of visitEnter result.

      I'm not sure how many other visitors rely on visitLeave being called regardless of visitEnter result.

      Illustration on why existing algorithm does not catch cycle 

      The following illustration is the node traversal for the test case above by current algorithm. This illustration tracks the dependency node graph and the "visited" set maintained by the visitor.

      • visited set. An internal data structure in PathRecordingDependencyVisitor to avoid cycle (link).
      • visitEnter(node): PathRecordingDependencyVisitor's function (link). When returning true, the node's children is traversed by the algorithm. This function adds the node to visited set.
      • visitLeave(node): PathRecordingDependencyVisitor's function (link). This function removes the node from visited set.

        

      The initial state starts with node "a" and visited set {a}.

      First child of a is b0. Because visited does not contain, visitEnter(b0) returns true, meaning that the algorithm traverses this b0's children next. B0 is added to visited.

      B0's children is "a". Because visited set contains "a", visitEnter(a) returns false. This means that the algorithm does not traverse this "a"'s children. A is added to visited set (already it has).

       

      Now not traversing this "a"'s children, the algorithm calls visitLeave(a). This removes "a" from visited set.

      B0's children are all traversed. the algorithm calls visitLeave(b0). This removes "b0" from visited set.

      Now visited set is empty.

      Next child of the root "a" is b1. B1 is not in visited set, thus visitEnter(b1) returns true. This means the algorithm traverses the children of this b1.

      B1's only child is a. "a" is not in visited set. visitEnter(a) returns true. This means to traverse "a"'s children.

      A's first children is b0. b0 is not in visited set. visitEnter(b0) returns true, meaning to traverse children of this b0.

       
      (img 0242)

      The only child of b0 is "a". Visited set contains "a", and thus not traversing its children.

      visitLeave(a) removes "a" from visited set.

      b0's children is all traversed. VisitLeave(b0) removes b0 from visited set.

      Next child of this "a" is b1. B1 is in visited set, and thus visitEnter(b1) returns false. This node's children is not to be traversed.

      (img 0255)

      visitLeave(b1) removes b1 from visited set. Now visited is emtpy.

      The last child of "a" is b2. VisitEnter(b2) returns true. It's children is to be traversed. B2 is in visited set.

       B2's only child is "a". "a" is not in visited set, thus visitEnter(a) returns true. The algorithm traverses this "a"'s children.

      (img 0258)

       

      (...omit...)

       

      IMG_0266 shows the step where I decided to give up. The algorithm does not seem to stop. Indeed the test shows that. The path from the root to the furthest a includes 5 "a" nodes. I concluded the visited set is not working as expected to avoid cycle.

       

       

        Attachments

        1. IMG_0234.jpg
          313 kB
          Tomo Suzuki
        2. IMG_0237.jpg
          372 kB
          Tomo Suzuki
        3. IMG_0235.jpg
          445 kB
          Tomo Suzuki
        4. IMG_0236.jpg
          463 kB
          Tomo Suzuki
        5. IMG_0238.jpg
          441 kB
          Tomo Suzuki
        6. IMG_0240.jpg
          504 kB
          Tomo Suzuki
        7. IMG_0243.jpg
          411 kB
          Tomo Suzuki
        8. IMG_0242.jpg
          500 kB
          Tomo Suzuki
        9. IMG_0244.jpg
          389 kB
          Tomo Suzuki
        10. IMG_0245.jpg
          386 kB
          Tomo Suzuki
        11. IMG_0257.jpg
          452 kB
          Tomo Suzuki
        12. IMG_0241.jpg
          544 kB
          Tomo Suzuki
        13. IMG_0255.jpg
          371 kB
          Tomo Suzuki
        14. IMG_0259.jpg
          398 kB
          Tomo Suzuki
        15. IMG_0258.jpg
          500 kB
          Tomo Suzuki
        16. IMG_0256.jpg
          480 kB
          Tomo Suzuki
        17. IMG_0260.jpg
          377 kB
          Tomo Suzuki
        18. IMG_0262.jpg
          352 kB
          Tomo Suzuki
        19. IMG_0261.jpg
          362 kB
          Tomo Suzuki
        20. IMG_0265.jpg
          318 kB
          Tomo Suzuki
        21. IMG_0263.jpg
          403 kB
          Tomo Suzuki
        22. IMG_0266.jpg
          325 kB
          Tomo Suzuki
        23. IMG_0264.jpg
          414 kB
          Tomo Suzuki

          Issue Links

            Activity

              People

              • Assignee:
                tibordigana Tibor Digana
                Reporter:
                suztomo Tomo Suzuki
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m