Uploaded image for project: 'Subversion'
  1. Subversion
  2. SVN-3443

Obtaining implicit subtree mergeinfo hammers merge performance

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • trunk
    • 1.6.6
    • libsvn_client

    Description

      Note: This has been a known problem for a while, see
      http://svn.collab.net/repos/svn/branches/subtree-mergeinfo/notes/subtree-mergeinfo/the-performance-problem.txt,
      particularly the 'Implicit Mergeinfo Query Problem', but it's just getting an
      issue now.
      
      ~~~~~
      
      When performing a merge tracking aware merge of URL1@R1:URL2@R2 to a target with
      subtrees with explicit mergeinfo, the merge logic looks at the explicit
      mergeinfo on the merge target and each subtree:
      
        * For forward merges, any revisions from the merge source
          that are already represented are *not* merged.
      
        * For reverse merges, only revisions from the merge source
          which are represented are reverse merged.
      
      The logic also considers the implicit mergeinfo (a.k.a. natural history) of each
      target and subtree in a similar way.  The problem is, for each subtree this
      means a call to svn_client__get_history_as_mergeinfo() and the expense of a
      network round trip.  The slower the network and/or the more subtrees with
      mergeinfo, the slower the merge becomes.  If the merge target has hundreds or
      thousands of subtrees with explicit mergeinfo then even simple merges can become
      excruciatingly slow.  
      
      For example, using a 1.6.x@38363 release build, checkout
      http://svn.collab.net/repos/svn/branches/1.6.x@38364.  Then, to simulate a "bad
      case" of subtree mergeinfo, set empty mergeinfo on every path that doesn't
      already have explicit mergeinfo (1800+ paths in this case).  Then merge a few
      backport nominated revisions.  Ultimately there will be only three editor
      drives, one for each backported revision, none of the subtrees have these
      changes after all -- but the work to actually figure that out takes a *lot* of
      time: 
      
        1.6.x>timethis svn merge http://svn.collab.net/repos/svn/trunk
      -c38290,38293,38294 .
      
        TimeThis :  Command Line :  svn merge http://svn.collab.net/repos/svn/trunk
      -c38290,38293,38294 .
        TimeThis :    Start Time :  Tue Jul 07 16:07:03 2009
      
        --- Merging r38290 into 'build\ac-macros\apache.m4':
        U    build\ac-macros\apache.m4
        --- Merging r38293 into 'build':
        U    build\ac-macros\serf.m4
        --- Merging r38294 into 'build':
        G    build\ac-macros\apache.m4
      
        TimeThis :  Command Line :  svn merge http://svn.collab.net/repos/svn/trunk
      -c38290,38293,38294 .
        TimeThis :    Start Time :  Tue Jul 07 16:07:03 2009
        TimeThis :      End Time :  Tue Jul 07 16:32:15 2009
        TimeThis :  Elapsed Time :  00:25:12.062
                                    ^^^^^^^^^^^^
      
      But what if we just inherit the implicit mergeinfo of the merge target (or the
      root of any switched subtree) and calculate the implicit mergeinfo for the
      subtrees from that?  Put another way:
           
           When merging SRC -rX:Y to TARGET with a subtree with explicit mergeinfo
           TARGET/SUBTREE, if SRC -rX:Y is part of TARGET's natural history/implicit
           mergeinfo, when is SRC/SUBTREE -rX:Y *not* part of of TARGET/SUBTREE's
           natural history?  Or inversely, if SRC -rX:Y is not part of TARGET's
           natural history/implicit mergeinfo, when *is* SRC/SUBTREE -rX:Y part of
           TARGET/SUBTREE's natural history?
           
      We already do something quite similar to this in the special case of subtrees
      without explicit mergeinfo which are the immediate children of paths with
      non-inheritable mergeinfo -- see r37491.  If we do this for *all* subtrees then
      the performance improvement in the above scenario is huge:
      
        1.6.x.imq.fix>timethis svn merge http://svn.collab.net/repos/svn/trunk
      -c38290,38293,38294 .
      
        TimeThis :  Command Line :  svn merge http://svn.collab.net/repos/svn/trunk
      -c38290,38293,38294 .
        TimeThis :    Start Time :  Tue Jul 07 13:53:05 2009
      
        --- Merging r38290 into 'build\ac-macros\apache.m4':
        U    build\ac-macros\apache.m4
        --- Merging r38293 into 'build':
        U    build\ac-macros\serf.m4
        --- Merging r38294 into 'build':
        G    build\ac-macros\apache.m4
      
        TimeThis :  Command Line :  svn merge http://svn.collab.net/repos/svn/trunk
      -c38290,38293,38294 .
        TimeThis :    Start Time :  Tue Jul 07 13:53:05 2009
        TimeThis :      End Time :  Tue Jul 07 13:53:47 2009
        TimeThis :  Elapsed Time :  00:00:41.968
      
      25 minutes down to 41 seconds!  That's lovely, but are there any edge cases
      where this change will actually break things?  Going to think on that a bit
      while I take a side trip into wcng and try to figure out how to fix 38335 on
      Windows...
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            pburba Paul Burba
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: