Uploaded image for project: 'Subversion'
  1. Subversion
  2. SVN-3356

svn_client__get_copy_source() can be abysmally slow

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: trunk
    • Fix Version/s: 1.6.0
    • Component/s: libsvn_ra
    • Labels:
      None

      Description

      Here's the mission, should you choose to accept it:  Answer the question, "What
      was the last place that the line of history identified by PATH@REV was copied
      from, if any?" in the most efficient way.
      
      Options today:
      
      1.  Use svn_ra_get_log() to do a full history crawl of PATH@REV, and
          remember the most recent copy operation's source.  The cost of
          this approach is directly proportional to the number of changes
          made to PATH@REV -- the entirety of that item's history.
      
      2.  Same #1, but bail out of the log processing as soon as you have
          your answer.  Cost here is the same with #1, except you reduce
          some client-side processing of the logs which predate that most
          recent copy (if any).  Server cost is the same.
      
      3.  Use svn_ra_get_location_segments() to get a summary of the
          locations this line of history has occupied over time.  The cost
          here is completely variable.  At best, the server is of 1.5 or
          better pedigree with a fully populated node origins cache, and
          returns the answer ("no copies") very quickly.  At worst there are
          no copies in the item's history and the server is pre-1.5 and has
          to basically do the same full history crawl that get_logs() does
          in order to answer the question.  In between are scenarios where
          the server can get the location segments fairly quickly but the
          client has to then figure out if those location changes were due
          to the item itself being copied, or because some parent of it was.
      
      4.  An incremental version of one or more of the above, trying to fetch
          history in chunks to avoid over-processing.  Cost here could be an
          improvement, or could not.  And code complexity grows.
      
      svn_client__get_copy_source() today goes with option #1.  I think, though, that
      as we come to rely more on our server for complex information, we need to be
      less shy about giving the server the power to answer complex questions.  I'd
      suggest at *least* an svn_ra_get_copy_source() API, but am open to other ideas
      (like a tunneled version of the svn_fs_history* interfaces, or an RA version of
      svn_fs_closest_copy(), or *something*).
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                cmpilato C. Michael Pilato
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: