Uploaded image for project: 'Subversion'
  1. Subversion
  2. SVN-1324

cvs2svn dumpfiles can be huge

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: all
    • Fix Version/s: cvs2svn-1.0
    • Component/s: tools
    • Labels:
      None

      Description

      Now that cvs2svn writes to a dumpfile, it can use a lot of disk space
      during a conversion, since dumpfiles express all node revisions in
      fulltext.
      
      One solution is to stream it to stdout, and pipe that into 'svnadmin
      load'.  The problem with that is that cvs2svn needs to be able to seek
      backwards on its dumpfile stream, since it goes back to patch up
      checksum and file size headers.  So in its current incarnation, it
      can't write to stdout.  Still, Robert Pluim made a patch, which he
      posted here:
      
         http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=37846
      
      In his commentary, he discusses some of the problems with this method:
      
         I cheated of course, by using a temporary file to hold
         the output of the rcs co command whilst we calculate the
         checksum, then writing the checksums & lengths etc to the
         dump stream.  Patch attached, needs more work.  BTW, doing
         it this way will probably be slower than using a dumpfile,
         since we process each file's data twice, but you never know,
         it might be faster on a multi-processor machine.
      
      Elsewhere in the same thread, I proposed another solution: an option
      to cvs2svn that causes it to output just "the next N revisions" of the
      dumpfile.  You can then load them, remove the dumpfile, lather, rinse,
      repeat.  (Or maybe instead of N revisions, we could do N bytes, where
      cvs2svn may exceed the limit in order to finish the current revision,
      but then it will stop).
      
      One way or another, this problem needs to be solved eventually.
      
      Note that it's not as bad as it could be: dumpfiles express copies
      with a tiny bit of metadata, a 'copy' action.  So it's not as if a
      repository containing Q tags and branches will increase the size of a
      dumpfile by a factor of Q.  (Whew!)
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              kfogel Karl Fogel
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: