Uploaded image for project: 'Subversion'
  1. Subversion
  2. SVN-2520

Working copy optimized for space not time

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • all
    • ---
    • unknown
    • None

    Description

      There should be an optional working copy format optimized for the client space
      requirements.  This can be accomplished in part by storing hashcodes instead of
      copies of the actual clean files.
      
      From an actual repository I obtain these stats:
      
      Exported files: 35774
      Check-out files: 221299
      
      Exported space: 1554784k
      Check-out space: 3378467k
      
      I'm sure you can corroborate with svn's own repo.  Thus, svn requires ~520% more
      files and ~120% more space requirements than necessary (not counting per-file fs
      overhead, probably about 1k per file).  I have used monotone on this same
      repository[*] with no significant difference in day-to-day operations on a 100mb
      lan, so I know most of this in unnecessary in practice.  Monotone has
      essentially zero space overhead (about 0.5% increase).
      
      The primary benefit to svn's current approach is that reverts and diffs happen
      without contacting the server and downloading the file.  This is an extremely
      weak justification imo for the large overhead to an svn working copy in my
      opinion, since reverts rarely happen and neither reverts nor diffs typically
      involve many files or large files (most files are skipped due to unchanged mod
      time).
      
      Solutions are simple: 
      
      * store hashcodes of files so the server only needs to be contacted if the file
      actually changes, not if it was just touched.
      * remove "empty-file"
      * put "format" into entries file (as a schema ideally)
      * create the files/folders for svn structure changes (add, mv, cp, etc) on-demand
      
      [*] I couldn't convert the whole repo, because once I got about 1/3 of the way
      through the number of inodes exceeded the amount storable in 1g of main memory,
      thus to convert a revision required disk-io to stat 200k+ files (due to simple
      method of determining added/removed files and svn's massive use of files).  I
      had to skip to head revision at that point as it would have taken over a month
      to commit each further revision that way.
      

      Original issue reported by cmonkey

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              subversion-importer Subversion Importer
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: