[SVN-2520] Working copy optimized for space not time - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Duplicate
Affects Version/s: all
Fix Version/s: ---
Component/s: unknown
Labels:
None

Description

There should be an optional working copy format optimized for the client space
requirements.  This can be accomplished in part by storing hashcodes instead of
copies of the actual clean files.

From an actual repository I obtain these stats:

Exported files: 35774
Check-out files: 221299

Exported space: 1554784k
Check-out space: 3378467k

I'm sure you can corroborate with svn's own repo.  Thus, svn requires ~520% more
files and ~120% more space requirements than necessary (not counting per-file fs
overhead, probably about 1k per file).  I have used monotone on this same
repository[*] with no significant difference in day-to-day operations on a 100mb
lan, so I know most of this in unnecessary in practice.  Monotone has
essentially zero space overhead (about 0.5% increase).

The primary benefit to svn's current approach is that reverts and diffs happen
without contacting the server and downloading the file.  This is an extremely
weak justification imo for the large overhead to an svn working copy in my
opinion, since reverts rarely happen and neither reverts nor diffs typically
involve many files or large files (most files are skipped due to unchanged mod
time).

Solutions are simple: 

* store hashcodes of files so the server only needs to be contacted if the file
actually changes, not if it was just touched.
* remove "empty-file"
* put "format" into entries file (as a schema ideally)
* create the files/folders for svn structure changes (add, mv, cp, etc) on-demand

[*] I couldn't convert the whole repo, because once I got about 1/3 of the way
through the number of inodes exceeded the amount storable in 1g of main memory,
thus to convert a revision required disk-io to stat 200k+ files (due to simple
method of determining added/removed files and svn's massive use of files).  I
had to skip to head revision at that point as it would have taken over a month
to commit each further revision that way.

Original issue reported by cmonkey

Attachments

Issue Links

duplicates

SVN-525 Allow working copies without .svn/pristine/ cache (a.k.a. "text-base/" files).

In Progress

Activity

People

Assignee:: Unassigned

Reporter:: Subversion Importer

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 10/Mar/06 20:11

Updated:: 27/Oct/18 16:34

Resolved:: 10/Mar/06 20:50