Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
all
Description
Rename should not be implemented as "copy + delete", because that creates a new revision of the renamed file. Instead, a rename should only change the (old and new) parent directory, not the file itself. This is related to http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=19922
Attachments
Attachments
- 1_Subversion_1.4_true_renames_problems.ppt
- 227 kB
- C. Michael Pilato
Issue Links
Activity
Branko ÄŚibej
added a comment -
Branko ÄŚibej
added a comment - This absolutely must be fixed by Beta.
Greg Stein
added a comment -
"Absolutely must" ? This sounds like a feature which can be postponed until after 1.0.
Greg Stein
added a comment - "Absolutely must" ? This sounds like a feature which can be postponed
until after 1.0.
Philip Martin
added a comment -
One common operation is to rename, edit to take account of the the new names, then commit. Would atomic rename apply in this case as well? I don't really want to do it in two commits, because neither the edit or the rename will compile without the other.
Philip Martin
added a comment - One common operation is to rename, edit to take account of the the new
names, then commit. Would atomic rename apply in this case as well? I
don't really want to do it in two commits, because neither the edit or
the rename will compile without the other.
Karl Fogel
added a comment -
Philip, Sure, you can get the atomicity you need in that scenario -- just include everything in one commit. The fact that rename is currently broken down into copy+delete doesn't mean you can't commit both the copy and the delete in the same commit.
Karl Fogel
added a comment - Philip,
Sure, you can get the atomicity you need in that scenario -- just
include everything in one commit. The fact that rename is currently
broken down into copy+delete doesn't mean you can't commit both the
copy and the delete in the same commit.
Philip Martin
added a comment -
Perhaps I didn't make my question clear :) As I understand the proposal an atomic rename would operate a bit like a cheap copy, whichever directory referred to the file before the rename would have a new version that didn't refer to the file. There would also be a new reference to the original file either in the new directory with a different filename or in a different directory. My question is then how such a rename would work if the user both renamed and modfied in a single commit. In this circumstance using a new reference to the original file won't work. So I am wondering if atomic rename makes any difference to this case. If I compare to ClearCase, for instance, the atomic rename involves committing directories, and the file modification involves a committing the file. Essentially "rename and modify" is always two separate commits, and the "rename" commit never modifies the file itself, it just moves it. I was wondering how this would be achieved with Subversion's whole repository commit.
Philip Martin
added a comment - Perhaps I didn't make my question clear :)
As I understand the proposal an atomic rename would operate a bit like
a cheap copy, whichever directory referred to the file before the
rename would have a new version that didn't refer to the file. There
would also be a new reference to the original file either in the new
directory with a different filename or in a different directory.
My question is then how such a rename would work if the user both
renamed and modfied in a single commit. In this circumstance using a
new reference to the original file won't work. So I am wondering if
atomic rename makes any difference to this case.
If I compare to ClearCase, for instance, the atomic rename involves
committing directories, and the file modification involves a
committing the file. Essentially "rename and modify" is always two
separate commits, and the "rename" commit never modifies the file
itself, it just moves it. I was wondering how this would be achieved
with Subversion's whole repository commit.
Branko ÄŚibej
added a comment -
I think the client would have to send directory modifications (i.e., the rename) before file modifications -- and if it doesn't do that now, we're in trouble anyway. So what happens in your case: -- First, the rename is sent, which produces one or two directory changes in the commit txn, essentially just a relinking of the existing file node. -- Later, the modification for that file coms in (in the new location, mind!), which produces a new file node and again modifies directory that's the target of the rename. I don't *think* that would be a problem, the server should already notice that the directory was modified before. In fact, it should work right out of the box, because a commit can handle several distinct directory changes already.
Branko ÄŚibej
added a comment - I think the client would have to send directory modifications (i.e.,
the rename) before file modifications -- and if it doesn't do that
now, we're in trouble anyway.
So what happens in your case:
-- First, the rename is sent, which produces one or two directory
changes in the commit txn, essentially just a relinking of the
existing file node.
-- Later, the modification for that file coms in (in the new location,
mind!), which produces a new file node and again modifies directory
that's the target of the rename.
I don't *think* that would be a problem, the server should already
notice that the directory was modified before. In fact, it should work
right out of the box, because a commit can handle several distinct
directory changes already.
Subversion Importer
added a comment -
Adding a small description of a use case where the distinction between rename and copy+delete is important. svn cp http://server/trunk http://server/branch svn co http://server/branch cd branch svn cp template.c foo.c svn ci edit foo.c svn ci cd ../trunk edit template.c svn ci Now I want to merge the changes back and forth. I probably don't want to merge both sets of changes together, because this was a copy. However, if I do the exact same thing, but with a mv instead of cp, I DO want to merge the changes in. svn cp http://server/trunk http://server/branch svn co http://server/branch cd branch svn mv template.c foo.c svn ci edit foo.c svn ci cd ../trunk edit template.c svn ci Now I probably do want the changes to be applied. As Kazinator pointed out on IRC, copy is to create a new, but related item in the fs, whereas move/rename should not change the object, just it's location as far as operations like merge are concerned.
Original comment by kevin
Subversion Importer
added a comment - Adding a small description of a use case where the distinction between
rename and copy+delete is important.
svn cp http://server/trunk http://server/branch
svn co http://server/branch
cd branch
svn cp template.c foo.c
svn ci
edit foo.c
svn ci
cd ../trunk
edit template.c
svn ci
Now I want to merge the changes back and forth. I probably don't
want to merge both sets of changes together, because this was a copy.
However, if I do the exact same thing, but with a mv instead of cp, I
DO want to merge the changes in.
svn cp http://server/trunk http://server/branch
svn co http://server/branch
cd branch
svn mv template.c foo.c
svn ci
edit foo.c
svn ci
cd ../trunk
edit template.c
svn ci
Now I probably do want the changes to be applied. As Kazinator
pointed out on IRC, copy is to create a new, but related item in the fs,
whereas move/rename should not change the object, just it's location
as far as operations like merge are concerned.
Original comment by kevin
Ben Collins-Sussman
added a comment -
*** Issue 1864 has been marked as a duplicate of this issue. ***
Ben Collins-Sussman
added a comment - *** Issue 1864 has been marked as a duplicate of this issue. ***
Karl Fogel
added a comment -
Goodness, I thought Mike and I had copied the important data below into this issue already, but apparently not... Here are some notes from discussions we had on 18 August 2004, regarding subtle problems in a "naive" implementation of true renames: --------------------------------------------------------------------------- We posited that renames would work the "natural way", that is, entries would get shifted around, but the target's entity ID would not change. Then Mike demonstrated the collision problem, whereby we could end up with two different names for the exact same entity ID: 1. begin with /trunk/foo/bar.c 2. copy /trunk /branches 3. rename /branches/foo /trunk/bloo 4. modify /branches/foo/bar.c and /trunk/bloo/bar.c in the same txn. Here's what things like like right before step 4: r1 r2 r3 ======== ========= ========= 0.0.1 0.0.2 0.0.3 .------. .------. .------. |trunk----. .--<----trunk | .----<-----trunk | | | | | |branch----. | |branch------>------. | | | | | | | | | | | |______| | | |______| | | |______| | v v v v v 1.0.1 | | 1.1.2 | | 1.0.3 1.1.3 | .------. | | .------. | | .------. .------. | | |<-+--<----' | |<--' `->| | | | | | foo-----. .--<----foo | .--<-----foo | | |<--' | | | | | | | .-<---bloo | | | |______| | | |______| | | |______| |______| v v v | 2.0.1 | | | v .------. | | | | | |<-+----<---+-------<--------'<-' |bar.c---. | | | |______| | v 3.0.1 | (file)<--' You can see how modifying bar.c in both places now will result in an entity ID collision. (Open question: can we also construct a cycle via normal Subversion operations, with these new renames? That would be even worse.) Some possible solutions to the collision situation: * Schema-preserving Brute Force Fix: Disallow the rename if the immediate parents of the merge source and dest have different CopyIDs. * Schema-semi-preserving fix: Add an optional component(s) to entity IDs. In new filesystem code, all new IDs get extended, but the code is still able to handle the old 3-component kinds too. This needs to be fleshed out to describe exactly what those components would be, of course; the point here is merely that schema changes are not always an all-or-nothing proposition. * Schema-non-preserving fix: Add another component to entity IDs. After several iterations of discussion, Mike reduced it to just a UniqueID, constantly incremented -- the UniqueID would be different for *every* Node Revision in the database, in fact, it could maybe serve as a unique identifier itself? Some other random thoughts: * 'svn log renamed-tgt' requires that renames be tracked in the copies table. While we're at it, copies themselves should be tracked directly there, Mike says. * Since we have to implement copy-on-write to handle subnodes of renamed trees anyway, why not go all the way and implement copy-on-write completely? Something like: 'svn cp URL/foo URL/bar' would create new entity IDs (because that is the correct semantic of copy), but the two would share the same underlying string in the db. * User Interface: An important benefit of true rename support would be that users can tell (from the client side) when two differently-named entites are the same object. Whether we do this by simply exposing entity IDs, or a uniquely-identifying portion of entity IDs, or something else, doesn't matter so much, as long as the question is unambiguously resolveable.
Karl Fogel
added a comment - Goodness, I thought Mike and I had copied the important data below into this
issue already, but apparently not...
Here are some notes from discussions we had on 18 August 2004, regarding subtle
problems in a "naive" implementation of true renames:
---------------------------------------------------------------------------
We posited that renames would work the "natural way", that is, entries
would get shifted around, but the target's entity ID would not change.
Then Mike demonstrated the collision problem, whereby we could end up
with two different names for the exact same entity ID:
1. begin with /trunk/foo/bar.c
2. copy /trunk /branches
3. rename /branches/foo /trunk/bloo
4. modify /branches/foo/bar.c and /trunk/bloo/bar.c in the same txn.
Here's what things like like right before step 4:
r1 r2 r3
======== ========= =========
0.0.1 0.0.2 0.0.3
.------. .------. .------.
|trunk----. .--<----trunk | .----<-----trunk |
| | | | |branch----. | |branch------>------.
| | | | | | | | | | |
|______| | | |______| | | |______| |
v v v v v
1.0.1 | | 1.1.2 | | 1.0.3 1.1.3 |
.------. | | .------. | | .------. .------. |
| |<-+--<----' | |<--' `->| | | | |
| foo-----. .--<----foo | .--<-----foo | | |<--'
| | | | | | | .-<---bloo | | |
|______| | | |______| | | |______| |______|
v v v |
2.0.1 | | | v
.------. | | | |
| |<-+----<---+-------<--------'<-'
|bar.c---.
| | |
|______| |
v
3.0.1 |
(file)<--'
You can see how modifying bar.c in both places now will result in an
entity ID collision.
(Open question: can we also construct a cycle via normal Subversion
operations, with these new renames? That would be even worse.)
Some possible solutions to the collision situation:
* Schema-preserving Brute Force Fix:
Disallow the rename if the immediate parents of the merge source
and dest have different CopyIDs.
* Schema-semi-preserving fix:
Add an optional component(s) to entity IDs. In new filesystem
code, all new IDs get extended, but the code is still able to
handle the old 3-component kinds too. This needs to be fleshed
out to describe exactly what those components would be, of
course; the point here is merely that schema changes are not
always an all-or-nothing proposition.
* Schema-non-preserving fix:
Add another component to entity IDs. After several iterations of
discussion, Mike reduced it to just a UniqueID, constantly
incremented -- the UniqueID would be different for *every* Node
Revision in the database, in fact, it could maybe serve as a
unique identifier itself?
Some other random thoughts:
* 'svn log renamed-tgt' requires that renames be tracked in the copies
table. While we're at it, copies themselves should be tracked
directly there, Mike says.
* Since we have to implement copy-on-write to handle subnodes of
renamed trees anyway, why not go all the way and implement
copy-on-write completely? Something like: 'svn cp URL/foo URL/bar'
would create new entity IDs (because that is the correct semantic of
copy), but the two would share the same underlying string in the db.
* User Interface:
An important benefit of true rename support would be that users can
tell (from the client side) when two differently-named entites are
the same object. Whether we do this by simply exposing entity IDs,
or a uniquely-identifying portion of entity IDs, or something else,
doesn't matter so much, as long as the question is unambiguously
resolveable.
Philip Martin
added a comment -
Back in 2003 I produced a patch that implmented svn_fs_rename, see http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=29594 There was a later version (I think it had more extensive tests) but I can't find it in the archives. Bill Tutt discovered the ID collision problem, and I think it could be triggered using normal Subversion operations, see http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=29880
Philip Martin
added a comment - Back in 2003 I produced a patch that implmented svn_fs_rename, see
http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=29594
There was a later version (I think it had more extensive tests) but I can't find
it in the archives.
Bill Tutt discovered the ID collision problem, and I think it could be triggered
using normal Subversion operations, see
http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=29880
C. Michael Pilato
added a comment -
What do you mean by saying you think the ID collision could be caused by "normal Subversion operations"? Had we svn_fs_rename(), that would be a normal Subversion operation. Or are you saying this bug exists today, sans-rename?
C. Michael Pilato
added a comment - What do you mean by saying you think the ID collision could be caused by "normal
Subversion operations"? Had we svn_fs_rename(), that would be a normal
Subversion operation.
Or are you saying this bug exists today, sans-rename?
Philip Martin
added a comment -
I mean that if we were using the svn_fs_rename in my patch then a sequence of svn commands could be sufficient to trigger the problem.
Philip Martin
added a comment - I mean that if we were using the svn_fs_rename in my patch then a sequence of
svn commands could be sufficient to trigger the problem.
Karl Fogel
added a comment -
Changing summary because "atomic" is a bit misleading (or at least ambiguous) in this context
Karl Fogel
added a comment - Changing summary because "atomic" is a bit misleading (or at least ambiguous) in
this context
Subversion Importer
added a comment -
Moving to 1.4, since, while cmpilato has started this work on the fs-atomic- renames branch, it won't be merged before 1.3.
Original comment by lundblad
Subversion Importer
added a comment - Moving to 1.4, since, while cmpilato has started this work on the fs-atomic-
renames branch, it won't be merged before 1.3.
Original comment by lundblad
Karl Fogel
added a comment -
This message+thread sums up one problem that true renames will solve, about how peg revisions should be able to track a renamed object from a point in the past to a point in the future: http://subversion.tigris.org/servlets/ReadMsg?list=users&msgNo=46814 From: kfogel@collab.net To: Scott Palmer <scott.palmer@2connected.org> Cc: "users@subversion subversion" <users@subversion.tigris.org> References: <68B5F4BC6852D244BDDAAE69E31B15F9085394EA@intrepid.hypertherm.com> <FD36CE0B-5E32-4399-B76B-9A750EB4F848@2connected.org> <85r7524364.fsf@newton.ch.collab.net> <9609F461-2487-47F4-88AE-C2CCE7E1AE00@2connected.org> Date: 21 Mar 2006 13:41:39 -0600 Message-ID: <8564m7zito.fsf@newton.ch.collab.net> Subject: Re: Help explain peg revisions
Karl Fogel
added a comment - This message+thread sums up one problem that true renames will solve,
about how peg revisions should be able to track a renamed object from
a point in the past to a point in the future:
http://subversion.tigris.org/servlets/ReadMsg?list=users&msgNo=46814
From: kfogel@collab.net
To: Scott Palmer <scott.palmer@2connected.org>
Cc: "users@subversion subversion" <users@subversion.tigris.org>
References: <68B5F4BC6852D244BDDAAE69E31B15F9085394EA@intrepid.hypertherm.com>
<FD36CE0B-5E32-4399-B76B-9A750EB4F848@2connected.org>
<85r7524364.fsf@newton.ch.collab.net>
<9609F461-2487-47F4-88AE-C2CCE7E1AE00@2connected.org>
Date: 21 Mar 2006 13:41:39 -0600
Message-ID: <8564m7zito.fsf@newton.ch.collab.net>
Subject: Re: Help explain peg revisions
Karl Fogel
added a comment -
Garrett Rooney is actively working on this, so assigning the issue to him (I'm assuming he doesn't mind).
Karl Fogel
added a comment - Garrett Rooney is actively working on this, so assigning the issue to him (I'm
assuming he doesn't mind).
Karl Fogel
added a comment - Mark as started.
Subversion Importer
added a comment -
The 1.4 branching point is approaching and this is not happening before that, so move into 1.5-consider.
Original comment by lundblad
Subversion Importer
added a comment - The 1.4 branching point is approaching and this is not happening before that,
so move into 1.5-consider.
Original comment by lundblad
Subversion Importer
added a comment -
I think the OID story (see my comments in issue 1525) must be implemented first before you can properly implement this one.
Original comment by ringods
Subversion Importer
added a comment - I think the OID story (see my comments in issue 1525) must be implemented first
before you can properly implement this one.
Original comment by ringods
Garrett Rooney
added a comment -
Let's not kid ourselves, there's no way I'm ever going to finish this stuff. Assigning back to issues@subversion on the off chance someone else wants to pick it up.
Garrett Rooney
added a comment - Let's not kid ourselves, there's no way I'm ever going to finish this stuff.
Assigning back to issues@subversion on the off chance someone else wants to pick
it up.
Subversion Importer
added a comment -
Status email from March 2006, in case someone wanted to pick this up: http://svn.haxx.se/dev/archive-2006-03/1334.shtml
Original comment by malcolm
Subversion Importer
added a comment - Status email from March 2006, in case someone wanted to pick this up:
http://svn.haxx.se/dev/archive-2006-03/1334.shtml
Original comment by malcolm
Subversion Importer
added a comment -
According to sussman, any fix for this should also incorporate a fix for issue 2685 - when merging a rename from (say) trunk to branch, we should merge by renaming an existing branch file, not overwriting the branch file with one on trunk.
Original comment by malcolm
Subversion Importer
added a comment - According to sussman, any fix for this should also incorporate a fix for issue 2685 - when merging a
rename from (say) trunk to branch, we should merge by renaming an existing branch file, not overwriting
the branch file with one on trunk.
Original comment by malcolm
Daniel Rall
added a comment - Issue 2685 is related, but may not be dependent upon this issue.
Erik Huelsmann
added a comment - Signing away, I said.
Erik Huelsmann
added a comment - Blundering with IZ again. Sorry for the last comment...
Erik Huelsmann
added a comment -
With resources fully concentrated on Merge Tracking, I don't think having this in 1.5 is viable. Moving to 1.6-consider.
Erik Huelsmann
added a comment - With resources fully concentrated on Merge Tracking, I don't think having this
in 1.5 is viable. Moving to 1.6-consider.
C. Michael Pilato
added a comment -
Attachment 1_Subversion_1.4_true_renames_problems.ppt has been added with description: Slideshow demonstrating some of the common problems with today's rename implementation
C. Michael Pilato
added a comment - Attachment 1_Subversion_1.4_true_renames_problems.ppt has been added with description: Slideshow demonstrating some of the common problems with today's rename implementation
C. Michael Pilato
added a comment -
Created an attachment (id=729) Slideshow demonstrating some of the common problems with today's rename implementation
C. Michael Pilato
added a comment - Created an attachment (id=729)
Slideshow demonstrating some of the common problems with today's rename implementation
Hyrum Kurt Wright
added a comment -
Post-1.6 issue sweep. Since 1.7 is already shaping up to be a large release, move to 1.8-consider.
Hyrum Kurt Wright
added a comment - Post-1.6 issue sweep. Since 1.7 is already shaping up to be a large release,
move to 1.8-consider.
Subversion Importer
added a comment -
To my mind, the rename is issue is far more important that automatic merge tracking. Merge tracking is solving a problem I don't really have. I follow the best practices in the 1.4 documentation and I include the merged revisions, source, and destination in the commit message. When I want to merge again, I grep for the previous merge and look at the message again. Seems like this thing is going to be pushed out indefinitely.
Original comment by mehaase
Subversion Importer
added a comment - To my mind, the rename is issue is far more important that automatic merge
tracking.
Merge tracking is solving a problem I don't really have. I follow the best
practices in the 1.4 documentation and I include the merged revisions, source,
and destination in the commit message. When I want to merge again, I grep for
the previous merge and look at the message again.
Seems like this thing is going to be pushed out indefinitely.
Original comment by mehaase
Subversion Importer
added a comment -
AFAIK only bzr (and possibly Monotone) is doing this properly currently, so having this in SVN could potentially be a competitive edge. As for the other competitors, both git and Mercurial are guessing themselves through renames: http://automatthias.wordpress.com/2007/06/07/directory-renaming-in-scm/ http://www.selenic.com/mercurial/bts/issue850
Original comment by walles
Subversion Importer
added a comment - AFAIK only bzr (and possibly Monotone) is doing this properly currently, so
having this in SVN could potentially be a competitive edge.
As for the other competitors, both git and Mercurial are guessing themselves
through renames:
http://automatthias.wordpress.com/2007/06/07/directory-renaming-in-scm/
http://www.selenic.com/mercurial/bts/issue850
Original comment by walles
Subversion Importer
added a comment -
> Instead, a rename should only change the (old and new) parent directory, not the file itself. Clearcase has this concept, and it's not particularly easy for developers to understand or use. An easier to understand concept/implementation is to have a (versioned) directory-path as a property of the file. When you move the file, you create a new version (commit) of that file where the directory-path property is the only change. The file should keep it's original unique id, and all internal operations should use the unique-id instead of the pathname+filename.
Original comment by sbrown2009
Subversion Importer
added a comment - > Instead, a rename should only change the (old and new) parent directory, not
the file itself.
Clearcase has this concept, and it's not particularly easy for developers to
understand or use.
An easier to understand concept/implementation is to have a (versioned)
directory-path as a property of the file. When you move the file, you create a
new version (commit) of that file where the directory-path property is the only
change. The file should keep it's original unique id, and all internal
operations should use the unique-id instead of the pathname+filename.
Original comment by sbrown2009
Branko ÄŚibej
added a comment -
> An easier to understand concept/implementation is to have a (versioned) directory-path as a property of the file. When you move the file, you create a new version (commit) of that file where the directory-path property is the only change. That would become *really* hairy IMHO because then we suddenly loose the concept of a first-class versioned tree (and consequently versioned directories); the whole object hierarchy in the repository becomes a guessing game based on properties of the object, instead of the other way around, as it is now. The idea that file moves/renames only affect directories, and that by implication file names are properties of directories, not files, is how the way the Unix virtual file system hierarchy is structured. It works very well on the implementation level, although it may be slightly confusing since users would typically expect the name to be a property of the fils -- that's how DOS/Windows (and their conceptual predecessors) originally treated it, with implementation problems and limitations arising from the fact that names were file properties but paths (i.e., containing directories) were not. Notice how NTFS, for example, sneakily adopted the Unix concept on the implementation level, which is what makes hard-links possible on NTFS but not on FAT.
Branko ÄŚibej
added a comment - > An easier to understand concept/implementation is to have a (versioned)
directory-path as a property of the file. When you move the file, you create a
new version (commit) of that file where the directory-path property is the only
change.
That would become *really* hairy IMHO because then we suddenly loose the concept
of a first-class versioned tree (and consequently versioned directories); the
whole object hierarchy in the repository becomes a guessing game based on
properties of the object, instead of the other way around, as it is now.
The idea that file moves/renames only affect directories, and that by
implication file names are properties of directories, not files, is how the way
the Unix virtual file system hierarchy is structured. It works very well on the
implementation level, although it may be slightly confusing since users would
typically expect the name to be a property of the fils -- that's how DOS/Windows
(and their conceptual predecessors) originally treated it, with implementation
problems and limitations arising from the fact that names were file properties
but paths (i.e., containing directories) were not. Notice how NTFS, for example,
sneakily adopted the Unix concept on the implementation level, which is what
makes hard-links possible on NTFS but not on FAT.
Julian Foad
added a comment -
See also: Issue #3630 "Rename tracking" Issue #3633 "Track renames as renames inside Subversion repository"
Julian Foad
added a comment - See also:
Issue #3630 "Rename tracking"
Issue #3633 "Track renames as renames inside Subversion repository"
Stewart Gordon
added a comment -
Re Steve Well said - you've taken the words out of my mouth. Think of it like page moving on MediaWiki (Wikipedia et al). Current SVN renaming is a cut-and-paste move, a common sin among WP users who don't know better. The file under its new name is a new file, with a new revision history, split from the file under its old name. When pages are moved correctly under MW, the revision history remains intact. A file rename under SVN ought to do the same. Re Branko This would only be the case if we move _all_ path information from the directory node to the file node. Here's an approach that, as far as I can see, would achieve the best of both worlds. Each file node contains: - unique node ID (immutable) - filename (versioned) - ID of the directory node containing it (versioned) - other properties (versioned) - file contents (versioned) All directory nodes are file nodes - but the "file contents" become a list of the IDs of files in the directory. File renaming is then straightforward - just change the filename in the file node. Moving a file would entail these steps: - change the directory node ID in the file node - change the filename in the file node, if it is being renamed at the same time - remove the ID of this node from the source directory node's list - add the ID of this node to the destination directory node's list The drawback is that it would be a major change to SVN repository structure, and so a lot of SVN code would need to be rewritten. The question is: can the changes be kept on the server side so that existing clients will still work on repositories using the new system? Or will SVN clients need to change as well?
Stewart Gordon
added a comment - Re Steve
Well said - you've taken the words out of my mouth. Think of it like page
moving on MediaWiki (Wikipedia et al). Current SVN renaming is a cut-and-paste
move, a common sin among WP users who don't know better. The file under its new
name is a new file, with a new revision history, split from the file under its
old name. When pages are moved correctly under MW, the revision history remains
intact. A file rename under SVN ought to do the same.
Re Branko
This would only be the case if we move _all_ path information from the directory
node to the file node. Here's an approach that, as far as I can see, would
achieve the best of both worlds.
Each file node contains:
- unique node ID (immutable)
- filename (versioned)
- ID of the directory node containing it (versioned)
- other properties (versioned)
- file contents (versioned)
All directory nodes are file nodes - but the "file contents" become a list of
the IDs of files in the directory.
File renaming is then straightforward - just change the filename in the file node.
Moving a file would entail these steps:
- change the directory node ID in the file node
- change the filename in the file node, if it is being renamed at the same time
- remove the ID of this node from the source directory node's list
- add the ID of this node to the destination directory node's list
The drawback is that it would be a major change to SVN repository structure, and
so a lot of SVN code would need to be rewritten. The question is: can the
changes be kept on the server side so that existing clients will still work on
repositories using the new system? Or will SVN clients need to change as well?
Stewart Gordon
added a comment -
A challenge is how to deal with file moves when the user updates, when either (a) it has moved into or out of the subdirectory the user has updated (b) the new name clashes with something An example of (a): dir1\file1 has been renamed to dir2\file1 User updates whole working copy - just move - straightforward User updates only dir1 or even file1 - does file1 just disappear, or is the file moved across? User updates only dir2 - is he left with two copies of file1, or is the file moved across? User has checked out only dir1 - does file1 just disappear? It gets even more complicated when you consider that the local file may have uncommitted changes. Possible cases of (b): - new name conflicts with an unversioned file that the user has put there - file1 has been deleted or renamed, and file2 has been renamed file1, but user updated only file2 and still has the old file1 I suppose the simplest is to continue to process renaming as delete and add when updating. Of course, we would have to re-retrieve the full history of the renamed file, and think about how access to a deleted file's history would work.
Stewart Gordon
added a comment - A challenge is how to deal with file moves when the user updates, when either
(a) it has moved into or out of the subdirectory the user has updated
(b) the new name clashes with something
An example of (a): dir1\file1 has been renamed to dir2\file1
User updates whole working copy - just move - straightforward
User updates only dir1 or even file1 - does file1 just disappear, or is the file
moved across?
User updates only dir2 - is he left with two copies of file1, or is the file
moved across?
User has checked out only dir1 - does file1 just disappear?
It gets even more complicated when you consider that the local file may have
uncommitted changes.
Possible cases of (b):
- new name conflicts with an unversioned file that the user has put there
- file1 has been deleted or renamed, and file2 has been renamed file1, but user
updated only file2 and still has the old file1
I suppose the simplest is to continue to process renaming as delete and add when
updating. Of course, we would have to re-retrieve the full history of the
renamed file, and think about how access to a deleted file's history would work.
Stewart Gordon
added a comment -
A challenge is how to deal with file moves when the user updates, when either (a) it has moved into or out of the subdirectory the user has updated (b) the new name clashes with something An example of (a): dir1\file1 has been renamed to dir2\file1 User updates whole working copy - just move - straightforward User updates only dir1 or even file1 - does file1 just disappear, or is the file moved across? User updates only dir2 - is he left with two copies of file1, or is the file moved across? User has checked out only dir1 - does file1 just disappear? It gets even more complicated when you consider that the local file may have uncommitted changes. Possible cases of (b): - new name conflicts with an unversioned file that the user has put there - file1 has been deleted or renamed, and file2 has been renamed file1, but user updated only file2 and still has the old file1 I suppose the simplest is to continue to process renaming as delete and add when updating. Of course, we would have to re-retrieve the full history of the renamed file, and think about how access to a deleted file's history would work.
Stewart Gordon
added a comment - A challenge is how to deal with file moves when the user updates, when either
(a) it has moved into or out of the subdirectory the user has updated
(b) the new name clashes with something
An example of (a): dir1\file1 has been renamed to dir2\file1
User updates whole working copy - just move - straightforward
User updates only dir1 or even file1 - does file1 just disappear, or is the file
moved across?
User updates only dir2 - is he left with two copies of file1, or is the file
moved across?
User has checked out only dir1 - does file1 just disappear?
It gets even more complicated when you consider that the local file may have
uncommitted changes.
Possible cases of (b):
- new name conflicts with an unversioned file that the user has put there
- file1 has been deleted or renamed, and file2 has been renamed file1, but user
updated only file2 and still has the old file1
I suppose the simplest is to continue to process renaming as delete and add when
updating. Of course, we would have to re-retrieve the full history of the
renamed file, and think about how access to a deleted file's history would work.
Daniel Shahaf
added a comment -
I don't see the problem. If you moved d1/f1 to d2/f2 and you update d1 to rN, then either d1@N contains an 'f1' child or it doesn't. If you had no local changes, then *why* it does or doesn't have an 'f1' child is irrelevant; we just create or delete the wc's 'f1' child as appropriate.
Daniel Shahaf
added a comment - I don't see the problem. If you moved d1/f1 to d2/f2 and you update d1 to rN, then either d1@N
contains an 'f1' child or it doesn't. If you had no local changes, then *why* it does or doesn't have an
'f1' child is irrelevant; we just create or delete the wc's 'f1' child as appropriate.
Stewart Gordon
added a comment -
That's basically what I meant by my final paragraph. I was pondering over whether that or something else is the best strategy for dealing with it.
Stewart Gordon
added a comment - That's basically what I meant by my final paragraph. I was pondering over
whether that or something else is the best strategy for dealing with it.
Tim Bain
added a comment -
I think Stewart Gordon's comments from 4/19 make sense. However, the challenge will be that many (most?) users interact with SVN via a client on a machine whose file system doesn't use the scheme he described. A file in Windows (or Unix) doesn't have a means to store his immutable unique node ID in a manner that is invisible to the client yet inseparable from the file content, leaving two possibilities: 1. Store the data within the file itself, which is a non-starter since that would probably invalidate the file format for any applications using the file. So we're left with... 2. Store the data outside of the file itself (e.g. in the .svn directory somewhere). This will require that users performing a rename/move operation do so in a manner that properly migrates the unique node ID to the new location for the file (which might simply be a different filename within the same .svn directory). Option 2 would probably be possible within the major stand-alone SVN clients (including the IDE plugins), though I imagine it would be a breaking change for older versions of those clients. But I don't see how it would work for the developer using Windows Explorer or the mv command to move/rename, unless the developer explicitly picks some shell extension option ("Move in SVN" instead of "Move") or runs a SVN-specific mv command. But if we do manage to make that work, it eliminates the concerns about things like unversioned copies existing at the time of an update (your merge resolution process involves deciding which unique node ID to use, just like you decide whether you'd rather keep your version or the server's version of line 43 when both have changes.
Tim Bain
added a comment - I think Stewart Gordon's comments from 4/19 make sense. However, the challenge
will be that many (most?) users interact with SVN via a client on a machine
whose file system doesn't use the scheme he described. A file in Windows (or
Unix) doesn't have a means to store his immutable unique node ID in a manner
that is invisible to the client yet inseparable from the file content, leaving
two possibilities:
1. Store the data within the file itself, which is a non-starter since that
would probably invalidate the file format for any applications using the file.
So we're left with...
2. Store the data outside of the file itself (e.g. in the .svn directory
somewhere). This will require that users performing a rename/move operation do
so in a manner that properly migrates the unique node ID to the new location for
the file (which might simply be a different filename within the same .svn
directory).
Option 2 would probably be possible within the major stand-alone SVN clients
(including the IDE plugins), though I imagine it would be a breaking change for
older versions of those clients. But I don't see how it would work for the
developer using Windows Explorer or the mv command to move/rename, unless the
developer explicitly picks some shell extension option ("Move in SVN" instead of
"Move") or runs a SVN-specific mv command.
But if we do manage to make that work, it eliminates the concerns about things
like unversioned copies existing at the time of an update (your merge resolution
process involves deciding which unique node ID to use, just like you decide
whether you'd rather keep your version or the server's version of line 43 when
both have changes.
Stewart Gordon
added a comment -
My idea was on the basis that node IDs would be just an SVN thing, not part of the filesystem. Using filesystem node IDs probably wouldn't work even if they were a feature of all filesystems, because they cannot be in sync across the repository and all working copies. SVN already has this move/rename command. The problem is that it's currently just shorthand for a copy (which in turn is just shorthand for copying the file manually and then adding it) followed by a delete. But current SVN clients would probably still do this, whereas newer versions would have the proper rename. Actually, ISTM the node ID doesn't need to be stored in working copies, if we take Daniel's approach. The problem I can see is that existing SVN clients might not correctly retrieve the revision history of a file that's been renamed.
Stewart Gordon
added a comment - My idea was on the basis that node IDs would be just an SVN thing, not part of
the filesystem. Using filesystem node IDs probably wouldn't work even if they
were a feature of all filesystems, because they cannot be in sync across the
repository and all working copies.
SVN already has this move/rename command. The problem is that it's currently
just shorthand for a copy (which in turn is just shorthand for copying the file
manually and then adding it) followed by a delete. But current SVN clients
would probably still do this, whereas newer versions would have the proper
rename. Actually, ISTM the node ID doesn't need to be stored in working copies,
if we take Daniel's approach.
The problem I can see is that existing SVN clients might not correctly retrieve
the revision history of a file that's been renamed.
Mauro Molinari
added a comment -
Please fix this bug for 1.8! Isn't now the problem with the WC-NG a bit more easy to fix? Please, don't blame on me, consider this as a "vote". I'm not an SVN internals expert, but I think that the problem depicted in http://svnbook.red-bean.com/ en/1.7/svn.branchmerge.advanced.html#svn.branchmerge.advanced.moves is quite severe and it would be great if Subversion could correctly handle that at last.
Mauro Molinari
added a comment - Please fix this bug for 1.8! Isn't now the problem with the WC-NG a bit more easy
to fix?
Please, don't blame on me, consider this as a "vote". I'm not an SVN internals
expert, but I think that the problem depicted in http://svnbook.red-bean.com/
en/1.7/svn.branchmerge.advanced.html#svn.branchmerge.advanced.moves is quite
severe and it would be great if Subversion could correctly handle that at last.
Stefan Sperling
added a comment -
Mauro, work is being done for 1.8 in this area, but not in the context of this issue. This issue is about changing the way renames are represented in the Subversion filesystem. But there are other approaches to fix usability problems present in the current implementation of moves (or, rather, lack of implementation). See issue 3630 and the issues linked from it, in particular issue 3631 which addresses what wc-ng can provide in this area. It is not clear yet what the exact set of improvements in move support shipped in 1.8 will look like. Discussion is still on-going, and some experimental work is being done on branches which may or may not be merged back to the trunk (and hence 1.8) eventually. Feel free to contribute to discussion and implementation.
Stefan Sperling
added a comment - Mauro, work is being done for 1.8 in this area, but not in the context of this
issue. This issue is about changing the way renames are represented in the
Subversion filesystem. But there are other approaches to fix usability problems
present in the current implementation of moves (or, rather, lack of
implementation). See issue 3630 and the issues linked from it, in particular
issue 3631 which addresses what wc-ng can provide in this area.
It is not clear yet what the exact set of improvements in move support shipped
in 1.8 will look like. Discussion is still on-going, and some experimental work
is being done on branches which may or may not be merged back to the trunk (and
hence 1.8) eventually. Feel free to contribute to discussion and implementation.
Subversion Importer
added a comment -
Just lost files added to my branch before merging from trunk, where the parent folder was simply renamed... Yes, it's a well known weak spot, but it's REALLY SEVERE, so it would be good if at least people could vote for its resolution.
Original comment by davide_cavestro
Subversion Importer
added a comment - Just lost files added to my branch before merging from trunk, where the parent
folder was simply renamed...
Yes, it's a well known weak spot, but it's REALLY SEVERE, so it would be good if
at least people could vote for its resolution.
Original comment by davide_cavestro
Stefan Sperling
added a comment -
There is no point in having people vote on this, or add even more "me too" comments to this and related issues. The developers are well aware that this is a severe limitation for many users. Work is being done, aiming towards 1.8. If you want to get this fixed asap, please help out by sending patches or funding additional developer time for new and/or existing developers. Else, be patient, let those who are putting in the effort work at their own pace, and wait for results in some future release. Thanks.
Stefan Sperling
added a comment - There is no point in having people vote on this, or add even more "me too"
comments to this and related issues. The developers are well aware that this is
a severe limitation for many users. Work is being done, aiming towards 1.8.
If you want to get this fixed asap, please help out by sending patches or
funding additional developer time for new and/or existing developers. Else, be
patient, let those who are putting in the effort work at their own pace, and
wait for results in some future release. Thanks.