Uploaded image for project: 'Subversion'
  1. Subversion
  2. SVN-3093

svn_ra_neon__get_dir returns wrong directory set with public URL



    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: trunk
    • Fix Version/s: 1.7.0
    • Component/s: libsvn_ra_neon
    • Labels:


      IRC log:
      [14:47] mm_aa: Hello! I'm having troubles with svn_ra_get_dir2... Resulting dir entries contain the 
      directory being listed istelf. Anyone knows how to fix this? Or at list determine that it's the same dir 
      and not process it?
      [14:47] mm_aa: I've got this:
      [14:47] mm_aa: http://pastie.caboo.se/147968
      [14:47] mm_aa: And my code is: http://pastie.caboo.se/147975
      [14:50] jackr: looks like your extra entry might be "."
      [14:50] mm_aa: It indeed is. But not with the "." key.
      [14:50] mm_aa: can I filter it out somehow?
      [14:51] mm_aa: And what's strange enough - 'svn ls' doesn't have this problem
      [14:52] jackr: I also find it interesting that the surprising entry comes out last
      [14:53] mm_aa: well, it's a hash.
      [14:53] mm_aa: in longer dirs it could be anywhere in the middle
      [14:55] pburba left the chat room. (Read error: 110 (Connection timed out))
      [14:56] jackr: What have you for dirent_fields?
      [14:57] mm_aa: SVN_DIRENT_ALL
      [14:57] jackr: oh, that's probably "which info to return on each entry," not "which entries to return"
      [14:57] jackr: n/m
      [14:57] mm_aa: right.
      [15:02] pburba_ is now known as pburba.
      [15:02] pburba was promoted to operator by ChanServ.
      [15:05] jackr: I looked in list.c, to see how it filters out ".", but can't spot any such code
      [15:07] mm_aa: Yeah. I looked there too. That's indeed some kind of magic 
      [15:07] fitz_: jackr in da hizzouse!
      [15:07] jackr: hey, itza fitza!
      [15:08] jackr: any guesses why svn_ra_get_dir2() returns "." (by its proper name) sometimes, but maybe 
      not others?
      [15:10] cmpilato: should never return "."
      [15:10] cmpilato: "" perhaps, but not "."
      [15:10] jackr: well, that's what we expected, but see the caboo pastes above
      [15:10] epg___: mm_aa: i use get_dir2 all the time with no such problem
      [15:10] epg___ is now known as epg.
      [15:12] cmpilato: mm_aa: what happens if your program also prints the value of "dir"?
      [15:13] mm_aa: my dir is ""
      [15:13] cmpilato: and the anchor of your ra session?
      [15:13] cmpilato: (the full url provided to the program, perhaps?)
      [15:14] mm_aa: http://pastie.caboo.se/147988      http://pastie.caboo.se/147989
      [15:14] • jackr doesn't see cmpilato's answers in the pasties
      [15:15] mm_aa: anchors? Don't know what are you talking about... Let me see
      [15:15] cmpilato: does googlecode implement a custom webdav handler?
      [15:15] epg: mm_aa: he means what you passed to ra_open
      [15:16] epg: cmpilato: what do you mean?
      [15:16] cmpilato: epg: is it just mod_dav_svn atop your bigtable stuff?
      [15:16] epg: cmpilato: httpd/apr/subversion are all stock, except for the fs
      [15:16] cmpilato: ok.
      [15:16] cmpilato: that's what i was asking.
      [15:16] mm_aa: svn_ra_open2(&session, url, callbacks, NULL, NULL, pool); URL =  
      [15:16] • jackr notes that pasties say that "svn ls" shows the expected
      [15:16] epg: and in front of httpd is the google load balancing foo
      [15:17] jelmer joined the chat room.
      [15:18] cmpilato: so, note that 'svn ls -v' shows the '.' entry.  it's the client code that filters out that 
      entry when -v isn't provided.
      [15:18] cmpilato: see print_dirent() in list-cmd.c
      [15:18] jackr: ah - tried that, but googlecode's gateway blocks "-v" or something? bad gateway?
      [15:19] cmpilato: worked for me.
      [15:19] cmpilato: $ svn ls http://mercurialidea.googlecode.com/svn/trunk/src/ -v
      [15:19] cmpilato:     50 gevesson              Dec 06 08:39 ./
      [15:19] cmpilato:     50 gevesson              Dec 06 08:39 com/
      [15:19] cmpilato: $
      [15:19] jackr: odd
      [15:19] epg: i guess i get that after all, and just forgot
      [15:19] epg: mm_aa: sorry for the confusion 
      [15:20] jackr: svn ls -v  http://svn.collab.net/repos/svn  does not show . for me
      [15:20] epg: jackr: just retry when you get 502
      [15:20] cmpilato: this doesn't help mm_aa, who for some reason is seeing "src" in his dirents, not ""
      [15:21] epg: hm
      [15:21] • epg plays in his python buffer
      [15:23] danielsh left the chat room. (Read error: 110 (Connection timed out))
      [15:26] jackr: is this a 1.5 change?
      [15:26] jackr: cmpilato shows (pastes, I assume) "." returning from "ls -v", but when I do the same 
      thing, I don't get no dot
      [15:26] jackr: I suspect cmpilato has some 1.5 build, and my experiments are with 1.4.4
      [15:26] mm_aa: I don't get a dot too
      [15:26] cmpilato: ah.
      [15:27] cmpilato: yes.  1.4 shows no dot
      [15:27] jackr: mm_aa: but, is the library you bind your actual program to 1.5 (i.e., built from trunk )
      [15:27] mm_aa: nope. it should be 1.4
      [15:27] jackr: it's not nice to shoot down people's theories 
      [15:28] mm_aa: I wonder how client library version can interfere with server result?
      [15:28] jackr: unless cmpilato means "the library always returned "", and command line always filtered 
      it, but 1.5 learned to show it sometimes"?
      [15:29] cmpilato: no, i think there's code in the 'ls' implementation to fake the '.' dirent.
      [15:29] mm_aa: let me check 1.4 list.c sources. (Was checking trunk).
      [15:29] • jackr too
      [15:29] cmpilato: the function call is not -- i thought -- supposed to return the self entry.
      [15:31] cmpilato: fwiw, i don't see a difference between serf and neon output.
      [15:31] mm_aa: serf/neon?
      [15:32] jackr: neon is the low-level HTTP transport; serf is a replacement now under development
      [15:33] mm_aa: aha! list.c has this:  /* Report the dirent for the target. */  SVN_ERR(list_func(baton, "", 
      dirent, locks ? (apr_hash_get(locks, fs_path,                                  APR_HASH_KEY_STRING)) : NULL, fs_path, 
      [15:34] mm_aa: so "." is indeed processed differently... I wonder now why I've got extra dirent...
      [15:35] pburba left the chat room. ("Escaping the clutches IRC for a while 
      [15:37] mm_aa: damn it... I will make a debug svn build and run it with gdb! 
      [15:38] epg: i don't get . *or* trunk in my dirents with 1.5ish
      [15:38] epg: (Pdb) foo = svn.ra.open2('http://mercurialidea.googlecode.com/svn/trunk/src', pr
      [15:38] epg: oject.repository.ra_callbacks, project.repository._config.svn_config, project.re
      [15:38] epg: pository._pool)
      [15:38] epg: (Pdb) p svn.ra.get_dir2(foo, '', SVN_INVALID_REVNUM, 0, pool)[0]               
      [15:38] epg: {'com': <libsvn.core.svn_dirent_t; proxy of C svn_dirent_t instance at 
      [15:39] epg: yet svn ls -v from same build prints a line for .
      [15:39] mm_aa: it's added by lib-client
      [15:40] epg: *added*?
      [15:40] epg: how does it know what to add?
      [15:41] mm_aa: it has a direct call of "" for list_func (see a couple of messages earlier)
      [15:41] epg: but i don't
      [15:41] jackr: epg: printing [0] of a hash might not show what might still be there later on
      [15:42] epg: later on?
      [15:42] jackr: in the hash
      [15:42] epg: it doesn't change
      [15:42] jackr: if it returns two, and you print only one, you'll never know the other
      [15:42] epg: get_dir2 in python returns (dirents, revision, properties)
      [15:42] epg: my [0] is getting the dirents hash
      [15:42] epg: which doesn't change, so i don't know what you're getting at
      [15:44] epg: this may make it clearer:
      [15:44] epg: (Pdb) result = svn.ra.get_dir2(foo, '', SVN_INVALID_REVNUM, 0, pool)
      [15:44] epg: (Pdb) foo_dirents = result[0]
      [15:44] epg: (Pdb) p foo_dirents
      [15:44] epg: {'com': <libsvn.core.svn_dirent_t; proxy of C svn_dirent_t instance at 
      [15:45] epg: (Pdb) p result[1]
      [15:45] epg: 50
      [15:46] epg: ah, i think my problem is 0 is not SVN_DIRENT_ALL; i misread that; so i'm just noise here
      [15:46] • epg wonders why SVN_DIRENT_ALL is not bound for python
      [15:57] epg: ah, "how does it know what to add?" => it also uses ra_stat
      [15:58] gabor left the chat room. (Read error: 104 (Connection reset by peer))
      [15:58] gabor joined the chat room.
      [15:58] cmpilato: epg: yup.
      [15:58] cmpilato: ra_stat
      [15:58] gabor left the chat room. (Read error: 104 (Connection reset by peer))
      [15:59] gabor joined the chat room.
      [16:02] DannyB_ left the chat room.
      [16:03] mm_aa: you mean, for every entry?
      [16:03] epg: no, just for the one passed to svn_client_list
      [16:24] CIA-21: dlr * r29196 /trunk/COMMITTERS: * COMMITTERS: Update my email address.
      [16:27] epg: dlr: slacker
      [16:27] dlr: grumble
      [17:04] mm_aa: You still here guys?
      [17:04] mm_aa: I've created a self-contained sample:
      [17:04] mm_aa: http://pastie.caboo.se/148029
      [17:05] mm_aa: $ ./svnt src
      [17:05] mm_aa: com
      [17:05] mm_aa: http://pastie.caboo.se/148031
      [17:05] mm_aa: (missing new line in my msg)
      [17:06] mm_aa: It clearly shows a problem
      [17:06] mm_aa: gcc -I/usr/include/subversion-1/ -I/usr/include/apr-0 -lsvn_ra-1 -lstdc++ -o svnt -
      g svnt.cc
      [17:10] cmpilato: $ g++ -o go -I/usr/local/include/subversion-1 -I/usr/local/apache2/include -
      L/usr/local/apache2/lib -L/usr/local/lib -lapr-1 -lsvn_ra-1 test.cpp
      [17:10] cmpilato: $ go
      [17:10] cmpilato: src
      [17:10] cmpilato: com
      [17:10] cmpilato: $
      [17:10] mm_aa: here it is
      [17:10] mm_aa: 2 dirs instead of 1
      [17:11] mm_aa: any ideas?
      [17:13] cmpilato: so, debugging in, i definitely see logic that simply won't work.
      [17:14] mm_aa: where?
      [17:14] cmpilato: inside the RA layer.
      [17:15] mm_aa: does it mean that I've stumbled upon a bug?
      [17:15] cmpilato: or possibly in mod_dav_svn.
      [17:15] • cmpilato needs to fire up wireshark.
      [17:17] cmpilato: ewwww...
      [17:17] cmpilato: <D:href>/svn/trunk/src/com/</D:href>
      [17:17] cmpilato: i thought those were supposed to be full URLs!
      [17:17] • cmpilato suspects a bad svn_path_join()
      [17:18] • cmpilato does not, however, understand why this situation gets any better with the 
      command-line client.
      [17:19] mm_aa: btw it's the same with normal svn (http://svn.collab.net/repos/svn/trunk/packages/)
      [17:19] epg: well, i'm mystified: i get src and com with mm_aa's test program, but *using the exact 
      same libs* from python i only get com
      [17:22] cmpilato: scratch that. looks like d:href isn't a full url after all.
      [17:23] cmpilato: but there's definitely some whack logic on our end at any rate.
      [17:24] cmpilato: if you look at svn_ra_neon__get_dir(), you find us comparing the number of path 
      components in the URL we query ("http://mercurialidea.googlecode.com/svn/trunk/src") ...
      [17:25] cmpilato: ... with the number found in those non-URLs in the D:href tags.
      [17:25] cmpilato: if they match, we omit the entry.
      [17:25] cmpilato: but it's a bogus comparison.
      [17:26] mm_aa: you mean this: " Skip the effective '.' entry that comes back from" ?
      [17:27] cmpilato: yes.
      [17:28] cmpilato: final_url_n_components = 5; the count of components from the dirents is 3 (for src) 
      and 4 (for com)
      [17:28] mm_aa: hm... we're lucky enough it doesn't skip random urls 
      [17:29] mm_aa: why is final count 5?
      [17:30] cmpilato: because svn_path_component_count  is URL-ignorant.
      [17:30] cmpilato: (and that could very well be the bug we need to fix)
      [17:30] mm_aa: ah... http:// counts for 1 too? 
      [17:31] mm_aa: and how does svn ls work in this case?
      [17:33] cmpilato: well, one difference is that subversion does peg-revision resolving
      [17:33] cmpilato: so the actual request neon makes is against a baseline collection URL, not a public 
      [17:33] cmpilato: i bet that's the difference!
      [17:34] mm_aa: how can I resolve the URL?
      [17:35] cmpilato: pass a real revision number to the get-dir API
      [17:35] jackr left the chat room. (Read error: 104 (Connection reset by peer))
      [17:36] cmpilato: epg: using python, are you providing a revision number through the API?
      [17:36] mm_aa: yep, this solves the problem
      [17:36] mm_aa: Thanks a lot.
      [17:36] mm_aa: Should I enter a bug somewhere?
      [17:36] cmpilato: PLEASE!
      [17:37] cmpilato: i was going to offer to, but if you'd spare me that, my love for you would increment 
      by some value.
      [17:37] mm_aa: is there an irc log? To attach a link.
      [17:37] cmpilato: don't think so.
      [17:38] mm_aa: Ok. I'll attach last dozen of lines to issue
      [17:38] cmpilato: is that all the history you have?
      [17:38] mm_aa: you mean have I discussed this earlier?
      [17:39] cmpilato: no, i mean do you need more lines of irc history?
      [17:39] mm_aa: nope. I've got it all.

      Original issue reported by mm_aa




            • Assignee:
              subversion-importer Subversion Importer
            • Votes:
              0 Vote for this issue
              0 Start watching this issue


              • Created: