Uploaded image for project: 'Subversion'
  1. Subversion
  2. SVN-2067

Performance issue with source directories containing large number of files

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • all
    • unscheduled
    • libsvn_fs_base
    • None

    Description

      Julian Foad [julianfoad@btopenworld.com]'s message:
      
      Jaap Vermeulen wrote:
      > 
      > I'm running 1.1.0 RC 3 using the svnserve server on one machine, and use the
      > svn: protocol on the client machine.
      > 
      > Everything has been working fine until I hit a directory with approx. 10332
      > files in it (small HTML files, combined total size of 18.2 Mb).  This seems
      > to bog down SVN (or Tortoise SVN 1.1.0 RC2) to an enormous extent.  If I use
      > the command line svn command, it will eat up close to 100% CPU time for 10
      > minutes or so on a commit.  If I use the RepoBrowser and try to expand the
      > directory with those files, the TortoiseProc will eat up 100% for a long
      > time.  If I try to open or refresh an explorer window, it goes to 100% for a
      > couple of minutes.
      > 
      > Is this to be expected?  Doesn't seem like it should cause such a dramatic
      > slowdown.
      
      It is a known problem - that Subversion is slow when using the BDB back-end and 
      there are very many files in the same directory.  There doesn't seem to be an 
      entry for it in the issue tracker.  Please could you file one?
      
      In the thread "Perf issues with BDB and directories with a large number of 
      items",
      Branko Čibej wrote:
      > Brian W. Fitzpatrick wrote:
      > 
      >> Summary:  If you have a directory in your repository with a large number
      >> of items in it, the BDB backend gets slower as you add more items (due
      >> parsing/unparsing skels I'll bet).
      >>
      >> FSFS does not exhibit this behavior.
      >>
      >> Details and Pretty Pictures here: http://www.red-bean.com/fitz/svn/
      >>  
      > This is a textbook example of quadratic vs. logarithmic behaviour, and 
      > it's a safe bet that it's caused by our saving directory data in 
      > unsorted lists and rewriting the whole list at every directory change.
      
      - Julian
      
      Additional comment from Jaap:
      
      Some rough comparisons for the case I described:
      
      SVN import BDB: 1 hour, 17 mins.
      SVN checkout BDB: 22 mins
      SVN list BDB: 6 mins
      
      SVN import FSFS: 8 mins
      SVN checkout FSFS: 20 mins
      SVN list FSFS: 6 secs
      
      Getting it from a non-SVN repository:  10 mins
      Disk copy, from SVN server: 4 mins
      

      Original issue reported by jaap

      Attachments

        Activity

          People

            Unassigned Unassigned
            subversion-importer Subversion Importer
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: