Uploaded image for project: 'Subversion'
  1. Subversion
  2. SVN-3719

Extremely slow checkout on Windows

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.6.x
    • 1.6.17
    • libsvn_wc
    • Windows Vista

    Description

      Our repo has 31k files, a total of 120 MB. Linux checkout takes 2 minutes,
      Windows checkout takes 37 minutes (on only a little slower machine).
      
      While checking out on Windows (Windows 7 x64, svn 1.6.12, NTFS), svn eats 50%
      CPU (that is, one of the two cores). When a large file is being downloaded, the
      load drops. It clearly showed me that there is a per-file problem (instead of
      throughput, bandwidth limit, some kind of conversion [we made even test to check
      if the CR -> CR/LF conversion takes too much time], etc.)
      
      Now I've fired up ProcMon from SysInternals. Here are some bottlenecks I've found:
      
      1. Anytime an "entries" is read, I see the following sequence: Open, Read 80
      bytes, Close, Open, Read 80 bytes, Close, Open, Read whole file, Close. What is
      the reason behind this?
      
      2. I've also found that the same "entries" file is being read several times (in
      the above way) consecutively, without any writes to that file, without any other
      operations between the two queries. So O, R80, C, O, R80, C, O, Rall, C, O, R80,
      C, O, R80, C, O, Rall, C, etc.
      
      3. In some directories I see a loop. Svn tries to create a file "tempfile.tmp"
      and gets NAME COLLISION result. "tempfile.2.tmp" is tried then with the same
      result. And so on. Sometimes going up to even "tempfile.340.tmp". Seems some
      DeleteFile is missing for the temporaries. But why not use the GetTempFileName
      function anyway?
      
      4. When a large file is being checked out, I see the following sequence:
      Write 4k from offs 0,
      Write 4k from offs 4k,
      Write 4k from offs 8k,
      Read 16k from offs 16k, <- Why?
      Write 4k from offs 12k,
      Write 4k from offs 16k,
      etc.
      It also shows that either the TCP packet size is set to 4096 bytes, or the file
      buffer size is set to this silly small value in svn.
      
      5. It seems that the "entries" in the whole directory tree is checked for each
      repository file. Say file "root/dirA/dirB​/fileC" is processed, and both dirA
      and dirB is already created. svn checks "root/entries", "root/dirA/entries",
      "root/dirA/dirB​/entries", deals with the file, the checks (reads!)
      "root/dirA/dirB​/entries" again, then "root/dirA/entries" and "root/entries".
      
      All in all: Most of the lost time is spent with the "entries" files.
      
      I'm willing to check any tests or test versions sent to me.
      
      PS: The OS field of the issue form should now include Windows 7 as well...
      

      Original issue reported by yogurt2

      Attachments

        1. 1_slow-checkout-fix.patch
          5 kB
          Subversion Importer
        2. 2_slow-checkout-fix3.patch
          6 kB
          Subversion Importer

        Issue Links

          Activity

            People

              Unassigned Unassigned
              subversion-importer Subversion Importer
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: