Uploaded image for project: 'Subversion'
  1. Subversion
  2. SVN-2916

Subversion is not always a good steward of its HTTP(S) network connections



    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • trunk
    • unscheduled
    • libsvn_ra
    • None


      Over in issue #2048, we dealt with a specific problem: Subversion sometimes
      spent so long doing stuff with secondary network connections (and simultaneously
      ignoring its primary connection) that the primary connection would time out,
      server-side.  This was showing up in merge operations, especially those where
      contextual diff calculations and applications were especially time-consuming.  
      But guess what -- it's not a merge-only problem.  It's a general problem in
      Subversion, a somewhat natural-yet-unanticipated tension between the desire to
      do things streamily (avoiding front-loaded and back-loaded processing) and the
      need to suck data off of timeout-able sources at a sufficient rate.
      I've now observed this happening during 'svn update', with close_directory()'s
      WC log processing taking too long (for a directory with 928 files in it).  I've
      suggested for some time that this might also pose problems in other areas, such
      as with interactive merge conflict handlers left unattended or an external merge
      tool taking too long to do its thang.  But in general, *any* processing which
      occurs between network reads is a liability in this area.
      The obvious and universal fix is to always spool server responses to disk as
      they come down, and then read back the spool.  (This was, in fact, the solution
      employed for issue 2048.)  There are some similarly obvious cons to that
      solution, too (disk space usage, the perception of nothing happening until the
      spooling is done and the response is re-read from disk, extra I/O overhead,
      etc.)  And these pros and cons vary depending on whether we are trying to spool
      a "full update" versus a "skelta"-type response.
      Sometimes, folks can workaround these problems by reorganizing their data
      layout, but that shouldn't be something that use of a version control tool
      should necessitate.  Sometimes, they can up their Apache timeouts, but in public
      servers, my understanding is that that becomes a potential concern for the
      severity of DoS attacks.  
      So, I think Subversion just needs to be a better steward of its network
      connections.  Further, this kind of stewardship belong inside the effected RA
      layers -- it should not be exposed through the RA interface.




            Unassigned Unassigned
            cmpilato C. Michael Pilato
            0 Vote for this issue
            0 Start watching this issue