Uploaded image for project: 'Subversion'
  1. Subversion
  2. SVN-3705

fix root cause of known fixable FSFS corruption

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • all
    • unscheduled
    • libsvn_fs_fs
    • None

    Description

      This issue is about the known FSFS corruption issues which the fsfsverify.py
      script can fix.
      
      The problem manifests itself in broken delta streams in representations.
      When it happens, svnadmin verify errors out with a message such as
      "Decompression of svndiff data failed".
      fsfsverify.py errors out with messages such as
      "Error InvalidCompressedStream: Invalid compressed instr stream at offset 169947
      (Error -3 while decompressing: incorrect header check)"
      "Error InvalidWindow: The window header at offset 537446 appears to be corrupted"
      
      The problem has been known for a while, since 2006 at least.
      This is probably the first issue filed on the matter, and is meant to aggregate
      all known information about the problem so far, collected during an IRC
      conversion on the #svn-dev channel (link to the full log of this conversion is
      below).
      
      For some unknown reason, a block of svndiff data up to 4K in size is written
      twice in succession.
      This corrupts the representation, causing fatal errors when the representation
      is accessed, but is fixable by erasing the first of the duplicated blocks. 
      The duplicated block apparently forms a complete logical unit in the svndiff stream.
      
      In http://svn.haxx.se/dev/archive-2006-02/0473.shtml, Malcolm Rowe gave a
      description of a possible cause of the problem.
      This description contains small inaccuracies:
      The data being written is not a node revision, but svndiff data referred to by a
      node revision.
      Also, the fsfsverify.py script does not copy-and-paste data into a gap, leaving
      the duplicate block in place, but actually moves the data that should follow the
      first instance of the duplicated block up in the data stream, overwriting the
      duplicated block.
      
      As Malcolm explains, it is possible that concurrent writes to the representation
      from multiple processes/threads, with interleaved fsync() calls to flush data to
      disk, is a cause of this. 
      
      It is currently unknown if current versions of Subversion are affected, but it's
      not unlikely.
      
      In an instance I observed, a 1.4.x server was used with 1.6.x clients.
      The problem was observed fairly often after a post-commit hook script was
      enabled which modified revision properties of the HEAD revision and other revisions.
      
      This problem has also been reported to have occurred in the ASF repository in 2006:
      http://thread.gmane.org/gmane.comp.version-control.subversion.devel/78960/focus=79016
      Further details which were supposedly posted to a mailing list back then have
      not been found yet.
      
      Daniel Shahaf points out that there have been known failures in the FSFS packing
      tests, when run on Windows with -DSVN_FS_FS_DEFAULT_MAX_FILES_PER_DIR=4
      -DPACK_AFTER_EVERY_COMMIT.
      The failure to pack revisions can still be reproduced with the current trunk
      code, and is assumed to be related to file descriptors which are left open.
      It's not clear yet whether the failing packing tests are related to this issue,
      but it also points to a problem with file descriptor handling in FSFS.
      
      Related links:
      
      Background information on the Subversion backend:
      http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_base/notes/fs-history
      http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_base/notes/structure
      http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure
      http://svn.apache.org/repos/asf/subversion/trunk/notes/svndiff
      
      fsfsverify.py:
      http://svn.apache.org/repos/asf/subversion/trunk/contrib/server-side/fsfsverify.py
      
      Log of the IRC conversation about this issue:
      http://colabti.org/irclogger/irclogger_log/svn-dev?date=2010-09-03#l46
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              stsp Stefan Sperling
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: