Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
all
-
None
Description
This issue is about the known FSFS corruption issues which the fsfsverify.py script can fix. The problem manifests itself in broken delta streams in representations. When it happens, svnadmin verify errors out with a message such as "Decompression of svndiff data failed". fsfsverify.py errors out with messages such as "Error InvalidCompressedStream: Invalid compressed instr stream at offset 169947 (Error -3 while decompressing: incorrect header check)" "Error InvalidWindow: The window header at offset 537446 appears to be corrupted" The problem has been known for a while, since 2006 at least. This is probably the first issue filed on the matter, and is meant to aggregate all known information about the problem so far, collected during an IRC conversion on the #svn-dev channel (link to the full log of this conversion is below). For some unknown reason, a block of svndiff data up to 4K in size is written twice in succession. This corrupts the representation, causing fatal errors when the representation is accessed, but is fixable by erasing the first of the duplicated blocks. The duplicated block apparently forms a complete logical unit in the svndiff stream. In http://svn.haxx.se/dev/archive-2006-02/0473.shtml, Malcolm Rowe gave a description of a possible cause of the problem. This description contains small inaccuracies: The data being written is not a node revision, but svndiff data referred to by a node revision. Also, the fsfsverify.py script does not copy-and-paste data into a gap, leaving the duplicate block in place, but actually moves the data that should follow the first instance of the duplicated block up in the data stream, overwriting the duplicated block. As Malcolm explains, it is possible that concurrent writes to the representation from multiple processes/threads, with interleaved fsync() calls to flush data to disk, is a cause of this. It is currently unknown if current versions of Subversion are affected, but it's not unlikely. In an instance I observed, a 1.4.x server was used with 1.6.x clients. The problem was observed fairly often after a post-commit hook script was enabled which modified revision properties of the HEAD revision and other revisions. This problem has also been reported to have occurred in the ASF repository in 2006: http://thread.gmane.org/gmane.comp.version-control.subversion.devel/78960/focus=79016 Further details which were supposedly posted to a mailing list back then have not been found yet. Daniel Shahaf points out that there have been known failures in the FSFS packing tests, when run on Windows with -DSVN_FS_FS_DEFAULT_MAX_FILES_PER_DIR=4 -DPACK_AFTER_EVERY_COMMIT. The failure to pack revisions can still be reproduced with the current trunk code, and is assumed to be related to file descriptors which are left open. It's not clear yet whether the failing packing tests are related to this issue, but it also points to a problem with file descriptor handling in FSFS. Related links: Background information on the Subversion backend: http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_base/notes/fs-history http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_base/notes/structure http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure http://svn.apache.org/repos/asf/subversion/trunk/notes/svndiff fsfsverify.py: http://svn.apache.org/repos/asf/subversion/trunk/contrib/server-side/fsfsverify.py Log of the IRC conversation about this issue: http://colabti.org/irclogger/irclogger_log/svn-dev?date=2010-09-03#l46
Attachments
Issue Links
- blocks
-
SVN-3885 FSFS corruption issues
- Open