Uploaded image for project: 'CouchDB'
  1. CouchDB
  2. COUCHDB-2040

Compaction fails when copying attachment

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.7.0
    • Database Core
    • None

    Description

      Orignal discussion from the user mailing list: http://mail-archives.apache.org/mod_mbox/couchdb-user/201401.mbox/%3cD14F971A540B974BB75ADC55F00F34CA69A356DA@SEX1.getback.ad2008r2.corp%3e

      Digest:
      During database compaction, the process fails at about 50% with the following error: http://pastebin.com/qeaZNHMj (CouchDB 1.2.0, Windows Server 2008 R2 Enterprise).
      After server and CouchDB upgrade the error is still the same: http://pastebin.com/feJWu7bN (CouchDB 1.5.0, Ubuntu 12.04.3 LTS (GNU/Linux 3.8.0-33-generic x86_64)).

      There was one prior attempt at compaction that failed because of insufficient disk space: http://pastebin.com/S1URXN0p
      After this initial failure, I've made sure that there's sufficient disk space for the .compact file.

      The .compact file was always removed before trying compaction again.
      At the request of Robert Samuel Newson, I've also tried with an empty .compact file - the results were the same: http://pastebin.com/MJCgGM8C.

      Our I/O subsystem consists of some RAID5 matrices - the admins claim that they've been running error-free since inception We have yet to run a parity check, since that'd require taking the matrix offline and I'd rather not do that without exhausting other options.

      Config files from the 1.2.0/Windows server (since that's where the fault must have occured):
      default.ini: http://pastebin.com/kUz0qyNk
      local.ini: http://pastebin.com/srZUMwzB

      Other than the default delayed_commits set to true, there are no options that could affect fsync()ing and such.

      I've run:
      curl localhost:5984/ecrepo/_changes?include_docs=true
      curl localhost:5984/ecrepo/_all_docs?include_docs=true
      and both calls succeeded, which would suggest that a faulty (incorrect checksum/length) is at fault somewhere.

      Attachments

        Activity

          People

            Unassigned Unassigned
            i.klimer Igor Klimer
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: