CouchDB
  1. CouchDB
  2. COUCHDB-926

Compaction does not release file descriptors

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.0.1
    • Fix Version/s: 1.0.2, 1.1
    • Component/s: Database Core
    • Labels:
      None
    • Environment:

      Ubuntu 9.04/10.4

    • Skill Level:
      Committers Level (Medium to Hard)

      Description

      When couch compacts a database, file descriptors of the deleted files are left open, causing the freed disk space to not be released to the system. With regular compaction, the system eventually runs out of disk space.

      There is a conversation thread in the user mailing list titled "Couch not releasing deleted files" that gives more insight into the problem, but I have been unable to find a bug report for it, so please accept my apologies if this has already been dealt with.

      1. test_comp.sh
        0.6 kB
        Filipe Manana

        Issue Links

          Activity

          Hide
          Adam Kocoloski added a comment - - edited

          Do you have views on the database? How reproducible is the problem? If you restart the server and compact the database without querying any views are the old file descriptors released?

          Show
          Adam Kocoloski added a comment - - edited Do you have views on the database? How reproducible is the problem? If you restart the server and compact the database without querying any views are the old file descriptors released?
          Hide
          Russell van der Walt added a comment -

          Yes, the database has views and it happens always. I'm not sure about whether the querying has any effect, but I will put together a test case tomorrow and let you know.

          Show
          Russell van der Walt added a comment - Yes, the database has views and it happens always. I'm not sure about whether the querying has any effect, but I will put together a test case tomorrow and let you know.
          Hide
          Adam Kocoloski added a comment -

          Thanks Russell. One of the patches that has been discussed dealt explicitly with view queries, so if the problem persists without any queries we'll know we need to look elsewhere to track it down.

          Show
          Adam Kocoloski added a comment - Thanks Russell. One of the patches that has been discussed dealt explicitly with view queries, so if the problem persists without any queries we'll know we need to look elsewhere to track it down.
          Hide
          Russell van der Walt added a comment -

          Hi Adam,

          Have now confirmed that it is only applicable to databases with views. No view, no problem.

          I set up a test that simply kept updating a single document in a loop (and touching the view after every update), with the following map function:

          function(doc) { if (doc.type && doc.type == 'record')

          { emit(doc.id, null); }

          }

          I let that run for a bit and then when exiting the loop, compacted the database and the views. The database and view file size goes down as expected, but "df" does not show an increase in free size.

          lsof has the following output which I reckon sums it up nicely:

          root@ubuntu904:~# lsof | grep -P 'COMMAND|/var/data/couchdb/'
          COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
          beam 21197 root 13u REG 252,0 4185 3653752 /var/data/couchdb/_users.couch
          beam 21197 root 14u REG 252,0 191959757 3653679 /var/data/couchdb/.delete/33acca4299388cfa73e271ca6b662518 (deleted)
          beam 21197 root 18u REG 252,0 8281 3884279 /var/data/couchdb/.storage-test_design/800b1d5fe6334f272f2a8ad9feb9e0d9.view
          beam 21197 root 20u REG 252,0 36956 3654908 /var/data/couchdb/storage-test.couch

          Note the deleted view file still being held open by beam.

          Hope this helps.

          Show
          Russell van der Walt added a comment - Hi Adam, Have now confirmed that it is only applicable to databases with views. No view, no problem. I set up a test that simply kept updating a single document in a loop (and touching the view after every update), with the following map function: function(doc) { if (doc.type && doc.type == 'record') { emit(doc.id, null); } } I let that run for a bit and then when exiting the loop, compacted the database and the views. The database and view file size goes down as expected, but "df" does not show an increase in free size. lsof has the following output which I reckon sums it up nicely: root@ubuntu904:~# lsof | grep -P 'COMMAND|/var/data/couchdb/' COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME beam 21197 root 13u REG 252,0 4185 3653752 /var/data/couchdb/_users.couch beam 21197 root 14u REG 252,0 191959757 3653679 /var/data/couchdb/.delete/33acca4299388cfa73e271ca6b662518 (deleted) beam 21197 root 18u REG 252,0 8281 3884279 /var/data/couchdb/.storage-test_design/800b1d5fe6334f272f2a8ad9feb9e0d9.view beam 21197 root 20u REG 252,0 36956 3654908 /var/data/couchdb/storage-test.couch Note the deleted view file still being held open by beam. Hope this helps.
          Hide
          Russell van der Walt added a comment -

          Sorry, I just realized I didn't actually answer your last question.

          Having dug around a bit more, I can confirm that it only if the view file is open.

          If you compact the database without opening the database after a restart, no problem.
          If you compact the database after opening it, but not querying any of the views, no problem.

          If however you compact after querying a view, file handles are left behind.

          If appears that both views and database files are left behind:

          Before compaction:

          root@ubuntu904:~# lsof | grep -P 'COMMAND|/var/data/couchdb/'
          COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
          beam 22022 root 13u REG 252,0 4185 3653752 /var/data/couchdb/_users.couch
          beam 22022 root 20u REG 252,0 18128988 3653679 /var/data/couchdb/storage-test.couch
          beam 22022 root 21u REG 252,0 12377 3884278 /var/data/couchdb/.storage-test_design/800b1d5fe6334f272f2a8ad9feb9e0d9.view

          After compaction:

          root@ubuntu904:~# lsof | grep -P 'COMMAND|/var/data/couchdb/'
          COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
          beam 22022 root 13u REG 252,0 4185 3653752 /var/data/couchdb/_users.couch
          beam 22022 root 20u REG 252,0 18128988 3653679 /var/data/couchdb/.delete/3d7816bb09602b0ac9092e1d9fba9172 (deleted)
          beam 22022 root 21u REG 252,0 12377 3884278 /var/data/couchdb/.delete/cb1436e9818a0fde236d7e130cf02fbe (deleted)
          beam 22022 root 23u REG 252,0 4185 3884279 /var/data/couchdb/.storage-test_design/800b1d5fe6334f272f2a8ad9feb9e0d9.view
          beam 22022 root 25u REG 252,0 32860 3654908 /var/data/couchdb/storage-test.couch

          Show
          Russell van der Walt added a comment - Sorry, I just realized I didn't actually answer your last question. Having dug around a bit more, I can confirm that it only if the view file is open. If you compact the database without opening the database after a restart, no problem. If you compact the database after opening it, but not querying any of the views, no problem. If however you compact after querying a view, file handles are left behind. If appears that both views and database files are left behind: Before compaction: root@ubuntu904:~# lsof | grep -P 'COMMAND|/var/data/couchdb/' COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME beam 22022 root 13u REG 252,0 4185 3653752 /var/data/couchdb/_users.couch beam 22022 root 20u REG 252,0 18128988 3653679 /var/data/couchdb/storage-test.couch beam 22022 root 21u REG 252,0 12377 3884278 /var/data/couchdb/.storage-test_design/800b1d5fe6334f272f2a8ad9feb9e0d9.view After compaction: root@ubuntu904:~# lsof | grep -P 'COMMAND|/var/data/couchdb/' COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME beam 22022 root 13u REG 252,0 4185 3653752 /var/data/couchdb/_users.couch beam 22022 root 20u REG 252,0 18128988 3653679 /var/data/couchdb/.delete/3d7816bb09602b0ac9092e1d9fba9172 (deleted) beam 22022 root 21u REG 252,0 12377 3884278 /var/data/couchdb/.delete/cb1436e9818a0fde236d7e130cf02fbe (deleted) beam 22022 root 23u REG 252,0 4185 3884279 /var/data/couchdb/.storage-test_design/800b1d5fe6334f272f2a8ad9feb9e0d9.view beam 22022 root 25u REG 252,0 32860 3654908 /var/data/couchdb/storage-test.couch
          Hide
          Adam Kocoloski added a comment -

          Oh, now that's interesting - both views and DBs are not being released.

          Can you share the script?

          Show
          Adam Kocoloski added a comment - Oh, now that's interesting - both views and DBs are not being released. Can you share the script?
          Hide
          Russell van der Walt added a comment -

          I'm not sure that it is always the case, as in my first test, only the database wasn't released.

          The script is in C# using the Divan library, but I'm sure you'll get the gist of it:

          var dbname = "storage-test";
          var server = new CouchServer("192.168.1.21");
          if (server.HasDatabase(dbname))
          server.DeleteDatabase(dbname);

          var database = server.GetNewDatabase(dbname);

          var design = database.NewDesignDocument("attached");
          design.AddView("id", @" function(doc) { if (doc.type && doc.type == 'telemetry')

          { emit(doc.id, null); }

          }");
          database.WriteDocument(design);

          var id = Guid.NewGuid().ToString();

          using (var timer = new LoopTimer("records")) {
          var terminated = false;
          while (!terminated) {

          var item = new {
          id = id,
          ownerId = Guid.Empty,
          originId = Guid.NewGuid(),
          linked = new Guid[]

          { Guid.NewGuid(), Guid.NewGuid() }

          ,
          date = DateTime.Now.ToUniversalTime(),
          received = DateTime.Now.ToUniversalTime(),
          active = true,
          status = new string[]

          { "ignition_on" }

          ,
          type = "record"
          };

          database.TouchView("attached", "id");

          var doc = database.GetDocument(id);
          if (doc != null)

          { database.SaveDocument(new CouchJsonDocument(Json.Encode(item), id, doc.Rev)); }

          else
          database.SaveDocument(new CouchJsonDocument(Json.Encode(item), id));

          timer.Increment();
          terminated = Console.KeyAvailable;

          }
          }

          Console.ReadKey();
          Console.WriteLine("Press any key to compact");
          Console.ReadLine();

          database.Compact(true); // compact database and views

          Show
          Russell van der Walt added a comment - I'm not sure that it is always the case, as in my first test, only the database wasn't released. The script is in C# using the Divan library, but I'm sure you'll get the gist of it: var dbname = "storage-test"; var server = new CouchServer("192.168.1.21"); if (server.HasDatabase(dbname)) server.DeleteDatabase(dbname); var database = server.GetNewDatabase(dbname); var design = database.NewDesignDocument("attached"); design.AddView("id", @" function(doc) { if (doc.type && doc.type == 'telemetry') { emit(doc.id, null); } }"); database.WriteDocument(design); var id = Guid.NewGuid().ToString(); using (var timer = new LoopTimer("records")) { var terminated = false; while (!terminated) { var item = new { id = id, ownerId = Guid.Empty, originId = Guid.NewGuid(), linked = new Guid[] { Guid.NewGuid(), Guid.NewGuid() } , date = DateTime.Now.ToUniversalTime(), received = DateTime.Now.ToUniversalTime(), active = true, status = new string[] { "ignition_on" } , type = "record" }; database.TouchView("attached", "id"); var doc = database.GetDocument(id); if (doc != null) { database.SaveDocument(new CouchJsonDocument(Json.Encode(item), id, doc.Rev)); } else database.SaveDocument(new CouchJsonDocument(Json.Encode(item), id)); timer.Increment(); terminated = Console.KeyAvailable; } } Console.ReadKey(); Console.WriteLine("Press any key to compact"); Console.ReadLine(); database.Compact(true); // compact database and views
          Hide
          Filipe Manana added a comment -

          Fix applied to trunk (revision 1036294) and branch 1.0.x (revision 1036295).

          Show
          Filipe Manana added a comment - Fix applied to trunk (revision 1036294) and branch 1.0.x (revision 1036295).
          Hide
          Filipe Manana added a comment -

          Attaching the script used to reproduce the issue.

          Show
          Filipe Manana added a comment - Attaching the script used to reproduce the issue.
          Hide
          Mathias Leppich added a comment -

          this is a major bug for all running a production environment.
          any date when you're releasing v1.0.2, which hopefully include the fix (rev 1036295)?

          Show
          Mathias Leppich added a comment - this is a major bug for all running a production environment. any date when you're releasing v1.0.2, which hopefully include the fix (rev 1036295)?
          Hide
          Adam Kocoloski added a comment -

          Hi Mathias, Filipe already backported the fix to the 1.0.x branch. It will be included in 1.0.2.

          Show
          Adam Kocoloski added a comment - Hi Mathias, Filipe already backported the fix to the 1.0.x branch. It will be included in 1.0.2.
          Hide
          Bob Clary added a comment -

          While the situation in 1.0.2 is much improved over 1.0.1, there are still cases where the deleted files are not released and are still held open by beam. I don't have specific steps to reproduce however it appears to be related to heavy load with couchdb crashing after compacting views and leaving the original file open.

          Show
          Bob Clary added a comment - While the situation in 1.0.2 is much improved over 1.0.1, there are still cases where the deleted files are not released and are still held open by beam. I don't have specific steps to reproduce however it appears to be related to heavy load with couchdb crashing after compacting views and leaving the original file open.

            People

            • Assignee:
              Filipe Manana
              Reporter:
              Russell van der Walt
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development