Uploaded image for project: 'Mesos'
  1. Mesos
  2. MESOS-184

Log has a space leak

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Reviewable
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.9.0, 0.14.0, 0.14.1, 0.14.2, 0.15.0, 0.16.0, 0.17.0, 0.18.0, 0.18.1, 0.18.2, 0.19.0
    • Fix Version/s: None
    • Component/s: c++ api, replicated log
    • Labels:

      Description

      In short, the access pattern of the Log of the underlying LevelDB storage is such that background compactions are ineffective and a long running Log will have a space leak on disk even in the presence of otherwise apparently sufficient Log::Writer::truncate calls.
      It seems the right thing to do is to issue a DB::CompactRange(NULL, Slice(truncateToKey)) after a replica learns a Action::TRUNCATE Record. The cost here is a synchronous compaction stall on every truncate so maybe this should be a configuration option or even an explicit api.
      ===

      Snip of email explanation:
      I spent some time understanding what was going on here and our use pattern of leveldb does in fact defeat the backround compaction algorithm.

      The docs are here: http://leveldb.googlecode.com/svn/trunk/doc/impl.html in the 'Compactions' section, but in short the gist is compaction operates on an uncompacted file from a level (1 file) + all files overlapping its key range in the next level. Since we write sequential keys with no randomness at all, by definition the only overlap we ever can get is in level 0 which is the only level that leveldb allows for overlap in sstables in the 1st place.

      That leaves the question of why no compaction on open. Looking there: http://code.google.com/p/leveldb/source/browse/db/db_impl.cc#1376
      I see a call to MaybeScheduleCompaction, but following that trail, that just leads to http://code.google.com/p/leveldb/source/browse/db/version_set.cc?spec=svnbc1ee4d25e09b04e074db330a41f54ef4af0e31b&r=36a5f8ed7f9fb3373236d5eace4f5fea369856ee#1156 which implements the compaction strategy I tried to summarize above, and thus background compactions for out case are limited to level0 -> level 1 compactions and lefel1 and higher never compact automatically.

      This seems born out by the LOG files. For example, from smf1-prod - restarts after your manual compaction fix in bold:
      [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG.old
      2012/04/13-00:24:20.356673 44c1e940 Compacting 3@0 + 4@1 files
      2012/04/13-00:24:20.490113 44c1e940 Compacting 5@1 + 281@2 files
      2012/04/13-00:24:25.824995 44c1e940 Compacting 1@1 + 0@2 files
      2012/04/13-00:24:26.008857 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:26.196877 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:26.312465 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:26.429817 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:26.533483 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:26.631044 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:26.733702 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:26.832787 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:26.949864 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:27.052502 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:27.164623 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:27.275621 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:27.376748 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:27.477728 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:27.611332 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:28.050275 44c1e940 Compacting 50@2 + 242@3 files
      2012/04/13-00:24:32.455665 44c1e940 Compacting 1@2 + 0@3 files
      2012/04/13-00:24:32.538566 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:32.819205 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:33.052064 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:33.198850 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:33.350893 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:33.521784 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:33.693531 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:33.847151 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:34.034277 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:34.225582 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:34.390228 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:34.554127 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:34.715242 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:34.852110 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:24:35.046899 44c1e940 Compacting 68@3 + 331@4 files
      2012/04/13-00:25:02.582758 44c1e940 Compacting 433@3 + 2159@4 files
      2012/04/13-00:26:39.827096 44c1e940 Compacting 1@3 + 0@4 files
      2012/04/13-00:26:39.992623 44c1e940 Compacting 72@4 + 354@5 files
      2012/04/13-00:27:13.024120 44c1e940 Compacting 9@4 + 51@5 files
      2012/04/13-00:27:18.007566 44c1e940 Compacting 9@4 + 48@5 files
      2012/04/13-00:27:23.026351 44c1e940 Compacting 8@4 + 41@5 files
      2012/04/13-00:27:28.408619 44c1e940 Compacting 6@4 + 33@5 files
      2012/04/13-00:27:32.522630 44c1e940 Compacting 6@4 + 32@5 files
      2012/04/13-00:27:36.719610 44c1e940 Compacting 6@4 + 31@5 files
      2012/04/13-00:27:41.277302 44c1e940 Compacting 6@4 + 33@5 files
      2012/04/13-00:27:44.928451 44c1e940 Compacting 6@4 + 32@5 files
      2012/04/13-00:27:48.168874 44c1e940 Compacting 6@4 + 34@5 files
      2012/04/13-00:27:52.718402 44c1e940 Compacting 6@4 + 32@5 files
      2012/04/13-00:27:55.665107 44c1e940 Compacting 6@4 + 33@5 files
      2012/04/13-00:27:59.381808 44c1e940 Compacting 6@4 + 34@5 files
      2012/04/13-00:28:03.592802 44c1e940 Compacting 6@4 + 33@5 files
      2012/04/13-00:28:07.179032 44c1e940 Compacting 664@4 + 3330@5 files
      2012/04/13-00:29:58.239662 44c1e940 Compacting 101@4 + 500@5 files
      2012/04/13-00:30:22.333750 44c1e940 Compacting 1@4 + 0@5 files

      2012/04/13-00:45:28.851715 44c1e940 Compacting 4@0 + 1@1 files
      2012/04/13-01:00:31.152105 44c1e940 Compacting 4@0 + 3@1 files
      2012/04/13-01:10:33.167940 44c1e940 Compacting 4@0 + 3@1 files
      2012/04/13-01:25:35.113416 44c1e940 Compacting 4@0 + 3@1 files
      2012/04/13-01:35:37.621499 44c1e940 Compacting 4@0 + 2@1 files

      [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting /var/lib/mesos/scheduler_db/mesos_log/LOG
      2012/04/13-01:44:32.533694 44c27940 Compacting 2@0 + 3@1 files
      2012/04/13-01:44:32.586958 44c27940 Compacting 2@1 + 6@2 files
      2012/04/13-01:44:32.739514 44c27940 Compacting 1@2 + 0@3 files
      2012/04/13-01:44:32.768764 44c27940 Compacting 1@2 + 0@3 files
      2012/04/13-01:44:32.843866 44c27940 Compacting 1@3 + 0@4 files
      2012/04/13-01:44:32.973304 44c27940 Compacting 1@3 + 0@4 files
      2012/04/13-01:44:33.009686 44c27940 Compacting 1@4 + 2@5 files
      2012/04/13-01:44:33.074056 44c27940 Compacting 1@4 + 0@5 files

      2012/04/13-02:01:42.947456 44c27940 Compacting 4@0 + 1@1 files
      2012/04/13-02:16:45.326088 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-02:31:48.360851 44c27940 Compacting 4@0 + 1@1 files
      2012/04/13-02:41:50.055622 44c27940 Compacting 4@0 + 3@1 files
      2012/04/13-02:51:51.889148 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-03:01:54.345784 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-03:11:55.987774 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-03:21:57.701121 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-03:31:59.373435 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-03:42:01.047061 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-03:52:03.088683 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-04:02:05.181165 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-04:12:06.757773 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-04:22:08.598259 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-04:32:10.882913 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-04:42:12.602192 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-04:52:14.779705 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-05:02:16.621063 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-05:12:18.608767 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-05:22:20.453201 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-05:32:22.215804 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-05:42:23.882423 44c27940 Compacting 4@0 + 4@1 files
      2012/04/13-05:52:25.553032 44c27940 Compacting 4@0 + 4@1 files

      With the format: [number of sstables compacted]@[level #] this says all levels are compacted on startup now, but once running we only see level0 -> level1 compactions and this accounts for the observed space leak.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ipronin Ilya Pronin
                Reporter:
                jsirois John Sirois
              • Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated: