CouchDB
  1. CouchDB
  2. COUCHDB-753

Add config option for view compact dir

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Database Core
    • Labels:
    • Skill Level:
      New Contributors Level (Easy)

      Description

      CouchDB creates a "foo.view.compact" file in the view directory ("view_index_dir") when you run compact against a view.

      I'd really like to be able to specify another directory where this ".compact" file is created and worked on. This is especially helpful when it's difficult to run compaction because you run out of disk space on the same device.

        Activity

        Till Klampaeckel created issue -
        Hide
        Adam Kocoloski added a comment -

        Hi Till, I'm having trouble figuring out how this would work. Are you saying you want to create the .compact file on another volume, then move it back to the volume hosting the view_index_dir when it's time to do the switchover?

        The trouble I see with that approach is that the final move of the compacted view index file would take a very long time. If the server crashed during the move, how would it know which file to choose? I guess one possible priority list might be

        1) .view file in view_index_dir
        2) .view.compact file in view_compact_dir
        3) .view.compact file view_index_dir

        This would hopefully take care of the case where the .view file has been deleted but the server crashed before the .view.compact file could be completely transferred over to the original volume and renamed.

        Show
        Adam Kocoloski added a comment - Hi Till, I'm having trouble figuring out how this would work. Are you saying you want to create the .compact file on another volume, then move it back to the volume hosting the view_index_dir when it's time to do the switchover? The trouble I see with that approach is that the final move of the compacted view index file would take a very long time. If the server crashed during the move, how would it know which file to choose? I guess one possible priority list might be 1) .view file in view_index_dir 2) .view.compact file in view_compact_dir 3) .view.compact file view_index_dir This would hopefully take care of the case where the .view file has been deleted but the server crashed before the .view.compact file could be completely transferred over to the original volume and renamed.
        Hide
        Till Klampaeckel added a comment -

        I admit, I haven't really thought this through. My issue is that sometimes people run out of disk space with compaction.

        "You" (not necessarily you or CouchDB) could do something like block writes etc. when a compact is about to replace the database dir. Expose something from the server via JSON?

        Show
        Till Klampaeckel added a comment - I admit, I haven't really thought this through. My issue is that sometimes people run out of disk space with compaction. "You" (not necessarily you or CouchDB) could do something like block writes etc. when a compact is about to replace the database dir. Expose something from the server via JSON?
        Hide
        Jan Lehnardt added a comment -

        I don't have much to add, but I could envision a pro-setup with two io subsystems/drives/mountpoints where CouchDB would alternate between db_dir and db_compact_dir (and similarly for views), it'd be a specific set-up case, but still very neat.

        Show
        Jan Lehnardt added a comment - I don't have much to add, but I could envision a pro-setup with two io subsystems/drives/mountpoints where CouchDB would alternate between db_dir and db_compact_dir (and similarly for views), it'd be a specific set-up case, but still very neat.
        Hide
        Filipe Manana added a comment -

        I like this last proposition from Jan.

        We could define in the .ini config a group of directories where DBs and compaction woud be stored/done.

        When a DB open request comes in, CouchDB would loop over that list of directories and stop as soon as it finds one having the DB file.
        If none has it, it would loop through all the directories again but looking for one containing a DB compaction file, and rename it to the main DB file (like it's currently done).

        With 2 or 3 directories, I think this wouldn't slow down (significantly) open DB requests.

        When creating a DB (or compacting an existing one), we could randomly choose which directory to use.

        If everyone sees this as a plus (I see it like a plus), and doesn't find a flaw in the idea, I can implement it and give a patch soon.

        Show
        Filipe Manana added a comment - I like this last proposition from Jan. We could define in the .ini config a group of directories where DBs and compaction woud be stored/done. When a DB open request comes in, CouchDB would loop over that list of directories and stop as soon as it finds one having the DB file. If none has it, it would loop through all the directories again but looking for one containing a DB compaction file, and rename it to the main DB file (like it's currently done). With 2 or 3 directories, I think this wouldn't slow down (significantly) open DB requests. When creating a DB (or compacting an existing one), we could randomly choose which directory to use. If everyone sees this as a plus (I see it like a plus), and doesn't find a flaw in the idea, I can implement it and give a patch soon.
        Hide
        Filipe Manana added a comment -

        I just implemented a solution where:

        • when creating a new DB, couch_server will select one directory in a round-robin fashion
        • when compacting a DB, one of the directories is also selected in a round-robin fashion to store the compaction file
        • when compaction finishes, the new DB file is not moved from one directory to another (avoiding expensive IO)

        For example, if we have 2 DB dirs and we create 6 DBs, each DB dir will have 3 DBs. This is great if each directory maps to a different IO device.

        The code is at http://github.com/fdmanana/couchdb/compare/multiple_db_dirs
        It includes a comprehensive Etap test case.

        I'll attach a patch here if I get positive feedback.

        If there's no objection to this approach, I'll do the same but for supporting multiple view index directories. Although this makes sense to go into a separate patch.

        Show
        Filipe Manana added a comment - I just implemented a solution where: when creating a new DB, couch_server will select one directory in a round-robin fashion when compacting a DB, one of the directories is also selected in a round-robin fashion to store the compaction file when compaction finishes, the new DB file is not moved from one directory to another (avoiding expensive IO) For example, if we have 2 DB dirs and we create 6 DBs, each DB dir will have 3 DBs. This is great if each directory maps to a different IO device. The code is at http://github.com/fdmanana/couchdb/compare/multiple_db_dirs It includes a comprehensive Etap test case. I'll attach a patch here if I get positive feedback. If there's no objection to this approach, I'll do the same but for supporting multiple view index directories. Although this makes sense to go into a separate patch.
        Paul Joseph Davis made changes -
        Field Original Value New Value
        Skill Level New Contributors Level (Easy)

          People

          • Assignee:
            Unassigned
            Reporter:
            Till Klampaeckel
          • Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development