Solr
  1. Solr
  2. SOLR-1533

Partition data directories into multiple "bucket" directories

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: 4.3, 5.0
    • Component/s: multicore
    • Labels:
      None

      Description

      Provide a way to partition data directories into multiple "bucket" directories. For example, instead of creating 10,000 data directories inside one base data directory, Solr can assign a core to one of 4 base directories, thereby distributing them.

      The underlying problem is that with large number of indexes, we see slower and slower system performance as one goes on increasing the number of cores, thereby increasing the number of directories in the single data directory.

        Issue Links

          Activity

          Hide
          Otis Gospodnetic added a comment -

          Trying to understand the need for this (I might have missed the discussion on the ML?).
          Isn't the creator of the core in control of the data dir ( http://wiki.apache.org/solr/CoreAdmin#CREATE ) and thus their distribution?
          Or is the goal of this to remove the logic and knowledge from the client and let Solr control where core's data is going to be placed, depending on the "core data distribution policy"?

          Show
          Otis Gospodnetic added a comment - Trying to understand the need for this (I might have missed the discussion on the ML?). Isn't the creator of the core in control of the data dir ( http://wiki.apache.org/solr/CoreAdmin#CREATE ) and thus their distribution? Or is the goal of this to remove the logic and knowledge from the client and let Solr control where core's data is going to be placed, depending on the "core data distribution policy"?
          Hide
          Shalin Shekhar Mangar added a comment -

          Trying to understand the need for this (I might have missed the discussion on the ML?).

          No, there was no discussion on the ML though this has been an item on the wiki for some time now.

          Isn't the creator of the core in control of the data dir ( http://wiki.apache.org/solr/CoreAdmin#CREATE ) and thus their distribution?
          Or is the goal of this to remove the logic and knowledge from the client and let Solr control where core's data is going to be placed, depending on the "core data distribution policy"?

          Yes, the creator can specify the dataDir during core creation. The problem is that maintaining exact filesystem path information in addition to the actual host/port/core name mapping in an external system is a huge pain. Also, Solr has more information about the system e.g. how many directories in a given bucket. It is far easier to have Solr manage it. The strategy for distribution can be made pluggable if you want.

          Show
          Shalin Shekhar Mangar added a comment - Trying to understand the need for this (I might have missed the discussion on the ML?). No, there was no discussion on the ML though this has been an item on the wiki for some time now. Isn't the creator of the core in control of the data dir ( http://wiki.apache.org/solr/CoreAdmin#CREATE ) and thus their distribution? Or is the goal of this to remove the logic and knowledge from the client and let Solr control where core's data is going to be placed, depending on the "core data distribution policy"? Yes, the creator can specify the dataDir during core creation. The problem is that maintaining exact filesystem path information in addition to the actual host/port/core name mapping in an external system is a huge pain. Also, Solr has more information about the system e.g. how many directories in a given bucket. It is far easier to have Solr manage it. The strategy for distribution can be made pluggable if you want.
          Hide
          Noble Paul added a comment -

          I guess , we can make that a pluggable CoreAdminHandler . So , users who need this feature can use that

          Show
          Noble Paul added a comment - I guess , we can make that a pluggable CoreAdminHandler . So , users who need this feature can use that
          Hide
          Hoss Man added a comment -

          Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email...

          http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E

          Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed.

          A unique token for finding these 240 issues in the future: hossversioncleanup20100527

          Show
          Hoss Man added a comment - Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email... http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed. A unique token for finding these 240 issues in the future: hossversioncleanup20100527
          Hide
          Robert Muir added a comment -

          Bulk move 3.2 -> 3.3

          Show
          Robert Muir added a comment - Bulk move 3.2 -> 3.3
          Hide
          Robert Muir added a comment -

          3.4 -> 3.5

          Show
          Robert Muir added a comment - 3.4 -> 3.5
          Hide
          Hoss Man added a comment -

          Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently.

          email notification suppressed to prevent mass-spam
          psuedo-unique token identifying these issues: hoss20120321nofix36

          Show
          Hoss Man added a comment - Bulk of fixVersion=3.6 -> fixVersion=4.0 for issues that have no assignee and have not been updated recently. email notification suppressed to prevent mass-spam psuedo-unique token identifying these issues: hoss20120321nofix36
          Hide
          Erick Erickson added a comment -

          I'm thinking of killing this JIRA, SOLR-1306 (which is up real soon now on my list) seems like it would handle this case. Having a pluggable CoreDescriptorProvider should allow the resident code to provide whatever instanceDir one wants. Since it's resident on the server, it can do whatever is needed to insure that the actual core directories are distributed as desired.

          So I'll kill this (or, more accurately, say it's a duplicate of 1306) unless I there's a reason 1306 won't handle this.

          Show
          Erick Erickson added a comment - I'm thinking of killing this JIRA, SOLR-1306 (which is up real soon now on my list) seems like it would handle this case. Having a pluggable CoreDescriptorProvider should allow the resident code to provide whatever instanceDir one wants. Since it's resident on the server, it can do whatever is needed to insure that the actual core directories are distributed as desired. So I'll kill this (or, more accurately, say it's a duplicate of 1306) unless I there's a reason 1306 won't handle this.
          Hide
          Steve Rowe added a comment -

          Erick, do you plan on following through on your threat to kill this issue?

          Show
          Steve Rowe added a comment - Erick, do you plan on following through on your threat to kill this issue?
          Hide
          Erick Erickson added a comment -

          This is very much in the works. It's related to SOLR-1028 & etc. I'm waiting on 4.1 to be cut so it can all bake in 4.2 for a while before being released into the wild. There are a bunch of other related JIRAS as well, SOLR-4196 and SOLR-4083 in particular.

          So I'll be tracking this and probably kill it when I'm sure the directory-discovery bits work...

          Show
          Erick Erickson added a comment - This is very much in the works. It's related to SOLR-1028 & etc. I'm waiting on 4.1 to be cut so it can all bake in 4.2 for a while before being released into the wild. There are a bunch of other related JIRAS as well, SOLR-4196 and SOLR-4083 in particular. So I'll be tracking this and probably kill it when I'm sure the directory-discovery bits work...
          Hide
          Uwe Schindler added a comment -

          Closed after release.

          Show
          Uwe Schindler added a comment - Closed after release.

            People

            • Assignee:
              Erick Erickson
              Reporter:
              Shalin Shekhar Mangar
            • Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development