Solr
  1. Solr
  2. SOLR-4506

[solr4.0.0] many index.{date} dir in replication node

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.0
    • Fix Version/s: 5.3, 6.0
    • Component/s: SolrCloud
    • Labels:
      None
    • Environment:

      the solr4.0 runs on suse11.
      mem:32G
      cpu:16 cores

      Description

      in our test,we used solrcloud feature in solr4.0(version detail :4.0.0.2012.10.06.03.04.33).
      the solrcloud configuration is 3 shards and 2 replications each shard.
      we found that there are over than 25 dirs which named index.

      {date}

      in one replication node belonging to shard 3.
      for example:
      index.20130217233335864 index.20130218012211880 index.20130218015714713 index.20130218023101958 index.20130218030424083 tlog
      index.20130218005648324 index.20130218012751078 index.20130218020141293

      the issue seems like SOLR-1781. but it is fixed in 4.0-BETA,5.0.
      so is solr4.0 ? if it is fixed too in solr4.0, why we find the issue again ?
      what can I do?

      1. SOLR-4506.patch
        21 kB
        Timothy Potter

        Issue Links

          Activity

          Hide
          Mark Miller added a comment -

          if it is fixed too in solr4.0, why we find the issue again ?

          I think because it was a probably different bug? I think it's probably also going to be fixed in 4.2 given some other recent fixes.

          All high level guesses though.

          Show
          Mark Miller added a comment - if it is fixed too in solr4.0, why we find the issue again ? I think because it was a probably different bug? I think it's probably also going to be fixed in 4.2 given some other recent fixes. All high level guesses though.
          Hide
          zhuojunjian added a comment -

          hi
          thanks for your reply
          now what we can do ? upgrade solr4.0 to solr4.1 or waiting?
          Some more specific questions:
          1.in what case indexes such as index.

          {date} are created ?
          2.is it configurable to clean up useless index.{date}

          in solr4.0 ?

          Show
          zhuojunjian added a comment - hi thanks for your reply now what we can do ? upgrade solr4.0 to solr4.1 or waiting? Some more specific questions: 1.in what case indexes such as index. {date} are created ? 2.is it configurable to clean up useless index.{date} in solr4.0 ?
          Hide
          zhuojunjian added a comment -

          I checked the solr JIRA list and find some similar issues. you can see SOLR-3853.because we missed the log files, so we can not check what case will cause the issue. And I am trying to duplicate the issue.

          Show
          zhuojunjian added a comment - I checked the solr JIRA list and find some similar issues. you can see SOLR-3853 .because we missed the log files, so we can not check what case will cause the issue. And I am trying to duplicate the issue.
          Hide
          zhuojunjian added a comment -

          hi
          we have duplicated the issue today.
          step 1: kill one replicate node (node A) , and make it not running.
          step 2: import many data to the solrcloud so that its leader node created too many new indexes.
          step 3: make node A running normally, and it will download files from its leader node.
          step 4: before node A finishes the download operation, kill node A again.
          step 5: then make node A running normally again, we will find there are two index dirs in the ../data/., and if we continue step 3 ~ step 4 , the number of index dirs will increase .

          I think it may be a bug. do you have any idea about that?

          Show
          zhuojunjian added a comment - hi we have duplicated the issue today. step 1: kill one replicate node (node A) , and make it not running. step 2: import many data to the solrcloud so that its leader node created too many new indexes. step 3: make node A running normally, and it will download files from its leader node. step 4: before node A finishes the download operation, kill node A again. step 5: then make node A running normally again, we will find there are two index dirs in the ../data/., and if we continue step 3 ~ step 4 , the number of index dirs will increase . I think it may be a bug. do you have any idea about that?
          Hide
          Mark Miller added a comment -

          I think its a known issue that interrupted replications will leave dirs around. We can look at cleaning them up on startup or something...

          Show
          Mark Miller added a comment - I think its a known issue that interrupted replications will leave dirs around. We can look at cleaning them up on startup or something...
          Hide
          zhuojunjian added a comment -

          OK
          I got that.

          Show
          zhuojunjian added a comment - OK I got that.
          Hide
          zhuojunjian added a comment -

          thanks for your reply.

          Show
          zhuojunjian added a comment - thanks for your reply.
          Hide
          zhuojunjian added a comment -

          hi
          do you have any plan for this issue ?
          4.1 or other ?

          Show
          zhuojunjian added a comment - hi do you have any plan for this issue ? 4.1 or other ?
          Hide
          Mark Miller added a comment -

          Yup, I can try and deal with this for 4.3. 4.2 is rolling now and it will take me some time to get this worked out.

          Show
          Mark Miller added a comment - Yup, I can try and deal with this for 4.3. 4.2 is rolling now and it will take me some time to get this worked out.
          Hide
          zhuojunjian added a comment -

          hi
          ok.
          do you think when the version 4.3 will be released?

          Show
          zhuojunjian added a comment - hi ok. do you think when the version 4.3 will be released?
          Hide
          Steve Rowe added a comment -

          Bulk move 4.4 issues to 4.5 and 5.0

          Show
          Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
          Hide
          Uwe Schindler added a comment -

          Move issue to Solr 4.9.

          Show
          Uwe Schindler added a comment - Move issue to Solr 4.9.
          Hide
          Timothy Potter added a comment -

          Hi Mark Miller, I'd like to take this one up as I need it for some other issue I'm working on. Let me know if you have any updated code or ideas on this otherwise, feel free to assign over to me and I'll work on it for 5.3. Thanks

          Show
          Timothy Potter added a comment - Hi Mark Miller , I'd like to take this one up as I need it for some other issue I'm working on. Let me know if you have any updated code or ideas on this otherwise, feel free to assign over to me and I'll work on it for 5.3. Thanks
          Hide
          Mark Miller added a comment -

          Fire away - fell off my radar - can't remember 2013

          Show
          Mark Miller added a comment - Fire away - fell off my radar - can't remember 2013
          Hide
          Timothy Potter added a comment -

          Patch that performs the cleanup operation at the end of the SolrCore.initIndex method. The delete work is done in a background daemon thread (which should run quickly). Might be overkill, but I added a check on the livePaths known to the CachingDirectoryFactory before deleting a directory. Patch works with local FS and HDFS.

          Show
          Timothy Potter added a comment - Patch that performs the cleanup operation at the end of the SolrCore.initIndex method. The delete work is done in a background daemon thread (which should run quickly). Might be overkill, but I added a check on the livePaths known to the CachingDirectoryFactory before deleting a directory. Patch works with local FS and HDFS.
          Hide
          Mark Miller added a comment -

          +1, LGTM.

          Show
          Mark Miller added a comment - +1, LGTM.
          Hide
          ASF subversion and git services added a comment -

          Commit 1683601 from Timothy Potter in branch 'dev/trunk'
          [ https://svn.apache.org/r1683601 ]

          SOLR-4506: Clean-up old (unused) index directories in the background after initializing a new index

          Show
          ASF subversion and git services added a comment - Commit 1683601 from Timothy Potter in branch 'dev/trunk' [ https://svn.apache.org/r1683601 ] SOLR-4506 : Clean-up old (unused) index directories in the background after initializing a new index
          Hide
          ASF subversion and git services added a comment -

          Commit 1683604 from Timothy Potter in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1683604 ]

          SOLR-4506: Clean-up old (unused) index directories in the background after initializing a new index

          Show
          ASF subversion and git services added a comment - Commit 1683604 from Timothy Potter in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1683604 ] SOLR-4506 : Clean-up old (unused) index directories in the background after initializing a new index
          Hide
          Shalin Shekhar Mangar added a comment -

          Bulk close for 5.3.0 release

          Show
          Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release
          Hide
          Mark Miller added a comment -

          This appears to be triggering the following nightly fail: SOLR-8447

          Show
          Mark Miller added a comment - This appears to be triggering the following nightly fail: SOLR-8447

            People

            • Assignee:
              Timothy Potter
              Reporter:
              zhuojunjian
            • Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 12h
                12h
                Remaining:
                Remaining Estimate - 12h
                12h
                Logged:
                Time Spent - Not Specified
                Not Specified

                  Development