Solr
  1. Solr
  2. SOLR-658

Allow Solr to load index from arbitrary directory in dataDir

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 1.4
    • Component/s: None
    • Labels:
      None

      Description

      This is a requirement for java based Solr replication

      Usecase for arbitrary index directory:
      if the slave has a corrupted index and the filesystem does not allow overwriting files in use (NTFS) replication will fail. The solution is to copy the index from master to an alternate directory on slave and load indexreader/indexwriter from this alternate directory.

      1. SOLR-658-reopen-windows-fix.patch
        0.7 kB
        Shalin Shekhar Mangar
      2. SOLR-658.patch
        5 kB
        Shalin Shekhar Mangar
      3. SOLR-658.patch
        9 kB
        Akshay K. Ukey
      4. SOLR-658.patch
        9 kB
        Akshay K. Ukey
      5. SOLR-658.patch
        10 kB
        Shalin Shekhar Mangar
      6. SOLR-658.patch
        10 kB
        Shalin Shekhar Mangar
      7. SOLR-658.patch
        10 kB
        Shalin Shekhar Mangar

        Issue Links

          Activity

          Hide
          Noble Paul added a comment -

          Implementation

          • keep a file index.properties in the data dir
          • Have an entry index=<new.index> in that file
          • This file may also keep version
          • When a new indexsearcher/writer is loaded, read this property and try to load the index from that folder
          • if it is absent , default to the hardcoded value for index and latest commitpoint
          Show
          Noble Paul added a comment - Implementation keep a file index.properties in the data dir Have an entry index=<new.index> in that file This file may also keep version When a new indexsearcher/writer is loaded, read this property and try to load the index from that folder if it is absent , default to the hardcoded value for index and latest commitpoint
          Hide
          Shalin Shekhar Mangar added a comment -

          This is cut out of the SOLR-561 patch supports loading index from an arbitrary directory.

          Changes

          1. A new method SolrCore#getNewIndexDir() is introduced which tries to read the latest indexDir from index.properties file. If that file is not present the default value (dataDir + "index/")
          2. SolrIndexSearcher now stores the path (indexDir) on which it is opened and has a getter for it.
          3. When SolrCore#getIndexDir() is called, it gives the current searcher's index directory, failing which the default value is given
          4. SolrIndexSearcher is always created with getNewIndexDir() and UpdateHandler also uses getNewIndexDir() to open IndexWriter instances.

          TODO:

          • Add a test
          • Add feature for loading arbitrary commit point.
          Show
          Shalin Shekhar Mangar added a comment - This is cut out of the SOLR-561 patch supports loading index from an arbitrary directory. Changes A new method SolrCore#getNewIndexDir() is introduced which tries to read the latest indexDir from index.properties file. If that file is not present the default value (dataDir + "index/") SolrIndexSearcher now stores the path (indexDir) on which it is opened and has a getter for it. When SolrCore#getIndexDir() is called, it gives the current searcher's index directory, failing which the default value is given SolrIndexSearcher is always created with getNewIndexDir() and UpdateHandler also uses getNewIndexDir() to open IndexWriter instances. TODO: Add a test Add feature for loading arbitrary commit point.
          Hide
          Akshay K. Ukey added a comment -

          Patch in sync with trunk and with a test case (loading arbitrary commit point feature not supported in this patch).

          Show
          Akshay K. Ukey added a comment - Patch in sync with trunk and with a test case (loading arbitrary commit point feature not supported in this patch).
          Hide
          Akshay K. Ukey added a comment -

          Patch in sync with the trunk.

          Show
          Akshay K. Ukey added a comment - Patch in sync with the trunk.
          Hide
          Shalin Shekhar Mangar added a comment -

          Thanks Akshay.

          Updated patch which calls getNewIndexDir before calling IndexReader#reopen so that if the new index directory is different from the old index directory, we always create a new SolrIndexSearcher with the new index directory.

          I'd like to commit this in the next two or three days if there are no objections.

          Show
          Shalin Shekhar Mangar added a comment - Thanks Akshay. Updated patch which calls getNewIndexDir before calling IndexReader#reopen so that if the new index directory is different from the old index directory, we always create a new SolrIndexSearcher with the new index directory. I'd like to commit this in the next two or three days if there are no objections.
          Hide
          Shalin Shekhar Mangar added a comment -

          Updated with a bug fix:

          if (result != null && result.trim().length() > 0) {
                  File tmp = new File(dataDir + s);
                  if (tmp.exists() && tmp.isDirectory())
                    result = dataDir + s;
                }
          

          should be:

          if (s != null && s.trim().length() > 0) {
                  File tmp = new File(dataDir + s);
                  if (tmp.exists() && tmp.isDirectory())
                    result = dataDir + s;
                }
          

          I'll commit shortly.

          Show
          Shalin Shekhar Mangar added a comment - Updated with a bug fix: if (result != null && result.trim().length() > 0) { File tmp = new File(dataDir + s); if (tmp.exists() && tmp.isDirectory()) result = dataDir + s; } should be: if (s != null && s.trim().length() > 0) { File tmp = new File(dataDir + s); if (tmp.exists() && tmp.isDirectory()) result = dataDir + s; } I'll commit shortly.
          Hide
          Shalin Shekhar Mangar added a comment -

          Instead of comparing path strings, we should compare the corresponding File objects to handle relative and absolute paths correctly.

          Patch to cover the above case.

          Show
          Shalin Shekhar Mangar added a comment - Instead of comparing path strings, we should compare the corresponding File objects to handle relative and absolute paths correctly. Patch to cover the above case.
          Hide
          Shalin Shekhar Mangar added a comment -

          Removing reference to rollbacks and commit points which is being handled in SOLR-670

          Show
          Shalin Shekhar Mangar added a comment - Removing reference to rollbacks and commit points which is being handled in SOLR-670
          Hide
          Shalin Shekhar Mangar added a comment -

          Committed revision 703981.

          Thanks Noble and Akshay!

          Show
          Shalin Shekhar Mangar added a comment - Committed revision 703981. Thanks Noble and Akshay!
          Hide
          Yonik Seeley added a comment -

          This causes reopen() to never be used on Windows because the following condition comes up false:

               if(new File(getIndexDir()).equals(new File(newIndexDir)))  {
          
          Show
          Yonik Seeley added a comment - This causes reopen() to never be used on Windows because the following condition comes up false: if ( new File(getIndexDir()).equals( new File(newIndexDir))) {
          Hide
          Shalin Shekhar Mangar added a comment -

          Copying over from the solr-dev thread on failing tests:

          The first problem is that File.equals compares only the path and not the absolute path. A work around is to compare absolute path ourselves. But a bigger problem is with the canonical paths where long directory names is uppercased and shortened into 8 character names (e.g. "C:\Documents and Settings" becomes "C:\DOCUME~1").

          The test fails because we use java.io.tmpdir which defaults to user's home directory (shortened and canonicalized) on windows and comparison on this path fails. What I'm not able to figure out yet is why does Slave Jetty, running on this canonical path, returns the full path of the index directory.

          Slave's SolrCore.getIndexDir gives:
          C:\Documents and Settings\shalinsmangar\Local Settings\Temp\org.apache.solr.handler.TestReplicationHandler$SolrInstance-1233681533000master\data\index

          The value written by TestReplicationHandler is:
          C:\DOCUME~1\SHALIN~1\LOCALS~1\Temp\org.apache.solr.handler.TestReplicationHandler$SolrInstance-1233681533000master\data\index

          Show
          Shalin Shekhar Mangar added a comment - Copying over from the solr-dev thread on failing tests: The first problem is that File.equals compares only the path and not the absolute path. A work around is to compare absolute path ourselves. But a bigger problem is with the canonical paths where long directory names is uppercased and shortened into 8 character names (e.g. "C:\Documents and Settings" becomes "C:\DOCUME~1"). The test fails because we use java.io.tmpdir which defaults to user's home directory (shortened and canonicalized) on windows and comparison on this path fails. What I'm not able to figure out yet is why does Slave Jetty, running on this canonical path, returns the full path of the index directory. Slave's SolrCore.getIndexDir gives: C:\Documents and Settings\shalinsmangar\Local Settings\Temp\org.apache.solr.handler.TestReplicationHandler$SolrInstance-1233681533000master\data\index The value written by TestReplicationHandler is: C:\DOCUME~1\SHALIN~1\LOCALS~1\Temp\org.apache.solr.handler.TestReplicationHandler$SolrInstance-1233681533000master\data\index
          Hide
          Shalin Shekhar Mangar added a comment -

          I should read javadocs more. This patch compares the index directories using their canonical paths This fixes the problem on windows.

          Show
          Shalin Shekhar Mangar added a comment - I should read javadocs more. This patch compares the index directories using their canonical paths This fixes the problem on windows.
          Hide
          Shalin Shekhar Mangar added a comment -

          Committed revision 759641.

          Show
          Shalin Shekhar Mangar added a comment - Committed revision 759641.
          Hide
          Grant Ingersoll added a comment -

          Bulk close for Solr 1.4

          Show
          Grant Ingersoll added a comment - Bulk close for Solr 1.4

            People

            • Assignee:
              Shalin Shekhar Mangar
              Reporter:
              Noble Paul
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development