Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-13908

Possible bugs when using HdfsDirectoryFactory w/ softCommit=true + openSearcher=true

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • hdfs
    • None

    Description

      While working on SOLR-13872 something caught my eye that seems fishy....

      Background:

      SOLR-4916 introduced the API DirectoryFactory.searchersReserveCommitPoints() – a method that SolrIndexSearcher uses to decide if it needs to explicitly save/release the IndexCommit point of it's DirectoryReader with the IndexDeletionPolicytWrapper, for use on Filesystems that don't in some way "protect" open files...

      SolrIndexSearcher
          if (directoryFactory.searchersReserveCommitPoints()) {
            // reserve commit point for life of searcher
            core.getDeletionPolicy().saveCommitPoint(reader.getIndexCommit().getGeneration());
          }
      
      DirectoryFactory
        /**
         * If your implementation can count on delete-on-last-close semantics
         * or throws an exception when trying to remove a file in use, return
         * false (eg NFS). Otherwise, return true. Defaults to returning false.
         * 
         * @return true if factory impl requires that Searcher's explicitly
         * reserve commit points.
         */
        public boolean searchersReserveCommitPoints() {
          return false;
        }
      

      HdfsDirectoryFactory is (still) the only DirectoryFactory Impl that returns true.


      Concern:

      As noted in LUCENE-9040 The behavior of DirectoryReader.getIndexCommit() is a little weird / underspecified when dealing with an "NRT" IndexReader (opened directly off of an IndexWriter using "un-committed" changes) ... which is exactly what SolrIndexSearcher is using in solr setups that use softCommit=true&openSearcher=false.

      In particular the IndexCommit.getGeneration() value that will be used when SolrIndexSearcher executes core.getDeletionPolicy().saveCommitPoint(reader.getIndexCommit().getGeneration()); will be (as of the current code) the generation of the last hard commit – meaning that new segment/data files since the last "hard commit" will not be protected from deletion if additional commits/merges happen on the index duringthe life of the SolrIndexSearcher – either view concurrent rapid commits, or via commit=true&softCommit=false&openSearcher=false.

      I have not investigated this in depth, but I believe there is risk here of unpredictible bugs when using HDFS in conjunction with softCommit=true&openSearcher=true.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hossman Chris M. Hostetter
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: