Solr
  1. Solr
  2. SOLR-6228

TestReplicationHandler.doTestIndexAndConfigReplication failure on jenkins

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.10, 6.0
    • Component/s: SolrCloud, Tests
    • Labels:
      None

      Description

      I recently saw a failure on jenkins:

      http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Windows/4077/

      REGRESSION:  org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigReplication
      
      Error Message:
      expected:<0> but was:<1404646041002>
      
      Stack Trace:
      java.lang.AssertionError: expected:<0> but was:<1404646041002>
              at __randomizedtesting.SeedInfo.seed([7B532BE2B6B38F19:6F1B70B795B43207]:0)
              at org.junit.Assert.fail(Assert.java:93)
              at org.junit.Assert.failNotEquals(Assert.java:647)
              at org.junit.Assert.assertEquals(Assert.java:128)
              at org.junit.Assert.assertEquals(Assert.java:147)
              at org.apache.solr.handler.TestReplicationHandler.assertVersions(TestReplicationHandler.java:987)
              at org.apache.solr.handler.TestReplicationHandler.doTestIndexAndConfigReplication(TestReplicationHandler.java:559)
      

        Activity

        Hide
        Shalin Shekhar Mangar added a comment -

        I think this is a timing issue in the test:

         [junit4]   2> 1250539 T5542 C2538 oasc.SolrCore.execute [collection1] webapp=/solr path=/replication params={_trace=getDetails&qt=/replication&wt=javabin&version=2&command=details} status=0 QTime=20 
           [junit4]   2> 1250539 T5550 oash.SnapPuller.fetchLatestIndex Configuration files are modified, core will be reloaded
           [junit4]   2> 1250539 T5550 oasu.DefaultSolrCoreState.openIndexWriter Creating new IndexWriter...
           [junit4]   2> 1250540 T5550 oasu.RandomMergePolicy.<init> RandomMergePolicy wrapping class org.apache.lucene.index.LogByteSizeMergePolicy: [LogByteSizeMergePolicy: minMergeSize=1677721, mergeFactor=20, maxMergeSize=2147483648, maxMergeSizeForForcedMerge=9223372036854775807, calibrateSizeByDeletes=false, maxMergeDocs=2147483647, maxCFSSegmentSizeMB=8.796093022207999E12, noCFSRatio=0.4184992878575281]
           [junit4]   2> 1250541 T5550 oasc.SolrDeletionPolicy.onInit SolrDeletionPolicy.onInit: commits: num=2
           [junit4]   2> 		commit{dir=MockDirectoryWrapper(RAMDirectory@3859e74f lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@7ef8b356),segFN=segments_1,generation=1}
           [junit4]   2> 		commit{dir=MockDirectoryWrapper(RAMDirectory@3859e74f lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@7ef8b356),segFN=segments_2,generation=2}
           [junit4]   2> 1250541 T5550 oasc.SolrDeletionPolicy.updateCommits newest commit generation = 2
           [junit4]   2> 1250542 T5550 oasu.DefaultSolrCoreState.openIndexWriter New IndexWriter is ready to be used.
           [junit4]   2> 1250542 T5553 oasc.SolrResourceLoader.<init> new SolrResourceLoader for directory: '.\solr.handler.TestReplicationHandler-7B532BE2B6B38F19-001\solr-instance-002\collection1\'
           [junit4]   2> 1250592 T5553 oasc.SolrConfig.<init> Using Lucene MatchVersion: LUCENE_4_10
           [junit4]   2> ASYNC  NEW_CORE C2539 name=collection1 org.apache.solr.core.SolrCore@6ecac580
           [junit4]   2> 1250622 T5528 C2539 oasc.SolrCore.execute [collection1] webapp=/solr path=/replication params={_trace=assertVersions&qt=/replication&wt=javabin&version=2&command=indexversion} status=0 QTime=0 
           [junit4]   2> 1250628 T5553 oasc.SolrConfig.<init> Loaded SolrConfig: solrconfig.xml
           [junit4]   2> 1250628 T5553 oass.IndexSchema.readSchema Reading Solr Schema from schema.xml
           [junit4]   2> 1250632 T5553 oass.IndexSchema.readSchema [collection1] Schema name=test
           [junit4]   2> ASYNC  NEW_CORE C2540 name=collection1 org.apache.solr.core.SolrCore@6efe46f0
           [junit4]   2> 1250682 T5544 C2540 oasc.SolrCore.execute [collection1] webapp=/solr path=/replication params={_trace=assertVersions&qt=/replication&wt=javabin&version=2&command=indexversion} status=0 QTime=0 
           [junit4]   2> 1250683 T5255 oas.SolrTestCaseJ4.tearDown ###Ending doTestIndexAndConfigReplication
        

        In this log:

        1. 1250539 - the getDetails call on the slave happens
        2. The slave core reloads
        3. 1250682 - the indexversion call on the slave happens

        The test should have failed earlier if assertVersions did not have:

        if (maxVersionClient1 > 0 && maxVersionClient2 > 0) {
              assertEquals(maxVersionClient1, maxVersionClient2);
            }
        

        I think a simple fix is to move the assertVersions call after the rQuery call on the slave to make sure that the replication is complete.

        I have committed this fix to trunk and branch_4x but it is not recorded automatically here because Jira was down for maintenance at that time.

        http://svn.apache.org/r1608601
        http://svn.apache.org/r1608602

        Show
        Shalin Shekhar Mangar added a comment - I think this is a timing issue in the test: [junit4] 2> 1250539 T5542 C2538 oasc.SolrCore.execute [collection1] webapp=/solr path=/replication params={_trace=getDetails&qt=/replication&wt=javabin&version=2&command=details} status=0 QTime=20 [junit4] 2> 1250539 T5550 oash.SnapPuller.fetchLatestIndex Configuration files are modified, core will be reloaded [junit4] 2> 1250539 T5550 oasu.DefaultSolrCoreState.openIndexWriter Creating new IndexWriter... [junit4] 2> 1250540 T5550 oasu.RandomMergePolicy.<init> RandomMergePolicy wrapping class org.apache.lucene.index.LogByteSizeMergePolicy: [LogByteSizeMergePolicy: minMergeSize=1677721, mergeFactor=20, maxMergeSize=2147483648, maxMergeSizeForForcedMerge=9223372036854775807, calibrateSizeByDeletes= false , maxMergeDocs=2147483647, maxCFSSegmentSizeMB=8.796093022207999E12, noCFSRatio=0.4184992878575281] [junit4] 2> 1250541 T5550 oasc.SolrDeletionPolicy.onInit SolrDeletionPolicy.onInit: commits: num=2 [junit4] 2> commit{dir=MockDirectoryWrapper(RAMDirectory@3859e74f lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@7ef8b356),segFN=segments_1,generation=1} [junit4] 2> commit{dir=MockDirectoryWrapper(RAMDirectory@3859e74f lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@7ef8b356),segFN=segments_2,generation=2} [junit4] 2> 1250541 T5550 oasc.SolrDeletionPolicy.updateCommits newest commit generation = 2 [junit4] 2> 1250542 T5550 oasu.DefaultSolrCoreState.openIndexWriter New IndexWriter is ready to be used. [junit4] 2> 1250542 T5553 oasc.SolrResourceLoader.<init> new SolrResourceLoader for directory: '.\solr.handler.TestReplicationHandler-7B532BE2B6B38F19-001\solr-instance-002\collection1\' [junit4] 2> 1250592 T5553 oasc.SolrConfig.<init> Using Lucene MatchVersion: LUCENE_4_10 [junit4] 2> ASYNC NEW_CORE C2539 name=collection1 org.apache.solr.core.SolrCore@6ecac580 [junit4] 2> 1250622 T5528 C2539 oasc.SolrCore.execute [collection1] webapp=/solr path=/replication params={_trace=assertVersions&qt=/replication&wt=javabin&version=2&command=indexversion} status=0 QTime=0 [junit4] 2> 1250628 T5553 oasc.SolrConfig.<init> Loaded SolrConfig: solrconfig.xml [junit4] 2> 1250628 T5553 oass.IndexSchema.readSchema Reading Solr Schema from schema.xml [junit4] 2> 1250632 T5553 oass.IndexSchema.readSchema [collection1] Schema name=test [junit4] 2> ASYNC NEW_CORE C2540 name=collection1 org.apache.solr.core.SolrCore@6efe46f0 [junit4] 2> 1250682 T5544 C2540 oasc.SolrCore.execute [collection1] webapp=/solr path=/replication params={_trace=assertVersions&qt=/replication&wt=javabin&version=2&command=indexversion} status=0 QTime=0 [junit4] 2> 1250683 T5255 oas.SolrTestCaseJ4.tearDown ###Ending doTestIndexAndConfigReplication In this log: 1250539 - the getDetails call on the slave happens The slave core reloads 1250682 - the indexversion call on the slave happens The test should have failed earlier if assertVersions did not have: if (maxVersionClient1 > 0 && maxVersionClient2 > 0) { assertEquals(maxVersionClient1, maxVersionClient2); } I think a simple fix is to move the assertVersions call after the rQuery call on the slave to make sure that the replication is complete. I have committed this fix to trunk and branch_4x but it is not recorded automatically here because Jira was down for maintenance at that time. http://svn.apache.org/r1608601 http://svn.apache.org/r1608602
        Hide
        ASF subversion and git services added a comment -

        Commit 1610448 from shalin@apache.org in branch 'dev/trunk'
        [ https://svn.apache.org/r1610448 ]

        SOLR-6228: Fixed bug in TestReplicationHandler.doTestIndexAndConfigReplication

        Show
        ASF subversion and git services added a comment - Commit 1610448 from shalin@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1610448 ] SOLR-6228 : Fixed bug in TestReplicationHandler.doTestIndexAndConfigReplication
        Hide
        ASF subversion and git services added a comment -

        Commit 1610450 from shalin@apache.org in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1610450 ]

        SOLR-6228: Fixed bug in TestReplicationHandler.doTestIndexAndConfigReplication

        Show
        ASF subversion and git services added a comment - Commit 1610450 from shalin@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1610450 ] SOLR-6228 : Fixed bug in TestReplicationHandler.doTestIndexAndConfigReplication
        Hide
        Shalin Shekhar Mangar added a comment -

        I haven't seen this fail since my commit on 8th July. We can re-open it if necessary.

        Show
        Shalin Shekhar Mangar added a comment - I haven't seen this fail since my commit on 8th July. We can re-open it if necessary.

          People

          • Assignee:
            Shalin Shekhar Mangar
            Reporter:
            Shalin Shekhar Mangar
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development