[SOLR-6640] Replication can cause index corruption. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Blocker
Resolution: Fixed
Affects Version/s: 5.0
Fix Version/s: 5.0, 6.0
Component/s: replication (java)
Labels:
None

Description

Test failure found on jenkins:
http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11333/

1 tests failed.
REGRESSION:  org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.testDistribSearch

Error Message:
shard2 is not consistent.  Got 62 from http://127.0.0.1:57436/collection1lastClient and got 24 from http://127.0.0.1:53065/collection1

Stack Trace:
java.lang.AssertionError: shard2 is not consistent.  Got 62 from http://127.0.0.1:57436/collection1lastClient and got 24 from http://127.0.0.1:53065/collection1
        at __randomizedtesting.SeedInfo.seed([F4B371D421E391CD:7555FFCC56BCF1F1]:0)
        at org.junit.Assert.fail(Assert.java:93)
        at org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1255)
        at org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1234)
        at org.apache.solr.cloud.ChaosMonkeySafeLeaderTest.doTest(ChaosMonkeySafeLeaderTest.java:162)
        at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:869)

Cause of inconsistency is:

Caused by: org.apache.lucene.index.CorruptIndexException: file mismatch, expected segment id=yhq3vokoe1den2av9jbd3yp8, got=yhq3vokoe1den2av9jbd3yp7 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/solr/build/solr-core/test/J0/temp/solr.cloud.ChaosMonkeySafeLeaderTest-F4B371D421E391CD-001/tempDir-001/jetty3/index/_1_2.liv")))
   [junit4]   2> 		at org.apache.lucene.codecs.CodecUtil.checkSegmentHeader(CodecUtil.java:259)
   [junit4]   2> 		at org.apache.lucene.codecs.lucene50.Lucene50LiveDocsFormat.readLiveDocs(Lucene50LiveDocsFormat.java:88)
   [junit4]   2> 		at org.apache.lucene.codecs.asserting.AssertingLiveDocsFormat.readLiveDocs(AssertingLiveDocsFormat.java:64)
   [junit4]   2> 		at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:102)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

corruptindex.log
25/Jan/15 20:13
4.47 MB
Mark Miller
Lucene-Solr-5.x-Linux-64bit-jdk1.8.0_20-Build-11333.txt
21/Oct/14 21:07
889 kB
Shalin Shekhar Mangar
SOLR-6640_new_index_dir.patch
30/Dec/14 17:44
12 kB
Varun Thacker
SOLR-6640.patch
20/Jan/15 15:36
8 kB
Shalin Shekhar Mangar
SOLR-6640.patch
05/Jan/15 19:35
4 kB
Shalin Shekhar Mangar
SOLR-6640.patch
16/Dec/14 16:32
3 kB
Varun Thacker
SOLR-6640.patch
08/Dec/14 18:20
36 kB
Varun Thacker
SOLR-6640-test.patch
20/Jan/15 11:07
4 kB
Shalin Shekhar Mangar
SOLR-6920.patch
06/Feb/15 19:58
16 kB
Varun Thacker

Issue Links

depends upon

SOLR-6920 During replication use checksums to verify if files are the same

Closed

is duplicated by

SOLR-6942 DistribDocExpirationUpdateProcessorTest failure.

Closed

is related to

SOLR-7134 Replication can still cause index corruption.

Closed

relates to

SOLR-7093 Cleanup comment and move hardcoded value of file size to replicate to a final

Open

Activity

People

Assignee:: Mark Miller

Reporter:: Shalin Shekhar Mangar

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 21/Oct/14 21:06

Updated:: 02/Oct/19 17:24

Resolved:: 09/Feb/15 18:49