Description
Filing this bug based on an email to solr-user@lucene from Tom Chen (Fri, 18 Jul 2014)...
Reproduce steps:
1) Setup Solr to run on HDFS like this:
java -Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs -Dsolr.hdfs.home=hdfs://host:port/path
For the purpose of this testing, turn off the default auto commit in solrconfig.xml, i.e. comment out autoCommit like this:
<!--
<autoCommit>
<maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>
-->
2) Add a document without commit:
{{curl "http://localhost:8983/solr/collection1/update?commit=false" -H
"Content-type:text/xml; charset=utf-8" --data-binary "@solr.xml"}}
3) Solr generate empty tlog file (0 file size, the last one ends with 6):
[hadoop@hdtest042 exampledocs]$ hadoop fs -ls /path/collection1/core_node1/data/tlog Found 5 items -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.0000000000000000001 -rw-r--r-- 1 hadoop hadoop 67 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.0000000000000000003 -rw-r--r-- 1 hadoop hadoop 667 2014-07-18 08:47 /path/collection1/core_node1/data/tlog/tlog.0000000000000000004 -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 /path/collection1/core_node1/data/tlog/tlog.0000000000000000005 -rw-r--r-- 1 hadoop hadoop 0 2014-07-18 09:02 /path/collection1/core_node1/data/tlog/tlog.0000000000000000006
4) Simulate Solr crash by killing the process with -9 option.
5) restart the Solr process. Observation is that uncommitted document are
not replayed, files in tlog directory are cleaned up. Hence uncommitted
document(s) is lost.
Am I missing anything or this is a bug?
BTW, additional observations:
a) If in step 4) Solr is stopped gracefully (i.e. without -9 option),
non-empty tlog file is geneated and after re-starting Solr, uncommitted
document is replayed as expected.
b) If Solr doesn't run on HDFS (i.e. on local file system), this issue is
not observed either.
Attachments
Issue Links
- duplicates
-
SOLR-6969 When opening an HDFSTransactionLog for append we must first attempt to recover it's lease to prevent data loss.
- Closed