Step to reproduce this issue is simple. Just apply commit on master without posting any documents
Ah... ok, wait a minute - this looks promising. in reviewing the logs you sent to solr-user@lucene i didn't notice that this "empty commit" was happening, but in attempting to reproduce i think i see what you're talking about.
Steps i followed...
1) took the Solr 4.2 example/solr dir and cloned it as "master-home"
2) edited the /replication handler to match what you posted in this issue (but adjusted the replicaiton time to 30 seconds for a faster test)
3) cloned "master-home" as "slave-home"
4) ran two instances of Solr 4.2 using hte following commands...
java -Denable.master=true -Dmaster.port=8999 -Djetty.port=8999 -Dsolr.solr.home=/home/hossman/tmp/ave_version_higher/master-home -jar start.jar &> /home/hossman/tmp/slave_version_higher/master.log
java -Denable.slave=true -Dmaster.port=899-Djetty.port=9999 -Dsolr.solr.home=/home/hossman/tmp/slave_version_higher/slave-home -jar start.jar &> /home/hossman/tmp/slave_version_higher/slave.log
5) ran two scripts to monitor replication details using the following commands...
while true; do date --utc && curl -sS "http://localhost:9999/solr/collection1/replication?command=details&indent=truewt=json" && echo && echo && sleep 2; done &> slave_rep_details.txt
while true; do date --utc && curl -sS "http://localhost:8999/solr/collection1/replication?command=details&indent=truewt=json" && echo && echo && sleep 2; done &> master_rep_details.txt
6) triggered an indexing of all the example docs on master, and waited for replication.
java -Durl=http://localhost:8999/solr/collection1/update -jar post.jar *.xml
7) triggered an explicit commit on master...
java -Durl=http://localhost:8999/solr/collection1/update -jar post.jar -
8) shutdown both servers and the scripts (Ctrl-C)
I've attached the full logs and home dirs at the completion of this test, but as a summmary of the results...
a) slave & master index files are identical except for segments.gen
b) the master's replication details indicate that the current commit being used is "indexVersion#1364927050819, generation#2" but it's list of commits does not include this, it contains a single commit of "indexVersion#1364927114002, generation#3"
c) the slave's replication details indicate that the current commit being used is "indexVersion#1364927050819, generation#2" and that this is the only commit it has locally. The slave's information about hte master is consistent with what the master itself reports.
I'm not certain, but I believe this is just an optimization where the searcher is not re-opened when the currently opened "commit" is identical to the new commit – this optimization is working on the master, but aparently not on the slave (maybe the slave can't tell that the commits are identical?)
FWIW: after running this test, i restarted the master and it's replication details were consistent with the list of commit points – it was using generation #3.
You can also observe the exact same behavior from master's replication details (current generation lower then the generation of any commit point) if you do a hard commit with openSearcher=false.
I think most of the behavior here makes sense – the slave is replicating the commits from the master, even if the master isn't using them yet because it hasn't opened a new searcher. The key questions i wonder about:
1) why was segments.get different when i ran my experiment? is that normal?
2) Assuming i'm correct about their being an optimization to not open a new searcher if the commits are identical, can we make this same optimization work on slaves in the case of replication?