[SOLR-830] snappuller picks bad snapshot name - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 1.2, 1.3
Fix Version/s: 1.4
Component/s: replication (scripts)
Labels:
None

Description

as mentioned on the mailing list...

http://www.nabble.com/FileNotFoundException-on-slave-after-replication---script-bug--to20111313.html#a20111313

We're seeing strange behavior on one of our slave nodes after replication. 
When the new searcher is created we see FileNotFoundExceptions in the log
and the index is strangely invalid/corrupted.

We may have identified the root cause but wanted to run it by the community. 
We figure there is a bug in the snappuller shell script, line 181:

snap_name=`ssh -o StrictHostKeyChecking=no ${master_host} "ls
${master_data_dir}|grep 'snapshot\.'|grep -v wip|sort -r|head -1"` 

This line determines the directory name of the latest snapshot to download
to the slave from the master.  Problem with this line is that it grab the
temporary work directory of a snapshot in progress.  Those temporary
directories are prefixed with  "temp" and as far as I can tell should never
get pulled from the master so its easy to disambiguate.  It seems that this
temp directory, if it exists will be the newest one so if present it will be
the one replicated: FAIL.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

solr-830.patch
17/Nov/08 19:13
0.6 kB
William Au

Activity

People

Assignee:: William Au

Reporter:: Chris M. Hostetter

Votes:: 1 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 29/Oct/08 15:12

Updated:: 10/Nov/09 15:51

Resolved:: 20/Nov/08 14:14