Solr
  1. Solr
  2. SOLR-830

snappuller picks bad snapshot name

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.2, 1.3
    • Fix Version/s: 1.4
    • Component/s: replication (scripts)
    • Labels:
      None

      Description

      as mentioned on the mailing list...

      http://www.nabble.com/FileNotFoundException-on-slave-after-replication---script-bug--to20111313.html#a20111313

      We're seeing strange behavior on one of our slave nodes after replication. 
      When the new searcher is created we see FileNotFoundExceptions in the log
      and the index is strangely invalid/corrupted.
      
      We may have identified the root cause but wanted to run it by the community. 
      We figure there is a bug in the snappuller shell script, line 181:
      
      snap_name=`ssh -o StrictHostKeyChecking=no ${master_host} "ls
      ${master_data_dir}|grep 'snapshot\.'|grep -v wip|sort -r|head -1"` 
      
      This line determines the directory name of the latest snapshot to download
      to the slave from the master.  Problem with this line is that it grab the
      temporary work directory of a snapshot in progress.  Those temporary
      directories are prefixed with  "temp" and as far as I can tell should never
      get pulled from the master so its easy to disambiguate.  It seems that this
      temp directory, if it exists will be the newest one so if present it will be
      the one replicated: FAIL.
      

        Activity

        Hide
        Hoss Man added a comment -

        while i haven't noticed this bug myself, and i'm not very "shell script smart", it does seem like it would be a problem ... prior to r529471 snappuller used a find command that required the filenames start with "snapshot." where as the current grep just requires that it contain "snapshot."

        the original poster had the following suggested fix...

        We've tweaked the line to exclude any directories starting with "temp":
        snap_name=`ssh -o StrictHostKeyChecking=no ${master_host} "ls
        ${master_data_dir}|grep 'snapshot\.'|grep -v wip|grep -v temp|sort -r|head
        -1"` 
        

        ...my first reaction would be to change the grep to use an anchored regex, but i'm not sure how portable that is - so perhaps the "grep -v temp" is the way to go. I'll leave it to people who understand shell scripts better.

        Show
        Hoss Man added a comment - while i haven't noticed this bug myself, and i'm not very "shell script smart", it does seem like it would be a problem ... prior to r529471 snappuller used a find command that required the filenames start with "snapshot." where as the current grep just requires that it contain "snapshot." the original poster had the following suggested fix... We've tweaked the line to exclude any directories starting with "temp": snap_name=`ssh -o StrictHostKeyChecking=no ${master_host} "ls ${master_data_dir}|grep 'snapshot\.'|grep -v wip|grep -v temp|sort -r|head -1"` ...my first reaction would be to change the grep to use an anchored regex, but i'm not sure how portable that is - so perhaps the "grep -v temp" is the way to go. I'll leave it to people who understand shell scripts better.
        Hide
        Bill Au added a comment -

        I think we should use a regular expression and match against the naming convention of the snapshot (snapshot.YYYYmmddHHMMss). Here is what I propose:

        snap_name=`ssh -o StrictHostKeyChecking=no $

        {master_host}

        "ls
        $

        {master_data_dir}

        |egrep '^snapshot\.[123456789][0-9]

        {13}

        $'|sort -r|head
        -1"`

        Show
        Bill Au added a comment - I think we should use a regular expression and match against the naming convention of the snapshot (snapshot.YYYYmmddHHMMss). Here is what I propose: snap_name=`ssh -o StrictHostKeyChecking=no $ {master_host} "ls $ {master_data_dir} |egrep '^snapshot\. [123456789] [0-9] {13} $'|sort -r|head -1"`
        Hide
        Hoss Man added a comment -

        that's kind of what i thinking, but i wasn't sure how portable egrep is.

        Show
        Hoss Man added a comment - that's kind of what i thinking, but i wasn't sure how portable egrep is.
        Hide
        Steve Rowe added a comment -

        From the egrep(1) man page:

        Traditional egrep did not support the { metacharacter, and some egrep
        implementations  support \{ instead, so portable scripts should avoid {
        in egrep patterns and should use [{] to match a literal {.
        

        Perl is pretty portable, and has a stable regex language - how about (untested):

        snap_name=`ssh -o StrictHostKeyChecking=no ${master_host} \
        "perl -e 'chdir q/${master_data_dir}/; print ((sort grep {/^snapshot[.][1-9][0-9]{13}\$/} <*>)[-1])'"`
        
        Show
        Steve Rowe added a comment - From the egrep(1) man page: Traditional egrep did not support the { metacharacter, and some egrep implementations support \{ instead, so portable scripts should avoid { in egrep patterns and should use [{] to match a literal {. Perl is pretty portable, and has a stable regex language - how about (untested): snap_name=`ssh -o StrictHostKeyChecking=no ${master_host} \ "perl -e 'chdir q/${master_data_dir}/; print ((sort grep {/^snapshot[.][1-9][0-9]{13}\$/} <*>)[-1])'" `
        Hide
        Bill Au added a comment -

        Steve, thanks for the perl code. I need to get rid of the "\" before the "$" in order to get it to work for me:

        perl -e 'chdir q/$

        {master_data_dir}

        /; print ((sort grep {/^snapshot[.][1-9][0-9]

        {13}

        $/} <*>)[-1])'

        I have tested this on Linux and FreeBSD. I will test on Mac OS X tonight. It will be good if someone can do a quick test on Solaris. You really don't need a full brown Solr installation to test it. Just create some dummy directory with various names like:

        snapshot.00080527124131
        snapshot.20080527124131
        snapshot.20080527124131-wip
        snapshot.20080527140518
        snapshot.20080527140610
        snapshot.20081113113700
        snapshot.2080527124131
        temp-snapshot.20080527124131

        and then run the perl command to make sure the right one is returned. With the data set above, you should get:

        snapshot.20081113113700

        Show
        Bill Au added a comment - Steve, thanks for the perl code. I need to get rid of the "\" before the "$" in order to get it to work for me: perl -e 'chdir q/$ {master_data_dir} /; print ((sort grep {/^snapshot [.] [1-9] [0-9] {13} $/} <*>) [-1] )' I have tested this on Linux and FreeBSD. I will test on Mac OS X tonight. It will be good if someone can do a quick test on Solaris. You really don't need a full brown Solr installation to test it. Just create some dummy directory with various names like: snapshot.00080527124131 snapshot.20080527124131 snapshot.20080527124131-wip snapshot.20080527140518 snapshot.20080527140610 snapshot.20081113113700 snapshot.2080527124131 temp-snapshot.20080527124131 and then run the perl command to make sure the right one is returned. With the data set above, you should get: snapshot.20081113113700
        Hide
        Steve Rowe added a comment -

        Bill, I ran the following script on a Linux box against a remote Solaris 2.8 box, with Perl 5.8.0 installed, on which I created the set of empty directories you listed:

        master_host=remotesolaris
        master_data_dir=/tmp/solrtest
        snap_name=`ssh -o StrictHostKeyChecking=no ${master_host} \
        "perl -e 'chdir q|${master_data_dir}|; print ((sort grep {/^snapshot[.][1-9][0-9]{13}$/} <*>)[-1])'"`
        echo ${snap_name}
        

        N.B.: I had to change the quoting boundary character around $

        {master_data_dir}, from '/' to '|', since ${master_data_dir}

        contains '/' characters – I think '|' is really unlikely to be a pathname component.

        It worked for me both with and without the '\' before the '$' – I get the following output from the script:

        snapshot.20081113113700

        Show
        Steve Rowe added a comment - Bill, I ran the following script on a Linux box against a remote Solaris 2.8 box, with Perl 5.8.0 installed, on which I created the set of empty directories you listed: master_host=remotesolaris master_data_dir=/tmp/solrtest snap_name=`ssh -o StrictHostKeyChecking=no ${master_host} \ "perl -e 'chdir q|${master_data_dir}|; print ((sort grep {/^snapshot[.][1-9][0-9]{13}$/} <*>)[-1])'" ` echo ${snap_name} N.B.: I had to change the quoting boundary character around $ {master_data_dir}, from '/' to '|', since ${master_data_dir} contains '/' characters – I think '|' is really unlikely to be a pathname component. It worked for me both with and without the '\' before the '$' – I get the following output from the script: snapshot.20081113113700
        Hide
        Bill Au added a comment -

        Steven, thanks for testing on Solaris. It looks like on Linux and FreeBSD that the '\' in front of '$' escape the special meaning of '$' so it is trying to match against the literal '$' after all the digits (ie snapshot.20080527124131$). Unless this does not work on Mac OS X, I will go with perl without the '\' before the '$'. I will attach a patch here after I test on my Mac tonight.

        Show
        Bill Au added a comment - Steven, thanks for testing on Solaris. It looks like on Linux and FreeBSD that the '\' in front of '$' escape the special meaning of '$' so it is trying to match against the literal '$' after all the digits (ie snapshot.20080527124131$). Unless this does not work on Mac OS X, I will go with perl without the '\' before the '$'. I will attach a patch here after I test on my Mac tonight.
        Hide
        Bill Au added a comment -

        It works on Mac OS X.

        Show
        Bill Au added a comment - It works on Mac OS X.
        Hide
        Bill Au added a comment -

        patch to use perl regular expression to improve accuracy in finding latest snapshot.

        Show
        Bill Au added a comment - patch to use perl regular expression to improve accuracy in finding latest snapshot.
        Hide
        Bill Au added a comment -

        Patch committed:

        Sending CHANGES.txt
        Sending src/scripts/snappuller
        Transmitting file data ..
        Committed revision 719233.

        Show
        Bill Au added a comment - Patch committed: Sending CHANGES.txt Sending src/scripts/snappuller Transmitting file data .. Committed revision 719233.
        Hide
        Grant Ingersoll added a comment -

        Bulk close for Solr 1.4

        Show
        Grant Ingersoll added a comment - Bulk close for Solr 1.4

          People

          • Assignee:
            Bill Au
            Reporter:
            Hoss Man
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development