Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-15371

Backups randomly fail sometimes

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 8.5.2, 8.8.2
    • None
    • Backup/Restore
    • None

    Description

      Hi, we have an issue where sometimes one shard fails to backup due to what might be a race condition in creating the folder/starting the backup.  When this happens, we have to restart the first server in a shard to get the backup to succeed again.  The cluster backs up to a shared NFS mount.  4/5 times the backup goes fine without issues (there is even another collection that the backup will run for later in the morning that will succeed fine even though it's all the same servers)  Below is the error I get.

      "Response":"Failed to backup core=slprod_shard4_replica_n6 because org.apache.solr.common.SolrException: Directory to contain snapshots doesn't exist: file:///mnt/solr_backups/slprod/slprod-04-25-2021. Note that Backup/Restore of a SolrCloud collection requires a shared file system mounted at the same path on all nodes!"},
      

      And below is the line I use to backup with (obviously with bash variables set earlier in the script)

      curl -s "http://localhost:8983/solr/admin/collections?action=BACKUP&name=${COLLECTION}-${DATE}&collection=${COLLECTION}&location=${BACKUP_PATH}&async=1000"
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            meltingrobot Roy Perkins
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: