Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-4519

corrupt tlog causes fullCopy download index files every time reboot a node

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Cannot Reproduce
    • 4.0
    • None
    • None
    • None
    • The solrcloud is implemented on three servers. There are three solr instance on each server. The collection has three shards. Every shard has three replica. Replicas in same shard run in solr instance on different server.

    Description

      There are two questions:
      1. The tlog of one replica of shard1 is damaged by some reason. We are still looking for the reason. Please give some clue if you are familia with this problem.

      2. The error replica successed to recovery by fullcopy download index files from leader. Then I killed the instance and started it again, the recovery process still is fullcopy download. In my opinion, after the first time fullcopy recovery, the tlog should be fixed. Here is some log:

      2013-02-28 15:04:58,622 INFO org.apache.solr.cloud.ZkController:757 - Core needs to recover:metadata
      2013-02-28 15:04:58,622 INFO org.apache.solr.update.DefaultSolrCoreState:214 - Running recovery - first canceling any ongoing recovery
      2013-02-28 15:04:58,625 INFO org.apache.solr.cloud.RecoveryStrategy:217 - Starting recovery process. core=metadata recoveringAfterStartup=true
      2013-02-28 15:04:58,626 INFO org.apache.solr.common.cloud.ZkStateReader:295 - Updating cloud state from ZooKeeper...
      2013-02-28 15:04:58,628 ERROR org.apache.solr.update.UpdateLog:957 - Exception reading versions from log
      java.io.EOFException
      at org.apache.solr.common.util.FastInputStream.readUnsignedByte(FastInputStream.java:72)
      at org.apache.solr.common.util.FastInputStream.readInt(FastInputStream.java:206)
      at org.apache.solr.update.TransactionLog$ReverseReader.next(TransactionLog.java:705)
      at org.apache.solr.update.UpdateLog$RecentUpdates.update(UpdateLog.java:906)
      at org.apache.solr.update.UpdateLog$RecentUpdates.access$000(UpdateLog.java:846)
      at org.apache.solr.update.UpdateLog.getRecentUpdates(UpdateLog.java:996)
      at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:256)
      at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:220)

      2013-02-28 15:05:01,857 INFO org.apache.solr.cloud.RecoveryStrategy:399 - Begin buffering updates. core=metadata
      2013-02-28 15:05:01,857 INFO org.apache.solr.update.UpdateLog:1015 - Starting to buffer updates. FSUpdateLog

      {state=ACTIVE, tlog=null}

      2013-02-28 15:05:01,857 INFO org.apache.solr.cloud.RecoveryStrategy:126 - Attempting to replicate from http://23.61.21.121:65201/solr/metadata/. core=metadata

      2013-02-28 15:05:02,882 INFO org.apache.solr.handler.SnapPuller:305 - Master's generation: 6993
      2013-02-28 15:05:02,882 INFO org.apache.solr.handler.SnapPuller:306 - Slave's generation: 6993
      2013-02-28 15:05:02,882 INFO org.apache.solr.handler.SnapPuller:307 - Starting replication process
      2013-02-28 15:05:02,893 INFO org.apache.solr.handler.SnapPuller:312 - Number of files in latest index in master: 422
      2013-02-28 15:05:02,897 INFO org.apache.solr.handler.SnapPuller:325 - Starting download to /solr/nodes/node1/bin/../solr/metadata/data/index.20130228150502893 fullCopy=true

      2013-02-28 15:33:55,848 INFO org.apache.solr.handler.SnapPuller:334 - Total time taken for download : 1732 secs (The size of index files is 94G)

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            icanfly0421 Simon Scofield
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment